Thanks for the replies, it was helpful!

I wasn't using the optimization flag -O3 on both the code running in the NW and SW. Now I am and the times are pretty similar between the NW execution and the SW execution on the example I was testing.

Now I'm testing another example and I'm getting some interesting results. The code above represents an image transformation. I'm going through every position in an array of integers and changing que new array values with a slight modification from the old values:

// start timer here

for(i = 0; i < size; i++) {

color = oldp[i];

alpha = (color >> 24) & 0xff;

red = (color >> 16) & 0xff;

green = (color >> 8) & 0xff;

blue = color & 0xff;

lum = (int) (red * 0.299 + green * 0.587 + blue * 0.114);

newp[i] = (alpha << 24) | (lum << 16) | (lum << 8) | lum;

}

// end timer here

// check timer diff and print result

I'm testing this same exact code on both the Secure and Nonsecure domains.

In the NW I'm getting about 155 ms of execution time, which for that buffer and transformation seems ok. On the other hand, the SW is giving me about 610 ms of execution time.

I can't seem to find a reasonable explanation for this time difference, since the code running in both scenarios is exactly the same. The secure code is running inside the TZ_VMM example.

Do you have an ideia on what might be happening here?

Thanks in advance, Tiago

2016-06-23 10:16 GMT+01:00 Norman Feske <norman.feske@...1...>:

Hi Tiago,

> I'm thinking that this alternative may suffer from the same problem as
> before if Genode's time clock becomes inconsistent whenever Linux is
> being executed in NW.
>
> Do you know any other way to benchmark a world switch + processing +
> world switch scenario? Is there any timer I can execute inside TZ_VMM?

have you considered the use of a performance counter for measuring
low-level code paths? For reference, you may take a look at the
'timestamp' function for ARM:

https://github.com/genodelabs/genode/blob/master/repos/os/include/spec/arm_v7/trace/timestamp.h

Compared to the other time sources, the counter is precise while having
very little overhead. The exact meaning of the counter value may depend
on the platform. E.g., on the Raspberry Pi where I used it, the counter
increases every 64 clock cycles.

As far as I know, the feature must be explicitly enabled by adding the
following line to your <build-dir>/etc/specs.conf:

SPECS += perf_counter

Be aware that further (TZ configuration) steps may be required to expose
the counter to the normal world.

Cheers
Norman

--
Dr.-Ing. Norman Feske
Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth

------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
genode-main mailing list
genode-main@...12...ceforge.net
https://lists.sourceforge.net/lists/listinfo/genode-main