Benchmarking Genode TrustZone

Thu Jun 23 10:33:02 CEST 2016

Hello Tiago,

On 06/21/2016 03:30 PM, Tiago Brito wrote:
> Hi, I want to benchmark the execution of a function running in the secure
> world of the TZ_VMM scenario in the i.MX53 QSB.
> 
> I have added a syscall to Linux which allows me to trigger a world switch
> from a user program running in Linux. In this program I have a function
> which allocates a buffer and processes it (each buffer position is changed
> in some way). This same function is coded inside TZ_VMM.
> 
> This is what I'm testing:
> 
>    1. Inside my user program in Linux I use gettimeofday before and after
>    the execution of the function in order to get the amount of milliseconds in
>    between. This is my NW test.
>    2. Inside my user program in Linux I use gettimeofday to get the start
>    time, then I execute the syscall which in turn does a world switch. Then
>    the function is executed inside the SW and it returns to the user program
>    inside Linux. After this I call another gettimeofday in order to get the
>    amount of milliseconds of execution.
> 
> The problem is that test 1 is giving me about 90 ms of real time execution,
> but test 2 gives me about 40 ms.

Well, I do not know how big your buffer is, and how computing intensive
the operation, but in general it is not irrational that a computing
intensive task executed in the secure world is completed faster than in
the normal world, given our experimental TrustZone VMM/hypervisor. Due
to the fact, that the secure world immediately receives any secure IRQ,
e.g., during the normal world buffer processing, which might cause
probably expensive world-switches. In contrast to this when the
secure-world is executing it is not "disturbed" by normal world IRQs,
which means: no additional world-switches.
Nevertheless, it does not explain supposedly the mighty gap of 50ms.

> 
> I suspect it might be a problem with Linux virtualization in the TZ_VMM
> example, which may be causing a drift in Linux's clock once it loses
> control to the SW. What I mean is, when there isn't a syscall triggering
> the SMC, Linux can count time just fine, but once the control is lost to
> the secure world the clock inside Linux becomes inconsistent and doesn't
> count time while the secure world is executing. Is this right?

That is totally right, as I've described above, Linux won't get any IRQs
as long as the secure world is executing.

> 
> Since I really need to benchmark a scenario similar to this I think that
> the best alternative is to offload the time functionality to Genode (SW). I
> create another syscall which is responsible for starting a timer inside
> Genode, then I call the SMC syscall which processes the buffer in the SW,
> then I call the time syscall again and check the difference. When I want to
> benchmark the NW function I follow the same steps as before. Will this work
> as intended?
> 

It sounds quite expensive, but should work in general.

> I'm thinking that this alternative may suffer from the same problem as
> before if Genode's time clock becomes inconsistent whenever Linux is being
> executed in NW.

No, Genode's timer service will work consitently, because its secure IRQ
is prioritized higher than Linux normal world IRQs.

> 
> Do you know any other way to benchmark a world switch + processing + world
> switch scenario? Is there any timer I can execute inside TZ_VMM?
> 

Well, in theory if you need a specified latency of IRQs in the normal
world, you need to guarantee that it is executed regularily.
Therefore, you would need to turn your synchronous secure-world call
into an asynchronous one. By now, the normal world won't be executed
until the call returns. That means in the asynchronous case, the "SMC"
call would return immediately, and for the response to the normal world
the VMM must instead inject an IRQ into the normal world.
Moreover, the normal world's execution context must not be prioritized
lower than the secure world's component that does the buffer processing.
However, this way you would turn the whole scenario into a fundamental
different execution model with a lot of implications regarding security
and liveliness. For example, the VMM cannot count on the shared memory's
consitency due to the normal world being executed in parallel, or a
higher priority of the VM can lead to starving, secure components.

To sum it up, if its "just" for the measurements, I would not change the
fundamental setup being in your position.

Regards
Stefan

> Thanks in advance, Tiago
> 
> 
> 
> ------------------------------------------------------------------------------
> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
> present their vision of the future. This family event has something for
> everyone, including kids. Get more information and register today.
> http://sdm.link/attshape
> 
> 
> 
> _______________________________________________
> genode-main mailing list
> genode-main at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/genode-main
> 

-- 
Stefan Kalkowski
Genode Labs

http://www.genode-labs.com/ · http://genode.org/