base-hw: Virtualbox thread priorities

Martin Stein martin.stein at ...1...
Tue Jul 7 15:26:06 CEST 2015


Hi Adrian,

On 06.07.2015 15:07, Adrian-Ken Rueegsegger wrote:
> I have implemented the above mentioned steps, see [1], and the boot time
> of a custom Linux (buildroot) from the bootloader menu to the prompt has
> been halved from ~1 min 03 seconds down to ~33 seconds. 

Great to hear that the configuration already had a positive effect :)

> However it seems
> that tweaking/playing around with the quota values [2] has no further
> effect on the execution time.

There is a bug [1] in the current master branch that prevents the
initial quota configuration on the construction of a thread. Thus,
threads merely receive quota when another thread at the session gets
created or destructed. Could you please apply the commit in the
referenced issue and try again.

> Changing the base-hw super period to 100ms and the timeslice to 1ms in
> [3] reduces the boot time to ~19 seconds. I am not quite sure about the
> exact reason for the speedup but I presume it is due to the fact that
> the super period is much shorter and thus the quota of each thread is
> refilled more frequently.

I've talked with Alexander about that. Our main ideas regarding this:

* The shorter super period mainly is a benefit for threads with quota.
It makes it less propable that quota threads don't use there whole quota
during a super period and, as you mentioned, quota more often gets
refilled. Thus, non-quota threads more often have only the unassigned
quota left for their purposes. The shorter time slice lowers the latency
during non-quota execution but also the throughput during this mode. In
fact, finding the best super period and time slice is an open problem
because we have not much "real world" data regarding the HW-scheduling
by now.

* For comparison with NOVA, it might be a good idea to assign 100% of
the CPU quota, because then, priorities are absolute also on base-hw.
Later, you may switch the best-effort parts of the scenario back to a
non-quota mode and see if the performance benefits from it.

* For a better debugging basis it might be useful to get the CPU
utilization stats of the scenario. This can be achieved in the kernel
pretty simple and unadulterately. You can find a basic implementation
and demonstration on my branch [2] (all .* commits). The output is
triggered via Kernel::print_char(0) that can be called from everywhere
in the userland but you may also print it e.g. on console-input IRQ
through THREAD_QUOTA_STATS_IRQ in the last commit. The printed "spent"
values are the timer tics that a thread has used for its own or spent
helping another thread. The printed "used" values are the timer tics
that the thread was really executed (on its own or through a helper).

* Beside vbox, also Genode device drivers (especially the timer) should
receive a reasonable amount of quota. On X86 the timer is also pretty
time intensive. It should be able to update its state at least 19 times
during a super period. In my qemu tests with cpu_quota.run, 10% of the
super period (5 ms per update) were definitely enough for the timer but
I assume it also works with less quota.

> I gave the cpu_quota scenario including your changes regarding #1616 [4]
> a try on hw_x86_64 but it seems that it does not complete. Should the
> test pass successfully with your latest changes?

I think the problem is the timer. On ARM the timer is once configured to
an X-seconds timeout and then sleeps until this timeout is over. On X86,
however, the timer needs the guarantee to be scheduled frequently also
during the timeout as mentioned above. Thus, the test simply takes to
long although the measured results are good in principal. I'm currently
working on this and will keep you up to date.

> As this execution time is still a lot slower than e.g. NOVA/Virtualbox,
> which boots the same system in about ~7 seconds, there still seems to be
> lingering issue(s) with regards to the base-hw scheduling. I would be
> glad if you could investigate this problem.

According to my above writings, I think that we can still raise the
performance through modifications at the application level. It would
also be helpful to see your CPU-utilization stats if the scheudling
remains a problem.

Cheers,
Martin

[1] https://github.com/genodelabs/genode/issues/1620
[2] https://github.com/m-stein/genode/tree/hw_cpu_quota_stats




More information about the users mailing list