base-hw: Virtualbox thread priorities

Thu Jul 2 16:48:22 CEST 2015

Hi Adrian,

On 02.07.2015 12:02, Adrian-Ken Rueegsegger wrote:
> Hi,
> 
> While investigating the cause for the performance discrepancy of
> Virtualbox on NOVA vs hw_x86_64[_muen] [1], I have determined, with help
> from Alexander, that Virtualbox thread priorities [2] are not applied on
> base-hw. Even though priorities are specified when constructing the
> Cpu_connection [3], they have no effect since no quota is subsequently
> assigned/transferred. According to the 14.11 release notes [4], threads
> without quota have no guarantee to be scheduled and their priority is
> ignored. I believe this is the main cause for the difference in
> execution speed of Virtualbox on NOVA and base-hw.

First of all, be aware that in the 15.05 release the integration of the
base-hw scheduling model has changed a bit [1]. Let's say you configure
a component with something like '<resource name="CPU" quantum="30"/>'.
Before 15.05, these 30% of the parent quota were merely transferred to
the env()->cpu_session() of the new component without any effect until
you explicitely say "thread X shall receive Y percent of this quota"
inside the component.

Since 15.05, all quota of a CPU session is automatically and permanently
distributed to the threads of the session. So as long as there is only
the main thread in our example component, this thread receives 30% of
the parent quota. After starting two additional threads, every thread
would receive 10% and so on.

To influence this distribution, there are the so-called thread weights.
By default, every thread gets the same weight 10. But if you create a
thread with a higher weight, it receives relatively more quota from its
session. So, in the above example, if there is the main thread (default
weight 10) and an additional thread with weight 90, the main thread
receives 3% of the parent quota while the other thread 27%.

However, you're right in general. CPU quota is our way to restrict the
time that a priority is allowed to get effective during a scheduling
super-period. So, if a prioritized thread has quota 0, the priority is
never applied by the scheduler.

> I was wondering if you could confirm my analysis of the issue and
> provide some pointers on how to best achieve the proper application of
> Virtualbox thread priorities on base-hw.

AFAIK, in the virtualbox component, we create a CPU session for each
priority level. To get these priorities active, the following would have
to be done:

* Configure the vbox component to receive some quota from init (add
<resource name="CPU" quantum="${PERCENTAGE}"/> as subnode to the vbox
start node)

* Set the reference account of every new CPU connection in vbox to the
env session by doing
my_cpu_connection.ref_account(Genode::env()->cpu_session()). This
enables you to transfer quota from the reference account to your CPU
connection.

* Let env()->cpu_session() transfer its quota to the new CPU
connections. This is a bit tricky. When transferring, you state the
percentage of the source-session quota scaled up to
Cpu_session::QUOTA_LIMIT. So, if you state (Cpu_session::QUOTA_LIMIT /
2), this means 50% of the source-session quota while
(Cpu_session::QUOTA_LIMIT / 3) means 33% and so on. Furthermore, the
value refers to the quota that is left at the source session not its
initial quota. Thus, if you want to distribute quota evenly to 3 other
sessions A, B, C, start with transferring 33% to A, then transfer 50% to
B, and then transfer 100% to C". For an example see
Init::Child::Resources::transfer_cpu_quota in os/include/init/child.h.
It takes the percentages that refer to the initial env quota from the
init config and translates them to percentages that refer to the latest
env quota.

* Optional: Specify individual weights for the vbox threads when
constructing them. If you roughly know, which thread needs more time for
its task and which one needs less, specifying individual weights may
optimize the overall performance. It goes without saying that this also
depends on how you've distributed quota in the previous steps. Thread
weights influence only the session-local distribution.

Now, let's say you have 4 threads. Thread A needs low latencies but has
little to do when it gets scheduled. Thread B also needs low latencies
but normally has a bigger work load than thread A. Thread C does not
need low latencies but should have the guarantee to be scheduled
frequently nevertheless. In addition, thread C normally has the biggest
workload. Thread D needs no guarantee about whether it gets scheduled
frequently and with which latency. So, an exemplary configuration would
be the following:

CPU connection 1: 40% of env quota, higher prio
   thread A: weight 1 (10% of env quota)
   thread B: weight 3 (30% of env quota)

CPU connection 2: 60% of env quota, lower prio
   thread C: default weight (60% of env quota)

CPU connection 3: 0% of env quota, prio doesn't matter
   thread D: default weight (0% of env quota)

Thread D could also be created at the env session if you distribute all
env quota. Furthermore, if you want to do this distribution on a more
reasonable basis, you can use Cpu_session::quota(). It tells you, how
long a super period is in microseconds and how much of that time your
session owns.

Be aware that there might be a problem with CPU quotas on hw_x86_64
currently. The test os/cpu_quota.run doesn't succeed on hw_x86_64 (it
states that it is successful but that is a bug because err in
check_counter in cpu_quota.run is 1 and not 0.01). There might be a
problem with the timer or the quota calculation (64 bit). I'll have a
look at this ASAP. However, the test finishes and shows a correct
CPU-utilization stat over all counter threads. The sourcecode might also
serve as show case for how CPU quotas (should) work.

If you have further questions, don't hesitate to ask ;)

Cheers,
Martin

[1]
http://genode.org/documentation/release-notes/15.05#Dynamic_thread_weights