Hi Adrian,
On 02.07.2015 12:02, Adrian-Ken Rueegsegger wrote:
Hi,
While investigating the cause for the performance discrepancy of Virtualbox on NOVA vs hw_x86_64[_muen] [1], I have determined, with help from Alexander, that Virtualbox thread priorities [2] are not applied on base-hw. Even though priorities are specified when constructing the Cpu_connection [3], they have no effect since no quota is subsequently assigned/transferred. According to the 14.11 release notes [4], threads without quota have no guarantee to be scheduled and their priority is ignored. I believe this is the main cause for the difference in execution speed of Virtualbox on NOVA and base-hw.
First of all, be aware that in the 15.05 release the integration of the base-hw scheduling model has changed a bit [1]. Let's say you configure a component with something like '<resource name="CPU" quantum="30"/>'. Before 15.05, these 30% of the parent quota were merely transferred to the env()->cpu_session() of the new component without any effect until you explicitely say "thread X shall receive Y percent of this quota" inside the component.
Since 15.05, all quota of a CPU session is automatically and permanently distributed to the threads of the session. So as long as there is only the main thread in our example component, this thread receives 30% of the parent quota. After starting two additional threads, every thread would receive 10% and so on.
To influence this distribution, there are the so-called thread weights. By default, every thread gets the same weight 10. But if you create a thread with a higher weight, it receives relatively more quota from its session. So, in the above example, if there is the main thread (default weight 10) and an additional thread with weight 90, the main thread receives 3% of the parent quota while the other thread 27%.
However, you're right in general. CPU quota is our way to restrict the time that a priority is allowed to get effective during a scheduling super-period. So, if a prioritized thread has quota 0, the priority is never applied by the scheduler.
I was wondering if you could confirm my analysis of the issue and provide some pointers on how to best achieve the proper application of Virtualbox thread priorities on base-hw.
AFAIK, in the virtualbox component, we create a CPU session for each priority level. To get these priorities active, the following would have to be done:
* Configure the vbox component to receive some quota from init (add <resource name="CPU" quantum="${PERCENTAGE}"/> as subnode to the vbox start node)
* Set the reference account of every new CPU connection in vbox to the env session by doing my_cpu_connection.ref_account(Genode::env()->cpu_session()). This enables you to transfer quota from the reference account to your CPU connection.
* Let env()->cpu_session() transfer its quota to the new CPU connections. This is a bit tricky. When transferring, you state the percentage of the source-session quota scaled up to Cpu_session::QUOTA_LIMIT. So, if you state (Cpu_session::QUOTA_LIMIT / 2), this means 50% of the source-session quota while (Cpu_session::QUOTA_LIMIT / 3) means 33% and so on. Furthermore, the value refers to the quota that is left at the source session not its initial quota. Thus, if you want to distribute quota evenly to 3 other sessions A, B, C, start with transferring 33% to A, then transfer 50% to B, and then transfer 100% to C". For an example see Init::Child::Resources::transfer_cpu_quota in os/include/init/child.h. It takes the percentages that refer to the initial env quota from the init config and translates them to percentages that refer to the latest env quota.
* Optional: Specify individual weights for the vbox threads when constructing them. If you roughly know, which thread needs more time for its task and which one needs less, specifying individual weights may optimize the overall performance. It goes without saying that this also depends on how you've distributed quota in the previous steps. Thread weights influence only the session-local distribution.
Now, let's say you have 4 threads. Thread A needs low latencies but has little to do when it gets scheduled. Thread B also needs low latencies but normally has a bigger work load than thread A. Thread C does not need low latencies but should have the guarantee to be scheduled frequently nevertheless. In addition, thread C normally has the biggest workload. Thread D needs no guarantee about whether it gets scheduled frequently and with which latency. So, an exemplary configuration would be the following:
CPU connection 1: 40% of env quota, higher prio thread A: weight 1 (10% of env quota) thread B: weight 3 (30% of env quota)
CPU connection 2: 60% of env quota, lower prio thread C: default weight (60% of env quota)
CPU connection 3: 0% of env quota, prio doesn't matter thread D: default weight (0% of env quota)
Thread D could also be created at the env session if you distribute all env quota. Furthermore, if you want to do this distribution on a more reasonable basis, you can use Cpu_session::quota(). It tells you, how long a super period is in microseconds and how much of that time your session owns.
Be aware that there might be a problem with CPU quotas on hw_x86_64 currently. The test os/cpu_quota.run doesn't succeed on hw_x86_64 (it states that it is successful but that is a bug because err in check_counter in cpu_quota.run is 1 and not 0.01). There might be a problem with the timer or the quota calculation (64 bit). I'll have a look at this ASAP. However, the test finishes and shows a correct CPU-utilization stat over all counter threads. The sourcecode might also serve as show case for how CPU quotas (should) work.
If you have further questions, don't hesitate to ask ;)
Cheers, Martin
[1] http://genode.org/documentation/release-notes/15.05#Dynamic_thread_weights