I am working with affinities in order to start and bind various components on different cores.
I am using QEMU as emulation platform to start the system with four cores.
My situation looks as follows:
I have one parent component which creates two more children. Each of these including the init component should be assigned to a different core.
The idea was to start init with an affinity configuration 4x1 in the run script:
=> <affinity-space width="4" height="1"/>
The custom parent component should be configured to get a affinity subspace of 3x1 beginning from the second core like:
=> <affinity xpos="1" ypos="0" width="3" height="1"/>
The next idea was to create two affinities with a subspace of 1x1 for the two children and start them within the parent component.
The two children should be working on the third and fourth core respectively, so my intention was to declare them as xpos 1 and 2 relative to the parents subspace of 3x1:
=> const Genode::Affinity &aff_count{Genode::Affinity::Space(1), Genode::Affinity::Location(1, 0)}; => const Genode::Affinity &aff_val{Genode::Affinity::Space(1), Genode::Affinity::Location(2, 0)};
To check this attempt i inserted a log call in the constructor of core's Cpu_session_component: => log("CORE ",__func__,": Affinity location: xpos: ",_location.xpos()," ypos: ",_location.ypos()," width: ",_location.width()," height: ",_location.height());
The log outputs seem relatively coherent with what was originally planned:
=> init cpu_session: "CORE Cpu_session_component: Affinity location: xpos: 0 ypos: 0 width: 4 height: 1"
=> parent cpu_session: "CORE Cpu_session_component: Affinity location: xpos: 1 ypos: 0 width: 3 height: 1"
=> child_1 cpu_session: "CORE Cpu_session_component: Affinity location: xpos: 2 ypos: 0 width: 3 height: 1"
=> child_2 cpu_session: "CORE Cpu_session_component: Affinity location: xpos: 3 ypos: 0 width: 3 height: 1"
Unfortunately right after the second child gets started, the program loops and does not terminate.
If i change the affinity for the children to
=> const Genode::Affinity &aff_count{Genode::Affinity::Space(1), Genode::Affinity::Location(0, 0)}; => const Genode::Affinity &aff_val{Genode::Affinity::Space(1), Genode::Affinity::Location(1, 0)};
the log outputs also change to
=> child_1 cpu_session: "CORE Cpu_session_component: Affinity location: xpos: 1 ypos: 0 width: 3 height: 1"
=> child_2 cpu_session: "CORE Cpu_session_component: Affinity location: xpos: 2 ypos: 0 width: 3 height: 1"
and the program continues its work as intended.
If you could tell me my thinking errors, if there are any, or how to achieve that each component gets assigned to their respective cores, i would appreciate it a lot.
--
Thank you,
Stephan Lex
Hi Stephan,
On 19.07.2017 14:47, Lex, Stephan wrote:
My situation looks as follows: ...
I'm struggling with following the description of your scenario. Would you mind making it (or better a minimalistic version thereof) available as a branch on GitHub with an accompanied run script - so that it becomes reproducible?
Cheers Norman
Hello Stefan,
On 03.08.2017 12:27, Stephan Lex wrote:
I created a testcase in our repository for you, that shows the functionality we would like to achieve. ...
thanks for the test case. I just gave your branch a try. But please allow me the remark that I feel quite uneasy about debugging problems on a fork of an old Genode version. Aside the inconvenience of switching to an old tool chain, I was left uncertain about the following assumptions:
* Do I need to use the argos-research/genode repository with the branch called 'checkpointRestore'?
* Do I need to create a build directory for your customized version of the base-foc platform, called 'focnados_pbxa9'?
To lower the bar for people to assist you, it would be of great help if you rebase a test scenario to the most current version of Genode. So it becomes much easier to reproduce and we don't run into the risk of debugging problems that are already solved in a later version of Genode (not presuming that this is the case for the issue at hand).
What we were trying to achieve is running init as well as the three components on different cores in the system. Unfortunately the documentation for the affinity subspacing was a bit confusing for us, so it would be great, if you can tell us whether this is the right approach or what logic we are still missing.
I have not completely investigated your scenario but made the following interesting observations while tinkering with it:
* Even though you instruct Qemu to emulate a 4-core machine, the Platform::affinity_space method reports a physical affinity space of 2x1. Maybe the CPU detection is at fault? Or the kernel indeed only sees 2 CPUs? Anyhow, the logical affinity space of 4x1 as defined in your init configuration is ultimately mapped to a physical affinity space of 2x1, which is certainly not your intention.
* The member variable '_location' is designated for the location of a thread within the physical affinity space but it is not used anywhere. Instead, the value should somehow end up being specified to a kernel operation during the thread creation. Hence, regardless of the affinity arguments, they remain without effect.
* Unfortunately, the Fiasco.OC kernel debugger is quite incomplete when using the pbxa9 platform. I suggest to first debug the problem on a more common platform such as x86_32 where the kernel debugger is able to show the CPU number of each thread (via the 'lp' command). A quick test, however, shows that the 'focnados' base platform is apparently tied to ARM. (I get compile errors with ARM register names on x86_32). But maybe the base platform can be easily revived for x86_32?
I hope this little input brings you a bit further.
Cheers Norman
Hello Norman,
- Do I need to use the argos-research/genode repository with the
branch called 'checkpointRestore'?
- Do I need to create a build directory for your customized version
of the base-foc platform, called 'focnados_pbxa9'?
My testcase is supposed to work with Genode 16.08, Fiasco.OC and pbx-A9 platform. You should be able to test it with vanilla Genode 16.08 in a build directory foc_pbxa9 with the original Genode 16.08 arm toolchain. Focnados and the rest of our Rtcr component is not used in this testcase. Unfortunately we could not rebase the testcase for 17.05 as it is part of my final thesis, that is based on 16.08 and understanding the changes for 17.05 and changing it would exceed my timely scope.
- Even though you instruct Qemu to emulate a 4-core machine, the
Platform::affinity_space method reports a physical affinity space of 2x1. Maybe the CPU detection is at fault? Or the kernel indeed only sees 2 CPUs? Anyhow, the logical affinity space of 4x1 as defined in your init configuration is ultimately mapped to a physical affinity space of 2x1, which is certainly not your intention.
In my case an inserted log reports the desired 4 cores from Platform::affinity_space. The log is located in <GENODE-DIR>/repos/base-foc/src/core/platform.cc inside Platform::affinity_space and outputs the variable 'cpus_online'
- The member variable '_location' is designated for the location of a
thread within the physical affinity space but it is not used anywhere. Instead, the value should somehow end up being specified to a kernel operation during the thread creation. Hence, regardless of the affinity arguments, they remain without effect.
This is a bit confusing for me. As i understand it following things should happen in my case: At child creation i pass a capability for a cpus session as well as a Child:Initial_thread object that is seeded with the same cpu session. I create the cpu session by opening a connection to it and passing the desired affinity with it. Shouldn't this be enough to let the child use this affinity, or did i miss something there?
I also attached a little image for you that shows how i want the components to be bound repectively. First, is it possible to bind different components to fixed different cores like shown in the image? Second, is it possible to dynamically spawn a child from within parents (in my example affinity_test_parent) code on another core?
If my approach of distributing the components to different cores is not correct, it would be very good for me, if you could tell me how such a partition has to look as the Genode framework intends it.
Thank you and kind regards, Stephan Lex
Hello Stephan,
Unfortunately we could not rebase the testcase for 17.05 as it is part of my final thesis, that is based on 16.08 and understanding the changes for 17.05 and changing it would exceed my timely scope.
please accept my apology that I cannot follow up on your problem. Given the many current developments, I am unable to justify spending my time debugging problems on an one-year old Genode version, using an outdated kernel and tool chain. I am sorry that this does not help you, but I did not want to leave your last message unanswered.
Cheers Norman