Hi Daniel,
Our results so far demonstrate that, in any case, a component running on core 1 utilizes at least the rom and log services provided by init which is running on core 0. Also the usage of a second level init component (which is suggested by the genode book) didn't change the resulting behavior.
this observation is correct. All session interfaces provided by core are served by a single thread running on the boot CPU. Also the init instance spawned by core is started on the same CPU. There are two reasons for this way of operation.
First, however complex your Genode system is, the ultimate allocation and arbitration of physical resources must happen - and be synchronized across CPUs - at a central point. Whether the synchronization happens via multiple threads (running on a different CPU each) contending for a lock or by serializing RPC calls to a single thread, cross-CPU synchronization is inevitable. We picked the latter path because it is simpler.
Second, core has no prior knowledge about the demands of the user land and does not even try to be clever (e.g., it does not balance threads among CPUs automatically). The first point where such policy enters the picture is the configuration of init. Hence, a secondary init can be moved to a different CPU by the means of the configuration of the primary init, but the primary init instance is bound to the boot CPU.
Of course, genode is not meant to be a separation kernel by design, but would it still be possible to map/assign resources/partitions to corresponding cores? Or do all components running on different cores still have to contact the genode CORE component which is running on core 0?
Should multi-core scalability become a concern (our today's workloads clearly don't suffer from the current design), I see three principle ways for improvement.
1. Combining the notions of a low-count SMP Genode system and a "distributed" Genode system (in multikernel fashion). In this picture, there would be multiple loosely-coupled Genode systems running on the same machine, using only a few coarsely shared resources. Each of these Genode systems would control multiple CPUs that are close to each other according to the physical CPU topology. So the current way of inter-CPU synchronization can work well within each of these systems.
2. Core could create an entrypoint for each physical CPU. For certain core services (PD, IRQ), it could take the client-provided session- affinity information into account to create the respective session object on the same CPU as the client. So RPC calls could be served independently. As noted above, the entrypoints must still be synchronized with each other. For IRQ sessions, this could generally be beneficial since each IRQ session is fairly free-standing. On the other hand, kernels like base-hw and NOVA don't actually need any interplay between the user-level device driver and core for servicing interrupts.
3. Should we observe a high contention on a specific core service, let's say the RAM service, one could implement "cached" versions of this service as a separate component. This could be a RAM server that initially allocates dataspaces from core and hands them out to its clients. The physical RAM could be partitioned among several of such cache services - each running on a different CPU. This could work also well for the ROM service.
That said, there are currently no concrete plans to pursue any of those ideas.
Cheers Norman