Genode Page-Fault Handling and IRQ architecture
Daniel Waddington
d.waddington at ...60...
Wed Dec 19 22:42:03 CET 2012
Hi Norman, thanks for your quick reply. Responses inline...
> -----Original Message-----
> From: Norman Feske [mailto:norman.feske at ...1...]
> Sent: Wednesday, December 19, 2012 11:08 AM
> To: genode-main at lists.sourceforge.net
> Subject: Re: Genode Page-Fault Handling and IRQ architecture
>
> Hi Daniel,
>
> thanks for providing more details about the motivation behind your
> questions. From this information, I gather that you actually do not
require
> the implementation of on-demand-paging policies outside of core.
> Is this correct?
Yes.
> > 1.) Distribute physical memory management across N cores and have each
> > locally handle page faults. The purpose is to eliminate contention on
> > physical memory AVL trees and avoid cross-core IPC as much as
> > possible. Of course this requires partitioning the physical memory
> > space. I also want to avoid eager mapping.
>
> I can clearly see how to achieve these goals:
Good. That's encouraging!
> * Genode on Fiasco.OC already populates page tables in a lazy way.
> There are no eager mappings. When attaching a dataspace to the
> RM session of a process, the page table of the process remains
> unchanged. The mapping gets inserted not before the process touches
> the virtual memory location (and thereby triggers the page-fault
> mechanism implemented by core).
>
> * To avoid cross-core IPC, there should be one pager thread per core.
> When setting the affinity of a thread, the pager should be set
> accordingly. This way, the handling of page faults would never cross
> CPU boundaries. That is actually quite straight forward to implement.
> On NOVA, we are even using one pager per thread. The flexibility to
> do that is built-in into the framework.
>
> Also, I would investigate to use multiple entrypoing (i.e., one for
> each CPU) to handle core's services. For doing this, we could attach
> affinity information as session arguments and then direct the session
> request to the right entrypoint at session-creation time.
Sounds sensible. I would use a mask to future proof for other schedulers.
> * To partition the physical memory, RAM sessions would need to carry
> affinity information with them - similar to how CPU sessions are
> already having priority information associated to them. Each RAM
> session would then use a different pool of physical memory.
Could I not use nested RM sessions/dataspaces to help with this?
> > 2.) For the purpose of scaling device drivers, I would like to avoid
> > serialization on the IRQ handling. I would like the kernel to deliver
> > the IRQ request message directly to a registered handler or via a
> > 'core' thread on the same core. We then use ACPI to nicely route IRQs
> > across cores and parallelize IRQ handling load - this is useful for
multi-queue
> NICs etc.
>
> Maybe the sequence diagram was a bit misleading. It is displaying the
> sequence for just one IRQ number. For each IRQ, there is a completely
> different flow of control. I.e., there is one thread per IRQ session in
core. The
> processing of IRQs are not serialized at all. They are processed
concurrently.
>
> There are opportunities for optimizations though. For example, we could
> consider to delegate the capability selectors for the used kernel IRQ
objects
> to the respective IRQ-session clients. This way, we could take core
> completely out of the loop for the IRQ handling on Fiasco.OC. We haven't
> implemented this optimization yet for two mundane reasons.
> First, we wanted to avoid a special case for Fiasco.OC unless we are sure
that
> the optimization is actually beneficial. And second, we handle shared IRQs
in
> core. We haven't yet taken the time for investigating how to handle shared
> IRQs with the sole use of kernel IRQ objects.
>
> > I have been exploring the idea of setting up another child of core
> > that has special kernel capabilities (yes I know TCB expansion) so it
> > can set up threads and IRQs in this way. What do you think of this
idea?
>
> To me this looks like you are creating a new OS personality (or runtime)
on
> top of Genode - similar to how L4Linux works. So you are effectively
> bypassing Genode (and naturally working around potential scalability
issues).
> Personally, I would prefer to improve the underlying framework (in
particular
> the implementation of core) to accommodate your requirements in the first
> place. This way, all Genode components would benefit, not only those that
> are children of your runtime.
Yes, I guess so - that is providing it makes sense for the general Genode
distribution. Also, this
might be a stop-gap until we/you get some of the changes into Genode.
Daniel
More information about the users
mailing list