Hallo Norman,
Hi Denis,
I am working on my thesis to implement a real-time capable checkpoint/restore mechanism in Genode/L4-Fiasco.OC on the Chair of Operating Systems, TUM Department of Informatics. A checkpoint component A has to transparently (more or less) store the internal state of another component B and restore it on another machine also running Genode.
let me refer to a recent discussion (back in March) that was closely related to the topic:
https://sourceforge.net/p/genode/mailman/genode-main/thread/CAPyx70Cy_v3WBcq...
to motivate your work, I would recommend you to state a tangible goal, i.e., an concrete example scenario that you want to realize. My sentiment is that many supposed use cases of a general checkpointing mechanism can be addressed differently and simpler on Genode. Please take the line of thinking that I presented as the second approach in the discussion above into consideration.
A common usage of my checkpoint/restore mechanism is found in a vehicle system. It shall provide a tool to enable fault tolerance and load balance on an ECU which runs several tasks concurrently:
If an ECU A goes into an error state and has to be rebooted or it runs a lot of tasks while other ECUs run only a few, all or only some of the tasks shall be migrated to other ECUs.
My mechanism shall checkpoint a task periodically during its execution by imposing as little overhead/downtime to the task as possible. Thus, I want to implement an incremental checkpointing mechanism, which only checkpoints the changes of the internal state to the last checkpoint to Memory.
If an event for a migration of a task occurs (error, high load on the ECU), the stored data are transfered to a suitable ECU and restarted from the point where it was last checkpointed.
That said, I still find it worthwhile to experiment with the design and implementation of such a mechanism. Don't let me discourage you. ;-)
If I understand the second approach correctly, a Component A, which is to be checkpointed, reports actively its state to a Component C, which shall checkpoint Component A. And the restart of Component A will be started from the beginning.
My goal is to provide a transparent checkpoint. Component A shall not know that it is being checkpointed. Also the Component shall be restarted from the last checkpoint, not from the beginning.
By searching the genode-main mailing list I found an idea for emulating the dirty-bit mechanism. In the message [1], the author suggests the use of managed dataspaces and a custom RAM service. The RAM server detaches a dataspace which a client is using, re-attaches and marks the dataspace when the client wants to read it or write to it. Thus, the server of the RAM service notices whenever a client uses a specific dataspace.
[1] https://sourceforge.net/p/genode/mailman/message/34787057/
The problem of this approach is the granularity of the dataspaces. I want to be notified when a virtual memory page is changed and store only this page. But using one RM session per 4 KiB dataspace (typical page size) is not efficient, because an RM session uses 64 KiB for itself.
This has changed in Genode 16.05. Now, one RM session can be used to create many managed dataspaces. Thereby, managed dataspaces have become much cheaper. The former functionality of the' Rm_session' has moved to 'Region_map'. Each PD is readily equipped with 3 region maps. The new 'Rm_session' interface allows the client to create further region caps. See the section in the release notes [1]. Please also consider to follow the changes I made in the new revision of the book [2] for version 16.05.
[1] http://genode.org/documentation/release-notes/16.05#Consolidation_of_core_s_... [2] https://github.com/nfeske/genode-manual/commit/7c165e92697932d2433e00316e5c9...
Thank you for the hint. I planned on implementing the mechanism on Genode 15.08 because other work on the chair to this project was implemented on this version. I will discuss this issue with my advisor.
- Do I understand this approach correctly?
- Is there another way to be notified when a virtual memory page is
accessed?
Not via the Genode API. If you tie your work to a specific kernel, there may be additional kernel-specific mechanisms you could use. E.g., virtualization components like L4Linux (Fiasco.OC) , Seoul (NOVA), or VirtualBox (NOVA) invoke the kernel interface directly to manage a guest OS.
- If not, is it possible to extend the core with this mechanism. If yes,
which modules are involved?
I cannot give you a single true answer. There may be various designs possible, ranging from no modifications of core at all (e.g., via the approach followed by the Noux runtime or the RAM-server mentioned in the discussion you mentioned), over the use of a kernel-specific virtualization feature (monitoring the to-be-checkpointed component at a very low level like a virtual machine), to the extension of core's PD-session interface with a serialization/de-serialization feature. I am sure there are even plenty of more options.
A virtualization feature like a virtual machine impacts the checkpointing and restoring overhead (e.g. increasing data to be transfered, or increasing the restart time until the task runs on the 2nd ECU) which I want to reduce. But, I will reconsider this approach, and I will look up the PD-session interface for the serialization/de-serialization feature. Thank you for the hints :)
Cheers Norman
Kind regards Denis