Incremental checkpointing of components

Tue Jul 12 10:10:58 CEST 2016

Hallo Norman,

> Hi Denis,
>
>> I am working on my thesis to implement a real-time capable
>> checkpoint/restore mechanism in Genode/L4-Fiasco.OC on the Chair of
>> Operating Systems, TUM Department of Informatics. A checkpoint
>> component A has to transparently (more or less) store the internal
>> state of another component B and restore it on another machine also
>> running Genode.
>
> let me refer to a recent discussion (back in March) that was closely
> related to the topic:
>
>
> https://sourceforge.net/p/genode/mailman/genode-main/thread/CAPyx70Cy_v3WBcqtsBuN5r%3DDwBDM_xH6COq_Z86aubHYdYR_3w%40mail.gmail.com/#msg34991766
>
> to motivate your work, I would recommend you to state a tangible goal,
> i.e., an concrete example scenario that you want to realize. My
> sentiment is that many supposed use cases of a general checkpointing
> mechanism can be addressed differently and simpler on Genode. Please
> take the line of thinking that I presented as the second approach in the
> discussion above into consideration.

A common usage of my checkpoint/restore mechanism is found in a vehicle 
system. It shall provide a tool to enable fault tolerance and load 
balance on an ECU which runs several tasks concurrently:

If an ECU A goes into an error state and has to be rebooted or it runs a 
lot of tasks while other ECUs run only a few, all or only some of the 
tasks shall be migrated to other ECUs.

My mechanism shall checkpoint a task periodically during its execution 
by imposing as little overhead/downtime to the task as possible. Thus, I 
want to implement an incremental checkpointing mechanism, which only 
checkpoints the changes of the internal state to the last checkpoint to 
Memory.

If an event for a migration of a task occurs (error, high load on the 
ECU), the stored data are transfered to a suitable ECU and restarted 
from the point where it was last checkpointed.

> That said, I still find it worthwhile to experiment with the design and
> implementation of such a mechanism. Don't let me discourage you. ;-)

If I understand the second approach correctly, a Component A, which is 
to be checkpointed, reports actively its state to a Component C, which 
shall checkpoint Component A. And the restart of Component A will be 
started from the beginning.

My goal is to provide a transparent checkpoint. Component A shall not 
know that it is being checkpointed. Also the Component shall be 
restarted from the last checkpoint, not from the beginning.

>> By searching the genode-main mailing list I found an idea for emulating
>> the dirty-bit mechanism. In the message [1], the author suggests the use
>> of managed dataspaces and a custom RAM service. The RAM server detaches
>> a dataspace which a client is using, re-attaches and marks the dataspace
>> when the client wants to read it or write to it. Thus, the server of the
>> RAM service notices whenever a client uses a specific dataspace.
>>
>> [1] https://sourceforge.net/p/genode/mailman/message/34787057/
>>
>> The problem of this approach is the granularity of the dataspaces. I
>> want to be notified when a virtual memory page is changed and store only
>> this page. But using one RM session per 4 KiB dataspace (typical page
>> size) is not efficient, because an RM session uses 64 KiB for itself.
>
> This has changed in Genode 16.05. Now, one RM session can be used to
> create many managed dataspaces. Thereby, managed dataspaces have become
> much cheaper. The former functionality of the' Rm_session' has moved to
> 'Region_map'. Each PD is readily equipped with 3 region maps. The new
> 'Rm_session' interface allows the client to create further region caps.
> See the section in the release notes [1]. Please also consider to follow
> the changes I made in the new revision of the book [2] for version 16.05.
>
> [1]
> http://genode.org/documentation/release-notes/16.05#Consolidation_of_core_s_SIGNAL__CAP__RM__and_PD_services
> [2]
> https://github.com/nfeske/genode-manual/commit/7c165e92697932d2433e00316e5c9cfb60c8a56d

Thank you for the hint. I planned on implementing the mechanism on 
Genode 15.08 because other work on the chair to this project was 
implemented on this version. I will discuss this issue with my advisor.

>> * Do I understand this approach correctly?
>> * Is there another way to be notified when a virtual memory page is
>> accessed?
>
> Not via the Genode API. If you tie your work to a specific kernel, there
> may be additional kernel-specific mechanisms you could use. E.g.,
> virtualization components like L4Linux (Fiasco.OC) , Seoul (NOVA), or
> VirtualBox (NOVA) invoke the kernel interface directly to manage a guest OS.
>
>> * If not, is it possible to extend the core with this mechanism. If yes,
>> which modules are involved?
>
> I cannot give you a single true answer. There may be various designs
> possible, ranging from no modifications of core at all (e.g., via the
> approach followed by the Noux runtime or the RAM-server mentioned in the
> discussion you mentioned), over the use of a kernel-specific
> virtualization feature (monitoring the to-be-checkpointed component at a
> very low level like a virtual machine), to the extension of core's
> PD-session interface with a serialization/de-serialization feature. I am
> sure there are even plenty of more options.

A virtualization feature like a virtual machine impacts the 
checkpointing and restoring overhead (e.g. increasing data to be 
transfered, or increasing the restart time until the task runs on the 
2nd ECU) which I want to reduce. But, I will reconsider this approach, 
and I will look up the PD-session interface for the 
serialization/de-serialization feature. Thank you for the hints :)

> Cheers
> Norman
>

Kind regards
Denis