Restoring child with checkpointed state

Wed Dec 7 14:13:11 CET 2016

Hello Denis,

> thanks to you [1], I could implement my Checkpoint/Restore mechanism on 
> Genode/Fiasco.OC. I also added the incremental checkpoint optimization, 
> to stored only changed memory regions compared to the last checkpoint 
> (although this is not working reliably due to a Fiasco.OC bug, which 
> Stefan Kalkowski found for me [2]). I also managed to checkpoint the 
> capability map and restore it with new badges and insert missing 
> capabilities into the capability space of Fiasco.OC.

this is impressive!

> My problem is, although I restore all RPC objects, especially the 
> instruction and stack pointer of the main thread, and the capability map 
> and space, the target component just starts its execution from the 
> beginning of its Component::construct function.

This confuses me. If you restore the states of all threads, why don't
the threads start at their checkpointed state? Starting the execution at
the 'Component::construct' function feels wrong. In particular, code
that is expected to be executed only once is in fact executed twice,
once in by the original component and a second time after waking up the
restored component.

> My approach:
> For the restore phase, I use Genode's native bootstrap mechanism (i.e. I 
> create a Genode::Child object) until it requests a LOG session from my 
> Checkpoint/Restore component. I force a LOG session request in 
> ::Constructor_component::construct() just before 
> "Genode::call_component_construct(env);" in
> 
> https://github.com/genodelabs/genode/blob/16.08/repos/base/src/lib/base/entrypoint.cc#L154
> 
> Until the session request several RAM dataspaces are created, among 
> other RPC objects, and attached to the address space. In my restore 
> mechanism I identify the RPC objects, which were created by the 
> bootstrap/startup mechanism, and only restore their state.

What you observe here is the ELF loading of the child's binary. As part
of the 'Child' object, the so-called '_process' member is constructed.
You can find the corresponding code at
'base/src/lib/base/child_process.cc'. The code parses the ELF executable
and loads the program segments, specifically the read-only text segment
and the read-writable data/bss segment. For the latter, a RAM dataspace
is allocated and filled with the content of the ELF binary's data. In
your case, when resuming, this procedure is wrong. After all, you want
to supply the checkpointed data to the new child, not the initial data
provided by the ELF binary.

Fortunately, I encountered the same problem when implementing fork for
noux. I solved it by letting the 'Child_process' constructor accept an
invalid dataspace capability as ELF argument. This has two effects:
First, the ELF loading is skipped (obviously - there is no ELF to load).
And second the creation of the initial thread is skipped as well.

In short, by supplying an invalid dataspace capability as binary for the
new child, you avoid all those unwanted operations. The new child will
not start at 'Component::construct'. You will have to manually create
and start the threads of the new child via the PD and CPU session
interfaces.

> During that process the mandatory CPU threads are identified (three of 
> them: "ep", "signal_handler", and "childs_rom_name") and restored to 
> their checkpointed state, especially the ip and sp registers. I did that 
> through the use of Cpu_thread::state(Thread_state), but without luck. 
> Also, although I know that the CPU threads were already started, I tried 
> to call Cpu_thread::start(ip, sp), but without success.

The approach looks good. I presume that you encounter base-foc-specific
peculiarities of the thread-creation procedure. I would try to follow
the code in 'base-foc/src/core/platform_thread.cc' to see what the
interaction of core with the kernel looks like. The order of operations
might be important.

One remaining problem may be that - even though you may by able the
restore most part of the thread state - the kernel-internal state cannot
be captured. E.g., think of a thread that was blocking in the kernel via
'l4_ipc_reply_and_wait' when checkpointed. When resumed, the new thread
can naturally not be in this blocking state because the kernel's state
is not part of the checkpointed state. The new thread would possibly
start its execution at the instruction pointer of the syscall and issue
system call again, but I am not sure what really happens in practice.

> After the restoration which happens entirely during the LOG session 
> request of the child, my component returns with a valid session object 
> to the child. Now the child should continue the work from the point 
> where it was checkpointed, but it continues its execution right after 
> the LOG session request, ignoring the setting of the instruction pointer.

I think that you don't need the LOG-session quirk if you follow my
suggestion to skip the ELF loading for the restored component
altogether. Could you give it a try?

Cheers
Norman

-- 
Dr.-Ing. Norman Feske
Genode Labs

http://www.genode-labs.com · http://genode.org

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth