Hello Denis,
thanks to you [1], I could implement my Checkpoint/Restore mechanism on Genode/Fiasco.OC. I also added the incremental checkpoint optimization, to stored only changed memory regions compared to the last checkpoint (although this is not working reliably due to a Fiasco.OC bug, which Stefan Kalkowski found for me [2]). I also managed to checkpoint the capability map and restore it with new badges and insert missing capabilities into the capability space of Fiasco.OC.
this is impressive!
My problem is, although I restore all RPC objects, especially the instruction and stack pointer of the main thread, and the capability map and space, the target component just starts its execution from the beginning of its Component::construct function.
This confuses me. If you restore the states of all threads, why don't the threads start at their checkpointed state? Starting the execution at the 'Component::construct' function feels wrong. In particular, code that is expected to be executed only once is in fact executed twice, once in by the original component and a second time after waking up the restored component.
My approach: For the restore phase, I use Genode's native bootstrap mechanism (i.e. I create a Genode::Child object) until it requests a LOG session from my Checkpoint/Restore component. I force a LOG session request in ::Constructor_component::construct() just before "Genode::call_component_construct(env);" in
https://github.com/genodelabs/genode/blob/16.08/repos/base/src/lib/base/entr...
Until the session request several RAM dataspaces are created, among other RPC objects, and attached to the address space. In my restore mechanism I identify the RPC objects, which were created by the bootstrap/startup mechanism, and only restore their state.
What you observe here is the ELF loading of the child's binary. As part of the 'Child' object, the so-called '_process' member is constructed. You can find the corresponding code at 'base/src/lib/base/child_process.cc'. The code parses the ELF executable and loads the program segments, specifically the read-only text segment and the read-writable data/bss segment. For the latter, a RAM dataspace is allocated and filled with the content of the ELF binary's data. In your case, when resuming, this procedure is wrong. After all, you want to supply the checkpointed data to the new child, not the initial data provided by the ELF binary.
Fortunately, I encountered the same problem when implementing fork for noux. I solved it by letting the 'Child_process' constructor accept an invalid dataspace capability as ELF argument. This has two effects: First, the ELF loading is skipped (obviously - there is no ELF to load). And second the creation of the initial thread is skipped as well.
In short, by supplying an invalid dataspace capability as binary for the new child, you avoid all those unwanted operations. The new child will not start at 'Component::construct'. You will have to manually create and start the threads of the new child via the PD and CPU session interfaces.
During that process the mandatory CPU threads are identified (three of them: "ep", "signal_handler", and "childs_rom_name") and restored to their checkpointed state, especially the ip and sp registers. I did that through the use of Cpu_thread::state(Thread_state), but without luck. Also, although I know that the CPU threads were already started, I tried to call Cpu_thread::start(ip, sp), but without success.
The approach looks good. I presume that you encounter base-foc-specific peculiarities of the thread-creation procedure. I would try to follow the code in 'base-foc/src/core/platform_thread.cc' to see what the interaction of core with the kernel looks like. The order of operations might be important.
One remaining problem may be that - even though you may by able the restore most part of the thread state - the kernel-internal state cannot be captured. E.g., think of a thread that was blocking in the kernel via 'l4_ipc_reply_and_wait' when checkpointed. When resumed, the new thread can naturally not be in this blocking state because the kernel's state is not part of the checkpointed state. The new thread would possibly start its execution at the instruction pointer of the syscall and issue system call again, but I am not sure what really happens in practice.
After the restoration which happens entirely during the LOG session request of the child, my component returns with a valid session object to the child. Now the child should continue the work from the point where it was checkpointed, but it continues its execution right after the LOG session request, ignoring the setting of the instruction pointer.
I think that you don't need the LOG-session quirk if you follow my suggestion to skip the ELF loading for the restored component altogether. Could you give it a try?
Cheers Norman