Hi all,
I want to be able to 'trace' every memory write access of an arbitrary target application, in order to 'duplicate' those writes into another memory location.
So that I would then have at all times one exact copy of the memory area (stack and heap) used by that application.
My plan was to realize this with the help of a parent application, that acts as a session-proxy for the target, i.e. target requests a RAM-session from parent, which forwards the request to core and then provides the session to target; thus, the parent later knows which dataspaces/ memory areas are used by target.
It could then flag that whole memory area with watchpoints. Hence, parent would be notified about all write accesses made by target and could consequently duplicate those writes in a secondary memory location.
(I know that this would probably incur a slowdown factor of 10-1000 times, but that is allowed in this scenario.)
Sadly, only now I have read that watchpoints are not implemented in Genode, and most probably neither are they in the underlying kernel, Fiasco.OC, which I have to use; breakpoints on the other hand seem to work just fine.
My question is: Could there possibly be another way to achieve my goal, or do I have to bite the bullet and implement watchpoints myself?
Thanks for bearing with me and for any hints at all.
Best,
Josef
Hi Josef,
I want to be able to 'trace' every memory write access of an arbitrary target application, in order to 'duplicate' those writes into another memory location. ... Sadly, only now I have read that watchpoints are not implemented in Genode, and most probably neither are they in the underlying kernel, Fiasco.OC, which I have to use; breakpoints on the other hand seem to work just fine.
My question is: Could there possibly be another way to achieve my goal, or do I have to bite the bullet and implement watchpoints myself?
this question reminds me on the student project of Martin Stein several years ago. Martin's goal was to emulate I/O devices backed by an on-target VHDL simulation. Both the driver and the simulated I/O device would be hosted on the same machine. Each time the driver accesses a memory-mapped I/O register, the simulated device would need to handle that access.
Martin solved this problem on the base-hw kernel on ARM. Please refer to his thesis for more information.
http://genode-labs.com/publications/mstein-device_emulation-2011.pdf
Cheers Norman
Hello Norman,
this question reminds me on the student project of Martin Stein several years ago. Martin's goal was to emulate I/O devices backed by an on-target VHDL simulation. Both the driver and the simulated I/O device would be hosted on the same machine. Each time the driver accesses a memory-mapped I/O register, the simulated device would need to handle that access.
the thesis indeed seems as a good starting point. I will try to get the original code to run under Genode 11.08, then change the code so that instead of MMIO-accesses, normal memory accesses within some special region are caught and make other first adjustments for my use-case; does this seem correct? Or is it more feasible to leave the MMIO mechanism as-is and then assign some emulated MMIO memory area to the target process (= the process to be monitored)? After that, I would then try to port that to a more recent version of Genode and then implement the rest.
Martin solved this problem on the base-hw kernel on ARM. Please refer to his thesis for more information. http://genode-labs.com/publications/mstein-device_emulation-2011.pdf
The thesis seems to implement it (albeit on an older version of) Fiasco.OC, which is what I'm using; is there any way to access the complete source code of the thesis or is it proprietary code? If not then I'll just try to make it work using the code-snippets inside the document.
Also, I found the follow-up thesis here: http://www.genode-labs.com/publications/mstein-hw-codesign-2013.pdf which now actually went back to implement it on bare-hw and made further improvements, but it seems that for my purposes the first thesis is more than enough (I admit having only peeked into the second one).
Thanks a lot for the quick help! Cheers, Josef
Hi again,
so I contacted the author, Martin Stein, who gave me quite some insight and he convinced me to use [1] as a starting point instead of [2] for several reasons: * The whereabouts of the older code from [2] are unknown due to a switch from svn to git. * [2] uses the now obsolete base-mb kernel. * base-mb ran only on microblaze. Obtaining the toolchain and getting everything to run would be a big hassle. * [1] is much more tested and stable and the thesis is probably written in a more comprehensible way. * [1] is newer and thus easier to port to more current versions.
So now as a first step I'm trying to get the original code from [3] to run. I'm using the toolchain 12.11-x86_64. Martin suggested to try the scenario run/ptc_hdl_env from the branch ptc_hdl_env.
However, now I'm running into some compilation troubles; So far I did the following steps:
$ git clone https://github.com/m-stein/genode_hdl_env.git ptc_hdl_env $ cd ptc_hdl_env/ $ git checkout ptc_hdl_env $ tool/create_builddir hw_pbxa9 BUILD_DIR=build # enable ports and libports repos inside build/etc/build.conf libports$ make prepare PKG="verilator" libports$ make prepare PKG="libc" libports$ make prepare PKG="stdcxx" $ cd build # point /etc/tools.conf to the toolchain # add MAKE += -j4 to etc/build.conf $ make run/ptc_hdl_env
I get the following errors: =============================================== ... Program core/pbxa9/core COMPILE environ.o COMPILE err.o COMPILE file.o In file included from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:21:0, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_session_component.h:23, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_root.h:24, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../main.cc:19: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:27:28: error: ‘SP804_CLOCK’ is not a member of ‘Genode::Board_base’ /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:27:28: error: ‘SP804_CLOCK’ is not a member of ‘Genode::Board_base’ /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:27:59: error: template argument 1 is invalid /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h: In constructor ‘Platform_timer_base::Platform_timer_base()’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:38:22: error: ‘SP804_MMIO_SIZE’ is not a member of ‘Genode::Board_base’ /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:40:4: error: class ‘Platform_timer_base’ does not have any field named ‘Sp804_base’ In file included from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_session_component.h:23:0, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_root.h:24, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../main.cc:19: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h: In constructor ‘Platform_timer::Platform_timer()’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:44:59: error: ‘max_value’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:44:60: error: ‘tics_to_us’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h: In member function ‘long unsigned int Platform_timer::curr_time()’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:61:41: error: ‘value’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:63:55: error: ‘max_value’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:70:43: error: ‘tics_to_us’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h: In member function ‘void Platform_timer::schedule_timeout(long unsigned int)’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:99:39: error: ‘us_to_tics’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:100:28: error: ‘run_and_wrap’ was not declared in this scope COMPILE ldso_types.o COMPILE lock.o /home/josef/git/ptc_hdl_env/base/mk/generic.mk:46: recipe for target 'main.o' failed make[3]: *** [main.o] Error 1 var/libdeps:261: recipe for target 'timer.prg' failed make[2]: *** [timer.prg] Error 2 make[2]: *** Waiting for unfinished jobs.... COMPILE main.o ... MERGE ld.lib.so LINK init Makefile:206: recipe for target 'gen_deps_and_build_targets' failed make[1]: *** [gen_deps_and_build_targets] Error 2 make[1]: Leaving directory '/home/josef/git/ptc_hdl_env/build' Error: Genode build failed Makefile:237: recipe for target 'run/ptc_hdl_env' failed make: *** [run/ptc_hdl_env] Error 252 ===============================================
It seems that base-hw is borrowing a few header-files from nova, and grep showed that SP804_CLOCK is declared inside base/include/platform/vea9x4/drivers/board_base.h (which looks like another board), so maybe that's about where things go wrong. Can someone help me out?
Cheers, Josef
[1] http://www.genode-labs.com/publications/mstein-hw-codesign-2013.pdf [2] http://genode-labs.com/publications/mstein-device_emulation-2011.pdf [3] https://github.com/m-stein/genode_hdl_env
Hi Josef,
Am 07.12.2017 um 21:57 schrieb Stark, Josef:
... Program core/pbxa9/core COMPILE environ.o COMPILE err.o COMPILE file.o In file included from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:21:0, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_session_component.h:23, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_root.h:24, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../main.cc:19: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:27:28: error: ‘SP804_CLOCK’ is not a member of ‘Genode::Board_base’ /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:27:28: error: ‘SP804_CLOCK’ is not a member of ‘Genode::Board_base’ /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:27:59: error: template argument 1 is invalid /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h: In constructor ‘Platform_timer_base::Platform_timer_base()’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:38:22: error: ‘SP804_MMIO_SIZE’ is not a member of ‘Genode::Board_base’ /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/platform_timer_base.h:40:4: error: class ‘Platform_timer_base’ does not have any field named ‘Sp804_base’ In file included from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_session_component.h:23:0, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../nova/timer_root.h:24, from /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../../main.cc:19: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h: In constructor ‘Platform_timer::Platform_timer()’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:44:59: error: ‘max_value’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:44:60: error: ‘tics_to_us’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h: In member function ‘long unsigned int Platform_timer::curr_time()’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:61:41: error: ‘value’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:63:55: error: ‘max_value’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:70:43: error: ‘tics_to_us’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h: In member function ‘void Platform_timer::schedule_timeout(long unsigned int)’: /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:99:39: error: ‘us_to_tics’ was not declared in this scope /home/josef/git/ptc_hdl_env/os/src/drivers/timer/hw/pbxa9/../platform_timer.h:100:28: error: ‘run_and_wrap’ was not declared in this scope COMPILE ldso_types.o COMPILE lock.o /home/josef/git/ptc_hdl_env/base/mk/generic.mk:46: recipe for target 'main.o' failed make[3]: *** [main.o] Error 1 var/libdeps:261: recipe for target 'timer.prg' failed make[2]: *** [timer.prg] Error 2 make[2]: *** Waiting for unfinished jobs.... COMPILE main.o ... MERGE ld.lib.so LINK init Makefile:206: recipe for target 'gen_deps_and_build_targets' failed make[1]: *** [gen_deps_and_build_targets] Error 2 make[1]: Leaving directory '/home/josef/git/ptc_hdl_env/build' Error: Genode build failed Makefile:237: recipe for target 'run/ptc_hdl_env' failed make: *** [run/ptc_hdl_env] Error 252 ===============================================
It seems that base-hw is borrowing a few header-files from nova, and grep showed that SP804_CLOCK is declared inside base/include/platform/vea9x4/drivers/board_base.h (which looks like another board), so maybe that's about where things go wrong. Can someone help me out?
I just pushed a fix [1] that solves the problem. I also had a compile error like:
Program test/ptc_hdl_env/ptc/test-ptc_hdl_env-ptc COMPILE integration.o genode-arm-g++: error: integration.cc: No such file or directory genode-arm-g++: fatal error: no input files
But it disappeated when I ran "make run/ptc_hdl_env" a second time so it seems to be a sporadic problem with the build system. Apart from that, the PTC worked fine for me.
Martin
[1] https://github.com/m-stein/genode_hdl_env/commit/b3714bfb81e9e9c37cc068e640f...
Hi all, hello Martin,
first of all, thank you for the rapid fix.
So since I only need a tiny fraction of Vinit (the device emulator), some time ago I decided that it would be easier for my project to only 'reintegrate' the stuff I need from Vinit into Genode 16.08 instead of porting the whole thing. Basically I create a child from my parent and register as pager. When my child requests a dataspace, I take note of size and address but don't actually create one. So then each time my child accesses data, I get notified via a pagefault. Then, with each PF, I would need to do 4 basic steps:
1. Get instruction pointer and decode instruction 2. Simulate memory access, redundantly 3. Increase instruction pointer 4. Resume child without ever attaching the dataspace
First, I experimented with my own solo app but then moved to adapt our existing checkpointing app rtcr accordingly [1] [2], since I figured that later I could reuse some parts of it for C/R. Step 4 is relatively easy and already working, so I now have an app that pagefaults and gets resumed repeatedly at the same address and ip since I couldn't implement the rest of the steps so far. The reason is that I can't figure out how to access the Thread_state of the thread causing the pagefault. The Vinit uses an imprint written into the State that can be used to correlate an RM client to the pagefault State. I'm trying to integrate this imprint as well, but I'm struggling because obviously since the creation of Vinit the genode architecture has changed quite a bit (at least for a newcomer like me). E.g. I'm having trouble corresponding the required changes from [3] to add_client() to the 16.08 version (it exists in a different place with a different signature and a much smaller body).
So I'm wondering if maybe in the meantime there exists an easier way to access the IP (and other registers) of a pagefaulting thread from within the fault-handler (in my example _handle_fault in [4])? Especially considering that my parent task has only this one child (with currently only 1 thread), if that makes it easier.
Thank you.
Best regards, Josef
[1] https://github.com/jmstark/genode (Base OS modifications required for rtcr) [2] https://github.com/jmstark/rtcr (rtcr, to be cloned into repos) [3] https://github.com/jmstark/genode_hdl_env/commit/79014e19861a7b02f028c6b5918... [4] https://github.com/jmstark/rtcr/blob/red_mem/src/rtcr/intercept/ram_session....
Hi Josef,
El 12/01/18 a las 16:51, Stark, Josef escribió:
Step 4 is relatively easy and already working, so I now have an app that pagefaults and gets resumed repeatedly at the same address and ip since I couldn't implement the rest of the steps so far. The reason is that I can't figure out how to access the Thread_state of the thread causing the pagefault. The Vinit uses an imprint written into the State that can be used to correlate an RM client to the pagefault State. I'm trying to integrate this imprint as well, but I'm struggling because obviously since the creation of Vinit the genode architecture has changed quite a bit (at least for a newcomer like me). E.g. I'm having trouble corresponding the required changes from [3] to add_client() to the 16.08 version (it exists in a different place with a different signature and a much smaller body).
So I'm wondering if maybe in the meantime there exists an easier way to access the IP (and other registers) of a pagefaulting thread from within the fault-handler (in my example _handle_fault in [4])? Especially considering that my parent task has only this one child (with currently only 1 thread), if that makes it easier.
As you should be the parent of the traced component, you can intercept the CPU session and remember all thread capabilities created by the component. On a fault you iterate through all threads to select those that are currently in a page fault. A good example for this is the GDB monitor [1]. Now you want to select from the remaining threads those that are in a fault on your specific address. This can't be done with the current Genode API, but an easy way to achieve it would be to expand the Cpu_state [2] to deliver also the value of the ARM Data Fault Address Register or DFAR when calling Cpu_thread_client::state (make sure to update the dfar member in [4] as is done in [3]).
Now you can arbitrarily choose one of the remaining threads, read its IP from the Cpu_state, emulate its instruction, and write back results via Cpu_thread_client::state. Then, you would have to continue execution of the thread. Unfortunately, this is normally done automatically by Genodes Core when it sees a new mapping that matches the fault address. You don't want to do such a mapping, but there is no explicit "resume this faulter". So, you have might add such an RPC call [5] and its back end [6] to the RM interface. This shouldn't be much invasive.
If there are further questions please do not hesitate to ask ;)
Cheers, Martin
[1] ports/src/app/gdb_monitor/cpu_session_component.cc: Cpu_session_component::handle_unresolved_page_fault
[2] base/include/spec/arm/cpu/cpu_state.h
[3] base-hw/src/core/spec/arm_v7/trustzone/kernel/vm.cc: Vm::exception
[4] base-hw/src/core/spec/arm/kernel/thread.cc: Thread::exception
[5] base/include/region_map/region_map.h
[6] base/src/core/region_map_component.cc
Hi,
As you should be the parent of the traced component, you can intercept the CPU session and remember all thread capabilities created by the component. On a fault you iterate through all threads to select those that are currently in a page fault. A good example for this is the GDB monitor [1].
Ok, so far, so good, I could successfully get the threads within my target process and indeed access the correct instruction pointer. But the next problem is that Thread_state::unresolved_page_fault for me always 0 is, for all threads (I provoked a pagefault by detaching the dataspace), so I can't filter out the faulter(s). Do you have a clue? As long as I don't attach a dataspace at the corresponding address, it should indeed be an unresolved page fault, or am I wrong here?
Now you want to select from the remaining threads those that are in a fault on your specific address. This can't be done with the current Genode API, but an easy way to achieve it would be to expand the Cpu_state [2] to deliver also the value of the ARM Data Fault Address Register or DFAR when calling Cpu_thread_client::state (make sure to update the dfar member in [4] as is done in [3]).
Good. I'm using Fiasco.OC though, so I'll have to figure out how to do this there. Writing a custom kernel function making the assembler call copied from Genode just froze the VM, but I already asked the Fiasco.OC people for help.
Then, you would have to continue execution of the thread. Unfortunately, this is normally done automatically by Genodes Core when it sees a new mapping that matches the fault address. You don't want to do such a mapping, but there is no explicit "resume this faulter". So, you have might add such an RPC call [5] and its back end [6] to the RM interface. This shouldn't be much invasive.
At least this seems to work.
If there are further questions please do not hesitate to ask ;)
That's very nice of you, thanks a lot for your help!
Cheers, Josef
Hi Josef,
El 25/01/18 a las 18:28, Stark, Josef escribió:
As you should be the parent of the traced component, you can intercept the CPU session and remember all thread capabilities created by the component. On a fault you iterate through all threads to select those that are currently in a page fault. A good example for this is the GDB monitor [1].
Ok, so far, so good, I could successfully get the threads within my target process and indeed access the correct instruction pointer. But the next problem is that Thread_state::unresolved_page_fault for me always 0 is, for all threads (I provoked a pagefault by detaching the dataspace), so I can't filter out the faulter(s). Do you have a clue? As long as I don't attach a dataspace at the corresponding address, it should indeed be an unresolved page fault, or am I wrong here?
At least in the current Genode release, the 'unresolved_page_fault' flag is set in [1] hence in [2] when giving up resolving the fault in core and consigning to the userland handler. I assume you're using a patched fork of an older Genode version so maybe things are different in your case. However, have you had a look ad the GDB monitor?
Now you want to select from the remaining threads those that are in a fault on your specific address. This can't be done with the current Genode API, but an easy way to achieve it would be to expand the Cpu_state [2] to deliver also the value of the ARM Data Fault Address Register or DFAR when calling Cpu_thread_client::state (make sure to update the dfar member in [4] as is done in [3]).
Good. I'm using Fiasco.OC though, so I'll have to figure out how to do this there. Writing a custom kernel function making the assembler call copied from Genode just froze the VM, but I already asked the Fiasco.OC people for help.
Then, you would have to continue execution of the thread. Unfortunately, this is normally done automatically by Genodes Core when it sees a new mapping that matches the fault address. You don't want to do such a mapping, but there is no explicit "resume this faulter". So, you have might add such an RPC call [5] and its back end [6] to the RM interface. This shouldn't be much invasive.
At least this seems to work.
Could you please explain in more detail what you mean with that? How do you get the faulter to continue execution without attaching a dataspace?
Cheers, Martin
[1] base-foc/src/core/pager_object.cc
[2] base/src/core/region_map_component.cc:207
Hello Martin,
At least in the current Genode release, the 'unresolved_page_fault' flag is set in [1] hence in [2] when giving up resolving the fault in core and consigning to the userland handler. I assume you're using a patched fork of an older Genode version so maybe things are different in your case. However, have you had a look ad the GDB monitor?
Yes, I've seen in your referenced code how GDB monitor uses this flag. We're using Genode 16.08, but I can't build the gdb_monitor.run sample, neither in the vanilla 16.08 nor in our patched version: "cp: cannot stat 'bin/ld.lib.so': No such file or directory". So next I'll look into how it's handled on newer versions.
Anyway, I tried to read the DFAR register as you recommended, but it seems I didn't quite understand what you meant, as an L4 hacker responded to me in [3] that DFAR/DFSR registers are non-thread-specific CPU registers, and thus cannot be used to differentiate between the threads like I tried.
For now, however, I managed to pass the IP via the pagefault report to the fault handler and can identify the corresponding thread with it. But I guess that there could be some corner case where two threads try to access different addresses from the same instruction, so I might still look into how to do this using DFAR.
How do you get the faulter to continue execution without attaching a dataspace?
By calling faulter->continue_after_resolved_fault(). You're completely right, it doesn't really continue execution as I haven't increased the instruction pointer yet, nor did I attach the dataspace. But it faults again at the same spot, which shows me that it *would* work if I had emulated the instruction and increased the IP.
Thank you. Josef
[1] base-foc/src/core/pager_object.cc [2] base/src/core/region_map_component.cc:207 [3] http://os.inf.tu-dresden.de/pipermail/l4-hackers/2018/008205.html
Hi Josef,
El 29/01/18 a las 15:22, Stark, Josef escribió:
At least in the current Genode release, the 'unresolved_page_fault' flag is set in [1] hence in [2] when giving up resolving the fault in core and consigning to the userland handler. I assume you're using a patched fork of an older Genode version so maybe things are different in your case. However, have you had a look ad the GDB monitor?
Yes, I've seen in your referenced code how GDB monitor uses this flag. We're using Genode 16.08, but I can't build the gdb_monitor.run sample, neither in the vanilla 16.08 nor in our patched version: "cp: cannot stat 'bin/ld.lib.so': No such file or directory". So next I'll look into how it's handled on newer versions.
Please consider that the GDB monitor test is supported only on Nova and real hardware (no Qemu) as the run script should tell you. You should also definitely create a fresh build dir when switching versions because the build system might have changed in the meanwhile which can lead to unpleasant errors.
Anyway, I tried to read the DFAR register as you recommended, but it seems I didn't quite understand what you meant, as an L4 hacker responded to me in [3] that DFAR/DFSR registers are non-thread-specific CPU registers, and thus cannot be used to differentiate between the threads like I tried.
The DFAR/DFSR are registers of the CPU that might or might not be saved per thread by a kernel. As far as I understand the mail from the L4 guys, Fiasco.OC doesn't save DFAR/DFSR to the thread state. However, the DFAR/DFSR are thread specific as their state corresponds to a specific fault that is caused by a specific thread. So, I assume you would have to modify fault-handling code (where DFAR/DFSR are inspected) or even exception-entry code (where general purpose registers are saved) in Fiasco.OC to gain access to DFAR/DFSR in userland.
For now, however, I managed to pass the IP via the pagefault report to the fault handler and can identify the corresponding thread with it. But I guess that there could be some corner case where two threads try to access different addresses from the same instruction, so I might still look into how to do this using DFAR.
This corner case is pretty common. Imagine a loop that writes a large contiguous RAM region. In this case you would have only one write instruction that reads the RAM address from a GPR which content gets de/increased with each iteration.
How do you get the faulter to continue execution without attaching a dataspace?
By calling faulter->continue_after_resolved_fault(). You're completely right, it doesn't really continue execution as I haven't increased the instruction pointer yet, nor did I attach the dataspace. But it faults again at the same spot, which shows me that it *would* work if I had emulated the instruction and increased the IP.
No, continue_after_resolved_fault DOES continue execution (although the IP isn't yet the one you need), what I meant is that you can't do this in vanilla Genode from userland (not core) without attaching an appropriate dataspace. So, you have already extended the RM session by an RPC that triggers the core-internal 'continue_after_resolved_fault' ? I think this is the best way to do it in your situation.
Cheers, Martin
Hi Martin,
No, continue_after_resolved_fault DOES continue execution (although the IP isn't yet the one you need), what I meant is that you can't do this in vanilla Genode from userland (not core) without attaching an appropriate dataspace. So, you have already extended the RM session by an RPC that triggers the core-internal 'continue_after_resolved_fault' ? I think this is the best way to do it in your situation.
Yes, this works. Currently I'm getting the thread state of the faulting thread, increase the IP by one instruction, and continue the thread. (The instruction emulation is missing, but the rest looks like it works.) My current problem is, as explained in the other mail, that I don't know how to find the memory / the instruction that is pointed to by the IP; I would like to do it without using imprints. I am the parent of the target (and it is the only child) and I can access its sessions. But I don't know where to look for this memory. Once I have the instruction, I could emulate it.
Another question of the other mail I could answer myself:
Another thing that I don't completely understand: The pagefault report includes the memory address where the pagefault occured. I can successfully find the corresponding data space. Experimenting a bit showed me that the reported address seems to be 8-Byte-aligned. (Because incrementing the accessed address in the test application byte by byte only results in an 8-byte jump of the reported address 'state.addr' every 8 bytes. Inside an 8-byte group it stays the same.) But how can I find out which of the 8 byte(s) was actually accessed? Especially considering that single-byte access doesn't have to be aligned. I think that for your emulator this information was not necessary, so [1] doesn't provide it. But is it even contained in the instruction?
After peeking into a disassembly, it looks like the second half of the instruction contains the exact address, which now seems logical to me as the processor somehow needs to now what exactly should be read/written.
So the missing piece of the puzzle is getting the instruction. Any hint/idea?
Thanks so far and best regards, Josef
[1] os/src/vinit/arm_v7a/instruction.h: load_store()
Hello,
I circumvented the problem of finding the correct dataspace by letting the pagefault handler open another ROM connection to the binary at its instantiation. The ELF file contains the VM address of the code section, which allows me to fetch the correct instruction when a pagefault happens.
Then, I can use the instruction decoder (see Vinit @ [1]):
bool ldst = Instruction::load_store(instr, writes, state.format, state.reg); size_t access_size = state.format == Region_map::LSB8 ? 1 : (state.format == Region_map::LSB16 ? 2 : 4);
With this information I should be able to emulate the instruction:
if(!writes) { state.value = 0; memcpy(&state.value,addr + state.addr,access_size); thread_state.set_gpr(state.reg,state.value); } else { thread_state.get_gpr(state.reg, state.value); memcpy(addr + state.addr,&state.value,access_size); }
(For set_gpr() and get_gpr() see Vinit @ [2])
I have a simple sample application that for now only reads repeatedly a memory value and prints it:
unsigned n = 0x12abcdef while(1) { log(n, " sheep"); timer.msleep(1000); }
I let this app run normally for a few seconds and then detach the dataspace to see if emulation is working... it isn't.
The register write - i.e. the simulated load (memory read) instruction - results in weird app behavior: The app outputs "0 sheep" - wrong number - without pausing 1 second. If I comment out the register write, it outputs the same but pauses 1 second like it should. The instruction decoder gives the following info (pf IP is 0x1001bb0):
instruction: e599116c, load, LSB32, reg: 1
Which conforms with what objdump tells me:
1001bb0: e599116c ldr r1, [r9, #364] ; 0x16c
Still, it doesn't work. By trial and error (going through all regs) I found out that writing to r9 instead of r1 (in the pf handler) gives me the desired behaviour: Printing the correct number and waiting 1 second.
But I have no idea why my code above doesn't work (and why r9 does). I know the target register number and write the value into it. Vinit also seems to do it like this: [3] Isn't this exactly what the processor would do? What could I be missing?
Best regards, Josef
[1] os/src/vinit/arm_v7a/instruction.h [2] base-hw/include/arm_v7/base/thread_state.h [3] state() & processed() in os/src/vinit/include/rm_session/component.h
Hi Josef,
During the last weeks I was out of office which is why this answer is a bit delayed.
El 26/02/18 a las 21:26, Josef Stark escribió:
Hello,
I circumvented the problem of finding the correct dataspace by letting the pagefault handler open another ROM connection to the binary at its instantiation. The ELF file contains the VM address of the code section, which allows me to fetch the correct instruction when a pagefault happens.
As far as I can see a good idea for static clients. At least this spares you the duplicated book keeping of guest attachments in your observer component. However, with dynamic linking or JIT compilation, the RM-inception approach might be the better choice.
Then, I can use the instruction decoder (see Vinit @ [1]):
bool ldst = Instruction::load_store(instr, writes, state.format, Â Â Â state.reg); size_t access_size = state.format == Region_map::LSB8 ? 1 : Â Â Â (state.format == Region_map::LSB16 ? 2 : 4);
With this information I should be able to emulate the instruction:
if(!writes) { Â Â Â Â state.value = 0; Â Â Â Â memcpy(&state.value,addr + state.addr,access_size); Â Â Â Â thread_state.set_gpr(state.reg,state.value); } else { Â Â Â Â thread_state.get_gpr(state.reg, state.value); Â Â Â Â memcpy(addr + state.addr,&state.value,access_size); }
(For set_gpr() and get_gpr() see Vinit @ [2])
I have a simple sample application that for now only reads repeatedly a memory value and prints it:
unsigned n = 0x12abcdefafter LDR between while(1) { Â Â Â Â log(n, " sheep"); Â Â Â Â timer.msleep(1000); }
I let this app run normally for a few seconds and then detach the dataspace to see if emulation is working... it isn't.
The register write - i.e. the simulated load (memory read) instruction - results in weird app behavior: The app outputs "0 sheep" - wrong number
- without pausing 1 second. If I comment out the register write, it
outputs the same but pauses 1 second like it should. The instruction decoder gives the following info (pf IP is 0x1001bb0):
instruction: e599116c, load, LSB32, reg: 1
Which conforms with what objdump tells me:
1001bb0:      e599116c       ldr    r1, [r9, #364] ; 0x16c
Still, it doesn't work. By trial and error (going through all regs) I found out that writing to r9 instead of r1 (in the pf handler) gives me the desired behaviour: Printing the correct number and waiting 1 second.
But I have no idea why my code above doesn't work (and why r9 does). I know the target register number and write the value into it. Vinit also seems to do it like this: [3] Isn't this exactly what the processor would do? What could I be missing?
First, I would look at the assembly context at 0x1001bb0 especially what is done with R1/R9 in the subsequent instructions. You might also play around with explicit register assignment:
register unsigned my_var asm("r1") = n;
or inline assembler to get a better understanding what you're actually doing wrong in the perspective of the client.
Maybe the client returns from emulation with a bad IP which you could check by adding an assembler instruction with an observable side effect directly behind the LDR (or using a debugger).
You could even write an inline assembler snippet that writes the whole GPR state after the LDR to RAM when running without emulation. This way, you can compare the GPR states of the emulation case (inside observer) and the non-emulation case.
Cheers, Martin
Hi Martin,
On 02/28/2018 04:02 PM, Martin Stein wrote:
Hi Josef, [...] First, I would look at the assembly context at 0x1001bb0 especially what is done with R1/R9 in the subsequent instructions. You might also play around with explicit register assignment:
register unsigned my_var asm("r1") = n;
or inline assembler to get a better understanding what you're actually doing wrong in the perspective of the client.
Maybe the client returns from emulation with a bad IP which you could check by adding an assembler instruction with an observable side effect directly behind the LDR (or using a debugger).
You could even write an inline assembler snippet that writes the whole GPR state after the LDR to RAM when running without emulation. This way, you can compare the GPR states of the emulation case (inside observer) and the non-emulation case.
I experimented a bit and after comparing the actual register contents of R0-R15 inside the child application with the contents of the Thread_state register backup (R0-R15) that is delivered to my fault handler, it seems like Genode (or Fiasco.OC or the glue code) delivers the registers in a strange mapping, where some original regs are mapped to different regs in the backup, and some do not appear at all. The mapping is like this:
Thread_state | Child | Child - Alternative Possibility --------------------------------------------------------- R0 | R9 | R7 R1 | R10 | R4 R2 | R11 | R3 | | R4 | | R5 | | R6 | | R7 | | R8 | R0 | R9 | R1 | R10 | R2 | R11 | R3 | R12 | R12 | R5 R13 | R13 | R14 | R14 | R15 | R15 | | R6 | | R8 |
(Alternative Possibility means I couldn't tell which of the two is actually correct due to equal values. The Thread_state registers that have no mapping contained values that did not match any of the original register contents; some of those values appeared in two distinct Thread_state registers. I assured myself that the child didn't modify the registers before dumping the contents.) While this looks pretty strange, I verified the 'mapping' on a few occasions and after incorporating the mapping into the instruction emulator/the redundant memory writer, it does what it should, at least for a limited test case (which doesn't access one of the unmapped registers).
I should mention that we use a slightly modified Fiasco.OC kernel and kernel interface, since in the unmodified Genode 16.08/Fiasco.OC, calling Cpu_thread_component::state() [1] internally calls Platform_thread::state(), which does not return the contents of all registers. Instead our modified Platform_thread::state() calls our own method all_regs() [2] which does that. The modified Fiasco.OC kernel source files are [3] and [4] with modifications marked with a comment mentioning "rtcr". But no remapping or other strange things are done there.
My guess is that this register magic is not happening in the base-hw version of Genode (since Vinit instruction emulation is not using any remapping and still works), but maybe you still have a clue of what is going on there.
Best regards, Josef
[1] state() repos/base/src/core/cpu_thread_component.cc [2] all_regs(): https://github.com/jmstark/cr-genode/blob/red_mem/repos/base-focnados/src/co... [3] https://github.com/argos-research/foc/blob/checkpointRestore/l4/pkg/l4sys/in... [4] https://github.com/argos-research/foc/blob/checkpointRestore/kernel/fiasco/s...
Hi Josef,
El 12/03/18 a las 18:19, Josef Stark escribió:
I experimented a bit and after comparing the actual register contents of R0-R15 inside the child application with the contents of the Thread_state register backup (R0-R15) that is delivered to my fault handler, it seems like Genode (or Fiasco.OC or the glue code) delivers the registers in a strange mapping, where some original regs are mapped to different regs in the backup, and some do not appear at all. The mapping is like this:
Thread_state   | Child   | Child - Alternative Possibility
R0Â Â Â Â Â Â Â | R9Â Â Â | R7 R1Â Â Â Â Â Â Â | R10Â Â Â | R4 R2Â Â Â Â Â Â Â | R11Â Â Â | R3Â Â Â Â Â Â Â |Â Â Â | R4Â Â Â Â Â Â Â |Â Â Â | R5Â Â Â Â Â Â Â |Â Â Â | R6Â Â Â Â Â Â Â |Â Â Â | R7Â Â Â Â Â Â Â |Â Â Â | R8Â Â Â Â Â Â Â | R0Â Â Â | R9Â Â Â Â Â Â Â | R1Â Â Â | R10Â Â Â Â Â Â Â | R2Â Â Â | R11Â Â Â Â Â Â Â | R3Â Â Â | R12Â Â Â Â Â Â Â | R12Â Â Â | R5 R13Â Â Â Â Â Â Â | R13Â Â Â | R14Â Â Â Â Â Â Â | R14Â Â Â | R15Â Â Â Â Â Â Â | R15Â Â Â | Â Â Â Â Â Â Â | R6Â Â Â | Â Â Â Â Â Â Â | R8Â Â Â |
(Alternative Possibility means I couldn't tell which of the two is actually correct due to equal values. The Thread_state registers that have no mapping contained values that did not match any of the original register contents; some of those values appeared in two distinct Thread_state registers. I assured myself that the child didn't modify the registers before dumping the contents.) While this looks pretty strange, I verified the 'mapping' on a few occasions and after incorporating the mapping into the instruction emulator/the redundant memory writer, it does what it should, at least for a limited test case (which doesn't access one of the unmapped registers).
I should mention that we use a slightly modified Fiasco.OC kernel and kernel interface, since in the unmodified Genode 16.08/Fiasco.OC, calling Cpu_thread_component::state() [1] internally calls Platform_thread::state(), which does not return the contents of all registers. Instead our modified Platform_thread::state() calls our own method all_regs() [2] which does that. The modified Fiasco.OC kernel source files are [3] and [4] with modifications marked with a comment mentioning "rtcr". But no remapping or other strange things are done there.
My guess is that this register magic is not happening in the base-hw version of Genode (since Vinit instruction emulation is not using any remapping and still works), but maybe you still have a clue of what is going on there.
Sorry, but I don't have a glue what the problem is. It looks as if there are issues with your all_regs syscall. Maybe the regs array layout is not as you expect it to be, or the v->mr array gets modified afterwards in the kernel? In your position, I would look into what really happens during the syscall, which values get written to the v->mr and what is done with them.
In contrast to FOC, the base-hw kernel was written with the sole purpose to fullfill the requieremts of Genodes Core/Userland in the simplest manner possible and thus, the kernel had support for returning all regs of a thread right from the beginning.
Cheers, Martin
Hey Martin,
the next problem that I'm facing now is that I don't know how to access the instruction that caused the pagefault. I have the instruction pointer but not the instruction itself (opcode and operands). Your vinit code [1] uses an imprint to identify the corresponding Rm_client and then find the correct region by the IP address:
Rm_client * const rm_client = Rm_client::by_id(state.imprint); addr_t off, ip = client_state->ip; Rm_session_component * const rm = rm_client->session(); Region * const region = rm->_find_region((void *)ip, &off); Dataspace_capability ds_cap = region->ds_cap(); void * local = env()->rm_session()->attach(ds_cap, 0, region->offset()); unsigned instr = *(unsigned *)((addr_t)local + off);
However, I'm again wondering if there's an easier way to find the dataspace considering that my checkpointer only has this one child (and I also know the binary name), and for the thread state it was already possible after you explained it to me. Where do I look? I tried looking through the RAM dataspaces, but so far, trial and error didn't yield any success. Probably because I'm doing something wrong, and due to the architectural changes introduced between 12.11 and 16.08 it's again hard for me to re-use vinit code, so maybe you can push me in the right direction again.
Another thing that I don't completely understand: The pagefault report includes the memory address where the pagefault occured. I can successfully find the corresponding data space. Experimenting a bit showed me that the reported address seems to be 8-Byte-aligned. (Because incrementing the accessed address in the test application byte by byte only results in an 8-byte jump of the reported address 'state.addr' every 8 bytes. Inside an 8-byte group it stays the same.) But how can I find out which of the 8 byte(s) was actually accessed? Especially considering that single-byte access doesn't have to be aligned. I think that for your emulator this information was not necessary, so [2] doesn't provide it. But is it even contained in the instruction?
Best regards, Josef
[1] os/src/vinit/include/rm_session/component.h: state() [2] os/src/vinit/arm_v7a/instruction.h: load_store()