Hi Josef,
El 29/01/18 a las 15:22, Stark, Josef escribió:
At least in the current Genode release, the 'unresolved_page_fault' flag is set in [1] hence in [2] when giving up resolving the fault in core and consigning to the userland handler. I assume you're using a patched fork of an older Genode version so maybe things are different in your case. However, have you had a look ad the GDB monitor?
Yes, I've seen in your referenced code how GDB monitor uses this flag. We're using Genode 16.08, but I can't build the gdb_monitor.run sample, neither in the vanilla 16.08 nor in our patched version: "cp: cannot stat 'bin/ld.lib.so': No such file or directory". So next I'll look into how it's handled on newer versions.
Please consider that the GDB monitor test is supported only on Nova and real hardware (no Qemu) as the run script should tell you. You should also definitely create a fresh build dir when switching versions because the build system might have changed in the meanwhile which can lead to unpleasant errors.
Anyway, I tried to read the DFAR register as you recommended, but it seems I didn't quite understand what you meant, as an L4 hacker responded to me in [3] that DFAR/DFSR registers are non-thread-specific CPU registers, and thus cannot be used to differentiate between the threads like I tried.
The DFAR/DFSR are registers of the CPU that might or might not be saved per thread by a kernel. As far as I understand the mail from the L4 guys, Fiasco.OC doesn't save DFAR/DFSR to the thread state. However, the DFAR/DFSR are thread specific as their state corresponds to a specific fault that is caused by a specific thread. So, I assume you would have to modify fault-handling code (where DFAR/DFSR are inspected) or even exception-entry code (where general purpose registers are saved) in Fiasco.OC to gain access to DFAR/DFSR in userland.
For now, however, I managed to pass the IP via the pagefault report to the fault handler and can identify the corresponding thread with it. But I guess that there could be some corner case where two threads try to access different addresses from the same instruction, so I might still look into how to do this using DFAR.
This corner case is pretty common. Imagine a loop that writes a large contiguous RAM region. In this case you would have only one write instruction that reads the RAM address from a GPR which content gets de/increased with each iteration.
How do you get the faulter to continue execution without attaching a dataspace?
By calling faulter->continue_after_resolved_fault(). You're completely right, it doesn't really continue execution as I haven't increased the instruction pointer yet, nor did I attach the dataspace. But it faults again at the same spot, which shows me that it *would* work if I had emulated the instruction and increased the IP.
No, continue_after_resolved_fault DOES continue execution (although the IP isn't yet the one you need), what I meant is that you can't do this in vanilla Genode from userland (not core) without attaching an appropriate dataspace. So, you have already extended the RM session by an RPC that triggers the core-internal 'continue_after_resolved_fault' ? I think this is the best way to do it in your situation.
Cheers, Martin