Hi Roman,
On 26.03.20 15:08, Roman Iten wrote:
I'm lost and would appreciate some inspiration on how to further debug a problem in a scenario that is similar to `run/depot_query`.
can you please share which Genode version you are using? I particular, I want to ensure that you are using commit [1]. (it would not explain the different behavior between x86_64 and ARM though)
[1] https://github.com/genodelabs/genode/commit/f85ec313de2cb723b1ca004866f03163...
For investigating issues like this, I usually start with looking at the page-fault address. What does 'dmesg' tell you? If the address lies within the stack area, the issue might be a stack overflow. If the address is very small, it would hint at a de-referenced null pointer. What is the code around the faulting instruction pointer doing (using 'objdump -lSd' on the faulting binary and searching for the instruction pointer)?
Since you are using base-linux, have you tried obtaining a backtrace via the GNU debugger? You may find the steps given at [2] useful.
[2] https://genodians.org/ssumpf/2019-04-26-java-gdb
I vaguely remember that you have tweaked the tool chain for base-linux on ARM using hard fp. Does the problem also occur with the original tool chain? I'm asking just to rule out tricky tool-chain-related technicalities.
Is the problem deterministic? If yes, have you tried the same scenario (same binaries) on another ARM kernel, e.g., base-hw on pbxa9? By cross-correlating different kernels, you can see whether the problem is specific to Linux or generally applies to 32-bit ARM.
You mention that the problem occurs with one particular deploy.config but not with another. So you may try gradually turning the one (bad) version into the other (good) and see when it stops breaking (bisecting the issue). Similarly, you could try reducing the "bad" configuration as far as possible while the problem persists, eventually reaching a minimal test case.
Good luck! Norman