Hello Genodians
Since we rolled out the update to 24.11 we see the above message about once or twicw a day in the logs of our testing pipeline.
It happens on different hardware (Up-Xtreme and imx8mp_iot_gate in the last two days) and in different run scripts. The run scripts range from a fairly simple TPM run script to complicated run scripts that test the resilience of the operating system to stopping/starting of whole subsystems and reboots.
Currently we don't have a case where this happens reproducible. It could be related to https://github.com/genodelabs/genode/issues/5382 but also something completely different.
How could we narrow down the issue and gather logs to identify the cause of it?
Kind regards, Pirmin
Hi Pirmin,
On Thu, Jan 16, 2025 at 10:34:40AM +0100, Pirmin Duss via users wrote:
Hello Genodians
Since we rolled out the update to 24.11 we see the above message about once or twicw a day in the logs of our testing pipeline.
It happens on different hardware (Up-Xtreme and imx8mp_iot_gate in the last two days) and in different run scripts. The run scripts range from a fairly simple TPM run script to complicated run scripts that test the resilience of the operating system to stopping/starting of whole subsystems and reboots.
Currently we don't have a case where this happens reproducible. It could be related to https://github.com/genodelabs/genode/issues/5382 but also something completely different.
Unfortunately, above message - although quite alerting - does not provide much information about the reason of the exception inside the kernel. It might be an MMU-fault or a system-call triggered by the kernel. The latter was actually the case in issue 5382 you have referred to. As a potential consequence of C++ exception handling in the C++ support library a kernel call was triggered. We're on the way to remove all C++ exception out of core and the base library. If this is finally the case, we can build/link base-hw's core, which implicitely is the kernel image, without C++ exception support, and thereby can guarantee its (system call inside kernel) absence. Nonetheless, after looking at the remaining throws, we are almost sure there is no C++ exception left in the kernel context of base-hw.
To make it short: I don't think its related to above issue.
How could we narrow down the issue and gather logs to identify the cause of it?
You need some more information about the kernel context, which triggered the recursive kernel entry. I was already planning to provide code to do so at the end of last year, but was busy otherwise. Now, I've opened an issue:
https://github.com/genodelabs/genode/issues/5425
to document further progress about it. I'm up to approach this very soon.
Regards Stefan
Kind regards, Pirmin
users mailing list -- users@lists.genode.org To unsubscribe send an email to users-leave@lists.genode.org Archived at https://lists.genode.org/mailman3/hyperkitty/list/users@lists.genode.org/mes...
Hello Stefan
Thanks for the explanation and creating the issue.
On 1/17/25 17:17, Stefan Kalkowski wrote:
You need some more information about the kernel context, which triggered the recursive kernel entry. I was already planning to provide code to do so at the end of last year, but was busy otherwise. Now, I've opened an issue:
https://github.com/genodelabs/genode/issues/5425
to document further progress about it. I'm up to approach this very soon.
We will gladly support you in testing and gathering data regarding this.
Cheers, Pirmin