Hi Pirmin,
On 23.02.21 16:34, Pirmin Duss wrote:
Eventually init throws 'Genode::Bit_array_base::Invalid_clear' and terminates (see the backtrace below).
Maybe it is noteworthy that packages with sub inits that have components that use resource saturation are started.
Does anybody have an idea on how to proceed with debugging the problem?
I have never seen this exception before, which got me thinking.
The backtrace is already quite insightful. Thanks for posting it.
My interpretation: The problem occurs while closing a session. The server informs init that it has finished the closing of a session (by calling Parent::session_response), which prompts init to discard the local copy of the session capability. The reference count of this capability reaches zero (which is expected), triggering the deletion of the capability's local ID from init's capability space. For some reason, the corresponding slot is already marked as free, which is unexpected.
To investigate, it is important to understand the situation a bit better. E.g., which session type is the troublemaker? Who is the client, who is the server? Of course, plain instrumentation with debug messages would produce too much noise. I would hence try to add conditional messages that are only printed in "interesting" situations. E.g., the attached patch prints a message with session information, but only in the unexpected exception case. The output may help to form a better mental picture of what is happening. E.g., is the session an environment session or a higher-level session?
Regarding general debugging hints, it depends on how "sporadic" is this issue? Can you easily provoke it? If not, I would first try to work on the reproducibility, e.g., by increasing the frequency of init reconfigurations. I'd also try to strip down the complexity of the scenario step by step, e.g., by removing one component after another while validating that the problem persists.
Best wishes! I'm eager to learn what just happened here.
Norman