Hi all,
I guess you are familiar with the problem of stack overflows in multi-threaded components. I already encountered a bunch of weird errors that were hard to track down until I remembered that it could simply be caused by a too small stack.
I know that Genode's policy is to use single-threaded components. Occasionally, however, one needs additional threads, especially when using 3rd party libraries.
Stack overflows are not only very annoying and time consuming but can (imo) also be mitigated rather easily. I therefore think it would be worth implementing a protection or detection mechanism for this in Genode.
The "problem" here is actually the `Stack_allocator` which places the stacks of the component threads consecutively in memory. I.e. if one thread exceeds its allocated stack area, it likely corrupts the stack of another thread (of the same component).
Hence, one could improve the `Stack_allocator` so that it keeps a guard page between the stacks in order to cause a page fault on any stack overflow. This comes at the cost of a slightly increased complexity and possibly also an increased memory consumption (because it requires a second-level translation table).
Alternatively, I can imagine a kernel-level (base-hw) approach which uses canaries at the top of each stack. Every time the kernel switches to a user thread, it checks whether the canary is still alive. If not, another thread's stack must have overflowed. Of course, this method is only reliable if we can assume that every memory word on the stack will be initialised (preferably sequentially).
Note that I'm not eager to implement these techniques in the near future. Nevertheless, I thought it would be great to start a discussion and collect comments and additional ideas.
Looking forward to your feedback!
Cheers Johannes