Stack overflow protection/detection

Stefan Kalkowski stefan.kalkowski at ...1...
Wed Aug 31 13:15:44 CEST 2016

Hi Johannes,

On 08/30/2016 09:40 PM, Johannes Schlatow wrote:
> Norman, thanks a lot for the clarification!
> I must admit that I only had a rather brief look at the existing code.
> You explanation still leaves me puzzled with one little question though:
> Where does the "invalid signal-context capability" come from? I
> actually noticed that a couple of times in the past and was wondering
> what's causing this.

"invalid signal-context capability" is printed when someone used an
invalid capability (e.g., not set signal handler) to submit a signal. It
can be printed for many reasons. Typically you can see it when a fault
in the thread context area cannot be resolved.

I was wondering: which kernel are you using right now? Because I also
stumbled across the problem that on top of certain kernels (e.g., Fiasco
old, Pistachio, SeL4...) we do not print an error message when a
page-fault cannot be resolved within a managed region-map area (e.g.,
within the thread context area). I will open an issue for this on github.


> On Tue, 30 Aug 2016 20:53:34 +0200
> Norman Feske <norman.feske at ...1...> wrote:
>> Hi Johannes,
>> I'm afraid that you misinterpreted the role of the "stack allocator".
>> Stacks are actually not allocated consecutively but within a sparsely
>> populated area (called stack area) within the component's virtual
>> memory space.
>> We introduced the current stack allocation scheme back in Genode
>> 10.02:
>> In short, the stack allocator is used to allocate slots within the
>> stack area, that hosts all the stacks. Each slot is 1 MiB of virtual
>> memory, aligned to a 1 MiB boundary. The actual stack (typically just
>> a few KiB) is placed within the slot but most of the slot remains
>> unpopulated. Consequently, guard pages are already in place - plenty
>> of them.
>> The only thing that changed since 10.02 is the naming. I removed the
>> notion of the "thread context area" earlier this year and just speak
>> of "stack area" instead. This was done to simplify the terminology
>> used within the framework's implementation.
>>> Stack overflows are not only very annoying and time consuming but
>>> can (imo) also be mitigated rather easily. I therefore think it
>>> would be worth implementing a protection or detection mechanism for
>>> this in Genode.  
>> Usually, when a stack overflows, you get a message indicating that an
>> unresolvable page fault has occurred with the virtual-memory range of
>> the stack area. On base-linux, the address can be found in the dmesg
>> output. On the other kernels, core's pager prints a message (mostly
>> accompanied with something like "invalid signal-context capability").
>> I doubt that stack corruptions were the reason for the trouble you
>> observed. I can vividly remember nerve-wracking bug hunting sessions
>> prior version 10.02 that were caused by stack overflows corrupting
>> adjacent memory, but this hasn't been an issue since then.
>>> Alternatively, I can imagine a kernel-level (base-hw) approach
>>> which uses canaries at the top of each stack. Every time the kernel
>>> switches to a user thread, it checks whether the canary is still
>>> alive. If not, another thread's stack must have overflowed. Of
>>> course, this method is only reliable if we can assume that every
>>> memory word on the stack will be initialised (preferably
>>> sequentially).   
>> Stack canaries are actually a good idea, which we will investigate in
>> the near future - but not to counter stack overflows but as a
>> protection measure against deliberate stack-smashing attacks.
>> Cheers
>> Norman
> ------------------------------------------------------------------------------
> _______________________________________________
> genode-main mailing list
> genode-main at

Stefan Kalkowski
Genode Labs ยท

More information about the users mailing list