C++ exceptions handing deadlock with gcc unwind code in dynamic library

Alexander Tormasov a.tormasov at innopolis.ru
Mon Oct 12 23:30:36 CEST 2020

What I found is a deadlock of recursive Linker::mutex call.

- If we have an exception in some code (e.g. code which call NOVA syscall, in my case this is attach_at() RPC call) then it somehow processed in caller.
In particular, during processing it call the following stack from injected by gcc function _Unwind_Resume -  pay attention to function dl_iterate_phdr():

#0  Linker::mutex () at /home/tor/gen/20.08/repos/base/src/lib/ldso/main.cc:68
#1  0x0000000000124997 in dl_iterate_phdr (callback=0x119e7a0 <_Unwind_IteratePhdrCallback>, data=0x403fdde0) at /home/tor/gen/20.08/repos/base/src/lib/ldso/exception.cc:41
#2  0x000000000119fa0f in _Unwind_Find_FDE (pc=0x119dc76 <_Unwind_Resume+54>, bases=bases at entry=0x403fe128) at /home/tor/gen/20.08/contrib/gcc-20345a83596fa42a25a85938329aea54bb4b2146/src/noux-pkg/gcc/libgcc/unwind-dw2-fde-dip.c:469
#3  0x000000000119bfc3 in uw_frame_state_for (context=context at entry=0x403fe080, fs=fs at entry=0x403fdec0) at /home/tor/gen/20.08/contrib/gcc-20345a83596fa42a25a85938329aea54bb4b2146/src/noux-pkg/gcc/libgcc/unwind-dw2.c:1257
#4  0x000000000119cfe0 in uw_init_context_1 (context=context at entry=0x403fe080, outer_cfa=outer_cfa at entry=0x403fe2b0, outer_ra=0x1000bcd <Genode::Region_map::attach_at(Genode::Capability<Genode::Dataspace>, unsigned long, unsigned long, long)+259>) at /home/tor/gen/20.08/contrib/gcc-20345a83596fa42a25a85938329aea54bb4b2146/src/noux-pkg/gcc/libgcc/unwind-dw2.c:1586
#5  0x000000000119dc77 in _Unwind_Resume (exc=0x1b41a8 <Genode::init_cxx_heap(Genode::Env&)::initial_block+5256>) at /home/tor/gen/20.08/contrib/gcc-20345a83596fa42a25a85938329aea54bb4b2146/src/noux-pkg/gcc/libgcc/unwind.inc:235
#6  0x0000000001000bcd in Genode::Region_map::attach_at (this=0x1304068 <vm_reg0+8648>, ds=..., local_addr=0x80000000, size=0x40000, offset=0x0) at /home/tor/gen/20.08/repos/base/include/region_map/region_map.h:127

The code of dl_iterate_phdr():
extern "C" int dl_iterate_phdr(int (*callback) (Phdr_info *info, size_t size, void *data), void *data)
    int err = 0;
    Phdr_info info;

    Mutex::Guard guard(mutex());

    for (Object *e = obj_list_head();e; e = e->next_obj()) {

        info.addr  = e->reloc_base();
        info.name  = e->name();
        info.phdr  = e->file()->phdr.phdr;
        info.phnum = e->file()->phdr.count;

        if (verbose_exception)
            log(e->name(), " reloc ", Hex(e->reloc_base()));

        if ((err = callback(&info, sizeof(Phdr_info), data)))

    return err;

Py attention that it take Linker::_mutex object (lock).

Inside, it call the callback() function for main C++ code which resolved to
from contrib/gcc-20345a83596fa42a25a85938329aea54bb4b2146/src/noux-pkg/gcc/libgcc/unwind-dw2-fde-dip.c
which internally call get_fde_encoding() and get_cie_encoding() which contain very simple line

  p = aug + strlen ((const char *)aug) + 1; /* Skip the augmentation string.  */

strlen() is not inlined/instantiated here.
In machine code it call strlen at plt which mean that strlen assumed in the shared library, and typically it should be processed by linker relocation code.

To find the code it call jmp_slot at PLT and, in turn,
call from src/lib/ldso/main.cc:294 function
Elf::Addr Ld::jmp_slot(Dependency const &dep, Elf::Size index)
    Mutex::Guard guard(mutex());

    if (verbose_relocation)

Pay attention that it call the same Linker::_mutex object (lock)
we have recursive call of the same linker mutex and deadlock in exception processing.

definitely key problem here is in the usage of linker mutex in Genode implementation of dl_iterate_phdr() 

So, question: how to fix this?
May be we need different mutexes for  Ld::jmp_slot and for dl_iterate_phdr?


More information about the users mailing list