Deadlock in combination with pthread, lwip and grpc

Christian Helmuth christian.helmuth at genode-labs.com
Fri May 22 12:11:25 CEST 2020


Hello Sid,

as I'm not able to fix the floating-point exception I just used the
following script and executed it in parallel with the depicted command
line.

  netcat.sh:

    #!/bin/bash
    for i in $(seq 10000); do
      dd if=/dev/zero status=none count=4 | netcat 10.0.2.55 8899 > /dev/null
    done

  > parallel -j 20 ./netcat.sh -- $(seq 20)

I also removed the diagnostic messages and after 422 rounds I got the
following log messages.

[init -> test-tcp_echo_server] waiting for connection 422
[init -> test-tcp_echo_server] Error: tcp_err_callback arg=null
[init -> test-tcp_echo_server] Error: Assertion "pbuf_free: p->ref > 0" /plain/krishna/src/genode/genode_tmp.git/contrib/lwip-6e0661b21fde397041389d5d8db906b5a6543700/src/lib/lwip/src/core/pbuf.c:753
[init -> test-tcp_echo_server] Error: tcp_poll_callback
[init -> test-tcp_echo_server] Error: tcp_poll_callback socket=653982970 state=4
[init -> test-tcp_echo_server] Error: tcp_poll_callback local_port=8899 remote_port=34522 state=4
[init -> test-tcp_echo_server] Error: tcp_poll_callback

This assertion seems to be very serious and the code decrements the
reference counter also if the assertion fails, which should render
this pbuf not reclaimable for a long time due to the integer
wrap-around. In my test case the lwip stack stills happily responds to
ARP and ICMP ping.

Is my test approach reasonable? Should we hunt the issue depicted above?

Regards
-- 
Christian Helmuth
Genode Labs

https://www.genode-labs.com/ · https://genode.org/
https://twitter.com/GenodeLabs · /ˈdʒiː.nəʊd/

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth



More information about the users mailing list