Hi all,
We are trying to get openVPN to work under 17.05 again. One issue we encountered was that lwip sockets seem to function incorrectly; so we tried switching to lxip sockets.
However, sockets in lxip require the use of the with_libc() function. When I put this around the openvpn_main() call it gives me the error:
Error: void Libc::Kernel::run(Libc::Application_code&) called from non-kernel context
The catch here is that I'm calling all this from inside the code of a Genode::Thread which executes the main function of openvpn. It seems that I cannot use with_libc inside another thread than the entrypoint thread.
calling with_libc inside Component::construct() does not work for the new thread. It will fail when calling socket().
Is there any way of using sockets correctly inside another thread?
Hello Boris,
On Fri, Jun 23, 2017 at 02:24:01PM +0200, Boris Mulder wrote:
However, sockets in lxip require the use of the with_libc() function. When I put this around the openvpn_main() call it gives me the error:
Error: void Libc::Kernel::run(Libc::Application_code&) called from non-kernel context
The catch here is that I'm calling all this from inside the code of a Genode::Thread which executes the main function of openvpn. It seems that I cannot use with_libc inside another thread than the entrypoint thread.
We also identified that exposing the with_libc() in the Libc API was the wrong direction. Therefore, we'll work on moving this aspect back into the libc internals in the future. You're right with regard that with_libc is not permitted/needed for other threads/pthreads in libc applications. Blocking situations are handled differently for threads beside the main entrypoint. But, the main entrypoint thread needs to finish I/O operations by handling the I/O signals of Genode sessions used by the component.
calling with_libc inside Component::construct() does not work for the new thread. It will fail when calling socket().
Is there any way of using sockets correctly inside another thread?
How does socket() fail if you do not wrap the call with with_libc()? I'd expect the thread to open a socket_fs file and maybe block for the I/O operation to complete. Also, is there any reason to use a Genode::Thread which uses POSIX interfaces only (beside the admittedly more concise syntax compared to pthread_create())?
Greets
Hello,
How does socket() fail if you do not wrap the call with with_libc()? I'd expect the thread to open a socket_fs file and maybe block for the I/O operation to complete. Also, is there any reason to use a Genode::Thread which uses POSIX interfaces only (beside the admittedly more concise syntax compared to pthread_create())?
Basically if I do not use with_libc, the call to socket() will hang forever inside the first read() call to the socket file. The reason I used a Genode::Thread was because openvpn already did that. Do you think using a pthread might be better in this case?
Hello,
How does socket() fail if you do not wrap the call with with_libc()? I'd expect the thread to open a socket_fs file and maybe block for the I/O operation to complete. Also, is there any reason to use a Genode::Thread which uses POSIX interfaces only (beside the admittedly more concise syntax compared to pthread_create())?
Basically if I do not use with_libc, the call to socket() will hang forever inside the first read() call to the socket file. The reason I used a Genode::Thread was because openvpn already did that. Do you think using a pthread might be better in this case?
Hey,
On Fri, Jun 23, 2017 at 03:21:28PM +0200, Boris Mulder wrote:
Basically if I do not use with_libc, the call to socket() will hang forever inside the first read() call to the socket file.
So, which code does your initial entrypoint execute? As I wrote before, the initial entrypoint is responsible for completing the I/O operations in Libc components. In other words, if the initial entrypoint does not block and then handle libc I/O signals, other threads blocked in the libc will never resume.
The reason I used a Genode::Thread was because openvpn already did that. Do you think using a pthread might be better in this case?
No, I was just curios ;-)
Grets
The entrypoint creates the root component, spawns the thread and returns. It will then handle RPC requests, as entrypoints do IIRC.
The program acts as a server (serving Nic sessions asynchronously) and as a client to lxip vfs with libc. the code can be found in [1].
How can I have the entrypoint handle I/O signals in libc while also being able to serve clients in Genode?
[1] https://github.com/genodelabs/genode/blob/master/repos/ports/src/app/openvpn...
On 23-06-17 15:28, Christian Helmuth wrote:
Hey,
On Fri, Jun 23, 2017 at 03:21:28PM +0200, Boris Mulder wrote:
Basically if I do not use with_libc, the call to socket() will hang forever inside the first read() call to the socket file.
So, which code does your initial entrypoint execute? As I wrote before, the initial entrypoint is responsible for completing the I/O operations in Libc components. In other words, if the initial entrypoint does not block and then handle libc I/O signals, other threads blocked in the libc will never resume.
The reason I used a Genode::Thread was because openvpn already did that. Do you think using a pthread might be better in this case?
No, I was just curios ;-)
Grets
Hello Boris,
On Fri, Jun 23, 2017 at 03:59:53PM +0200, Boris Mulder wrote:
The entrypoint creates the root component, spawns the thread and returns. It will then handle RPC requests, as entrypoints do IIRC.
The program acts as a server (serving Nic sessions asynchronously) and as a client to lxip vfs with libc. the code can be found in [1].
How can I have the entrypoint handle I/O signals in libc while also being able to serve clients in Genode?
This should happen automatically under the hood as libc processes signals in ordinary I/O signal handlers in the entrypoint.
Are you able to run the scenario under linux and inspect the processing of both threads via GDB? I fear that I cannot help with specifics of OpenVPN, but may guide with more details about the blocking situation. It may be interesting to know if any network packets reach the OpenVPN code.
Greets
Hello Christian,
Actually the OpenVPN code hangs once it calls the libc socket() function. Internally, this function calls a blocking write(), and this write() is handled by Libc::Kernel.
So openVPN does not send or receive any packet yet as it is blocked at socket().
Earlier, we have used lwip as a socket library. When we did that, socket() (and connect() in TCP mode) did work, but it failed to send any initial data to the server, likewise blocking on some function.
We are reaching the limit of our knowledge of genode libc and the side-effects of the asynchronous entrypoint. At this point our debugging went down into the libc kernel and there is a limit how deep we can go. Help on this topic would be appreciated.
We uploaded the new 17.05 ready code of openVPN (including a run script which can be run through make run/openvpn) onto https://github.com/nlcsl/genode/tree/openvpn_17.05 .
If you have the time, could you try to run it and see if it is possible to let it produce a single UDP packet? For this, it is not necessary to setup a server. From there, we could pick it up again.
We appreciate it,
Boris
On 26-06-17 10:57, Christian Helmuth wrote:
Hello Boris,
On Fri, Jun 23, 2017 at 03:59:53PM +0200, Boris Mulder wrote:
The entrypoint creates the root component, spawns the thread and returns. It will then handle RPC requests, as entrypoints do IIRC.
The program acts as a server (serving Nic sessions asynchronously) and as a client to lxip vfs with libc. the code can be found in [1].
How can I have the entrypoint handle I/O signals in libc while also being able to serve clients in Genode?
This should happen automatically under the hood as libc processes signals in ordinary I/O signal handlers in the entrypoint.
Are you able to run the scenario under linux and inspect the processing of both threads via GDB? I fear that I cannot help with specifics of OpenVPN, but may guide with more details about the blocking situation. It may be interesting to know if any network packets reach the OpenVPN code.
Greets
Hello Boris,
On Mon, Jun 26, 2017 at 02:46:08PM +0200, Boris Mulder wrote:
Actually the OpenVPN code hangs once it calls the libc socket() function. Internally, this function calls a blocking write(), and this write() is handled by Libc::Kernel.
thanks to your provided test case and the hint with "blocking write" I was able to validate my suspicion about the blocker in your scenario. A rough sketch of my solution can be found here
https://github.com/chelmuth/genode/commits/openvpn_17.05.
The issue is the unfortunate interplay of I/O-signal handling in the initial entrypoint and the current implementation of the VFS plugin, which interfaces with our file-system session. In the case of "blocking write" the VFS plugin calls wait_and_dispatch_one_io_signal() directly on the initial entrypoint. In your scenario this results in the initial-entrypoint thread and the OpenVPN thread racing on the handling of first I/O signal. As the entrypoint always wins, the OpenVPN thread is blocked until another I/O signal occurs (which may never happen in the startup phase).
The sketched solution just reverses the roles of the first and second application thread. Now, the initial entrypoint implements OpenVPN (handling its own I/O signals) and the additional entrypoint implements the NIC server (with root and session component).
I hope this helps.
Greets
Hi,
Christian, thanks a lot for your speedy refactoring of the OpenVPN port to run the OpenVPN code in the main thread. The OpenVPN code no longer blocks on opening a socket and now tries to setup a VPN connection with the configured server. Unfortunately we are now stumbling upon two new problems.
With OpenVPN configured to use UDP, the OpenVPN component starts the TLS handshake but fails. After some debugging we noticed a pattern of retransmissions by the OpenVPN client. It appears to us that the OpenVPN client cannot read incoming packets from the socket until after (again) writing to the socket (which happens due to retransmission after timeout). If you are interested, take a look at the attached pcap in Wireshark and notice the duplication of messages. For reference I also added a pcap of the OpenVPN port on 16.05.
Also we notice that OpenVPN reads on the socket are non-blocking, proven by the massive amount of READ (len -1) debug messages. This was previously not the case.
With OpenVPN configured to use TCP, the TLS handshake and key-exchange passes successfully, yielding an OpenVPN connection between both client and server. We would now expect the corresponding Nic session to become available for the Genode client that issued the Nic session request, but this is not the case. Instead the client blocks on the creation of the Nic::Connection indefinitely. In the OpenVPN server Root::_create_session returns and the Root calls _ep.manage(..) etc.. What could keep the constructor of Nic::Connection blocking? Is this somehow related to the new asynchronous session creation process?
Met vriendelijke groet / kind regards,
Martijn Verschoor
Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands +31 35 631 3253 (office) | +31 616 014 087 (mobile)
On 26 Jun 2017, at 16:17, Christian Helmuth <Christian.Helmuth@...1...> wrote:
Hello Boris,
On Mon, Jun 26, 2017 at 02:46:08PM +0200, Boris Mulder wrote:
Actually the OpenVPN code hangs once it calls the libc socket() function. Internally, this function calls a blocking write(), and this write() is handled by Libc::Kernel.
thanks to your provided test case and the hint with "blocking write" I was able to validate my suspicion about the blocker in your scenario. A rough sketch of my solution can be found here
https://github.com/chelmuth/genode/commits/openvpn_17.05.
The issue is the unfortunate interplay of I/O-signal handling in the initial entrypoint and the current implementation of the VFS plugin, which interfaces with our file-system session. In the case of "blocking write" the VFS plugin calls wait_and_dispatch_one_io_signal() directly on the initial entrypoint. In your scenario this results in the initial-entrypoint thread and the OpenVPN thread racing on the handling of first I/O signal. As the entrypoint always wins, the OpenVPN thread is blocked until another I/O signal occurs (which may never happen in the startup phase).
The sketched solution just reverses the roles of the first and second application thread. Now, the initial entrypoint implements OpenVPN (handling its own I/O signals) and the additional entrypoint implements the NIC server (with root and session component).
I hope this helps.
Greets
Christian Helmuth Genode Labs
https://www.genode-labs.com/ · https://genode.org/ https://twitter.com/GenodeLabs · /ˈdʒiː.nəʊd/
Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth
So openVPN does not send or receive any packet yet as it is blocked at socket().
Earlier, we have used lwip as a socket library. When we did that, socket() (and connect() in TCP mode) did work, but it failed to send any initial data to the server, likewise blocking on some function.
We are reaching the limit of our knowledge of genode libc and the side-effects of the asynchronous entrypoint. At this point our debugging went down into the libc kernel and there is a limit how deep we can go. Help on this topic would be appreciated.
We uploaded the new 17.05 ready code of openVPN (including a run script which can be run through make run/openvpn) onto https://github.com/nlcsl/genode/tree/openvpn_17.05 .
If you have the time, could you try to run it and see if it is possible to let it produce a single UDP packet? For this, it is not necessary to setup a server. From there, we could pick it up again.
We appreciate it,
Boris
On 26-06-17 10:57, Christian Helmuth wrote:
Hello Boris,
On Fri, Jun 23, 2017 at 03:59:53PM +0200, Boris Mulder wrote:
The entrypoint creates the root component, spawns the thread and returns. It will then handle RPC requests, as entrypoints do IIRC.
The program acts as a server (serving Nic sessions asynchronously) and as a client to lxip vfs with libc. the code can be found in [1].
How can I have the entrypoint handle I/O signals in libc while also being able to serve clients in Genode?
This should happen automatically under the hood as libc processes signals in ordinary I/O signal handlers in the entrypoint.
Are you able to run the scenario under linux and inspect the processing of both threads via GDB? I fear that I cannot help with specifics of OpenVPN, but may guide with more details about the blocking situation. It may be interesting to know if any network packets reach the OpenVPN code.
Greets
--
Met vriendelijke groet / kind regards,
Boris Mulder
Cyber Security Labs B.V. | Gooimeer 6-31 | 1411 DD Naarden | The Netherlands +31 35 631 3253 (office)
Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main
Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main