Hi Genodians,
With the release 15.11 I've build a new working Genode system with lighttpd on top of the Fiasco kernel, the only kernel that likes my hardware.
I've solved the 256Kb problem a while ago. However, there is one issue with lightttp remaining. It has an artificial limit on urls or filenames. It doesn't like this one: http://eccentric-authentication.org/blog/2014/03/26/how-to-design-a-distribu... . It gives a 404, while nginx on linux serves the contents. As it's probably lighttpd. I'll check that out.
There is another issue, the whole site stops responding after a little time. Clicking new urls just give timeouts.
Please advice how to proceed.
I've attached the generated config file.
With regards, Guido Witmond.
On 12/22/15 23:33, Guido Witmond wrote:
Hi Genodians,
With the release 15.11 I've build a new working Genode system with lighttpd on top of the Fiasco kernel, the only kernel that likes my hardware.
Correction, my system runs under Fiasco/OC 64 bit.
There is another issue, the whole site stops responding after a little time. Clicking new urls just give timeouts.
I've investigated:
It's a sporadic issue, sometimes it happens almost immediately, other times, it happened after 5 hours. Independent of the amount of http-traffic.
First I got this error on the console:
Ipxe_session_component::_receive(const char*, unsigned int): failed to process received packet
This was the evil code responsible for that:
repos/dde_ipxe/src/drivers/nic/main.cc:85 } catch (...) { PDBG("failed to process received packet"); }
Disabling that catch-all and running again gave this error:
[init -> nic_drv] Uncaught exception of type 'N6Genode20Packet_stream_sourceINS_20Packet_stream_po17Packet_descriptorELj1024ELj1024EcEEE19Packet_alloc_failedE' [init -> nic_drv] abort called - thread: 'nic_drv_ep'
To me this file is where the error gets thrown but not caught: repos/os/src/server/nic_bridge/packet_handler.cc:121
It looks to me that the system dies on an unexpected packet from the outside. And I don't want to place a linux based firewall in front of Genode. ;-)
I've tried to add some code to show details of the packet and why it cannot be allocated. However, I can't get this change to get compiled and build.
Here I'm stuck.
Please help me how can I change the repos/os part and get it to compile. Or with whatever else I'm doing wrong.
Cheers, Guido.
PS. If you are going to have a New Years' resolution (or a Roadmap entry), please remove those evil catchalls that hide the root causes. There's quite a few in the codebase.
Congratulations of the running of a server with Genode!
On Sat, Dec 26, 2015 at 02:33:41PM +0100, Guido Witmond wrote:
PS. If you are going to have a New Years' resolution (or a Roadmap entry), please remove those evil catchalls that hide the root causes. There's quite a few in the codebase.
If there's a trend perhaps this needs to be cleaned up. In situations like this, I'd probably want it to just crash with a coredump. It's especially great having drivers in user-space for this. :)
Cheers, Guido.
Cheers, Jookia.
Hi Guido,
Am 26. Dezember 2015 14:33:41 MEZ, schrieb Guido Witmond <guido@...231...>:
First I got this error on the console:
Ipxe_session_component::_receive(const char*, unsigned int): failed to process received packet
This was the evil code responsible for that:
repos/dde_ipxe/src/drivers/nic/main.cc:85 } catch (...) { PDBG("failed to process received packet"); }
Disabling that catch-all and running again gave this error:
[init -> nic_drv] Uncaught exception of type 'N6Genode20Packet_stream_sourceINS_20Packet_stream_po17Packet_descriptorELj1024ELj1024EcEEE19Packet_alloc_failedE' [init -> nic_drv] abort called - thread: 'nic_drv_ep'
To me this file is where the error gets thrown but not caught: repos/os/src/server/nic_bridge/packet_handler.cc:121
This is a different component - the nic_bridge. The exception does occur in nic_drv.
It looks to me that the system dies on an unexpected packet from the outside. And I don't want to place a linux based firewall in front of Genode. ;-)
I don't know what you mean by "unexpected" - the nic_drv always expects packets from the ethernet. To forward these packets to its single client, it has to allocate a buffer and a packet descriptor in the incoming packet stream. For some reason this fails, mostly because the client got stuck and does no longer acknowledge incoming packets resulting in failed allocations for further packets.
I've tried to add some code to show details of the packet and why it cannot be allocated. However, I can't get this change to get compiled and build.
Here I'm stuck.
Please help me how can I change the repos/os part and get it to compile. Or with whatever else I'm doing wrong.
Would you mind to share your changes as a Github branch. I can't promise anything because of vacation time but if a have a minute the next days I might have a look.
Greets
On 12/26/15 16:28, Christian Helmuth wrote:
Hi Guido,
Am 26. Dezember 2015 14:33:41 MEZ, schrieb Guido Witmond <guido@...231...>:
It looks to me that the system dies on an unexpected packet from the outside. And I don't want to place a linux based firewall in front of Genode. ;-)
I don't know what you mean by "unexpected" - the nic_drv always expects packets from the ethernet. To forward these packets to its single client, it has to allocate a buffer and a packet descriptor in the incoming packet stream. For some reason this fails, mostly because the client got stuck and does no longer acknowledge incoming packets resulting in failed allocations for further packets.
What I meant by unexpected is a packet that doesn't get parsed correctly and triggers an exception, like a Ping of Death. I jumped to that idea because it was independent of the amount of traffic I generated.
Before I took out the catchall I would get the 'failed to process received packet' error multiple times, but again, no apparent link to the amount of http-traffic.
I've reinstated the catchall and get some more 'measurements'.
- there could be more than one of these messages; - the system still responds to pings; - listening on a different port than 80 also gives these error messages; - the hanging seems independent of the messages;
But at a different port than 80, I don't seem to experience hangs. It could be the lighttpd that receives a request from somewhere out there that makes it hang. So more like a Request of Death.
I'll start with chccking the input validation of lighttpd before badmouthing the kernel. Expect more from me soon.
Cheers, Guido.
On 12/27/15 21:51, Guido Witmond wrote:
So more like a Request of Death.
It had! See: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-5533
"The http_request_split_value function in request.c in lighttpd before 1.4.32 allows remote attackers to cause a denial of service (infinite loop) via a request with a header containing an empty token, as demonstrated using the "Connection: TE,,Keep-Alive" header. "
I don't have any packet captures to prove I got hit by those, but anyway, I've upgraded to lighttpd 1.4.38. Which is here: https://github.com/gwitmond/genode/commit/074130
However, I still experience the hangs. :-(
Could it have to do with the remark by ChristianH: [1] "This hints we may have an issue with our file descriptor handling on poll/select."
I have reasons to believe the system is busy waiting instead of polling. The power usage monitor shows a constant 56.8 Watts when running Fiasco.OC with lightttpd and a mere 43.6 when running linux. A cpuburn at one cpu reaches just 1 watt less than Genode.
Is this polling to be expected from Fiasco.OC with debug mode enabled? Is Genode smart enough to prevent busy loops or is it worth investigation as a cause for the hangs?
Cheers, Guido.
1: https://github.com/genodelabs/genode/issues/987#issuecomment-129775238