Dear Genodians, Here are two (very minor) notes after my upgrading to 23.02 (plus cherry-picking the patch from #4785), regarding two "bugs" I had found, which ended up being mistakes on my side, not on Genode's side.
These two don't justify opening tickets, but they might deserve a quick mention here on the mailing-list:
* vfs_pipe: components that use the "pipe" plug-in should make sure to enable RTC (<libc rtc="/dev/rtc"...>) as of 23.02 -- previously it would work without specifying an RTC path. This probably has zero impact since everyone properly configures their libc config (unlike me in the one scenario that brought this up..!)
* FS "stalls" when a vfs-handle fails read()ing in a specific way: Found out that when writing a vfs plug-in, and vfs handles, I ought to make sure not to return READ_ERR in the read() hook, error cases should be detected earlier, at the open() stage (-> return a nil vfs-handle and error code, obviously). Once a valid vfs-handle is returned in open() it's not ok to say "sorry, my bad, the file cannot actually be read and should not have been opened in the first place" every time read() is called... Otherwise this entails a bunch of consequences, the most visible of which is big red "packet operation failed" error messages in the LOG ! Again, others seem more careful, but I stumbled into making that mistake and thought I should post a cautionary tale about it ;-)
Cédric
P.S. If one is in need of a quick temporary hack before vfs-handle usage is cleaned up, it seems there is an alternative way for the stalls to be kept at bay or mitigated by modifying Genode's fs_file_system.h, adding a forced call to wakeup_vfs_user() (I'm still investigating... hopefully won't have to dig much though, fixing my code to fail earlier at the open() stage should do the trick).
Hi Cedrik,
thanks for sharing your findings.
On 2023-06-29 16:44, ttcoder@netcourrier.com wrote:
- vfs_pipe: components that use the "pipe" plug-in should make sure to enable RTC (<libc rtc="/dev/rtc"...>)
as of 23.02 -- previously it would work without specifying an RTC path. This probably has zero impact since everyone properly configures their libc config (unlike me in the one scenario that brought this up..!)
This made me curios because the nightly tests [1,2] for the vfs_pipe plugin do in fact not configure the rtc for the libc, yet are known to work.
[1] https://github.com/genodelabs/genode/blob/master/repos/libports/recipes/pkg/... [2] https://github.com/genodelabs/genode/blob/master/repos/libports/recipes/pkg/...
I guess that the use of the rtc has a side effect that covers up the symptom of a problem that might still lurk there. To get to the bottom of it, I'd need more information or - ideally - a scenario to reproduce the non-working state.
There is one suspicion: The VFS changes in 23.02 foster the batching of I/O in the file-system session. When operating both sides of a pipe through the same file-system session, there is the risk that a write operations fits into the file-system session's packet-stream buffer but not in the pipe buffer in the remote pipe. Once the pipe is saturated, a reader must consume content before any new progress can be made. But when the reader issues its read request through the same file-system session as used by the writer - which is still clogged by the not yet completed write operation - we end up in a deadlock situation. May this be the case in your scenario?
In practice, such situations don't occur whenever both ends of one pipe are operated by different components - each using a distinct file-system session. But when interacting with a pipe behind a chain of VFS servers, stalling effects are plausible. But as I said, I'm just speculating.
Should you by any chance find a way to come up with a run script that shows the non-working behavior, I'd love to investigate it.
Cheers Norman
- vfs_pipe: components that use the "pipe" plug-in should make sure to
enable RTC (<libc rtc="/dev/rtc"...>)
as of 23.02 -- previously it would work without specifying an RTC path.
This made me curios because the nightly tests [1,2] for the vfs_pipe plugin do in fact not configure the rtc for the libc, yet are known to work.
There is one suspicion: The VFS changes in 23.02 foster the batching of I/O in the file-system session. When operating both sides of a pipe
..
completed write operation - we end up in a deadlock situation. May this be the case in your scenario?
In practice, such situations don't occur whenever both ends of one pipe are operated by different components - each using a distinct file-system session. But when interacting with a pipe behind a chain of VFS servers, stalling effects are plausible. But as I said, I'm just speculating.
Should you by any chance find a way to come up with a run script that shows the non-working behavior, I'd love to investigate it.
Just filed ticket #4951 Although... Now that we've discussed it, I see that: 1) it's a cross-component issue, in my case, which might rule out the intra-component deadlock you describe 2) it's not a deadlock anyhow, but a problem whereby write() always returns -1, it won't work no matter what I throw at it.
But the ticket is there if you still think it's worth investigating,
Cédric