Hi Martijn,
- When I remove the USB stick, the usb driver detects removal, but
the rump_fs remains unaware. The CLI component can successfully open new file system sessions and even list the files in the root directory, even though the actual storage device is detached...
this is where the problem begins. Unlike the NIC session, neither the block session nor the file-system session has any notion of unplugging devices. Once connected, a client expects the session to be available until it is closed.
- The rump_fs server aborts when a filesystem is not of the expected
type.
I think this is the adequate behavior in this situation. From the file-system's perspective, this is fatal condition.
- To complicate matters more, the target platform is booted from a -
different - USB stick. Currently the usb driver detects this USB stick as mass storage device and the rump_fs aborts because the fs is not the expected ext2fs.
What you describe is the general case of using hot-swappable storage. To build a system that works, we need to anticipate the fact that storage sizes and file-system types may differ. The system must still stay robust.
For the latter finding, I am aware that the usb driver supports a policy mechanism for raw devices (in combination with the usb_report_filter component). But to my knowledge for storage devices, such a policy mechanism does not exist, right?
Our mid-term goal is to remove the build-in storage/HID/networking support from the USB driver and move this functionality into dedicated components that use the USB-session interface. This will make us much more flexible because the policy configuration can then be used to explicitly assign devices to a clients. Right now, the USB driver's built-in policy provides the first storage device as a block session. This is quite limiting. I.e., there is no good way to access multiple storage devices at the same time.
Regarding detachment / reattachment of USB storage, I understand that at startup of this composition, the rump_fs server immediately requests a block session at the part_blk server, which in turn requests a block session at the usb driver. This whole blocks until a USB storage device is plugged in. When this happens, the chain of sessions requests is setup and the file system client can access the medium. Now if the USB storage device is detached, what happens to the open sessions?
They ultimately fail. From their perspective, the situation is not different from a just-died hard disk.
To implement your scenario, we need to come up with an protocol that takes care of orderly closing the sessions before the medium disappears.
1. We need to tell the client to release the file-system session, e.g., via a config update or by killing the client. Once, the client complied (or ceased to exist), 2. We need to tell the (now client-less) file-system server to close the block session. In principle, we could just kill it since it has no client anyway. But in practice, we want to make sure that the file system writes back the content of its block cache before closing the block session. Once, the file-system server is gone, 3. We need to tell part_blk to release the block session at the driver, or kill it. Once part_blk is gone, 4. There is no longer a block client at the USB driver. So we can remove the USB stick. The next time a client connects, it will perform the regular procedure that worked for the first time.
As of now, Genode provides no established solution to realize such a protocol. The dynamic init that I outlined my road-map posting will make such scenarios much easier to implement. But until it is ready, I am afraid that you will need to implement it in the form of a custom runtime.
As a way to support detachment / reattachment of USB storage I’m thinking about placing the rump_fs and part_blk components in a child subtree of the CLI component that is spawned on demand and cleaned-up after use. But this seems a bit like overkill.
That's exactly the right solution. I don't think that it's overkill either. Spawning up rump_fs and part_blk dynamically is certainly quick enough. Memory-wise, it does not take more resources that a static scenario either. By letting your CLI component implement the protocol outlined above, you have full control over chain of events. Also the aborting rump_fs is nothing fatal anymore but can be gracefully handled by the CLI component. As another benefit, the solution does not need us to supplement the notion of hot-plugging to the file-system and block session interfaces, which would otherwise inflate the complexity of these interfaces (and thereby all the clients that rely on them).
Cheers Norman