Gesendet: Dienstag, 27. April 2021 um 16:03 Uhr Von: "Norman Feske" norman.feske@genode-labs.com An: users@lists.genode.org Betreff: Re: Restart able block devices
Hello Uwe,
On 22.04.21 21:50, Uwe wrote:
I have seen the Video (0). At the end there is the question how to make drivers for block devices restart able. The only limiting factor for this is state, according to the video. To store state in restart able processes raft(1) was invented. Especially this implementation has enough modularity to be adaptable to store its data on raw block devices.
thanks for commenting on my talk and for sharing your perspective.
The stance I expressed is primarily an economic one. For driver classes covered by the talk (network, graphics, input), the architectural change towards the pluggable device drivers actually reduced the interface complexity, which is beautiful. The main concern were drivers in the order of 100 thousand lines code (thinking of the wifi or intel graphics drivers), which are no longer critical for the liveliness of the system. So when looking at the effort, complexity implications, and benefit, this change is a clear win.
For block drivers, the equation looks different.
First, each of our block drivers (NVMe, AHCI, USB block) comprises about 1500 lines of code. With this rather low complexity, we can attain robustness by the means of code-quality measures. When equaling risk with source-code complexity, the pressure to make block drivers restartable is two orders of magnitudes weaker compared to the drivers mentioned above.
The raft implementations have around 2000 SLOC. Although I don't believe in SLOC as Measurement of complexity I would think it comparable. But if the implementation can be shared like ROMFile it would be unimportant because the complexity would be distributed.
Second, the restartability mechanism would introduce new risks. We are faced with the problem of replicating large states. Even though solutions exist, they introduce new complexity instead of taking away complexity. Intuitively, I think that a solution would likely exceed the 1500 lines of code of the block driver. So the net gain from a risk-assessment perspective is dubious.
The raft implementation would allow additional uses. Like the reorganization of the graphics allows screenshots for free the raft implementation will allow simply powering off the computer. If it is used to allow the seamless update of applications It would allow at the same time continuing the work after powering off and reboot. With additional connectivity to a matching cloud service it's even possible to surprisingly destroying a computer and reboot a brand new computer, connecting to the server and continue the work where the old computer was destroyed. But In contrast to remote desktop the work is done locally and the service only stores data. The difference is important if the connection is flaky.
That being said, please don't take my words as discouragement. If you are interested to explore this direction, I'd be delighted to learn about your findings.
It is pretty discouraging because there is an interdependence between raft and block devices. The other features won't probably work without raft in the block device.
Cheers Norman
-- Dr.-Ing. Norman Feske Genode Labs
https://www.genode-labs.com · https://genode.org
Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth
Genode users mailing list users@lists.genode.org https://lists.genode.org/listinfo/users