Hi Menno,
thank you for chiming in. Great that we can get a joint discussion going!
This is great. Currently we're trying to accomplish a similar thing for RPC calls and Signals. The approach with statically configured proxies you take, also made most sense to us, which is what we did for RPC calls, and we're now trying to use it for Signals. Combined with your work it seems together we made the first steps towards distributed use of the fundamental IPC mechanisms provided by Genode (shared memory, rpc, and signals).
Intuitively, I see the appeal of proxying those low-level mechanisms. Once these three mechanisms are covered, any session interface could be reused in a distributed fashion. However, practically, I doubt that this approach is the best way to go because it discards a substantial benefit that Genode gives us: The knowledge about the semantics of the information to transmit (1). At the same time, it will ultimately become complex (2). Let me illustrate both problems separately.
1) Exploiting our knowledge about the transferred information
For the sake of argumentation, let us take a look at the framebuffer session interface. It basically consists of an RPC call to request a shared-memory buffer (as dataspace) and an RPC call for reporting rectangular areas within the framebuffer to be updated. The actual pixels are transferred via the shared-memory buffer. Signals are used to let the client synchronize its output to the refresh rate of the framebuffer. With the approach of merely wrapping the three low-level communication mechanisms into network packets, the interface would rely on RPC via TCP, a distributed shared memory technique, and a signaling - possibly also via a TCP connection.
This is not how a network protocol for remote desktops is usually designed. Instead, such protocols leverage the knowledge about the specific domain, in this case, transferring pixels. Exploiting these known semantics, they can compress the pixels via pixel-oriented compression algorithms, drop intermediate frames if detecting a low network bandwidth, and possibly decide to not proxy the sync signals over the network because the network latency would render them useless anyway. Instead, we would produce artificial periodic sync events at the local proxy.
Similar observations can be made for the other session interfaces as well. E.g., it is perfectly fine to drop network packets of a NIC session. But it would be wrong to do that for block requests. When looking at the file-system session, the issue becomes even more apparent. The file-system session uses shared memory to carry payload between client and server. Using the low-level proxy approach, we would again employ a distributed shared memory mechanism instead of straight-forward packet-based communication. This is just wrong.
In short: Genode's session interfaces are painfully bad as network protocols because they are not designed as network protocols.
Hence, instead of proxying the low-level mechanisms, I warmly recommend to take a step back and solve the proxying problem for different session interfaces individually. Granted, this approach does not "scale" with a growing number of session interfaces. On the other hand, we actually don't have a scalability problem. The number of session interfaces remained almost constant over the past years. So it is unlikely that we will suddenly see an influx of new session interfaces. In practice, we are talking about approx. 10 session interfaces (Terminal, ROM, Report, Framebuffer, Input, Nic, Block, File-system, LOG) to cover. Consequently, the presumed generality of the low-level proxy approach does not solve a scalability problem.
As another benefit for solving the proxying in a session-specific way, we can reuse existing protocols in a straight-forward way. E.g., just using VNC for proxying the framebuffer session. This would, as a side effect, greatly improve the interoperability of the distributed Genode systems with existing infrastructures.
2) Complexity
As you mentioned, Genode relies on three low-level communication mechanisms (synchronous RPC, signals, and shared memory). Genode's session interfaces rely on certain characteristics of those mechanisms:
* Low latency of RPC. Usually, the synchronous nature of RPC calls is leveraged by the underlying kernel to take scheduling decisions. E.g., NOVA and base-hw transfer the scheduling context between the client and the server to attain low latency. On the network, this assumption does no longer hold true.
* The order of memory accesses in shared memory must be preserved. E.g., In packet-stream-based session interfaces, packet descriptors are enqueued into the request queue _after_ the payload got written into the bulk buffer. If we don't preserve this order (as most attempts at distributed shared memory do), the server would possibly observe a request before the associated bulk-buffer data is current.
* Signals must not be dropped.
When attempting the proxying of the low-level mechanisms, those characteristics must be preserved. This is extremely complicated, especially for shared memory. The resulting sophistication would ultimately defeat Genode's biggest advantage, which is its low complexity.
Where we both have some form of serialized commands (in your case it's handled by the ip backend with an enum identifying the action is being sent over), we choose, for now, not to specify any sort of transportation mechanism except a bare nic session, and send raw data over nic sessions. In case we need an ip layer, we plan to move this in a separate component which knows how to handle ip, ipsec or some other protocol.
That sounds very cool!
Some challenges we're still looking into are:
- Manual marshalling. Right now we marshall the rpc calls manually.
Genode already has a great marshalling mechanism in place, however I didn't manage to re-use this, so for now I do it by hand. This seems like a bit of a waste, so in a later stage I hope to look into this again.
It is tempting. But as I discussed above, I think that the attempt to leverage Genode's RPC at the transport level for the distributed scenario is futile.
- Cross CPU architecture usage of arguments. How to handle integer
arguments or structs with integer arguments between big/little endian architectures, or systems with a different notion of word-lengths and struct alignment?
- We're still working on Signals.
- Routing. Presently we statically configure how the client-proxy and
the server-proxy are connected. It would be better if we had more flexibility here, like what's now being provided by init.
- Capabilities. It would be great to have some sort of distributed
capability system to have stricter control over who talks to who.
The endian issues, signals, and capabilities are further indicators hinting at the possibly wrong approach.
Your remark about the routing and dynamics is of course a limitation of the static proxies (which are actually modeled after device drivers, which also operate on static resources - the devices). I propose to not take on this problem before we have developed a good understanding of our requirements. E.g., I don't foresee that we will distribute a Genode system arbitrarily. Instead, we will intuitively select our "cut points" in ways that minimize remote communication. So maybe, the flexibility of init's routing won't be needed. Or we will discover that we will need some other kind of flexibility.
Best regards Norman