Hello Genodians
We are implementing hardware accelerated encryption for the CBE and have a question pertaining to this:
How should the Cbe_crypto::Interface be implemented when the underlying hardware is asynchronous, e.g. driven by interrupts?
I think I understand the semantics of submit_encryption_request and encryption_request_complete but this mechanism doesn't seem to help with asynchronous completion. As far as I can tell, no signals are used with the Cbe_crypto::Interface.
What am I missing?
Best regards Stefan
Hello Stefan,
We are implementing hardware accelerated encryption for the CBE and have a question pertaining to this:
I wonder, is performance your primary motivation? If so, there might be other opportunities for optimization that are worth implementing first, before resorting to hardware acceleration. Should your primary motivation be the hiding of the keys from software, that's really cool!
How should the Cbe_crypto::Interface be implemented when the underlying hardware is asynchronous, e.g. driven by interrupts?
I think I understand the semantics of submit_encryption_request and encryption_request_complete but this mechanism doesn't seem to help with asynchronous completion. As far as I can tell, no signals are used with the Cbe_crypto::Interface
I cannot speak for the 'Cbe_crypto::Interface' specifically. But in general, a VFS plugin with outstanding requests is polled for the completion (request_complete) each time after I/O happened, i.e., after an I/O signal occurred.
You may take a look at how the implementation of the VFS terminal plugin responds to incoming data for reading ('_read_avail_handler') [1]. The 'read_avail_sigh' corresponds to your interrupt.
[2] https://github.com/genodelabs/genode/blob/master/repos/os/src/lib/vfs/termin...
To support batches of outstanding requests at the VFS (for hiding the latency of the crypto device), one may consider maintaining multiple file handles (at the client of the VFS plugin) where each file handle can have one request in flight. So the degree of parallelism can be tuned by the number of open file handles. Right now, I think that the CBE does not schedule more than one crypto request at a time though.
Cheers Norman
Hi Norman
Thanks for your answer.
I wonder, is performance your primary motivation? If so, there might be other opportunities for optimization that are worth implementing first, before resorting to hardware acceleration. Should your primary motivation be the hiding of the keys from software, that's really cool!
Our primary motivation is indeed hiding the keys from the software and reducing the risk of side channel attacks via timing or power analysis. Offloading the computation from the CPU is a nice side benefit though.
I cannot speak for the 'Cbe_crypto::Interface' specifically. But in general, a VFS plugin with outstanding requests is polled for the completion (request_complete) each time after I/O happened, i.e., after an I/O signal occurred.
This mechanism doesn't seem to be integrated in the Cbe_crypto::Interface which is why we would like to the opinion of the original author of this interface before adapting it.
To support batches of outstanding requests at the VFS (for hiding the latency of the crypto device), one may consider maintaining multiple file handles (at the client of the VFS plugin) where each file handle can have one request in flight. So the degree of parallelism can be tuned by the number of open file handles. Right now, I think that the CBE does not schedule more than one crypto request at a time though.
For the moment I'm just trying to avoid polling and multi-threading although latency might be a concern later on.
Best regards Stefan
Hi Stefan,
This mechanism doesn't seem to be integrated in the Cbe_crypto::Interface which is why we would like to the opinion of the original author of this interface before adapting it.
even though I'm not the author for the 'Cbe_crypto' interface, let me clarify how the control flow of asynchronous I/O works in the VFS in general.
The control flow is always driven by the client of the VFS, e.g., the libc or the VFS server. Whenever the client detects that I/O happened (an I/O signal occurred), it checks each outstanding VFS request for possible progress. In the case of the CBE, there is a chain. For example, the CBE client requests the reading of a block. The CBE, in turn, requests the 'Cbe_crypto' to decrypt data as a prerequisite of finishing the block-read request.
Each time, some I/O happened, the client (e.g., the libc's 'read') asks the VFS for possible progress of the read request (by calling 'complete_read'). This call eventually reaches the CBE. The CBE, in turn, checks the progress of its outstanding crypto operation ('complete_read'). This is what I meant with polling. It is not busy polling but checking for progress each time after I/O happened. This checking is done transitively (libc checks the VFS, VFS checks the CBE, CBE checks crypto).
For this to work, one has to make sure that the I/O backends of the VFS plugins use 'Io_signal_handler' not mere 'Signal_handler' for the reception of asynchronous events like your device interrupt. Otherwise the VFS client won't notice that I/O happened. You may follow the use of the 'Io_progress_handler' interface in base/entrypoint.h to connect the dots.
Hence, there is no special feature in the 'Cbe_crypto::Interface' needed to support asynchronous operation. The pair of 'request' (submit work) and 'complete' (check the state of the submitted work) functions suffices.
However, as I said above, I've no experience with the 'Cbe_crypto' interface specifically and hope that I'm not telling anything wrong.
Maybe Martin or Josef can chime in to substantiate?
To support batches of outstanding requests at the VFS (for hiding the latency of the crypto device), one may consider maintaining multiple file handles (at the client of the VFS plugin) where each file handle can have one request in flight. So the degree of parallelism can be tuned by the number of open file handles. Right now, I think that the CBE does not schedule more than one crypto request at a time though.
For the moment I'm just trying to avoid polling and multi-threading although latency might be a concern later on.
Sorry for the confusion. I was not speaking of busy polling or multi-threading at all. If the batching of crypto operations is not a concern of you (right now), please better ignore this paragraph.
Cheers Norman
Hi Stefan,
I'm sorry for the delayed response of mine, somehow I missed your initial mail.
On 23.09.21 12:37, Norman Feske wrote:
The control flow is always driven by the client of the VFS, e.g., the libc or the VFS server. Whenever the client detects that I/O happened (an I/O signal occurred), it checks each outstanding VFS request for possible progress. In the case of the CBE, there is a chain. For example, the CBE client requests the reading of a block. The CBE, in turn, requests the 'Cbe_crypto' to decrypt data as a prerequisite of finishing the block-read request.
Each time, some I/O happened, the client (e.g., the libc's 'read') asks the VFS for possible progress of the read request (by calling 'complete_read'). This call eventually reaches the CBE. The CBE, in turn, checks the progress of its outstanding crypto operation ('complete_read'). This is what I meant with polling. It is not busy polling but checking for progress each time after I/O happened. This checking is done transitively (libc checks the VFS, VFS checks the CBE, CBE checks crypto).
For this to work, one has to make sure that the I/O backends of the VFS plugins use 'Io_signal_handler' not mere 'Signal_handler' for the reception of asynchronous events like your device interrupt. Otherwise the VFS client won't notice that I/O happened. You may follow the use of the 'Io_progress_handler' interface in base/entrypoint.h to connect the dots.
Hence, there is no special feature in the 'Cbe_crypto::Interface' needed to support asynchronous operation. The pair of 'request' (submit work) and 'complete' (check the state of the submitted work) functions suffices.
I can confirm all of this and have little to add. The Crypto (as any other module in the CBE) is polled for progress, but only driven by external events - kind of "conditional polling". In the request chains that Norman mentioned that span over the different internal modules of the program, the last module before the Crypto talks to the Crypto only by the means of "add request" and "poll if request finished".
As soon as the Crypto wants to communicate that there was some progress, it signals the user of the CBE (See _io_handler respectively _backend_io_response_handler in repos/gems/src/lib/vfs/cbe/vfs.cc (or, for a less vfs-ish context _sigh in repos/gems/src/app/cbe_tester/main.cc)), which polls the CBE and the CBE polls each of its modules for progress again, including the one that talks to the Crypto. The latter then polls Crypto and reacts to the progress. This avoids that different external events have different entry points into the program which causes execution flow to remain plain and simple.
Note that there are modules internal to the CBE (block cache, tree walks, request scheduling, etc.) and modules that must be considered external, like crypto, block back-end, or trust anchor. Pending progress of internal modules is always resolved in the course of one "root poll" of the CBE to the point where each internal module either waits for progress of an external module again or is idle (no requests). Thus, the signal feedback (towards the CBE user) is only applied for modules that must be considered external.
I hope this helped. Don't hesitate to ask if you have further questions.
Cheers, Martin
Hi Martin, hi Norman,
Thanks for your answers, I we will try to implement as you suggest.
Best regards Stefan
On 23.09.21 14:19, Martin Stein wrote:
Hi Stefan,
I'm sorry for the delayed response of mine, somehow I missed your initial mail.
On 23.09.21 12:37, Norman Feske wrote:
The control flow is always driven by the client of the VFS, e.g., the libc or the VFS server. Whenever the client detects that I/O happened (an I/O signal occurred), it checks each outstanding VFS request for possible progress. In the case of the CBE, there is a chain. For example, the CBE client requests the reading of a block. The CBE, in turn, requests the 'Cbe_crypto' to decrypt data as a prerequisite of finishing the block-read request.
Each time, some I/O happened, the client (e.g., the libc's 'read') asks the VFS for possible progress of the read request (by calling 'complete_read'). This call eventually reaches the CBE. The CBE, in turn, checks the progress of its outstanding crypto operation ('complete_read'). This is what I meant with polling. It is not busy polling but checking for progress each time after I/O happened. This checking is done transitively (libc checks the VFS, VFS checks the CBE, CBE checks crypto).
For this to work, one has to make sure that the I/O backends of the VFS plugins use 'Io_signal_handler' not mere 'Signal_handler' for the reception of asynchronous events like your device interrupt. Otherwise the VFS client won't notice that I/O happened. You may follow the use of the 'Io_progress_handler' interface in base/entrypoint.h to connect the dots.
Hence, there is no special feature in the 'Cbe_crypto::Interface' needed to support asynchronous operation. The pair of 'request' (submit work) and 'complete' (check the state of the submitted work) functions suffices.
I can confirm all of this and have little to add. The Crypto (as any other module in the CBE) is polled for progress, but only driven by external events - kind of "conditional polling". In the request chains that Norman mentioned that span over the different internal modules of the program, the last module before the Crypto talks to the Crypto only by the means of "add request" and "poll if request finished".
As soon as the Crypto wants to communicate that there was some progress, it signals the user of the CBE (See _io_handler respectively _backend_io_response_handler in repos/gems/src/lib/vfs/cbe/vfs.cc (or, for a less vfs-ish context _sigh in repos/gems/src/app/cbe_tester/main.cc)), which polls the CBE and the CBE polls each of its modules for progress again, including the one that talks to the Crypto. The latter then polls Crypto and reacts to the progress. This avoids that different external events have different entry points into the program which causes execution flow to remain plain and simple.
Note that there are modules internal to the CBE (block cache, tree walks, request scheduling, etc.) and modules that must be considered external, like crypto, block back-end, or trust anchor. Pending progress of internal modules is always resolved in the course of one "root poll" of the CBE to the point where each internal module either waits for progress of an external module again or is idle (no requests). Thus, the signal feedback (towards the CBE user) is only applied for modules that must be considered external.
I hope this helped. Don't hesitate to ask if you have further questions.
Cheers, Martin
Hi Martin, hi Norman
Thanks for your answers, which work great for our hardware crypto backend.
However, I had to make a few changes to the CBE crypto interface to enable asynchronous backends: https://github.com/throwException/genode/commit/52245b9df53d289e66b10b7fa564...
Do you think this is a useful direction for our purpose? Do you have any suggestions to keep this as closely aligned with your implementation as possible?
Kind regards Stefan
On 23.09.21 14:19, Martin Stein wrote:
Hi Stefan,
I'm sorry for the delayed response of mine, somehow I missed your initial mail.
On 23.09.21 12:37, Norman Feske wrote:
The control flow is always driven by the client of the VFS, e.g., the libc or the VFS server. Whenever the client detects that I/O happened (an I/O signal occurred), it checks each outstanding VFS request for possible progress. In the case of the CBE, there is a chain. For example, the CBE client requests the reading of a block. The CBE, in turn, requests the 'Cbe_crypto' to decrypt data as a prerequisite of finishing the block-read request.
Each time, some I/O happened, the client (e.g., the libc's 'read') asks the VFS for possible progress of the read request (by calling 'complete_read'). This call eventually reaches the CBE. The CBE, in turn, checks the progress of its outstanding crypto operation ('complete_read'). This is what I meant with polling. It is not busy polling but checking for progress each time after I/O happened. This checking is done transitively (libc checks the VFS, VFS checks the CBE, CBE checks crypto).
For this to work, one has to make sure that the I/O backends of the VFS plugins use 'Io_signal_handler' not mere 'Signal_handler' for the reception of asynchronous events like your device interrupt. Otherwise the VFS client won't notice that I/O happened. You may follow the use of the 'Io_progress_handler' interface in base/entrypoint.h to connect the dots.
Hence, there is no special feature in the 'Cbe_crypto::Interface' needed to support asynchronous operation. The pair of 'request' (submit work) and 'complete' (check the state of the submitted work) functions suffices.
I can confirm all of this and have little to add. The Crypto (as any other module in the CBE) is polled for progress, but only driven by external events - kind of "conditional polling". In the request chains that Norman mentioned that span over the different internal modules of the program, the last module before the Crypto talks to the Crypto only by the means of "add request" and "poll if request finished".
As soon as the Crypto wants to communicate that there was some progress, it signals the user of the CBE (See _io_handler respectively _backend_io_response_handler in repos/gems/src/lib/vfs/cbe/vfs.cc (or, for a less vfs-ish context _sigh in repos/gems/src/app/cbe_tester/main.cc)), which polls the CBE and the CBE polls each of its modules for progress again, including the one that talks to the Crypto. The latter then polls Crypto and reacts to the progress. This avoids that different external events have different entry points into the program which causes execution flow to remain plain and simple.
Note that there are modules internal to the CBE (block cache, tree walks, request scheduling, etc.) and modules that must be considered external, like crypto, block back-end, or trust anchor. Pending progress of internal modules is always resolved in the course of one "root poll" of the CBE to the point where each internal module either waits for progress of an external module again or is idle (no requests). Thus, the signal feedback (towards the CBE user) is only applied for modules that must be considered external.
I hope this helped. Don't hesitate to ask if you have further questions.
Cheers, Martin
Gesendet: Mittwoch, 13. Oktober 2021 um 11:46 Uhr Von: "Stefan Thöni" stefan.thoeni@gapfruit.com An: users@lists.genode.org Betreff: Re: CBE crypt interface
Hi Martin, hi Norman
Thanks for your answers, which work great for our hardware crypto backend.
However, I had to make a few changes to the CBE crypto interface to enable asynchronous backends: https://github.com/throwException/genode/commit/52245b9df53d289e66b10b7fa564...
I have some issue with giving the env() to the library. For me it's akin to giving suid to an executable. As long as the complexity of the driver is low, there is no problem. But you want to define that at *interface* level so every driver gets that access, even one with high complexity. And, for my taste, that is too much access. Is there some interface with less access that would work instead? Platform for instance?
Do you think this is a useful direction for our purpose? Do you have any suggestions to keep this as closely aligned with your implementation as possible?
Kind regards Stefan
Hi Stefan,
Thanks for your answers, which work great for our hardware crypto backend.
it's good to know that it worked out so well for you. Thank you for the positive feedback.
However, I had to make a few changes to the CBE crypto interface to enable asynchronous backends: https://github.com/throwException/genode/commit/52245b9df53d289e66b10b7fa564...
Do you think this is a useful direction for our purpose? Do you have any suggestions to keep this as closely aligned with your implementation as possible?
I'd like to leave the ultimate answer up to Martin but there are no reservations from my side.
For the next few months, there are no immediate changes planned to the CBE crypto interface. So you don't need to worry about diverging directions.
Cheers Norman