Hi, I was wondering if anyone else has had problems with file vault performance. I tested a Debian VM with and without the file vault and performance was noticeably impacted by its use. Additionally, heavy file vault I/O (copying lots of data or growing the vault) seems to tie up one of the CPUs completely, although the top tool shows that the nvme fs component is taking more of the CPU load than the file vault, which surprised me... but that may just be an artifact of the parent/child relationship.
For some specific numbers: ~47% load for init -> runtime -> nvme-1.3.fs ~14% load for init -> runtime -> file_vault -> file_vault -> tresor ~8% load for init -> runtime -> nvme ~7% load for init -> runtime -> nvme-1.part All one one CPU, others were idle. This was while expanding file vault capacity from 30 to 64 GB. The process took over half an hour to complete (still running at time of this email).
This is on a newer machine that supports AES-NI and which has no problem dealing with LUKS full disk encryption, so it surprised me to see these results.
I performed some follow-up testing to verify that it was indeed the file vault and not another component.
First I checked the NVMe itself and noticed that the thermal pad was not aligned with the flash chip. I fixed that issue but it didn't seem to make much of a difference (perhaps a slight improvement).
Next I ran some tests, using three otherwise identical Debian VMs. One VM was installed on the file vault which was on a recall fs on the NVMe; one was on the recall fs with no encryption; and one was on the recall fs using Debian's default LUKS encryption.
I ran the following tests: head -c 1G </dev/urandom >/dev/null (control test, <5s each run) head -c 1G </dev/urandom >1Gfile (write 1G file with random data) sync (run immediately following the writing of the file)
These were the results: File vault - 2.5 minutes to write the file, another ~3m to finish sync No encryption - 20s to write the file, another 20s to finish sync LUKS - 45s to write the file, another ~15s to finish sync
One CPU was also pegged while writing to the file vault, while usage was visible but reasonable on the other two tests.
The tests weren't perfect and I could have been more precise about collecting the data, but this was good enough to point to the problem.
LUKS does cause a performance penalty for the VM, but the result is still in an acceptable range if you want encryption-at-rest for your VM data. The file vault performance however is clearly inadequate for this purpose. Since it is outside the VM and is not encumbered by the extra virtualization layer, I was surprised by these results.
Now, I don't like the idea of relying entirely on LUKS for my VM encryption, as it's not fully encrypted--the bootloader is still unencrypted. Someone could therefore modify GRUB to either intercept the LUKS password or to store the unlocked key somewhere on /boot (there is prior art for this).
The middle ground could be to put /boot (with the LUKS header, why not) on a partition stored on the file vault, with the remaining data encrypted only using LUKS on the recall fs. This has an additional benefit in that LUKS is battle hardened, and as long as nobody gets root in the VM then your data should be safe. Putting the LUKS header on /boot isn't strictly necessary here but it won't hurt.
In the long run I'd rather rely on the file vault for all encryption, but this should be an acceptable compromise until more eyes have looked over the file vault code.
Thanks mntn, for your thorough testing and considerations of alternative approaches to VM FDE based on the file vault!
What you discovered is a known shortcoming of the current file-vault implementation: Performance optimization for heavy throughput as well as latency concerns were not yet addressed. For example, independent requests on the vault are not yet parallelized and base primitives (e.g., encryption) are not segmented and distributed on multiple CPUs. LUKS on the other experienced years of optimization in those regards.
The current work-in-progress is focused on reasonable complexity, robustness, and absence of errors of the underlying tresor library - the block-encryption layer. Common use cases comprise storage of credentials like wifi passwords or passphrases as well as journal/notes files that should be kept separate from VMs. Unfortunately, we are midway through replanning future file-vault activities as the main developer left the team after he passed the baton on to me. Also for me, the vault is a valuable asset of Genode but I've to admit that other tasks enjoy higher priority currently.
Regards
Thanks for confirming the current state of the file vault; glad it's not just my setup. Fortunately it is functional enough for a split file vault/LUKS approach, and it seems to be stable with reasonably sized vaults.
I'll report any specific issues I encounter on Github.