Hi everyone,
I'm currently measuring the memory bandwidth of the Genode Armv8 Linux Virtualization by running tinymembench[1] and lmbench's memory bandwidth[2]. On the NXP I.mx8 the results of both benchmarks showed a degradation of roughly 25 to 35 percent (depending on the exact benchmark) compared to native Linux.
I tried to ensure comparability with the native setup by disabling in native Linux all but one core and fixing this core's frequency to 1 GHz (which should be the frequency Genode uses). Both setups use the same Kernel version.
To improve the performance I played with the register values controlling caching (clear HCR_EL2.CD - enable S2 Data cacheability; VTCR_EL2.Irgn0 and VTCR_EL2.Orgn0 respectively set to b11 - set Normal memory, Inner/Outer Write-Back Read-allocate No Write-Allocate Cachable). However, this had no significant effect.
What's your opinion on that? Based on your experience with virtualization, do these values surprise you or is this an expected degradation?
All the best, Chris
[1] https://github.com/ssvb/tinymembench [2] http://lmbench.sourceforge.net/
Hi Chris,
On Mon, Jun 08, 2020 at 10:33:49PM +0200, Chris Hofer wrote:
Hi everyone,
I'm currently measuring the memory bandwidth of the Genode Armv8 Linux Virtualization by running tinymembench[1] and lmbench's memory bandwidth[2]. On the NXP I.mx8 the results of both benchmarks showed a degradation of roughly 25 to 35 percent (depending on the exact benchmark) compared to native Linux.
I tried to ensure comparability with the native setup by disabling in native Linux all but one core and fixing this core's frequency to 1 GHz (which should be the frequency Genode uses). Both setups use the same Kernel version.
To improve the performance I played with the register values controlling caching (clear HCR_EL2.CD - enable S2 Data cacheability; VTCR_EL2.Irgn0 and VTCR_EL2.Orgn0 respectively set to b11 - set Normal memory, Inner/Outer Write-Back Read-allocate No Write-Allocate Cachable). However, this had no significant effect.
What's your opinion on that? Based on your experience with virtualization, do these values surprise you or is this an expected degradation?
I would expect the memory bandwidth to be better, but it's more a gut feeling, I've never measured the memory bandwidth of a virtualized Linux VM before. What I've measured is the overhead of a 1:1 paged 2-stages mapping using the LPAE on ARMv7, which is using the same format we've chosen for ARMv8 with respect to page-table depth of stage1 and stage2. Back then, I've measured a memory bandwidth degragation of ~4%.
So I assume the VM being scheduled during the time being measured? Are there any other components running, like user-land timer, drivers, etc? If so, it would naturally explain why you loose time for copying during your measurement time-span.
Best regards Stefan
All the best, Chris
[1] https://github.com/ssvb/tinymembench [2] http://lmbench.sourceforge.net/
Genode users mailing list users@lists.genode.org https://lists.genode.org/listinfo/users