Cache invalidation issue on arm

Stefan Kalkowski stefan.kalkowski at
Mon Mar 4 12:19:22 CET 2019

Hello Tomasz,

On Sun, Mar 03, 2019 at 01:19:06AM +0100, Tomasz Gajewski wrote:
> Hi,
> during my attempts to make framebuffer driver working on rpi3b+ I had
> problems with memory caching. Effect was that I received from gpu
> information that result message was processed but couldn't see results -
> read memory always contained content of message and not response.
> Before I found a simple fix - ask dataspace for an uncached memory - I
> tried to call 'Kernel::update_data_region' but it didn't work for
> me. During my experiments I tried to dump memory from
> 'Kernel::Thread::_call_update_data_region()' for virtual address range
> passed to this function and it worked.
> In that function there is a comment that states that kernel operates in
> different address space than the caller for threads other than core. In
> that case different method of cache invalidation was called:
> 'clean_invalidate_data_cache', that should invalidate all cache. For
> core threads 'clean_invalidate_data_cache_by_virt_region' was used.
> Successfully dumping memory passed to _call_update_data_region() (using
> virtual address) made me think that a comment there about address spaces
> is wrong and - as an attempt to see what happens - I changed code to
> always call invalidating memory for given region only. This change [1]
> resolved issues with not seeing responses from gpu regarding framebuffer
> setup.
> If I'm not mistaken then:
>  1. Comment in 'Kernel::Thread::_call_update_data_region()' is not true
>     anymore and code can be changed to always call more efficient
>     version that invalidates always only cache for region of memory.

You are right, that comment and branch is an artefact from the time,
where the kernel used a separate address space. It should always call
the fewer cache line invalidation instead of invalidating the whole
cache. To my excuse, those functions are actually not used by any
components. Actually, the only use-case is the Javascript JIT compiler
of Arora I know of. But anyway, it needs to be fixed.

>  2. 'clean_invalidate_data_cache' is not working on rpi3b+ properly. I
>     wouldn't be really surprised as I used code from Cortex A15,
>     available already in Genode, without checking if Cortex A53 requires
>     a different code (even for AArch32 mode).

Well, this leaves me puzzled, I would have assumed that at least the
overall cross-core cache invalidation should work here too. Actually,
there is not much differences in between different Armv7 cores and the
data-cache clean/invalidate operations apart from the special outer L2
cache of Cortex A9 cpus.
Maybe the cross-core cache coherency is not setup appropriatedly? to
me it is not quite clear how smp setup is done on Cortex A53. When
looking into the manual, it seems to me that ACTLR register is
differently implemented, but I'm not sure whether I understood it
correctly. In Cortex A9 / A15 cpus, you had to enable coherency units
using the smb bit in that ACTLR register before enabling the MMU in
multi-core environments.


> Please give me information if kernel is now mapped into every address
> space (as is stated in the aforementioned comment as a future goal) and
> my change is a correct one. If it is not the case can you provide some
> other possible explanation of this?
> Tomasz Gajewski
> PS. As a minor addtion I found a trivial function documentation bug that
>     I fixed in [2].
> [1]
> [2]
> _______________________________________________
> Genode users mailing list
> users at

Stefan Kalkowski
Genode labs |

More information about the users mailing list