Hi Michael,
On Fri, 20 Jan 2023 16:40:10 +0100 Michael Grunditz michael.grunditz@gmail.com wrote:
Hello,
Is there any particular reason why it is empty? My rect copy to fb in riscos uses neon. It is a speed gain of about 40% compared to word/long word copy from c. But I don't know how much it affects Genode.
It seems like it ends up in /* eight bytes chunks */ but isn't that a byte copy?
There is no particular reason why the implementation is (i.e. "was") empty. You can find a recent commit on the staging branch that applies a few obvious optimisations to all architectures, though: https://github.com/genodelabs/genode/commit/4d06661d7c3f7b798ec8228f04983bd4...
For 32bit arm, I optimised the memcpy_cpu implementation a while ago (see Issue #4456). Interestingly, I could not see any improvements when using neon, at least on arm v7. I got the impression that the instruction density is not an issue when using the multi-word load/store (ldm/stm).
Johannes