On Fri, 20 Jan 2023 at 20:15, Michael Grunditz michael.grunditz@gmail.com wrote:
There is no particular reason why the implementation is (i.e. "was") empty. You can find a recent commit on the staging branch that applies a few obvious optimisations to all architectures, though: https://github.com/genodelabs/genode/commit/4d06661d7c3f7b798ec8228f04983bd4...
Is there any content in the commit?
For 32bit arm, I optimised the memcpy_cpu implementation a while ago (see Issue #4456). Interestingly, I could not see any improvements when using neon, at least on arm v7. I got the impression that the instruction density is not an issue when using the multi-word load/store (ldm/stm).
Ok. I think that it makes more sense now. But yes for v7 it might not help.The only way , from my experience, neon could be used effectively is a test/copy routine for all sizes. Most arm/arm64 libc do that.
I would like to try this. But I have no Idea where to put the .s file in order to build it. I don't want to have it inline since it is quite big..
I have something that seems to work, even though I get a crash from test-log that I haven't solved. It could be because the .S file is built in every component. So the question is, where do I put it!?! It needs to live in "base", I guess. But I don't know how. The rest of the system doesn't resolve the symbol.
/Michael