memcpy_cpu on 64bit arm
Michael Grunditz
michael.grunditz at gmail.com
Mon Jan 23 12:13:11 CET 2023
On Fri, 20 Jan 2023 at 20:15, Michael Grunditz
<michael.grunditz at gmail.com> wrote:
>
> > There is no particular reason why the implementation is (i.e. "was")
> > empty. You can find a recent commit on the staging branch that applies a
> > few obvious optimisations to all architectures, though:
> > https://github.com/genodelabs/genode/commit/4d06661d7c3f7b798ec8228f04983bd4ae7cddcf
> >
> Is there any content in the commit?
>
> > For 32bit arm, I optimised the memcpy_cpu implementation a while ago
> > (see Issue #4456). Interestingly, I could not see any improvements when
> > using neon, at least on arm v7. I got the impression that the
> > instruction density is not an issue when using the multi-word
> > load/store (ldm/stm).
>
> Ok. I think that it makes more sense now. But yes for v7 it might not
> help.The only
> way , from my experience, neon could be used effectively is a
> test/copy routine for
> all sizes. Most arm/arm64 libc do that.
>
> I would like to try this. But I have no Idea where to put the .s file
> in order to build it.
> I don't want to have it inline since it is quite big..
I have something that seems to work, even though I get a crash from
test-log that I haven't solved.
It could be because the .S file is built in every component. So the
question is, where do I put it!?!
It needs to live in "base", I guess. But I don't know how. The rest of
the system doesn't resolve
the symbol.
/Michael
More information about the users
mailing list