Message ID | 20230415112340.38431-1-xry111@xry111.site |
---|---|
Headers | show |
Series | LoongArch: Multiarch string and memory copy routines for unaligned access | expand |
We are preparing a series of patches that include ifunc support (aligned/unaligned/vectorized assembly implementation) for str/mem functions, tunable functionality, and vectorized _dl_runtime_resolve. However, we are not currently able to submit them to the upstream community. We may consider publishing them on GitHub in the future like gcc and binutils. We will temporarily keep your patches. 在 2023/4/15 下午7:23, Xi Ruoyao 写道: > LoongArch CPUs may have hardware unaligned access support. For the > launched LoongArch CPUs, those branded as Loongson-3 (for desktops or > servers) have hardware unaligned access support, but those branded as > Loongson-2 (for embedded or industrial applications) do not. > > On Linux, the unaligned access support is indicated by a HWCAP bit > provided by the kernel. So we can multiarch stpcpy and memcpy with > ifunc to take the advantage on the CPUs with unaligned access support. > > On a Loongson-3A5000HV CPU running at 2.5GHz, "make bench" has shown > these changes can really improve the performance: > > - https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-stpcpy-summary.txt > - https://www.linuxfromscratch.org/~xry111/loongarch-ual-bench/bench-memcpy-summary.txt > > Xi Ruoyao (5): > LoongArch: Add bits/hwcap.h for Linux > LoongArch: Add LOONGARCH_HAVE_UAL macro > string: stpcpy.c: Only alias __stpcpy to stpcpy if STPCPY undefined > LoongArch: Multiarch stpcpy for unaligned access > LoongArch: Multiarch memcpy for unaligned access > > string/stpcpy.c | 3 ++ > sysdeps/loongarch/loongarch-features.h | 26 ++++++++++ > sysdeps/loongarch/multiarch/Makefile | 6 +++ > sysdeps/loongarch/multiarch/memcpy-generic.c | 27 ++++++++++ > sysdeps/loongarch/multiarch/memcpy-ual.c | 50 +++++++++++++++++++ > sysdeps/loongarch/multiarch/memcpy.c | 39 +++++++++++++++ > sysdeps/loongarch/multiarch/stpcpy-generic.c | 25 ++++++++++ > sysdeps/loongarch/multiarch/stpcpy-ual.c | 43 ++++++++++++++++ > sysdeps/loongarch/multiarch/stpcpy.c | 37 ++++++++++++++ > .../loongarch/multiarch/wordcopy-ual-inline.c | 31 ++++++++++++ > .../unix/sysv/linux/loongarch/bits/hwcap.h | 37 ++++++++++++++ > .../sysv/linux/loongarch/loongarch-features.h | 30 +++++++++++ > sysdeps/unix/sysv/linux/loongarch/sysdep.h | 1 + > 13 files changed, 355 insertions(+) > create mode 100644 sysdeps/loongarch/loongarch-features.h > create mode 100644 sysdeps/loongarch/multiarch/Makefile > create mode 100644 sysdeps/loongarch/multiarch/memcpy-generic.c > create mode 100644 sysdeps/loongarch/multiarch/memcpy-ual.c > create mode 100644 sysdeps/loongarch/multiarch/memcpy.c > create mode 100644 sysdeps/loongarch/multiarch/stpcpy-generic.c > create mode 100644 sysdeps/loongarch/multiarch/stpcpy-ual.c > create mode 100644 sysdeps/loongarch/multiarch/stpcpy.c > create mode 100644 sysdeps/loongarch/multiarch/wordcopy-ual-inline.c > create mode 100644 sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h > create mode 100644 sysdeps/unix/sysv/linux/loongarch/loongarch-features.h >