mbox series

[0/6] LoongArch: Add ifunc support for {raw}memchr,

Message ID 20230828072651.3085034-1-dengjianbo@loongson.cn
Headers show
Series LoongArch: Add ifunc support for {raw}memchr, | expand

Message

dengjianbo Aug. 28, 2023, 7:26 a.m. UTC
This patch add mutiple versions of rawmemchr, memchr, memrchr, memset,
memcmp implemented by LoongArch basic instructions, LSX instructions,
LASX instructions, comparing with current generic version, even this
implementation experience performance degradation in few cases, overall,
the performance gains are significant.

See:
https://github.com/jiadengx/glibc_test/blob/main/bench/rawmemchr_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/memchr_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/memrchr_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/memset_compare.out
https://github.com/jiadengx/glibc_test/blob/main/bench/memcmp_compare.out

In the data, positive values in the parentheses indicate that out
implementation took less time, indicating a performance improvement;
negative values in the parentheses mean that our implementation took
more time, indicating a decrease in performance. Following is the
summarize of the performance comparing with the generic version in the
glibc microbenchmark:

Name                   Percent of time reduced
rawmemchr-lasx         40%-80%
rawmemchr-lsx          40%-66%
rawmemchr-aligned      20%-40%

memchr-lasx            37%-83%
memchr-lsx             30%-66%
memchr-aligned         0%-15%

memrchr-lasx           20%-83%
memrchr-lsx            20%-64%

memset-lasx            15%-75%
memset-lsx             15%-50%
memset-unaligned       performance is close when the length larger than
                       128. For 8-128, 30%-70%
memset-aligned         performance is close when the length larger than
                       128. For 8-128, 20%-50%

memcmp-lasx            16%-74%
memcmp-lsx             20%-50%
memcmp-aligned         5%-20%

dengjianbo (6):
  LoongArch: Add ifunc support for rawmemchr{aligned, lsx, lasx}
  LoongArch: Add ifunc support for memchr{aligned, lsx, lasx}
  LoongArch: Add ifunc support for memrchr{lsx, lasx}
  LoongArch: Add ifunc support for memset{aligned, unaligned, lsx, lasx}
  LoongArch: Add ifunc support for memcmp{aligned, lsx, lasx}
  LoongArch: Change loongarch to LoongArch in comments

 sysdeps/loongarch/lp64/multiarch/Makefile     |  16 +
 .../lp64/multiarch/dl-symbol-redir-ifunc.h    |  24 ++
 .../lp64/multiarch/ifunc-impl-list.c          |  40 +++
 .../loongarch/lp64/multiarch/ifunc-memchr.h   |  40 +++
 .../loongarch/lp64/multiarch/ifunc-memcmp.h   |  40 +++
 .../loongarch/lp64/multiarch/ifunc-memrchr.h  |  40 +++
 .../lp64/multiarch/ifunc-rawmemchr.h          |  40 +++
 .../loongarch/lp64/multiarch/memchr-aligned.S |  95 ++++++
 .../loongarch/lp64/multiarch/memchr-lasx.S    | 117 +++++++
 sysdeps/loongarch/lp64/multiarch/memchr-lsx.S | 102 ++++++
 sysdeps/loongarch/lp64/multiarch/memchr.c     |  37 +++
 .../loongarch/lp64/multiarch/memcmp-aligned.S | 292 ++++++++++++++++++
 .../loongarch/lp64/multiarch/memcmp-lasx.S    | 207 +++++++++++++
 sysdeps/loongarch/lp64/multiarch/memcmp-lsx.S | 269 ++++++++++++++++
 sysdeps/loongarch/lp64/multiarch/memcmp.c     |  43 +++
 .../loongarch/lp64/multiarch/memcpy-aligned.S |   2 +-
 .../loongarch/lp64/multiarch/memcpy-lasx.S    |   2 +-
 sysdeps/loongarch/lp64/multiarch/memcpy-lsx.S |   2 +-
 .../lp64/multiarch/memcpy-unaligned.S         |   2 +-
 .../lp64/multiarch/memmove-aligned.S          |   2 +-
 .../loongarch/lp64/multiarch/memmove-lasx.S   |   2 +-
 .../loongarch/lp64/multiarch/memmove-lsx.S    |   2 +-
 .../lp64/multiarch/memmove-unaligned.S        |   2 +-
 .../lp64/multiarch/memrchr-generic.c          |  23 ++
 .../loongarch/lp64/multiarch/memrchr-lasx.S   | 123 ++++++++
 .../loongarch/lp64/multiarch/memrchr-lsx.S    | 105 +++++++
 sysdeps/loongarch/lp64/multiarch/memrchr.c    |  33 ++
 .../loongarch/lp64/multiarch/memset-aligned.S | 174 +++++++++++
 .../loongarch/lp64/multiarch/memset-lasx.S    | 142 +++++++++
 sysdeps/loongarch/lp64/multiarch/memset-lsx.S | 135 ++++++++
 .../lp64/multiarch/memset-unaligned.S         | 162 ++++++++++
 sysdeps/loongarch/lp64/multiarch/memset.c     |  37 +++
 .../lp64/multiarch/rawmemchr-aligned.S        | 124 ++++++++
 .../loongarch/lp64/multiarch/rawmemchr-lasx.S |  82 +++++
 .../loongarch/lp64/multiarch/rawmemchr-lsx.S  |  71 +++++
 sysdeps/loongarch/lp64/multiarch/rawmemchr.c  |  37 +++
 .../loongarch/lp64/multiarch/strchr-aligned.S |   2 +-
 .../loongarch/lp64/multiarch/strchr-lasx.S    |   2 +-
 sysdeps/loongarch/lp64/multiarch/strchr-lsx.S |   2 +-
 .../lp64/multiarch/strchrnul-aligned.S        |   2 +-
 .../loongarch/lp64/multiarch/strchrnul-lasx.S |   2 +-
 .../loongarch/lp64/multiarch/strchrnul-lsx.S  |   2 +-
 .../loongarch/lp64/multiarch/strcmp-aligned.S |   2 +-
 sysdeps/loongarch/lp64/multiarch/strcmp-lsx.S |   2 +-
 .../loongarch/lp64/multiarch/strlen-aligned.S |   2 +-
 .../loongarch/lp64/multiarch/strlen-lasx.S    |   2 +-
 sysdeps/loongarch/lp64/multiarch/strlen-lsx.S |   2 +-
 .../lp64/multiarch/strncmp-aligned.S          |   2 +-
 .../loongarch/lp64/multiarch/strncmp-lsx.S    |   2 +-
 .../lp64/multiarch/strnlen-aligned.S          |   2 +-
 .../loongarch/lp64/multiarch/strnlen-lasx.S   |   2 +-
 .../loongarch/lp64/multiarch/strnlen-lsx.S    |   2 +-
 52 files changed, 2674 insertions(+), 24 deletions(-)
 create mode 100644 sysdeps/loongarch/lp64/multiarch/dl-symbol-redir-ifunc.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-memchr.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-memcmp.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-memrchr.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/ifunc-rawmemchr.h
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memchr-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memchr-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memchr-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memchr.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcmp-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcmp-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcmp-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memcmp.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memrchr-generic.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memrchr-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memrchr-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memrchr.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memset-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memset-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memset-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memset-unaligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/memset.c
 create mode 100644 sysdeps/loongarch/lp64/multiarch/rawmemchr-aligned.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/rawmemchr-lasx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/rawmemchr-lsx.S
 create mode 100644 sysdeps/loongarch/lp64/multiarch/rawmemchr.c