From patchwork Sat Apr 15 11:23:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 1769277 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=hXMcSAYP; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Pz9ww351Dz1yZt for ; Sat, 15 Apr 2023 21:25:52 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B2FDB3856DF6 for ; Sat, 15 Apr 2023 11:25:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B2FDB3856DF6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1681557949; bh=m/MiL9i2v0psQrleICfmgLRAgR3Am2oSpeyHnLVPBw4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=hXMcSAYP2PARcYL6uf75DtC+B2/ESuGOPUiOyDNFo7sG3eOYYHc5Y9k33IQ5DDYCK VIba9zWr3y/QrHCm2e5x7htqJpHNILDxeIAPHO4RxMNZgkvmGJM2gi3mjPrYFI1rfH VZsFwM79MQTxURCoPrN6NvfLFnMR/xhu+bZDsV+I= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1]) by sourceware.org (Postfix) with ESMTPS id 274C6385734E for ; Sat, 15 Apr 2023 11:24:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 274C6385734E Received: from stargazer.. (unknown [113.140.11.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id AEAA266128; Sat, 15 Apr 2023 07:24:01 -0400 (EDT) To: libc-alpha@sourceware.org Cc: caiyinyu , Wang Xuerui , Adhemerval Zanella Netto , Xi Ruoyao Subject: [PATCH 5/5] LoongArch: Multiarch memcpy for unaligned access Date: Sat, 15 Apr 2023 19:23:40 +0800 Message-Id: <20230415112340.38431-6-xry111@xry111.site> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230415112340.38431-1-xry111@xry111.site> References: <20230415112340.38431-1-xry111@xry111.site> MIME-Version: 1.0 X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_FROM, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Xi Ruoyao via Libc-alpha From: Xi Ruoyao Reply-To: Xi Ruoyao Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" When the CPU supports unaligned access, we can align the dest pointer first (this will make a better performance than solely relying on the hardware unaligned access support), then just copy the memory area word by word pretending both src and dest are aligned. --- sysdeps/loongarch/multiarch/Makefile | 3 +- sysdeps/loongarch/multiarch/memcpy-generic.c | 27 ++++++++++ sysdeps/loongarch/multiarch/memcpy-ual.c | 50 +++++++++++++++++++ sysdeps/loongarch/multiarch/memcpy.c | 39 +++++++++++++++ .../loongarch/multiarch/wordcopy-ual-inline.c | 31 ++++++++++++ 5 files changed, 149 insertions(+), 1 deletion(-) create mode 100644 sysdeps/loongarch/multiarch/memcpy-generic.c create mode 100644 sysdeps/loongarch/multiarch/memcpy-ual.c create mode 100644 sysdeps/loongarch/multiarch/memcpy.c create mode 100644 sysdeps/loongarch/multiarch/wordcopy-ual-inline.c diff --git a/sysdeps/loongarch/multiarch/Makefile b/sysdeps/loongarch/multiarch/Makefile index 958752bcbd..34e2f2a334 100644 --- a/sysdeps/loongarch/multiarch/Makefile +++ b/sysdeps/loongarch/multiarch/Makefile @@ -1,5 +1,6 @@ ifeq ($(subdir),string) -sysdep_routines += stpcpy-generic stpcpy-ual +sysdep_routines += stpcpy-generic stpcpy-ual memcpy-generic memcpy-ual CFLAGS-stpcpy-ual.c += -mno-strict-align +CFLAGS-memcpy-ual.c += -mno-strict-align endif diff --git a/sysdeps/loongarch/multiarch/memcpy-generic.c b/sysdeps/loongarch/multiarch/memcpy-generic.c new file mode 100644 index 0000000000..9374ced033 --- /dev/null +++ b/sysdeps/loongarch/multiarch/memcpy-generic.c @@ -0,0 +1,27 @@ +/* Multiarch memcpy for LoongArch. Generic version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern __typeof (memcpy) __memcpy_generic attribute_hidden; + +#define MEMCPY __memcpy_generic +#undef libc_hidden_def +#define libc_hidden_def(name) + +#include diff --git a/sysdeps/loongarch/multiarch/memcpy-ual.c b/sysdeps/loongarch/multiarch/memcpy-ual.c new file mode 100644 index 0000000000..e7cd8f253b --- /dev/null +++ b/sysdeps/loongarch/multiarch/memcpy-ual.c @@ -0,0 +1,50 @@ +/* Multiarch memcpy for LoongArch. Unaligned access version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern __typeof (memcpy) __memcpy_ual attribute_hidden; + +#include "wordcopy-ual-inline.c" + +#define OPSIZ (sizeof (op_t)) + +void * +__memcpy_ual (void *dest, const void *src, size_t len) +{ + unsigned long int dstp = (long int) dest; + unsigned long int srcp = (long int) src; + + /* If there not too few bytes to copy, use word copy. */ + if (len >= OP_T_THRES) + { + /* Copy just a few bytes to make DSTP aligned. Not needed with + unaligned access support, but it improves the performance. */ + len -= (-dstp) % OPSIZ; + BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ); + + _wordcopy_fwd_ual (dstp, srcp, len / OPSIZ); + dstp += len & -OPSIZ; + srcp += len & -OPSIZ; + len %= OPSIZ; + } + + BYTE_COPY_FWD (dstp, srcp, len); + + return dest; +} diff --git a/sysdeps/loongarch/multiarch/memcpy.c b/sysdeps/loongarch/multiarch/memcpy.c new file mode 100644 index 0000000000..6a3089f88c --- /dev/null +++ b/sysdeps/loongarch/multiarch/memcpy.c @@ -0,0 +1,39 @@ +/* Multiple versions of memcpy. LoongArch version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#if defined SHARED && IS_IN (libc) +# undef memcpy +# define memcpy __redirect_memcpy +# include +# undef memcpy + +extern __typeof (__redirect_memcpy) __libc_memcpy; + +extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden; +extern __typeof (__redirect_memcpy) __memcpy_ual attribute_hidden; + +# include +# define INIT_ARCH() + +libc_ifunc (__libc_memcpy, + LOONGARCH_HAVE_UAL ? __memcpy_ual : __memcpy_generic); +strong_alias (__libc_memcpy, memcpy); +libc_hidden_ver (__libc_memcpy, memcpy) +#else +# include +#endif diff --git a/sysdeps/loongarch/multiarch/wordcopy-ual-inline.c b/sysdeps/loongarch/multiarch/wordcopy-ual-inline.c new file mode 100644 index 0000000000..a552aa6946 --- /dev/null +++ b/sysdeps/loongarch/multiarch/wordcopy-ual-inline.c @@ -0,0 +1,31 @@ +/* Reuse subroutine from string/wordcopy.c for LoongArch unaligned access. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +static __always_inline void _wordcopy_fwd_ual (long int, long int, size_t); +static void _nouse_1 (long int, long int, size_t) __attribute__ ((unused)); +static void _nouse_2 (long int, long int, size_t) __attribute__ ((unused)); +static void _nouse_3 (long int, long int, size_t) __attribute__ ((unused)); + +#define WORDCOPY_FWD_ALIGNED _wordcopy_fwd_ual +#define WORDCOPY_BWD_ALIGNED _nouse_1 +#define WORDCOPY_FWD_DEST_ALIGNED _nouse_2 +#define WORDCOPY_BWD_DEST_ALIGNED _nouse_3 + +#include