From patchwork Fri Apr 7 23:07:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Evan Green X-Patchwork-Id: 1766757 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=rivosinc-com.20210112.gappssmtp.com header.i=@rivosinc-com.20210112.gappssmtp.com header.a=rsa-sha256 header.s=20210112 header.b=LAOuux/I; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PtYtZ0y2Hz1yZF for ; Sat, 8 Apr 2023 09:07:50 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A3DF3857017 for ; Fri, 7 Apr 2023 23:07:46 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id 9E7E63858C1F for ; Fri, 7 Apr 2023 23:07:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9E7E63858C1F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pl1-x62a.google.com with SMTP id kx12so2223644plb.12 for ; Fri, 07 Apr 2023 16:07:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; t=1680908847; x=1683500847; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YkaGDQE+tPHHgll1cZde3EZiWrEIMjVRjFYMfVGQSEg=; b=LAOuux/I61btKejMDMRE2yCa8y0XX47M9Az9mTRmj/pQT55Ew/0jHMW95y8poHAI5R 6DCA4Hh/lraVGzvP6/yR8uQp0AX0IMQTHJCVNen7pKkCdj1kHSejIFizDhtG4Ab/hWap KfpMjBd6PQS8pTrtMBuTLiNBlbvYJan+5tsSmxHpKOYl6STMwYy9eMYVWSs2qq21dnvY hiWMfBgSkIJDEyiLyALR7WaPMj5+HWme1YaAJgprmwqJoBR2g+9SkTYEstvJVlCNVTnv MyF0WxKJRNPzHVDNQNxOiCy8PvsdTsYgTGHHGsdUkrJcU90F9n9eBtaVG4MZh72JDvC5 SDzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680908847; x=1683500847; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YkaGDQE+tPHHgll1cZde3EZiWrEIMjVRjFYMfVGQSEg=; b=l/Q2HpJC4jCyCStBHN88Jt1LsiIV8YjQp5vq8sxCA0+1aDjddLfvr12Of3tnqa+e7x S1q+LIu+GUP9xqdqdIPCv1fafnhyw/Ts0ZfxPZUgVFJg5ycHflPN/zoGSvqNyKCb6y8K TSOtXscA9xATgmxTEN3jNkztF8bel9hSdM5kcn7yJGAruYga5bcZzPho5oVdGmj6mNE1 nbklLZkoVWdbjxXcY+x4ZJ/KEBlBy8SdRqs1psmLI0vJM2AoVrjq7yHP06VwyDE8GxYc NF1tkiVVzKLwt3JN1yvuE80RtqgzyDcCxIq8cOAM0M9qvdZdvOoTkIEiTgvmuqLm9v1x Xd+g== X-Gm-Message-State: AAQBX9fpxcZNzA09p7iP/cUcwWaJiRLwzFzQIINmR/1yAMFWIvijJg0C c3HOJnkYp9hcjZpzbvzYRq/bWaKEm6FAxbowZxg= X-Google-Smtp-Source: AKy350ZBA4WGPEMIaHfMPNnRf9X/aKqKVAN/0Nsh3ZO+q+5DgWr7+9f2/mRanYHG0PK3yXCq9x0H1Q== X-Received: by 2002:a17:90b:4c89:b0:23d:1e5f:eea6 with SMTP id my9-20020a17090b4c8900b0023d1e5feea6mr4140293pjb.24.1680908847149; Fri, 07 Apr 2023 16:07:27 -0700 (PDT) Received: from evan.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id ei7-20020a17090ae54700b00240ab3c5f66sm3224691pjb.29.2023.04.07.16.07.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Apr 2023 16:07:26 -0700 (PDT) From: Evan Green To: libc-alpha@sourceware.org Cc: vineetg@rivosinc.com, palmer@rivosinc.com, slewis@rivosinc.com, Evan Green Subject: [PATCH v3 3/3] riscv: Add and use alignment-ignorant memcpy Date: Fri, 7 Apr 2023 16:07:10 -0700 Message-Id: <20230407230711.2621614-4-evan@rivosinc.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230407230711.2621614-1-evan@rivosinc.com> References: <20230407230711.2621614-1-evan@rivosinc.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" For CPU implementations that can perform unaligned accesses with little or no performance penalty, create a memcpy implementation that does not bother aligning buffers. It will use a block of integer registers, a single integer register, and fall back to bytewise copy for the remainder. Signed-off-by: Evan Green Reviewed-by: Palmer Dabbelt --- Changes in v3: - Word align dest for large memcpy()s. - Add tags - Remove spurious blank line from sysdeps/riscv/memcpy.c Changes in v2: - Used _MASK instead of _FAST value itself. --- sysdeps/riscv/memcopy.h | 28 ++++ sysdeps/riscv/memcpy.c | 64 +++++++++ sysdeps/riscv/memcpy_noalignment.S | 121 ++++++++++++++++++ sysdeps/unix/sysv/linux/riscv/Makefile | 4 + .../unix/sysv/linux/riscv/memcpy-generic.c | 24 ++++ 5 files changed, 241 insertions(+) create mode 100644 sysdeps/riscv/memcopy.h create mode 100644 sysdeps/riscv/memcpy.c create mode 100644 sysdeps/riscv/memcpy_noalignment.S create mode 100644 sysdeps/unix/sysv/linux/riscv/memcpy-generic.c diff --git a/sysdeps/riscv/memcopy.h b/sysdeps/riscv/memcopy.h new file mode 100644 index 0000000000..21f6081b5f --- /dev/null +++ b/sysdeps/riscv/memcopy.h @@ -0,0 +1,28 @@ +/* memcopy.h -- definitions for memory copy functions. RISC-V version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +/* + * Redefine the generic memcpy implementation to __memcpy_generic, so + * the memcpy ifunc can select between generic and special versions. + * In rtld, don't bother with all the ifunciness. + */ +#if IS_IN (libc) +#define MEMCPY __memcpy_generic +#endif diff --git a/sysdeps/riscv/memcpy.c b/sysdeps/riscv/memcpy.c new file mode 100644 index 0000000000..fdb8dc3208 --- /dev/null +++ b/sysdeps/riscv/memcpy.c @@ -0,0 +1,64 @@ +/* Multiple versions of memcpy. + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2017-2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#if IS_IN (libc) +/* Redefine memcpy so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef memcpy +# define memcpy __redirect_memcpy +# include +#include +#include + +#define INIT_ARCH() + +extern __typeof (__redirect_memcpy) __libc_memcpy; + +extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden; +extern __typeof (__redirect_memcpy) __memcpy_noalignment attribute_hidden; + +static inline __typeof (__redirect_memcpy) * +select_memcpy_ifunc (void) +{ + INIT_ARCH (); + + struct riscv_hwprobe pair; + + pair.key = RISCV_HWPROBE_KEY_CPUPERF_0; + if (__riscv_hwprobe(&pair, 1, 0, NULL, 0) != 0) + return __memcpy_generic; + + if ((pair.key > 0) && + (pair.value & RISCV_HWPROBE_MISALIGNED_MASK) == + RISCV_HWPROBE_MISALIGNED_FAST) + return __memcpy_noalignment; + + return __memcpy_generic; +} + +libc_ifunc (__libc_memcpy, select_memcpy_ifunc ()); + +# undef memcpy +strong_alias (__libc_memcpy, memcpy); +# ifdef SHARED +__hidden_ver1 (memcpy, __GI_memcpy, __redirect_memcpy) + __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memcpy); +# endif + +#endif diff --git a/sysdeps/riscv/memcpy_noalignment.S b/sysdeps/riscv/memcpy_noalignment.S new file mode 100644 index 0000000000..80f5e09ebb --- /dev/null +++ b/sysdeps/riscv/memcpy_noalignment.S @@ -0,0 +1,121 @@ +/* memcpy for RISC-V, ignoring buffer alignment + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +/* void *memcpy(void *, const void *, size_t) */ +ENTRY (__memcpy_noalignment) + move t6, a0 /* Preserve return value */ + + /* Round down to the nearest "page" size */ + andi a4, a2, ~((16*SZREG)-1) + beqz a4, 2f + add a3, a1, a4 + + /* Copy the first word to get dest word aligned */ + andi a5, t6, SZREG-1 + beqz a5, 1f + REG_L a6, (a1) + REG_S a6, (t6) + + /* Align dst up to a word, move src and size as well. */ + addi t6, t6, SZREG-1 + andi t6, t6, ~(SZREG-1) + sub a5, t6, a0 + add a1, a1, a5 + sub a2, a2, a5 + + /* Recompute page count */ + andi a4, a2, ~((16*SZREG)-1) + beqz a4, 2f + +1: + /* Copy "pages" (chunks of 16 registers) */ + REG_L a4, 0(a1) + REG_L a5, SZREG(a1) + REG_L a6, 2*SZREG(a1) + REG_L a7, 3*SZREG(a1) + REG_L t0, 4*SZREG(a1) + REG_L t1, 5*SZREG(a1) + REG_L t2, 6*SZREG(a1) + REG_L t3, 7*SZREG(a1) + REG_L t4, 8*SZREG(a1) + REG_L t5, 9*SZREG(a1) + REG_S a4, 0(t6) + REG_S a5, SZREG(t6) + REG_S a6, 2*SZREG(t6) + REG_S a7, 3*SZREG(t6) + REG_S t0, 4*SZREG(t6) + REG_S t1, 5*SZREG(t6) + REG_S t2, 6*SZREG(t6) + REG_S t3, 7*SZREG(t6) + REG_S t4, 8*SZREG(t6) + REG_S t5, 9*SZREG(t6) + REG_L a4, 10*SZREG(a1) + REG_L a5, 11*SZREG(a1) + REG_L a6, 12*SZREG(a1) + REG_L a7, 13*SZREG(a1) + REG_L t0, 14*SZREG(a1) + REG_L t1, 15*SZREG(a1) + addi a1, a1, 16*SZREG + REG_S a4, 10*SZREG(t6) + REG_S a5, 11*SZREG(t6) + REG_S a6, 12*SZREG(t6) + REG_S a7, 13*SZREG(t6) + REG_S t0, 14*SZREG(t6) + REG_S t1, 15*SZREG(t6) + addi t6, t6, 16*SZREG + bltu a1, a3, 1b + andi a2, a2, (16*SZREG)-1 /* Update count */ + +2: + /* Remainder is smaller than a page, compute native word count */ + beqz a2, 6f + andi a5, a2, ~(SZREG-1) + andi a2, a2, (SZREG-1) + add a3, a1, a5 + /* Jump directly to byte copy if no words. */ + beqz a5, 4f + +3: + /* Use single native register copy */ + REG_L a4, 0(a1) + addi a1, a1, SZREG + REG_S a4, 0(t6) + addi t6, t6, SZREG + bltu a1, a3, 3b + + /* Jump directly out if no more bytes */ + beqz a2, 6f + +4: + /* Copy the last few individual bytes */ + add a3, a1, a2 +5: + lb a4, 0(a1) + addi a1, a1, 1 + sb a4, 0(t6) + addi t6, t6, 1 + bltu a1, a3, 5b +6: + ret + +END (__memcpy_noalignment) + +hidden_def (__memcpy_noalignment) diff --git a/sysdeps/unix/sysv/linux/riscv/Makefile b/sysdeps/unix/sysv/linux/riscv/Makefile index 45cc29e40d..aa9ea443d6 100644 --- a/sysdeps/unix/sysv/linux/riscv/Makefile +++ b/sysdeps/unix/sysv/linux/riscv/Makefile @@ -7,6 +7,10 @@ ifeq ($(subdir),stdlib) gen-as-const-headers += ucontext_i.sym endif +ifeq ($(subdir),string) +sysdep_routines += memcpy memcpy-generic memcpy_noalignment +endif + abi-variants := ilp32 ilp32d lp64 lp64d ifeq (,$(filter $(default-abi),$(abi-variants))) diff --git a/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c b/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c new file mode 100644 index 0000000000..0abe03f7f5 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/memcpy-generic.c @@ -0,0 +1,24 @@ +/* Re-include the default memcpy implementation. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern __typeof (memcpy) __memcpy_generic; +hidden_proto(__memcpy_generic) + +#include