From patchwork Thu Jun 9 04:16:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 1641013 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=jFHZONxL; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LJW5T58qHz9s09 for ; Thu, 9 Jun 2022 14:17:17 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 661003834F0F for ; Thu, 9 Jun 2022 04:17:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 661003834F0F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1654748232; bh=lIrxltMACzljYbvTppdVyxPytzxSlKwDxXhgCu5z3uU=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=jFHZONxL04U6QIrJ73tOel6NGy/WaSO2hbxG9o1nwplYqc6S75EzvSDwjAiAAqK8+ D+Y++KFSg/kphHpFu+xK5RI82v20Vvmasil8WrwUwiH1I7zKSOPLLz+AaEfZK4r5f6 ou37maLLnShDGOcjNczAXwW3fbaVpMgG+86J0qEw= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id 3531A38356A4 for ; Thu, 9 Jun 2022 04:16:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3531A38356A4 Received: by mail-pf1-x42c.google.com with SMTP id z17so20039117pff.7 for ; Wed, 08 Jun 2022 21:16:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=lIrxltMACzljYbvTppdVyxPytzxSlKwDxXhgCu5z3uU=; b=fGxrCsD6nUlYOfDF0sIW7bDK6Nkpu1AzFdIfJcnIHCnvuxVuqKSJOOz6GWW49QEfud VN6m+oV9s4KePIJqGe1tyiST62dV+ztT8t/0JPW2fkKMx7EG5lKzMpC0hfYxrAlc+qLM YVPD6JwvC+YRZPn8zlG3xzRjQL2qBcGWSZGjEM0XzK252eYqF71SB2OfaMrFNwjSYTSb 3ws171JVoifxiTCQyRXks+lSX8EG0dwFwDZgzzAKSearNNmJrWAP/e1tXAVsP5z2fdmt NN15KIQQfm3O+YldPcGKsuBwPd7G+Fw/8lHlVl6dBZ9ILRk3pgiX7hOS1ZSDC9WqPBDI QNCg== X-Gm-Message-State: AOAM531oCTFRdlsbarWrF0CWsGBfs4ZJHKYXH9wHTHnKoLMQDVmrQ+2t eSy/haFscpiFpgVcSsu6yy4JwcIKdBsbeQ== X-Google-Smtp-Source: ABdhPJwcTDCbyfjB/fyoPLYtGKnneeTu77CQYat8PiExs1lhEwXqbyclrpgBVvANRZDG4D6U/PnjhQ== X-Received: by 2002:a05:6a00:a19:b0:51e:48cc:3acf with SMTP id p25-20020a056a000a1900b0051e48cc3acfmr1511421pfh.68.1654748216111; Wed, 08 Jun 2022 21:16:56 -0700 (PDT) Received: from noah-tgl.. ([2600:1010:b04a:6ef:d217:ff37:61dd:fb1]) by smtp.gmail.com with ESMTPSA id d10-20020a170902e14a00b00166d8100b7bsm11833644pla.176.2022.06.08.21.16.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jun 2022 21:16:55 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v1 1/3] x86: Align varshift table to 32-bytes Date: Wed, 8 Jun 2022 21:16:51 -0700 Message-Id: <20220609041653.2515397-1-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" This ensures the load will never split a cache line. --- sysdeps/x86_64/multiarch/varshift.c | 5 +++-- sysdeps/x86_64/multiarch/varshift.h | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/sysdeps/x86_64/multiarch/varshift.c b/sysdeps/x86_64/multiarch/varshift.c index c8210f0546..d27767520a 100644 --- a/sysdeps/x86_64/multiarch/varshift.c +++ b/sysdeps/x86_64/multiarch/varshift.c @@ -16,9 +16,10 @@ License along with the GNU C Library; if not, see . */ -#include "varshift.h" +#include -const int8_t ___m128i_shift_right[31] attribute_hidden = +const int8_t ___m128i_shift_right[31] attribute_hidden + __attribute__((aligned(32))) = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 diff --git a/sysdeps/x86_64/multiarch/varshift.h b/sysdeps/x86_64/multiarch/varshift.h index af30694488..ffd12d79e4 100644 --- a/sysdeps/x86_64/multiarch/varshift.h +++ b/sysdeps/x86_64/multiarch/varshift.h @@ -19,7 +19,8 @@ #include #include -extern const int8_t ___m128i_shift_right[31] attribute_hidden; +extern const int8_t ___m128i_shift_right[31] attribute_hidden + __attribute__ ((aligned (32))); static __inline__ __m128i __m128i_shift_right (__m128i value, unsigned long int offset) From patchwork Thu Jun 9 04:16:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 1641014 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=W+5ubZvO; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LJW6L3Y2Xz9s09 for ; Thu, 9 Jun 2022 14:18:02 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D377E3834F2C for ; Thu, 9 Jun 2022 04:17:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D377E3834F2C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1654748278; bh=DVuSc1h+Y0rnlqx4QPk9vhBsHKZ5AIyDWwqOC592zOs=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=W+5ubZvOm8BfXgrtIwj1TJrP94r0Uc+EZrxusqHjf4DSR/vBESMhhqscH2x1TZr8h iiUQw5zEHY+GvpmUmTHZ3yI4KyjMfCZKgofkAeb91hYHxTHhx3MbPQpgKSomav75Ag hIKRss/bgsnklri5xVZmgqEX+B9M5X14Wt6F3WR4= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by sourceware.org (Postfix) with ESMTPS id BF5E138356AB for ; Thu, 9 Jun 2022 04:16:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BF5E138356AB Received: by mail-pl1-x631.google.com with SMTP id r1so2067619plo.10 for ; Wed, 08 Jun 2022 21:16:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DVuSc1h+Y0rnlqx4QPk9vhBsHKZ5AIyDWwqOC592zOs=; b=cYVM1U791MNmMSvgbpLJcWuAsAO7i74giB36dWPh3kfIazqN2IfsLt6wCUPrTOryGk WDDCULX436BiAXkuEo8g4mJrryGgKraflpM8S9CRzA6mFk+wl4CbJOyAlq2uq1zyZ+9X atoO1s3ffNsRUSPTZXyQ6Of52FT5pFvAGNrvaPJx8S17zYgq6MVNM6PFWSE9YIENM5Or 8WRJAtMGtCC3+he2+UbVOlX1q8dXBjctfuEwcoY1QXJZ8RaZTdZdz2O9mdxRWnq7PQUV b3NdjfCXDPCSaG5Ybbac/YqQhKbkP9VW6mR5DR15lQVcJ+5EvdjHuUeMWAKRshGR/N6+ X1tg== X-Gm-Message-State: AOAM533AxjNNVje6CozeNLMLrgxslDOOBtxrtvcPSbUc1Yx2FKGd4Yyp KJoD/TVInWH5PXZfHLbDBAhj9vrME3IF+A== X-Google-Smtp-Source: ABdhPJxUnbXYn+fcoA9w6fEqyUvCTAu9koGc6qjFDYykerb5DBjbKyByluDCMmuCxck0NPV6kM2jgw== X-Received: by 2002:a17:90b:3e88:b0:1e8:8d83:8782 with SMTP id rj8-20020a17090b3e8800b001e88d838782mr1436004pjb.0.1654748218461; Wed, 08 Jun 2022 21:16:58 -0700 (PDT) Received: from noah-tgl.. ([2600:1010:b04a:6ef:d217:ff37:61dd:fb1]) by smtp.gmail.com with ESMTPSA id d10-20020a170902e14a00b00166d8100b7bsm11833644pla.176.2022.06.08.21.16.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jun 2022 21:16:58 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v1 2/3] x86: Add avx compiled version for strspn, strcspn, and strpbrk Date: Wed, 8 Jun 2022 21:16:52 -0700 Message-Id: <20220609041653.2515397-2-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220609041653.2515397-1-goldstein.w.n@gmail.com> References: <20220609041653.2515397-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" No change to the actual logic of the functions. The goal is to so the avx/avx2 machines rely less of sse instructions. Full xcheck passes on x86_64. --- sysdeps/x86_64/multiarch/Makefile | 21 ++++++++++----- .../multiarch/{ifunc-sse4_2.h => ifunc-avx.h} | 4 +++ sysdeps/x86_64/multiarch/ifunc-impl-list.c | 6 +++++ sysdeps/x86_64/multiarch/strcspn-c-avx.c | 21 +++++++++++++++ .../{strcspn-c.c => strcspn-c-sse4.c} | 26 ++++++++++++------- sysdeps/x86_64/multiarch/strcspn.c | 2 +- sysdeps/x86_64/multiarch/strpbrk-c-avx.c | 23 ++++++++++++++++ .../{strpbrk-c.c => strpbrk-c-sse4.c} | 6 ++--- sysdeps/x86_64/multiarch/strpbrk.c | 2 +- sysdeps/x86_64/multiarch/strspn-c-avx.c | 21 +++++++++++++++ .../multiarch/{strspn-c.c => strspn-c-sse4.c} | 15 ++++++++--- sysdeps/x86_64/multiarch/strspn.c | 2 +- 12 files changed, 122 insertions(+), 27 deletions(-) rename sysdeps/x86_64/multiarch/{ifunc-sse4_2.h => ifunc-avx.h} (89%) create mode 100644 sysdeps/x86_64/multiarch/strcspn-c-avx.c rename sysdeps/x86_64/multiarch/{strcspn-c.c => strcspn-c-sse4.c} (90%) create mode 100644 sysdeps/x86_64/multiarch/strpbrk-c-avx.c rename sysdeps/x86_64/multiarch/{strpbrk-c.c => strpbrk-c-sse4.c} (89%) create mode 100644 sysdeps/x86_64/multiarch/strspn-c-avx.c rename sysdeps/x86_64/multiarch/{strspn-c.c => strspn-c-sse4.c} (92%) diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index 3d153cac35..27f306c7c8 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -76,7 +76,8 @@ sysdep_routines += \ strcpy-evex \ strcpy-sse2 \ strcpy-sse2-unaligned \ - strcspn-c \ + strcspn-c-avx \ + strcspn-c-sse4 \ strcspn-sse2 \ strlen-avx2 \ strlen-avx2-rtm \ @@ -108,22 +109,28 @@ sysdep_routines += \ strnlen-evex \ strnlen-evex512 \ strnlen-sse2 \ - strpbrk-c \ + strpbrk-c-avx \ + strpbrk-c-sse4 \ strpbrk-sse2 \ strrchr-avx2 \ strrchr-avx2-rtm \ strrchr-evex \ strrchr-sse2 \ - strspn-c \ + strspn-c-avx \ + strspn-c-sse4 \ strspn-sse2 \ strstr-avx512 \ strstr-sse2-unaligned \ varshift \ # sysdep_routines -CFLAGS-varshift.c += -msse4 -CFLAGS-strcspn-c.c += -msse4 -CFLAGS-strpbrk-c.c += -msse4 -CFLAGS-strspn-c.c += -msse4 + +CFLAGS-strcspn-c-avx.c += -mavx +CFLAGS-strcspn-c-sse4.c += -msse4 +CFLAGS-strpbrk-c-avx.c += -mavx +CFLAGS-strpbrk-c-sse4.c += -msse4 +CFLAGS-strspn-c-avx.c += -mavx +CFLAGS-strspn-c-sse4.c += -msse4 + CFLAGS-strstr-avx512.c += -mavx512f -mavx512vl -mavx512dq -mavx512bw -mbmi -mbmi2 -O3 endif diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h b/sysdeps/x86_64/multiarch/ifunc-avx.h similarity index 89% rename from sysdeps/x86_64/multiarch/ifunc-sse4_2.h rename to sysdeps/x86_64/multiarch/ifunc-avx.h index b555ff2fac..891f3ddcac 100644 --- a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h +++ b/sysdeps/x86_64/multiarch/ifunc-avx.h @@ -21,12 +21,16 @@ extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (sse42) attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (avx) attribute_hidden; static inline void * IFUNC_SELECTOR (void) { const struct cpu_features* cpu_features = __get_cpu_features (); + if (CPU_FEATURES_ARCH_P (cpu_features, AVX_Fast_Unaligned_Load)) + return OPTIMIZE (avx); + if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2)) return OPTIMIZE (sse42); diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c index 58f3ec8306..507c563669 100644 --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c @@ -529,6 +529,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/x86_64/multiarch/strcspn.c. */ IFUNC_IMPL (i, name, strcspn, + IFUNC_IMPL_ADD (array, i, strcspn, CPU_FEATURE_USABLE (AVX), + __strcspn_avx) IFUNC_IMPL_ADD (array, i, strcspn, CPU_FEATURE_USABLE (SSE4_2), __strcspn_sse42) IFUNC_IMPL_ADD (array, i, strcspn, 1, __strcspn_sse2)) @@ -605,6 +607,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/x86_64/multiarch/strpbrk.c. */ IFUNC_IMPL (i, name, strpbrk, + IFUNC_IMPL_ADD (array, i, strpbrk, CPU_FEATURE_USABLE (AVX), + __strpbrk_avx) IFUNC_IMPL_ADD (array, i, strpbrk, CPU_FEATURE_USABLE (SSE4_2), __strpbrk_sse42) IFUNC_IMPL_ADD (array, i, strpbrk, 1, __strpbrk_sse2)) @@ -612,6 +616,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/x86_64/multiarch/strspn.c. */ IFUNC_IMPL (i, name, strspn, + IFUNC_IMPL_ADD (array, i, strspn, CPU_FEATURE_USABLE (AVX), + __strspn_avx) IFUNC_IMPL_ADD (array, i, strspn, CPU_FEATURE_USABLE (SSE4_2), __strspn_sse42) IFUNC_IMPL_ADD (array, i, strspn, 1, __strspn_sse2)) diff --git a/sysdeps/x86_64/multiarch/strcspn-c-avx.c b/sysdeps/x86_64/multiarch/strcspn-c-avx.c new file mode 100644 index 0000000000..b8d983f79f --- /dev/null +++ b/sysdeps/x86_64/multiarch/strcspn-c-avx.c @@ -0,0 +1,21 @@ +/* strcspn with AVX intrinsics + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STRCSPN __strcspn_avx +#define SECTION "avx" +#include "strcspn-c-sse4.c" diff --git a/sysdeps/x86_64/multiarch/strcspn-c.c b/sysdeps/x86_64/multiarch/strcspn-c-sse4.c similarity index 90% rename from sysdeps/x86_64/multiarch/strcspn-c.c rename to sysdeps/x86_64/multiarch/strcspn-c-sse4.c index c312fab8b1..848c3cfb14 100644 --- a/sysdeps/x86_64/multiarch/strcspn-c.c +++ b/sysdeps/x86_64/multiarch/strcspn-c-sse4.c @@ -52,9 +52,16 @@ when either CFlag or ZFlag is 1. If CFlag == 1, ECX has the offset X for case 1. */ -#ifndef STRCSPN_SSE2 -# define STRCSPN_SSE2 __strcspn_sse2 -# define STRCSPN_SSE42 __strcspn_sse42 +#ifndef STRCSPN_FALLBACK +# define STRCSPN_FALLBACK __strcspn_sse2 +#endif + +#ifndef STRCSPN +# define STRCSPN __strcspn_sse42 +#endif + +#ifndef SECTION +# define SECTION "sse4.2" #endif #ifdef USE_AS_STRPBRK @@ -69,16 +76,15 @@ char * #else size_t #endif -STRCSPN_SSE2 (const char *, const char *) attribute_hidden; - +STRCSPN_FALLBACK (const char *, const char *) attribute_hidden; #ifdef USE_AS_STRPBRK char * #else size_t #endif -__attribute__ ((section (".text.sse4.2"))) -STRCSPN_SSE42 (const char *s, const char *a) +__attribute__ ((section (".text." SECTION))) +STRCSPN (const char *s, const char *a) { if (*a == 0) RETURN (NULL, strlen (s)); @@ -116,10 +122,10 @@ STRCSPN_SSE42 (const char *s, const char *a) maskz_bits = _mm_movemask_epi8 (maskz); if (maskz_bits == 0) { - /* There is no NULL terminator. Don't use SSE4.2 if the length - of A > 16. */ + /* There is no NULL terminator. Don't use pcmpstri based approach if the + length of A > 16. */ if (a[16] != 0) - return STRCSPN_SSE2 (s, a); + return STRCSPN_FALLBACK (s, a); } aligned = s; diff --git a/sysdeps/x86_64/multiarch/strcspn.c b/sysdeps/x86_64/multiarch/strcspn.c index 4848fa8677..63e1cf052e 100644 --- a/sysdeps/x86_64/multiarch/strcspn.c +++ b/sysdeps/x86_64/multiarch/strcspn.c @@ -24,7 +24,7 @@ # undef strcspn # define SYMBOL_NAME strcspn -# include "ifunc-sse4_2.h" +# include "ifunc-avx.h" libc_ifunc_redirected (__redirect_strcspn, strcspn, IFUNC_SELECTOR ()); diff --git a/sysdeps/x86_64/multiarch/strpbrk-c-avx.c b/sysdeps/x86_64/multiarch/strpbrk-c-avx.c new file mode 100644 index 0000000000..2918013994 --- /dev/null +++ b/sysdeps/x86_64/multiarch/strpbrk-c-avx.c @@ -0,0 +1,23 @@ +/* strpbrk with AVX intrinsics + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define USE_AS_STRPBRK +#define STRCSPN_FALLBACK __strpbrk_sse2 +#define STRCSPN __strpbrk_avx +#define SECTION "avx" +#include "strcspn-c-sse4.c" diff --git a/sysdeps/x86_64/multiarch/strpbrk-c.c b/sysdeps/x86_64/multiarch/strpbrk-c-sse4.c similarity index 89% rename from sysdeps/x86_64/multiarch/strpbrk-c.c rename to sysdeps/x86_64/multiarch/strpbrk-c-sse4.c index abf4ff7f1a..2efd38d809 100644 --- a/sysdeps/x86_64/multiarch/strpbrk-c.c +++ b/sysdeps/x86_64/multiarch/strpbrk-c-sse4.c @@ -17,6 +17,6 @@ . */ #define USE_AS_STRPBRK -#define STRCSPN_SSE2 __strpbrk_sse2 -#define STRCSPN_SSE42 __strpbrk_sse42 -#include "strcspn-c.c" +#define STRCSPN_FALLBACK __strpbrk_sse2 +#define STRCSPN __strpbrk_sse42 +#include "strcspn-c-sse4.c" diff --git a/sysdeps/x86_64/multiarch/strpbrk.c b/sysdeps/x86_64/multiarch/strpbrk.c index 04e300ea71..ab5b04a482 100644 --- a/sysdeps/x86_64/multiarch/strpbrk.c +++ b/sysdeps/x86_64/multiarch/strpbrk.c @@ -24,7 +24,7 @@ # undef strpbrk # define SYMBOL_NAME strpbrk -# include "ifunc-sse4_2.h" +# include "ifunc-avx.h" libc_ifunc_redirected (__redirect_strpbrk, strpbrk, IFUNC_SELECTOR ()); diff --git a/sysdeps/x86_64/multiarch/strspn-c-avx.c b/sysdeps/x86_64/multiarch/strspn-c-avx.c new file mode 100644 index 0000000000..9d5fdb9550 --- /dev/null +++ b/sysdeps/x86_64/multiarch/strspn-c-avx.c @@ -0,0 +1,21 @@ +/* strspn with AVX intrinsics + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define STRSPN __strspn_avx +#define SECTION "avx" +#include "strspn-c-sse4.c" diff --git a/sysdeps/x86_64/multiarch/strspn-c.c b/sysdeps/x86_64/multiarch/strspn-c-sse4.c similarity index 92% rename from sysdeps/x86_64/multiarch/strspn-c.c rename to sysdeps/x86_64/multiarch/strspn-c-sse4.c index 6124033ceb..6a91def2e0 100644 --- a/sysdeps/x86_64/multiarch/strspn-c.c +++ b/sysdeps/x86_64/multiarch/strspn-c-sse4.c @@ -53,10 +53,17 @@ extern size_t __strspn_sse2 (const char *, const char *) attribute_hidden; +#ifndef STRSPN +# define STRSPN __strspn_sse42 +#endif + +#ifndef SECTION +# define SECTION "sse4.2" +#endif size_t -__attribute__ ((section (".text.sse4.2"))) -__strspn_sse42 (const char *s, const char *a) +__attribute__ ((section (".text." SECTION))) +STRSPN (const char *s, const char *a) { if (*a == 0) return 0; @@ -95,8 +102,8 @@ __strspn_sse42 (const char *s, const char *a) maskz_bits = _mm_movemask_epi8 (maskz); if (maskz_bits == 0) { - /* There is no NULL terminator. Don't use SSE4.2 if the length - of A > 16. */ + /* There is no NULL terminator. Don't use pcmpstri based approach if the + length of A > 16. */ if (a[16] != 0) return __strspn_sse2 (s, a); } diff --git a/sysdeps/x86_64/multiarch/strspn.c b/sysdeps/x86_64/multiarch/strspn.c index 07f5def155..c3c5e7a3cc 100644 --- a/sysdeps/x86_64/multiarch/strspn.c +++ b/sysdeps/x86_64/multiarch/strspn.c @@ -24,7 +24,7 @@ # undef strspn # define SYMBOL_NAME strspn -# include "ifunc-sse4_2.h" +# include "ifunc-avx.h" libc_ifunc_redirected (__redirect_strspn, strspn, IFUNC_SELECTOR ()); From patchwork Thu Jun 9 04:16:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 1641015 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=cMcC51X+; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LJW764VQJz9s09 for ; Thu, 9 Jun 2022 14:18:42 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B2A9138346B0 for ; Thu, 9 Jun 2022 04:18:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B2A9138346B0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1654748320; bh=w6llKQnDJlzuR27F90QLbfRINmn1xPfUBOa/O/l4+ZU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=cMcC51X+Fq5qvbq8aV9QVp0Xs0XwjgQ7WB8ODwtg9XFG1tvJAtjqbmghR+H0FS3bF DCk1e5w0UiV2DKlo6QaVvkBCxMxlFJ6oPEliKartoerUCJDF5SH3NI8WbyZZHrWb3b /2Rhe7I2B1P99avN9WXQQXXlSQpBZl68huIBbvuI= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id AD92F3834F2A for ; Thu, 9 Jun 2022 04:17:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AD92F3834F2A Received: by mail-pl1-x632.google.com with SMTP id r1so2067646plo.10 for ; Wed, 08 Jun 2022 21:17:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=w6llKQnDJlzuR27F90QLbfRINmn1xPfUBOa/O/l4+ZU=; b=xpaXTJyRBaQ9EwSPoxmfh2j2W4mk/uRv1FeiEZTJ1J3zvuG+T/Kerp2OfmUTzw5fMf m611pqFg13mjltnHSy+xz5efEnlbE4cvrUTyQkw2sl+JMvKUg+rZC6MA6PgN026e3WLN MPefOISa0T1Z/buOSiNl26Keh/PfV5BcA8s8XhG/NlzmveTlGLvwxrziTpeIXsoQ7e+h nUYc/79vQnaIkjtlDuoyfFgbXu5zAX6fS4kW1MRkWKST+rRQeNwEHx+DHBcL5zowAbou +QvrcOW7/uBXKigEYcTBfsYW+IKoZYmERbxC5njScefeN9C8REymnQxT8P7PLeRmVzw8 k/hA== X-Gm-Message-State: AOAM533HfcCSrtH0+ZLXBjQS3w3xvuKuOT/LDc/bni7+wGDAMZi2yVMy aIBYiGdA6cxkX41OsHnbnKBo3t69fQCMZg== X-Google-Smtp-Source: ABdhPJxycn7S3X5jFf25g1NnfmiQzfr1K4cK5Ild/BahF+p4k8Qi3aUUk2AxTNUn1iMNIQwsfYA9Dg== X-Received: by 2002:a17:90b:4b8d:b0:1e3:5147:6e63 with SMTP id lr13-20020a17090b4b8d00b001e351476e63mr1438169pjb.162.1654748219505; Wed, 08 Jun 2022 21:16:59 -0700 (PDT) Received: from noah-tgl.. ([2600:1010:b04a:6ef:d217:ff37:61dd:fb1]) by smtp.gmail.com with ESMTPSA id d10-20020a170902e14a00b00166d8100b7bsm11833644pla.176.2022.06.08.21.16.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jun 2022 21:16:59 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v1 3/3] x86: Rename generic functions with unique postfix for clarity Date: Wed, 8 Jun 2022 21:16:53 -0700 Message-Id: <20220609041653.2515397-3-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220609041653.2515397-1-goldstein.w.n@gmail.com> References: <20220609041653.2515397-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" No functions are changed. It just renames generic implementations from '{func}_sse2' to '{func}_generic'. This is just because the postfix "_sse2" was overloaded and was used for files that had hand-optimized sse2 assembly implementations and files that just redirected back to the generic implementation. Full xcheck passed on x86_64. --- sysdeps/x86_64/multiarch/Makefile | 6 +++--- sysdeps/x86_64/multiarch/ifunc-avx.h | 4 ++-- sysdeps/x86_64/multiarch/ifunc-impl-list.c | 16 ++++++++-------- sysdeps/x86_64/multiarch/ifunc-strcpy.h | 8 ++++++-- sysdeps/x86_64/multiarch/ifunc-wcslen.h | 8 ++++++-- sysdeps/x86_64/multiarch/stpncpy-c.c | 2 +- sysdeps/x86_64/multiarch/stpncpy.c | 1 + sysdeps/x86_64/multiarch/strcspn-c-sse4.c | 2 +- .../multiarch/{strcspn-sse2.c => strcspn-c.c} | 2 +- sysdeps/x86_64/multiarch/strncat-c.c | 2 +- sysdeps/x86_64/multiarch/strncat.c | 1 + sysdeps/x86_64/multiarch/strncpy-c.c | 2 +- sysdeps/x86_64/multiarch/strncpy.c | 1 + sysdeps/x86_64/multiarch/strpbrk-c-avx.c | 2 +- sysdeps/x86_64/multiarch/strpbrk-c-sse4.c | 2 +- .../multiarch/{strpbrk-sse2.c => strpbrk-c.c} | 2 +- sysdeps/x86_64/multiarch/strspn-c-sse4.c | 4 ++-- .../multiarch/{strspn-sse2.c => strspn-c.c} | 2 +- sysdeps/x86_64/multiarch/wcscpy-c.c | 2 +- sysdeps/x86_64/multiarch/wcscpy.c | 4 ++-- sysdeps/x86_64/multiarch/wcsnlen-c.c | 4 ++-- sysdeps/x86_64/multiarch/wcsnlen.c | 1 + 22 files changed, 45 insertions(+), 33 deletions(-) rename sysdeps/x86_64/multiarch/{strcspn-sse2.c => strcspn-c.c} (96%) rename sysdeps/x86_64/multiarch/{strpbrk-sse2.c => strpbrk-c.c} (96%) rename sysdeps/x86_64/multiarch/{strspn-sse2.c => strspn-c.c} (96%) diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index 27f306c7c8..9b1e0add1a 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -76,9 +76,9 @@ sysdep_routines += \ strcpy-evex \ strcpy-sse2 \ strcpy-sse2-unaligned \ + strcspn-c \ strcspn-c-avx \ strcspn-c-sse4 \ - strcspn-sse2 \ strlen-avx2 \ strlen-avx2-rtm \ strlen-evex \ @@ -109,16 +109,16 @@ sysdep_routines += \ strnlen-evex \ strnlen-evex512 \ strnlen-sse2 \ + strpbrk-c \ strpbrk-c-avx \ strpbrk-c-sse4 \ - strpbrk-sse2 \ strrchr-avx2 \ strrchr-avx2-rtm \ strrchr-evex \ strrchr-sse2 \ + strspn-c \ strspn-c-avx \ strspn-c-sse4 \ - strspn-sse2 \ strstr-avx512 \ strstr-sse2-unaligned \ varshift \ diff --git a/sysdeps/x86_64/multiarch/ifunc-avx.h b/sysdeps/x86_64/multiarch/ifunc-avx.h index 891f3ddcac..30efbd29d0 100644 --- a/sysdeps/x86_64/multiarch/ifunc-avx.h +++ b/sysdeps/x86_64/multiarch/ifunc-avx.h @@ -19,7 +19,7 @@ #include -extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (generic) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (sse42) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx) attribute_hidden; @@ -34,5 +34,5 @@ IFUNC_SELECTOR (void) if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2)) return OPTIMIZE (sse42); - return OPTIMIZE (sse2); + return OPTIMIZE (generic); } diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c index 507c563669..23a2d7114d 100644 --- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c +++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c @@ -372,7 +372,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __stpncpy_evex) IFUNC_IMPL_ADD (array, i, stpncpy, 1, __stpncpy_sse2_unaligned) - IFUNC_IMPL_ADD (array, i, stpncpy, 1, __stpncpy_sse2)) + IFUNC_IMPL_ADD (array, i, stpncpy, 1, __stpncpy_generic)) /* Support sysdeps/x86_64/multiarch/stpcpy.c. */ IFUNC_IMPL (i, name, stpcpy, @@ -533,7 +533,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __strcspn_avx) IFUNC_IMPL_ADD (array, i, strcspn, CPU_FEATURE_USABLE (SSE4_2), __strcspn_sse42) - IFUNC_IMPL_ADD (array, i, strcspn, 1, __strcspn_sse2)) + IFUNC_IMPL_ADD (array, i, strcspn, 1, __strcspn_generic)) /* Support sysdeps/x86_64/multiarch/strncase_l.c. */ IFUNC_IMPL (i, name, strncasecmp, @@ -587,7 +587,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __strncat_evex) IFUNC_IMPL_ADD (array, i, strncat, 1, __strncat_sse2_unaligned) - IFUNC_IMPL_ADD (array, i, strncat, 1, __strncat_sse2)) + IFUNC_IMPL_ADD (array, i, strncat, 1, __strncat_generic)) /* Support sysdeps/x86_64/multiarch/strncpy.c. */ IFUNC_IMPL (i, name, strncpy, @@ -603,7 +603,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __strncpy_evex) IFUNC_IMPL_ADD (array, i, strncpy, 1, __strncpy_sse2_unaligned) - IFUNC_IMPL_ADD (array, i, strncpy, 1, __strncpy_sse2)) + IFUNC_IMPL_ADD (array, i, strncpy, 1, __strncpy_generic)) /* Support sysdeps/x86_64/multiarch/strpbrk.c. */ IFUNC_IMPL (i, name, strpbrk, @@ -611,7 +611,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __strpbrk_avx) IFUNC_IMPL_ADD (array, i, strpbrk, CPU_FEATURE_USABLE (SSE4_2), __strpbrk_sse42) - IFUNC_IMPL_ADD (array, i, strpbrk, 1, __strpbrk_sse2)) + IFUNC_IMPL_ADD (array, i, strpbrk, 1, __strpbrk_generic)) /* Support sysdeps/x86_64/multiarch/strspn.c. */ @@ -620,7 +620,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, __strspn_avx) IFUNC_IMPL_ADD (array, i, strspn, CPU_FEATURE_USABLE (SSE4_2), __strspn_sse42) - IFUNC_IMPL_ADD (array, i, strspn, 1, __strspn_sse2)) + IFUNC_IMPL_ADD (array, i, strspn, 1, __strspn_generic)) /* Support sysdeps/x86_64/multiarch/strstr.c. */ IFUNC_IMPL (i, name, strstr, @@ -703,7 +703,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL (i, name, wcscpy, IFUNC_IMPL_ADD (array, i, wcscpy, CPU_FEATURE_USABLE (SSSE3), __wcscpy_ssse3) - IFUNC_IMPL_ADD (array, i, wcscpy, 1, __wcscpy_sse2)) + IFUNC_IMPL_ADD (array, i, wcscpy, 1, __wcscpy_generic)) /* Support sysdeps/x86_64/multiarch/wcslen.c. */ IFUNC_IMPL (i, name, wcslen, @@ -755,7 +755,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, wcsnlen, CPU_FEATURE_USABLE (SSE4_1), __wcsnlen_sse4_1) - IFUNC_IMPL_ADD (array, i, wcsnlen, 1, __wcsnlen_sse2)) + IFUNC_IMPL_ADD (array, i, wcsnlen, 1, __wcsnlen_generic)) /* Support sysdeps/x86_64/multiarch/wmemchr.c. */ IFUNC_IMPL (i, name, wmemchr, diff --git a/sysdeps/x86_64/multiarch/ifunc-strcpy.h b/sysdeps/x86_64/multiarch/ifunc-strcpy.h index a15afa44e9..80529458d1 100644 --- a/sysdeps/x86_64/multiarch/ifunc-strcpy.h +++ b/sysdeps/x86_64/multiarch/ifunc-strcpy.h @@ -20,7 +20,11 @@ #include -extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; +#ifndef GENERIC +# define GENERIC sse2 +#endif + +extern __typeof (REDIRECT_NAME) OPTIMIZE (GENERIC) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2_unaligned) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; @@ -49,5 +53,5 @@ IFUNC_SELECTOR (void) if (CPU_FEATURES_ARCH_P (cpu_features, Fast_Unaligned_Load)) return OPTIMIZE (sse2_unaligned); - return OPTIMIZE (sse2); + return OPTIMIZE (GENERIC); } diff --git a/sysdeps/x86_64/multiarch/ifunc-wcslen.h b/sysdeps/x86_64/multiarch/ifunc-wcslen.h index 2b29e7608a..88c1c502af 100644 --- a/sysdeps/x86_64/multiarch/ifunc-wcslen.h +++ b/sysdeps/x86_64/multiarch/ifunc-wcslen.h @@ -19,7 +19,11 @@ #include -extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; +#ifndef GENERIC +# define GENERIC sse2 +#endif + +extern __typeof (REDIRECT_NAME) OPTIMIZE (GENERIC) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (sse4_1) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (avx2_rtm) attribute_hidden; @@ -48,5 +52,5 @@ IFUNC_SELECTOR (void) if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_1)) return OPTIMIZE (sse4_1); - return OPTIMIZE (sse2); + return OPTIMIZE (GENERIC); } diff --git a/sysdeps/x86_64/multiarch/stpncpy-c.c b/sysdeps/x86_64/multiarch/stpncpy-c.c index b016e487e1..eb62fcf388 100644 --- a/sysdeps/x86_64/multiarch/stpncpy-c.c +++ b/sysdeps/x86_64/multiarch/stpncpy-c.c @@ -1,4 +1,4 @@ -#define STPNCPY __stpncpy_sse2 +#define STPNCPY __stpncpy_generic #undef weak_alias #define weak_alias(ignored1, ignored2) #undef libc_hidden_def diff --git a/sysdeps/x86_64/multiarch/stpncpy.c b/sysdeps/x86_64/multiarch/stpncpy.c index 82fa53957d..879bc83f0b 100644 --- a/sysdeps/x86_64/multiarch/stpncpy.c +++ b/sysdeps/x86_64/multiarch/stpncpy.c @@ -25,6 +25,7 @@ # undef stpncpy # undef __stpncpy +# define GENERIC generic # define SYMBOL_NAME stpncpy # include "ifunc-strcpy.h" diff --git a/sysdeps/x86_64/multiarch/strcspn-c-sse4.c b/sysdeps/x86_64/multiarch/strcspn-c-sse4.c index 848c3cfb14..8541035ccb 100644 --- a/sysdeps/x86_64/multiarch/strcspn-c-sse4.c +++ b/sysdeps/x86_64/multiarch/strcspn-c-sse4.c @@ -53,7 +53,7 @@ X for case 1. */ #ifndef STRCSPN_FALLBACK -# define STRCSPN_FALLBACK __strcspn_sse2 +# define STRCSPN_FALLBACK __strcspn_generic #endif #ifndef STRCSPN diff --git a/sysdeps/x86_64/multiarch/strcspn-sse2.c b/sysdeps/x86_64/multiarch/strcspn-c.c similarity index 96% rename from sysdeps/x86_64/multiarch/strcspn-sse2.c rename to sysdeps/x86_64/multiarch/strcspn-c.c index 3a04bb39fc..423de2e2b2 100644 --- a/sysdeps/x86_64/multiarch/strcspn-sse2.c +++ b/sysdeps/x86_64/multiarch/strcspn-c.c @@ -19,7 +19,7 @@ #if IS_IN (libc) # include -# define STRCSPN __strcspn_sse2 +# define STRCSPN __strcspn_generic # undef libc_hidden_builtin_def # define libc_hidden_builtin_def(STRCSPN) diff --git a/sysdeps/x86_64/multiarch/strncat-c.c b/sysdeps/x86_64/multiarch/strncat-c.c index 93a7fab7ea..b729c033d9 100644 --- a/sysdeps/x86_64/multiarch/strncat-c.c +++ b/sysdeps/x86_64/multiarch/strncat-c.c @@ -1,2 +1,2 @@ -#define STRNCAT __strncat_sse2 +#define STRNCAT __strncat_generic #include diff --git a/sysdeps/x86_64/multiarch/strncat.c b/sysdeps/x86_64/multiarch/strncat.c index b649343a97..50fba8a41f 100644 --- a/sysdeps/x86_64/multiarch/strncat.c +++ b/sysdeps/x86_64/multiarch/strncat.c @@ -24,6 +24,7 @@ # undef strncat # define SYMBOL_NAME strncat +# define GENERIC generic # include "ifunc-strcpy.h" libc_ifunc_redirected (__redirect_strncat, strncat, IFUNC_SELECTOR ()); diff --git a/sysdeps/x86_64/multiarch/strncpy-c.c b/sysdeps/x86_64/multiarch/strncpy-c.c index 57c45ac7ab..183b0b8e0f 100644 --- a/sysdeps/x86_64/multiarch/strncpy-c.c +++ b/sysdeps/x86_64/multiarch/strncpy-c.c @@ -1,4 +1,4 @@ -#define STRNCPY __strncpy_sse2 +#define STRNCPY __strncpy_generic #undef libc_hidden_builtin_def #define libc_hidden_builtin_def(strncpy) diff --git a/sysdeps/x86_64/multiarch/strncpy.c b/sysdeps/x86_64/multiarch/strncpy.c index 2a780a7e16..7fc7d72ec5 100644 --- a/sysdeps/x86_64/multiarch/strncpy.c +++ b/sysdeps/x86_64/multiarch/strncpy.c @@ -24,6 +24,7 @@ # undef strncpy # define SYMBOL_NAME strncpy +# define GENERIC generic # include "ifunc-strcpy.h" libc_ifunc_redirected (__redirect_strncpy, strncpy, IFUNC_SELECTOR ()); diff --git a/sysdeps/x86_64/multiarch/strpbrk-c-avx.c b/sysdeps/x86_64/multiarch/strpbrk-c-avx.c index 2918013994..363daebd9e 100644 --- a/sysdeps/x86_64/multiarch/strpbrk-c-avx.c +++ b/sysdeps/x86_64/multiarch/strpbrk-c-avx.c @@ -17,7 +17,7 @@ . */ #define USE_AS_STRPBRK -#define STRCSPN_FALLBACK __strpbrk_sse2 +#define STRCSPN_FALLBACK __strpbrk_generic #define STRCSPN __strpbrk_avx #define SECTION "avx" #include "strcspn-c-sse4.c" diff --git a/sysdeps/x86_64/multiarch/strpbrk-c-sse4.c b/sysdeps/x86_64/multiarch/strpbrk-c-sse4.c index 2efd38d809..a02c951dfd 100644 --- a/sysdeps/x86_64/multiarch/strpbrk-c-sse4.c +++ b/sysdeps/x86_64/multiarch/strpbrk-c-sse4.c @@ -17,6 +17,6 @@ . */ #define USE_AS_STRPBRK -#define STRCSPN_FALLBACK __strpbrk_sse2 +#define STRCSPN_FALLBACK __strpbrk_generic #define STRCSPN __strpbrk_sse42 #include "strcspn-c-sse4.c" diff --git a/sysdeps/x86_64/multiarch/strpbrk-sse2.c b/sysdeps/x86_64/multiarch/strpbrk-c.c similarity index 96% rename from sysdeps/x86_64/multiarch/strpbrk-sse2.c rename to sysdeps/x86_64/multiarch/strpbrk-c.c index d03214c4fb..d31acfe495 100644 --- a/sysdeps/x86_64/multiarch/strpbrk-sse2.c +++ b/sysdeps/x86_64/multiarch/strpbrk-c.c @@ -19,7 +19,7 @@ #if IS_IN (libc) # include -# define STRPBRK __strpbrk_sse2 +# define STRPBRK __strpbrk_generic # undef libc_hidden_builtin_def # define libc_hidden_builtin_def(STRPBRK) diff --git a/sysdeps/x86_64/multiarch/strspn-c-sse4.c b/sysdeps/x86_64/multiarch/strspn-c-sse4.c index 6a91def2e0..9323a117ab 100644 --- a/sysdeps/x86_64/multiarch/strspn-c-sse4.c +++ b/sysdeps/x86_64/multiarch/strspn-c-sse4.c @@ -51,7 +51,7 @@ We exit from the loop for case 1. */ -extern size_t __strspn_sse2 (const char *, const char *) attribute_hidden; +extern size_t __strspn_generic (const char *, const char *) attribute_hidden; #ifndef STRSPN # define STRSPN __strspn_sse42 @@ -105,7 +105,7 @@ STRSPN (const char *s, const char *a) /* There is no NULL terminator. Don't use pcmpstri based approach if the length of A > 16. */ if (a[16] != 0) - return __strspn_sse2 (s, a); + return __strspn_generic (s, a); } aligned = s; offset = (unsigned int) ((size_t) s & 15); diff --git a/sysdeps/x86_64/multiarch/strspn-sse2.c b/sysdeps/x86_64/multiarch/strspn-c.c similarity index 96% rename from sysdeps/x86_64/multiarch/strspn-sse2.c rename to sysdeps/x86_64/multiarch/strspn-c.c index 61cc6cb0a5..6b50c36432 100644 --- a/sysdeps/x86_64/multiarch/strspn-sse2.c +++ b/sysdeps/x86_64/multiarch/strspn-c.c @@ -19,7 +19,7 @@ #if IS_IN (libc) # include -# define STRSPN __strspn_sse2 +# define STRSPN __strspn_generic # undef libc_hidden_builtin_def # define libc_hidden_builtin_def(STRSPN) diff --git a/sysdeps/x86_64/multiarch/wcscpy-c.c b/sysdeps/x86_64/multiarch/wcscpy-c.c index 26d6984e9b..fa38dd898d 100644 --- a/sysdeps/x86_64/multiarch/wcscpy-c.c +++ b/sysdeps/x86_64/multiarch/wcscpy-c.c @@ -1,5 +1,5 @@ #if IS_IN (libc) -# define WCSCPY __wcscpy_sse2 +# define WCSCPY __wcscpy_generic #endif #include diff --git a/sysdeps/x86_64/multiarch/wcscpy.c b/sysdeps/x86_64/multiarch/wcscpy.c index 6a2d1421d9..53c3228dc2 100644 --- a/sysdeps/x86_64/multiarch/wcscpy.c +++ b/sysdeps/x86_64/multiarch/wcscpy.c @@ -26,7 +26,7 @@ # define SYMBOL_NAME wcscpy # include -extern __typeof (REDIRECT_NAME) OPTIMIZE (sse2) attribute_hidden; +extern __typeof (REDIRECT_NAME) OPTIMIZE (generic) attribute_hidden; extern __typeof (REDIRECT_NAME) OPTIMIZE (ssse3) attribute_hidden; static inline void * @@ -37,7 +37,7 @@ IFUNC_SELECTOR (void) if (CPU_FEATURE_USABLE_P (cpu_features, SSSE3)) return OPTIMIZE (ssse3); - return OPTIMIZE (sse2); + return OPTIMIZE (generic); } libc_ifunc_redirected (__redirect_wcscpy, __wcscpy, IFUNC_SELECTOR ()); diff --git a/sysdeps/x86_64/multiarch/wcsnlen-c.c b/sysdeps/x86_64/multiarch/wcsnlen-c.c index e1ec7cfbb5..1c9c04241a 100644 --- a/sysdeps/x86_64/multiarch/wcsnlen-c.c +++ b/sysdeps/x86_64/multiarch/wcsnlen-c.c @@ -1,9 +1,9 @@ #if IS_IN (libc) # include -# define WCSNLEN __wcsnlen_sse2 +# define WCSNLEN __wcsnlen_generic -extern __typeof (wcsnlen) __wcsnlen_sse2; +extern __typeof (wcsnlen) __wcsnlen_generic; #endif #include "wcsmbs/wcsnlen.c" diff --git a/sysdeps/x86_64/multiarch/wcsnlen.c b/sysdeps/x86_64/multiarch/wcsnlen.c index baa26666a8..05b7a211de 100644 --- a/sysdeps/x86_64/multiarch/wcsnlen.c +++ b/sysdeps/x86_64/multiarch/wcsnlen.c @@ -24,6 +24,7 @@ # undef __wcsnlen # define SYMBOL_NAME wcsnlen +# define GENERIC generic # include "ifunc-wcslen.h" libc_ifunc_redirected (__redirect_wcsnlen, __wcsnlen, IFUNC_SELECTOR ());