From patchwork Tue Jul 11 12:14:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: develop--- via Libc-alpha X-Patchwork-Id: 1806258 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=imslRFiy; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R0fv23W2Rz20ZZ for ; Tue, 11 Jul 2023 22:14:38 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 76B903858C1F for ; Tue, 11 Jul 2023 12:14:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 76B903858C1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1689077676; bh=lxpfn1ZWF3UG7dlfXg/+lvasLGRCwn8/re3pyo3Iqk0=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=imslRFiywGCg5MhN2P/QIUC2SumxbYqy16bRd/SpOZc4EnxgxIn0/qBOfuAb/XhXj rKPrr6v39qbpkOTkrYkXmRRDvuHpgyQiQUa+z6VtheQXn8B59Dr1A/YzBdYv0ku8Sy lCuYpw3yMQHNBzbhPDCteD9y4FlCQv6voL9onEc4= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id B49FD3858D20 for ; Tue, 11 Jul 2023 12:14:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B49FD3858D20 Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36BCCJpD000700; Tue, 11 Jul 2023 12:14:17 GMT Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rs6vjg1k1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jul 2023 12:14:16 +0000 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 36BAAxHK013146; Tue, 11 Jul 2023 12:14:14 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma03fra.de.ibm.com (PPS) with ESMTPS id 3rpye51aqs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 Jul 2023 12:14:14 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 36BCEAnW57606618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Jul 2023 12:14:10 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B84C42004D; Tue, 11 Jul 2023 12:14:10 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BB9292004F; Tue, 11 Jul 2023 12:14:09 +0000 (GMT) Received: from ltcden2-lp1.aus.stglabs.ibm.com (unknown [9.3.90.43]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 11 Jul 2023 12:14:09 +0000 (GMT) To: libc-alpha@sourceware.org Cc: rajis@linux.ibm.com, bergner@linux.ibm.com, adhemerval.zanella@linaro.org, Mahesh Bodapati Subject: [PATCH v6] PowerPC: Influence cpu/arch hwcap features via GLIBC_TUNABLES. Date: Tue, 11 Jul 2023 07:14:06 -0500 Message-Id: <20230711121406.962921-1-bmahi496@linux.ibm.com> X-Mailer: git-send-email 2.39.3 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: ZFOmkYwgu4_rQw6YFP5Sz_saET89Sdag X-Proofpoint-GUID: ZFOmkYwgu4_rQw6YFP5Sz_saET89Sdag X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_06,2023-07-11_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 spamscore=0 mlxscore=0 malwarescore=0 priorityscore=1501 phishscore=0 bulkscore=0 impostorscore=0 lowpriorityscore=0 adultscore=0 clxscore=1015 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307110108 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: bmahi496--- via Libc-alpha From: develop--- via Libc-alpha Reply-To: bmahi496@linux.ibm.com Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Mahesh Bodapati This patch enables the option to influence hwcaps used by PowerPC. The environment variable, GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz...., can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature xxx and zzz, where the feature name is case-sensitive and has to match the ones mentioned in the file{sysdeps/powerpc/dl-procinfo.c}. Note that the hwcap tunables only used in the IFUNC selection. --- manual/tunables.texi | 5 +- sysdeps/powerpc/cpu-features.c | 96 ++++++++++++++++- sysdeps/powerpc/cpu-features.h | 102 ++++++++++++++++++ sysdeps/powerpc/dl-tunables.list | 3 + sysdeps/powerpc/hwcapinfo.c | 4 + .../power4/multiarch/ifunc-impl-list.c | 4 +- .../powerpc32/power4/multiarch/init-arch.h | 10 +- sysdeps/powerpc/powerpc64/dl-machine.h | 2 - .../powerpc64/multiarch/ifunc-impl-list.c | 7 +- 9 files changed, 221 insertions(+), 12 deletions(-) diff --git a/manual/tunables.texi b/manual/tunables.texi index 4ca0e42a11..776fd93fd9 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -513,7 +513,10 @@ On s390x, the supported HWCAP and STFLE features can be found in @code{sysdeps/s390/cpu-features.c}. In addition the user can also set a CPU arch-level like @code{z13} instead of single HWCAP and STFLE features. -This tunable is specific to i386, x86-64 and s390x. +On powerpc, the supported HWCAP and HWCAP2 features can be found in +@code{sysdeps/powerpc/dl-procinfo.c}. + +This tunable is specific to i386, x86-64, s390x and powerpc. @end deftp @deftp Tunable glibc.cpu.cached_memopt diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c index 0ef3cf89d2..2b727b5917 100644 --- a/sysdeps/powerpc/cpu-features.c +++ b/sysdeps/powerpc/cpu-features.c @@ -19,14 +19,108 @@ #include #include #include +#include +#include +#define MEMCMP_DEFAULT memcmp +#define STRLEN_DEFAULT strlen + +static void +TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) +{ + /* The current IFUNC selection is always using the most recent + features which are available via AT_HWCAP or AT_HWCAP2. But in + some scenarios it is useful to adjust this selection. + + The environment variable: + + GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,.... + + Can be used to enable HWCAP/HWCAP2 feature yyy, disable HWCAP/HWCAP2 + feature xxx, where the feature name is case-sensitive and has to match + the ones mentioned in the file{sysdeps/powerpc/dl-procinfo.c}. */ + + /* Copy the features from dl_powerpc_cpu_features, which contains the + features provided by AT_HWCAP and AT_HWCAP2. */ + struct cpu_features *cpu_features = &GLRO (dl_powerpc_cpu_features); + unsigned long int tcbv_hwcap = cpu_features->hwcap; + unsigned long int tcbv_hwcap2 = cpu_features->hwcap2; + unsigned int tun_count; + const char *token = valp->strval; + tun_count = sizeof (hwcap_tunables) / sizeof (hwcap_tunables[0]); + do + { + const char *token_end, *feature; + bool disable; + size_t token_len, i, feature_len, offset = 0; + /* Find token separator or end of string. */ + for (token_end = token; *token_end != ','; token_end++) + if (*token_end == '\0') + break; + + /* Determine feature. */ + token_len = token_end - token; + if (*token == '-') + { + disable = true; + feature = token + 1; + feature_len = token_len - 1; + } + else + { + disable = false; + feature = token; + feature_len = token_len; + } + for (i = 0; i < tun_count; ++i) + { + const char *hwcap_name = hwcap_names + offset; + /* Check the tunable name on the supported list. */ + if (STRLEN_DEFAULT (hwcap_name) == feature_len + && MEMCMP_DEFAULT (feature, hwcap_name, feature_len) + == 0) + { + /* Update the hwcap and hwcap2 bits. */ + if (disable) + { + /* Id is 1 for hwcap2 tunable. */ + if (hwcap_tunables[i].id) + cpu_features->hwcap2 &= ~(hwcap_tunables[i].mask); + else + cpu_features->hwcap &= ~(hwcap_tunables[i].mask); + } + else + { + /* Enable the features and also check that no unsupported + features were enabled by user. */ + if (hwcap_tunables[i].id) + cpu_features->hwcap2 |= (tcbv_hwcap2 & hwcap_tunables[i].mask); + else + cpu_features->hwcap |= (tcbv_hwcap & hwcap_tunables[i].mask); + } + break; + } + offset += STRLEN_DEFAULT (hwcap_name) + 1; + } + token += token_len; + /* ... and skip token separator for next round. */ + if (*token == ',') token++; + } + while (*token != '\0'); +} static inline void -init_cpu_features (struct cpu_features *cpu_features) +init_cpu_features (struct cpu_features *cpu_features, uint64_t hwcaps[]) { + /* Fill the cpu_features with the supported hwcaps + which are set by __tcb_parse_hwcap_and_convert_at_platform. */ + cpu_features->hwcap = hwcaps[0]; + cpu_features->hwcap2 = hwcaps[1]; /* Default is to use aligned memory access on optimized function unless tunables is enable, since for this case user can explicit disable unaligned optimizations. */ int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t, NULL); cpu_features->use_cached_memopt = (cached_memfunc > 0); + TUNABLE_GET (glibc, cpu, hwcaps, tunable_val_t *, + TUNABLE_CALLBACK (set_hwcaps)); } diff --git a/sysdeps/powerpc/cpu-features.h b/sysdeps/powerpc/cpu-features.h index d316dc3d64..e5fce88e5e 100644 --- a/sysdeps/powerpc/cpu-features.h +++ b/sysdeps/powerpc/cpu-features.h @@ -19,10 +19,112 @@ # define __CPU_FEATURES_POWERPC_H #include +#include struct cpu_features { bool use_cached_memopt; + unsigned long int hwcap; + unsigned long int hwcap2; +}; + +static const char hwcap_names[] = { + "4xxmac\0" + "altivec\0" + "arch_2_05\0" + "arch_2_06\0" + "archpmu\0" + "booke\0" + "cellbe\0" + "dfp\0" + "efpdouble\0" + "efpsingle\0" + "fpu\0" + "ic_snoop\0" + "mmu\0" + "notb\0" + "pa6t\0" + "power4\0" + "power5\0" + "power5+\0" + "power6x\0" + "ppc32\0" + "ppc601\0" + "ppc64\0" + "ppcle\0" + "smt\0" + "spe\0" + "true_le\0" + "ucache\0" + "vsx\0" + "arch_2_07\0" + "dscr\0" + "ebb\0" + "htm\0" + "htm-nosc\0" + "htm-no-suspend\0" + "isel\0" + "tar\0" + "vcrypto\0" + "arch_3_00\0" + "ieee128\0" + "darn\0" + "scv\0" + "arch_3_1\0" + "mma\0" +}; + +static const struct +{ + unsigned int mask; + bool id; +} hwcap_tunables[] = { + /* AT_HWCAP tunable masks. */ + { PPC_FEATURE_HAS_4xxMAC, 0 }, + { PPC_FEATURE_HAS_ALTIVEC, 0 }, + { PPC_FEATURE_ARCH_2_05, 0 }, + { PPC_FEATURE_ARCH_2_06, 0 }, + { PPC_FEATURE_PSERIES_PERFMON_COMPAT, 0 }, + { PPC_FEATURE_BOOKE, 0 }, + { PPC_FEATURE_CELL_BE, 0 }, + { PPC_FEATURE_HAS_DFP, 0 }, + { PPC_FEATURE_HAS_EFP_DOUBLE, 0 }, + { PPC_FEATURE_HAS_EFP_SINGLE, 0 }, + { PPC_FEATURE_HAS_FPU, 0 }, + { PPC_FEATURE_ICACHE_SNOOP, 0 }, + { PPC_FEATURE_HAS_MMU, 0 }, + { PPC_FEATURE_NO_TB, 0 }, + { PPC_FEATURE_PA6T, 0 }, + { PPC_FEATURE_POWER4, 0 }, + { PPC_FEATURE_POWER5, 0 }, + { PPC_FEATURE_POWER5_PLUS, 0 }, + { PPC_FEATURE_POWER6_EXT, 0 }, + { PPC_FEATURE_32, 0 }, + { PPC_FEATURE_601_INSTR, 0 }, + { PPC_FEATURE_64, 0 }, + { PPC_FEATURE_PPC_LE, 0 }, + { PPC_FEATURE_SMT, 0 }, + { PPC_FEATURE_HAS_SPE, 0 }, + { PPC_FEATURE_TRUE_LE, 0 }, + { PPC_FEATURE_UNIFIED_CACHE, 0 }, + { PPC_FEATURE_HAS_VSX, 0 }, + + /* AT_HWCAP2 tunable masks. */ + { PPC_FEATURE2_ARCH_2_07, 1 }, + { PPC_FEATURE2_HAS_DSCR, 1 }, + { PPC_FEATURE2_HAS_EBB, 1 }, + { PPC_FEATURE2_HAS_HTM, 1 }, + { PPC_FEATURE2_HTM_NOSC, 1 }, + { PPC_FEATURE2_HTM_NO_SUSPEND, 1 }, + { PPC_FEATURE2_HAS_ISEL, 1 }, + { PPC_FEATURE2_HAS_TAR, 1 }, + { PPC_FEATURE2_HAS_VEC_CRYPTO, 1 }, + { PPC_FEATURE2_ARCH_3_00, 1 }, + { PPC_FEATURE2_HAS_IEEE128, 1 }, + { PPC_FEATURE2_DARN, 1 }, + { PPC_FEATURE2_SCV, 1 }, + { PPC_FEATURE2_ARCH_3_1, 1 }, + { PPC_FEATURE2_MMA, 1 }, }; #endif /* __CPU_FEATURES_H */ diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list index 87d6235c75..807b7f8013 100644 --- a/sysdeps/powerpc/dl-tunables.list +++ b/sysdeps/powerpc/dl-tunables.list @@ -24,5 +24,8 @@ glibc { maxval: 1 default: 0 } + hwcaps { + type: STRING + } } } diff --git a/sysdeps/powerpc/hwcapinfo.c b/sysdeps/powerpc/hwcapinfo.c index e26e64d99e..0652df353a 100644 --- a/sysdeps/powerpc/hwcapinfo.c +++ b/sysdeps/powerpc/hwcapinfo.c @@ -19,6 +19,7 @@ #include #include #include +#include tcbhead_t __tcb __attribute__ ((visibility ("hidden"))); @@ -63,6 +64,9 @@ __tcb_parse_hwcap_and_convert_at_platform (void) else if (h1 & PPC_FEATURE_POWER5) h1 |= PPC_FEATURE_POWER4; + uint64_t array_hwcaps[] = { h1, h2 }; + init_cpu_features (&GLRO (dl_powerpc_cpu_features), array_hwcaps); + /* Consolidate both HWCAP and HWCAP2 into a single doubleword so that we can read both in a single load later. */ __tcb.hwcap = (h1 << 32) | (h2 & 0xffffffff); diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c index b4f80539e7..3b9278f5e5 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c @@ -21,6 +21,7 @@ #include #include #include +#include size_t __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, @@ -28,7 +29,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, { size_t i = max; - unsigned long int hwcap = GLRO(dl_hwcap); + const struct cpu_features *features = &GLRO (dl_powerpc_cpu_features); + unsigned long int hwcap = features->hwcap; /* hwcap contains only the latest supported ISA, the code checks which is and fills the previous supported ones. */ if (hwcap & PPC_FEATURE_ARCH_2_06) diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h index 3dd00e02ee..a0bbd12012 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h @@ -16,6 +16,7 @@ . */ #include +#include /* The code checks if _rtld_global_ro was realocated before trying to access the dl_hwcap field. The assembly is to make the compiler not optimize the @@ -32,11 +33,12 @@ # define __GLRO(value) GLRO(value) #endif -/* dl_hwcap contains only the latest supported ISA, the macro checks which is - and fills the previous ones. */ +/* Get the hardware information post the tunables set , the macro checks + it and fills the previous ones. */ #define INIT_ARCH() \ - unsigned long int hwcap = __GLRO(dl_hwcap); \ - unsigned long int __attribute__((unused)) hwcap2 = __GLRO(dl_hwcap2); \ + const struct cpu_features *features = &GLRO(dl_powerpc_cpu_features); \ + unsigned long int hwcap = features->hwcap; \ + unsigned long int __attribute__((unused)) hwcap2 = features->hwcap2; \ bool __attribute__((unused)) use_cached_memopt = \ __GLRO(dl_powerpc_cpu_features.use_cached_memopt); \ if (hwcap & PPC_FEATURE_ARCH_2_06) \ diff --git a/sysdeps/powerpc/powerpc64/dl-machine.h b/sysdeps/powerpc/powerpc64/dl-machine.h index 9b8943bc91..449208e86f 100644 --- a/sysdeps/powerpc/powerpc64/dl-machine.h +++ b/sysdeps/powerpc/powerpc64/dl-machine.h @@ -27,7 +27,6 @@ #include #include #include -#include #include #include #include @@ -297,7 +296,6 @@ static inline void __attribute__ ((unused)) dl_platform_init (void) { __tcb_parse_hwcap_and_convert_at_platform (); - init_cpu_features (&GLRO(dl_powerpc_cpu_features)); } #endif diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c index ebe9434052..139b846cef 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c +++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c @@ -17,6 +17,7 @@ . */ #include +#include #include #include #include @@ -27,9 +28,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, size_t max) { size_t i = max; - - unsigned long int hwcap = GLRO(dl_hwcap); - unsigned long int hwcap2 = GLRO(dl_hwcap2); + const struct cpu_features *features = &GLRO (dl_powerpc_cpu_features); + unsigned long int hwcap = features->hwcap; + unsigned long int hwcap2 = features->hwcap2; #ifdef SHARED int cacheline_size = GLRO(dl_cache_line_size); #endif