From patchwork Fri Nov 8 14:01:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh Bodapati X-Patchwork-Id: 2008470 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=OS11NC9X; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XlLH56SvHz1xy0 for ; Sat, 9 Nov 2024 01:02:25 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 832083858C32 for ; Fri, 8 Nov 2024 14:02:23 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id E353E3858D20 for ; Fri, 8 Nov 2024 14:02:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E353E3858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E353E3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731074525; cv=none; b=LXUqgY6NSiLu8J0tNTn7Y5/Z63MYlnnEu2Ox6hrv57pCxdNjmYZOPEjd6ydPPauItpWzMTRJ/hT2v9TiJ2YW0fenI1/MtppWeGK5ruZo9IhjbfuJrgT8GW0DQsINypi+mE0ia+euNO7LSPOVre7EwU7P8XA3OiD5RL79CkG+K2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731074525; c=relaxed/simple; bh=AIVhoYa+6kvsnawjRKAbfNJDXzibBrRvSNrhh2cRVdU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=qYVRIGW9M1DpIUK4HuUOkpMIWkxRjyTsHcilVvBSQtEF/jK9jgT+JmuyTopU+QHzh409wP2dS7+CGS55EN1jQzvsLLwQ2sScx209+IeXUVRW0eqs1h+vSuzjKoJcfyXIv1ZbC688yY65YdQfoR70AHGBTva2vtiyulNqZXXbyr4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4A8De1k7020188 for ; Fri, 8 Nov 2024 14:02:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:message-id:mime-version :subject:to; s=pp1; bh=dfRTC4cXn1x/ESQ0MPseCq+bONaVK/uGFe0Q7/nhO +Q=; b=OS11NC9XPdJa5RnvXVaXSYBJFaHQ3X0tFLHkLof1AxGDVm2myopNhmDTS g9AcM0iQ+Qr0HoRWCP0oMjzwH+P6a57GcGuef/sOZJK3cu/Hs8Fx88+vZCBmnWI7 PiDD9dsbFXvpzPV4f+DcYrzN630tNri4tKRgaUsyiFHZutcz8cMdaNViVP0WlwQM CMwJGE1i1bRMkLq4qqTqO7AJSy9xhQfTaN/iDQzvHhhKqxE0ELaTr8yPshdbPyIO UPOyz+q/T41yTfJsWQw8S4a1Zbxj+lfXnF5utGPqtIHkxxMrjMOXUEetKYBcMu29 sg/GGW6b8pGDdexklomnd7lZYn2aA== Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 42skqqr4u2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 08 Nov 2024 14:02:00 +0000 (GMT) Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 4A8CZRe2008439 for ; Fri, 8 Nov 2024 14:01:59 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 42nywmj1h3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 08 Nov 2024 14:01:59 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4A8E1tsk54395244 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 8 Nov 2024 14:01:55 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A47F020043; Fri, 8 Nov 2024 14:01:55 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DB4DB20040; Fri, 8 Nov 2024 14:01:54 +0000 (GMT) Received: from rain48p1.aus.stglabs.ibm.com (unknown [9.40.203.147]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 8 Nov 2024 14:01:54 +0000 (GMT) From: Mahesh Bodapati To: libc-alpha@sourceware.org Cc: bergner@linux.ibm.com, murphyp@linux.ibm.com, Mahesh Bodapati Subject: [PATCH] powerpc64le: Optimized strcat for POWER10 Date: Fri, 8 Nov 2024 09:01:38 -0500 Message-ID: <20241108140138.3880456-1-bmahi496@linux.ibm.com> X-Mailer: git-send-email 2.43.5 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Td-2n-SAt4PkmCKHHq_2VUDlvqO7Jm3o X-Proofpoint-GUID: Td-2n-SAt4PkmCKHHq_2VUDlvqO7Jm3o X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-15_01,2024-10-11_01,2024-09-30_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 adultscore=0 impostorscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 clxscore=1011 spamscore=0 mlxscore=0 priorityscore=1501 mlxlogscore=673 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2409260000 definitions=main-2411080113 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org With the new optimized strcpy and strlen implementation, this patch adds an optimized strcat which uses it along with default implementation at strings. --- sysdeps/powerpc/powerpc64/multiarch/Makefile | 5 +-- .../powerpc64/multiarch/ifunc-impl-list.c | 5 +++ .../powerpc64/multiarch/strcat-power10.c | 33 +++++++++++++++++++ sysdeps/powerpc/powerpc64/multiarch/strcat.c | 23 +++++++++---- 4 files changed, 57 insertions(+), 9 deletions(-) create mode 100644 sysdeps/powerpc/powerpc64/multiarch/strcat-power10.c diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile index b847c19049..dc7c5b14ee 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile @@ -34,8 +34,9 @@ ifneq (,$(filter %le,$(config-machine))) sysdep_routines += memchr-power10 memcmp-power10 memcpy-power10 \ memmove-power10 memset-power10 rawmemchr-power9 \ rawmemchr-power10 strcmp-power9 strcmp-power10 \ - strncmp-power9 strncmp-power10 strcpy-power9 stpcpy-power9 \ - strlen-power9 strncpy-power9 stpncpy-power9 strlen-power10 + strncmp-power9 strncmp-power10 strcpy-power9 strcat-power10 \ + stpcpy-power9 strlen-power9 strncpy-power9 stpncpy-power9 \ + strlen-power10 endif CFLAGS-strncase-power7.c += -mcpu=power7 -funroll-loops CFLAGS-strncase_l-power7.c += -mcpu=power7 -funroll-loops diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c index 2bb47d3527..ab9e7c6142 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c +++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c @@ -406,6 +406,11 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, /* Support sysdeps/powerpc/powerpc64/multiarch/strcat.c. */ IFUNC_IMPL (i, name, strcat, +#ifdef __LITTLE_ENDIAN__ + IFUNC_IMPL_ADD (array, i, strcpy, hwcap2 & PPC_FEATURE2_ARCH_3_1 + && hwcap & PPC_FEATURE_HAS_VSX, + __strcat_power10) +#endif IFUNC_IMPL_ADD (array, i, strcat, hwcap2 & PPC_FEATURE2_ARCH_2_07 && hwcap & PPC_FEATURE_HAS_VSX, diff --git a/sysdeps/powerpc/powerpc64/multiarch/strcat-power10.c b/sysdeps/powerpc/powerpc64/multiarch/strcat-power10.c new file mode 100644 index 0000000000..8d653ab500 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/multiarch/strcat-power10.c @@ -0,0 +1,33 @@ +/* Copyright (C) 2015-2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifdef __LITTLE_ENDIAN__ +#include + +#define STRCAT __strcat_power10 + +#undef libc_hidden_def +#define libc_hidden_def(name) + +extern typeof (strcpy) __strcpy_power9; +extern typeof (strlen) __strlen_power10; + +#define strcpy __strcpy_power9 +#define strlen __strlen_power10 + +#include +#endif diff --git a/sysdeps/powerpc/powerpc64/multiarch/strcat.c b/sysdeps/powerpc/powerpc64/multiarch/strcat.c index 27e636e0ff..3493716c3c 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/strcat.c +++ b/sysdeps/powerpc/powerpc64/multiarch/strcat.c @@ -25,14 +25,23 @@ extern __typeof (strcat) __strcat_ppc attribute_hidden; extern __typeof (strcat) __strcat_power7 attribute_hidden; extern __typeof (strcat) __strcat_power8 attribute_hidden; +#ifdef __LITTLE_ENDIAN__ +extern __typeof (strcat) __strcat_power10 attribute_hidden; +#endif # undef strcat + libc_ifunc_redirected (__redirect_strcat, strcat, - (hwcap2 & PPC_FEATURE2_ARCH_2_07 - && hwcap & PPC_FEATURE_HAS_VSX) - ? __strcat_power8 - : (hwcap & PPC_FEATURE_ARCH_2_06 - && hwcap & PPC_FEATURE_HAS_VSX) - ? __strcat_power7 - : __strcat_ppc); +#ifdef __LITTLE_ENDIAN__ + (hwcap2 & PPC_FEATURE2_ARCH_3_1 + && hwcap & PPC_FEATURE_HAS_VSX) + ? __strcat_power10 : +#endif + (hwcap2 & PPC_FEATURE2_ARCH_2_07 + && hwcap & PPC_FEATURE_HAS_VSX) + ? __strcat_power8 + : (hwcap & PPC_FEATURE_ARCH_2_06 + && hwcap & PPC_FEATURE_HAS_VSX) + ? __strcat_power7 + : __strcat_ppc); #endif