From patchwork Fri Nov 10 17:35:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 1862477 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=sjmo/Sn7; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=sjmo/Sn7; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SRmFF4wfRz1yQK for ; Sat, 11 Nov 2023 04:35:45 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F1CC13858C01 for ; Fri, 10 Nov 2023 17:35:42 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR02-AM0-obe.outbound.protection.outlook.com (mail-am0eur02on2041.outbound.protection.outlook.com [40.107.247.41]) by sourceware.org (Postfix) with ESMTPS id BB8CF3858D32 for ; Fri, 10 Nov 2023 17:35:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BB8CF3858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BB8CF3858D32 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.247.41 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699637733; cv=pass; b=ccYqWw32j1fN7QagvSNINZnqLEkVNsNCiYgb86DVJ+ok7CjS4NDkGEtRE8aM03d3S23xC6OYRZgL2/YAU2SQfL4AESUlm0d0bNThPayqUA8nCWpTq/DXpcQNYrmsXPCBISZ9L0rZEz0elob+Y0fc8pHXcNG+nMExbIdEDIjYaNU= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699637733; c=relaxed/simple; bh=slzr/SjqOWlAYfwbbHHfbPRQwPym0zqkfuCb1qbg/Vk=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=G0WfAaYQpZV3+yetcAkjdaO4cM6TyzEPLvtG12MsQ/E+H9gg1ePwV81M69yV1cMUvNF8lj9TKPj0MvIaDpm5R8D+RqfSoSFZFLFZLjBCR7foGz6NrFCSQClAe9qOPRLavgqEVSfcuYCLCqAaou0BfGTR3wb58l8OVe6vTz7T+9I= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=X3SEywcMqYLv36BBJ/mlJtMcarHg2FCCIs6L+rTmE0NlsvJ3+TYMyHTgUknJsK7OZV81pkdW38AnJJo3FYQD44e08BUxt+bnS3wkOoE4xcOOYA+UHSRyyeTLCYHwGZ1zcmWFLiJ6vVzG7v6B0GT7h2bCEOvaGYZC2voURhCUyhU2+UIj+0WhHf/38bSrfu8qPcCmyaMHtb5n36icBo7f5591r+fdyLLqfB0T3TzSgGC8Fkpfhe2e+PY8QqBfAZ63eVN6w8wJZ3MX1qItSlrJLNHh7LDqbVNMa3NXgyWVnBF8qsUrfYTQmIJrtFHfYthdkSR4i1WgJbCbso0s7NZMKw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UCz6g/8EBagKYn4oe9zCM8t8kobFLU3OVYJVUmDvgeU=; b=DlZs/S+At/r8CkOBII0G4ywJwsu3QQ+pEOQg+QEl+6hHIpxWx12S7s5p0G3dpOAxe38MCQsewOWersBGRBBz+/uD/Tl3+acg8TR1QiRmpE9J0SAC94u2VPgKVBUgE6BQAmIqiP2bi0S8npmWTuwyOHF3bOpR/IaVWftoggOY6hIuZZVkfX/d5RAun1q+hQYQNTxbgFJdNpE1vCsuoo1b8SefgHvSvXjO2p65cxDwwKU/ba88aYmm7CeE3frSh+yQvnbBptJRibEoaInDyerc5KMD1T5NNgsFOMGsN15fhf6l/Ma1DUf1n0/IP2I4ahd+kM61lIHZb4wmsHss4TgXaw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UCz6g/8EBagKYn4oe9zCM8t8kobFLU3OVYJVUmDvgeU=; b=sjmo/Sn71Y2HdefVQ4ldP64ATJ6NSli6SjBsVZjyekYKA3I5dKA8oypdaWh7nkiW97O9dU8VvJvD4pU/VHqrdmNEobtl2zbFDKLWm+o9en9lgtlTd6wJjPLc3tTidRcIZ6pYALEwR2EGQ/8g5/knm5qzsuBkX+nTZs/AKt/M+E8= Received: from AM6P192CA0040.EURP192.PROD.OUTLOOK.COM (2603:10a6:209:82::17) by AS8PR08MB7695.eurprd08.prod.outlook.com (2603:10a6:20b:520::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.19; Fri, 10 Nov 2023 17:35:28 +0000 Received: from AM3PEPF00009BA2.eurprd04.prod.outlook.com (2603:10a6:209:82:cafe::74) by AM6P192CA0040.outlook.office365.com (2603:10a6:209:82::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.19 via Frontend Transport; Fri, 10 Nov 2023 17:35:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM3PEPF00009BA2.mail.protection.outlook.com (10.167.16.27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.16 via Frontend Transport; Fri, 10 Nov 2023 17:35:27 +0000 Received: ("Tessian outbound 5d213238733f:v228"); Fri, 10 Nov 2023 17:35:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d231eb1536256b7e X-CR-MTA-TID: 64aa7808 Received: from 155e175bfb78.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 03406A69-2BCB-4045-922A-5375A5B57C6C.1; Fri, 10 Nov 2023 17:35:21 +0000 Received: from EUR02-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 155e175bfb78.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 10 Nov 2023 17:35:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SOOyjjL6H9jGqFy15Wb9NVVL/UlB6qUnEA7hfEyzad7GsXmXeyI+eOAZeUjJfa0tCDmn3W62DWVSwhZjc8VTDb+y/Jqaudur05Mazoj1YcqW3evOy3J53I6qmd0rFMDqPLlPH9bssstTGK2YlUgNjdh1ac0vW3hm/qSMIJ5rY9BPlQdr3YKhykopndO6w5/gpoGvtoikfSUJrBNZSgXUPA+Rd+hOonGwSBEEK51KYLNOcd/VY+OYyTSoQy4si22ulZ9SEPSvnKXnRzyPst73YwJB2sBMa9nE9CpZAqJtYIH+2wjpZ+IQN2JS2XyPhrsv7L0n3iuF9zJ6G6r50kTKTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UCz6g/8EBagKYn4oe9zCM8t8kobFLU3OVYJVUmDvgeU=; b=eD85uenOHydbCQhf1q6Fyn9QQipgt4p5I+tl3zG6IZPR50VRkTc4PN9G8AC7MFG9tKroHhRvTX5iYEpw343Uj1zZyvOPU5fVLtXQUW4SrYBQLOXNP6hNHun2caP6o9HcoCS77pXaEnI7yuWFLYTRVLvoUFawOZD5adkyo0aB8F6C5QHqLCn/jOa/tKMmvuq1/AhdpZUA4wchW8ow/P20sPmO6NQO1sRd4Kkewkl0Urc54HwCOQoVAgn6ilclk7ZqDqUW4NTynG+n2ImEwyDanyLCIvxIxBBOTJYB6huVRkZcEcu8gF14grtiLbltojiRHHA4lCDMH+GjVASijEe0Pw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UCz6g/8EBagKYn4oe9zCM8t8kobFLU3OVYJVUmDvgeU=; b=sjmo/Sn71Y2HdefVQ4ldP64ATJ6NSli6SjBsVZjyekYKA3I5dKA8oypdaWh7nkiW97O9dU8VvJvD4pU/VHqrdmNEobtl2zbFDKLWm+o9en9lgtlTd6wJjPLc3tTidRcIZ6pYALEwR2EGQ/8g5/knm5qzsuBkX+nTZs/AKt/M+E8= Received: from PAWPR08MB8982.eurprd08.prod.outlook.com (2603:10a6:102:33f::20) by AS8PR08MB7766.eurprd08.prod.outlook.com (2603:10a6:20b:526::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.19; Fri, 10 Nov 2023 17:35:20 +0000 Received: from PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::cfc5:acc1:cfc1:9704]) by PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::cfc5:acc1:cfc1:9704%5]) with mapi id 15.20.6954.028; Fri, 10 Nov 2023 17:35:20 +0000 From: Wilco Dijkstra To: 'GNU C Library' CC: Szabolcs Nagy Subject: [PATCH 2/3] AArch64: Add memset_zva64 Thread-Topic: [PATCH 2/3] AArch64: Add memset_zva64 Thread-Index: AQHaE/wPp2cXJ+gkQ0+YV1pmbvS1Hg== Date: Fri, 10 Nov 2023 17:35:19 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAWPR08MB8982:EE_|AS8PR08MB7766:EE_|AM3PEPF00009BA2:EE_|AS8PR08MB7695:EE_ X-MS-Office365-Filtering-Correlation-Id: fb3cb6a8-15e3-4228-1d76-08dbe2136ddf x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Dw/fGi1D37d9h5edtwEgtejqp/dVeUfnc/UUFRDFJSNY56jNylpofkBHLNQIV44AsvaaHQ387GSCAxqByzS9GWTqCtwQa5j60wQtMMrZXVbXg/Lolz1oZgjc42F3h0H3cSVJlgq0RxXz0tPFMzyDjHklQQDir7EjJt6PvManP/GwEFe1nH/vjCQq1MTwkKDgOXdiMpsvu20MPDpGxXSgeiFKv8Wa4zoKvtNUGAGVNaHThABrPy7/xTPyHI/t9oaMj7Kid1H58NcqRfZAVQb9iafN1ZTOhko8KYBAeqVN5tN7g5TNsT8lwPmD3kW5BSUy5xuC1Cx5nXdeEWGhhW10LpTirrN8RBhrqHQaSm96J5k4BjuVk85XVAu4BOry0UOEt+0w2BeEwsA3JRAk04ChUHU7Z9m93CrZWZlwbwNKsT4r1UWFr4Y03mg1e7Zki7s5ABi7cHaM4R+659el0BVasBVM0Wu3SPjxcBMu8Qtblyht3WqpwW7SuVN0n/Ba9TAvTndLhelK7oE31W5ADIV1RF3g2G72lXnPuXaBqlGZHabZro8S2jMCBEZt2IYbFrBriXNKs8ew5lMdbMVCsJTSsYW768bMpmfOTgwHgetGI/wqYXbrcuBI3OI0rFuz3PQjYyJxcZvs6eqWqbY+awbO0kzCVic2xZQnl8foxVLb2XY1pmW9r/2qWdWzvs2sGKedBMN7vlKpt4J4HqQdn4Oq6A== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8982.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(136003)(366004)(346002)(396003)(39860400002)(376002)(230922051799003)(64100799003)(451199024)(186009)(1800799009)(26005)(38100700002)(6506007)(7696005)(55016003)(122000001)(83380400001)(9686003)(478600001)(316002)(71200400001)(91956017)(66946007)(64756008)(76116006)(66556008)(66476007)(66446008)(6916009)(8936002)(38070700009)(33656002)(41300700001)(4326008)(52536014)(2906002)(8676002)(86362001)(5660300002)(2004002)(357404004); DIR:OUT; SFP:1101; MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB7766 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM3PEPF00009BA2.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 15281538-b317-4f04-4231-08dbe2136930 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: f0WYnt8ZG32B2MG/3HYjoZw0l2V7anYMD9vJR4u/K49mheljlQQZjIFrM/TZFpRWWPP6+BZ9r8XE9XQQoH25oNqnzn5H0eacAf7T3/ptO7KGIPlob5Ayix87gR6bl+r8Ev6T/yZk+GurTqtF3mUSA45e9Xzcq+dtwZKQXzb8dMtc8QvU/hvSZxYVCRjDMdCAFCqqlFpoFToxP9lj0BUOS+wt4sCsjdB0ftyJOr0R817Q/7eN1/DcU8qx2zK6x+VK2sbUirSEuQm0n9f8IJhjmyIzRS8VT/z5xlikOO1zNe3Zvp7DkVCMEt5qP5Z7zDfSs17U216cratcrv+jOZK+oHBPILbcr4olmPeTZfSzaQS0y82uD1i/MIgo6yNDIoGXBxJT2VpsvwzjI87WAQWSlYOxGkAW+WWKxtEVxQoSB2JTSiH9ezcka37BQ7PAJl9A5gu8E4dkLhzyOECdHvinNjxpRvTifI+gwvp9obyiV/LNm7ZuZq0dW465e/xEn/hDwfDFsUryUEeWb28uCApLkxHz4oQxRqCIR+D/JXHbvR60s3jp/3VFwXVWTlOjzGXTQAMS7M0ikt5UZS8uhRDvNN9z1rczMCqyOzHgBGoJk7B5A4bkuWKWh0C9LxJ5TXLdp/G5rTrmJqvhwUw4xX0R0kdLzFTe2gHc4eVhNZzLCWxZSsDPeCrmqsX3yFIqbo2FYaumNrC9y2TUhuo30+qM5Sx9WoH2x1DW/fkHvZlfJuzOnrgsY0N/yTNjixKk94BihtovtDqlDzko/ega8QMsTA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(346002)(39860400002)(396003)(376002)(136003)(230922051799003)(82310400011)(64100799003)(1800799009)(451199024)(186009)(40470700004)(36840700001)(46966006)(26005)(9686003)(47076005)(55016003)(40480700001)(83380400001)(336012)(40460700003)(36860700001)(478600001)(6506007)(7696005)(316002)(81166007)(70206006)(6916009)(2906002)(70586007)(41300700001)(5660300002)(82740400003)(86362001)(33656002)(8936002)(4326008)(356005)(52536014)(8676002)(2004002)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Nov 2023 17:35:27.8026 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fb3cb6a8-15e3-4228-1d76-08dbe2136ddf X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF00009BA2.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB7695 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Add a specialized memset for the common ZVA size of 64. Since the code is identical to __memset_falkor, remove the latter. OK for commit? Reviewed-by: Adhemerval Zanella diff --git a/sysdeps/aarch64/memset.S b/sysdeps/aarch64/memset.S index bf3cf85c8a95fd8c03ae13c4173fe507040ee8cd..bbfb7184c3e4277f59178ccf4f9b92814dd7a48d 100644 --- a/sysdeps/aarch64/memset.S +++ b/sysdeps/aarch64/memset.S @@ -101,19 +101,19 @@ L(tail64): ret L(try_zva): -#ifdef ZVA_MACRO - zva_macro -#else +#ifndef ZVA64_ONLY .p2align 3 mrs tmp1, dczid_el0 tbnz tmp1w, 4, L(no_zva) and tmp1w, tmp1w, 15 cmp tmp1w, 4 /* ZVA size is 64 bytes. */ b.ne L(zva_128) - + nop +#endif /* Write the first and last 64 byte aligned block using stp rather than using DC ZVA. This is faster on some cores. */ + .p2align 4 L(zva_64): str q0, [dst, 16] stp q0, q0, [dst, 32] @@ -123,7 +123,6 @@ L(zva_64): sub count, dstend, dst /* Count is now 128 too large. */ sub count, count, 128+64+64 /* Adjust count and bias for loop. */ add dst, dst, 128 - nop 1: dc zva, dst add dst, dst, 64 subs count, count, 64 @@ -134,6 +133,7 @@ L(zva_64): stp q0, q0, [dstend, -32] ret +#ifndef ZVA64_ONLY .p2align 3 L(zva_128): cmp tmp1w, 5 /* ZVA size is 128 bytes. */ diff --git a/sysdeps/aarch64/multiarch/Makefile b/sysdeps/aarch64/multiarch/Makefile index a1a4de3cd93c48db6e47eebc9c111186efca53fb..171ca5e4cf9a87fc7df5896f21c2e5b94ea218ba 100644 --- a/sysdeps/aarch64/multiarch/Makefile +++ b/sysdeps/aarch64/multiarch/Makefile @@ -12,10 +12,10 @@ sysdep_routines += \ memmove_mops \ memset_a64fx \ memset_emag \ - memset_falkor \ memset_generic \ memset_kunpeng \ memset_mops \ + memset_zva64 \ strlen_asimd \ strlen_generic \ # sysdep_routines diff --git a/sysdeps/aarch64/multiarch/ifunc-impl-list.c b/sysdeps/aarch64/multiarch/ifunc-impl-list.c index 3596d3c8d3403b4ea07d80d9a8877e2908a9883e..fdd9ea92463123df213dec27f6f0598f8ce54d6e 100644 --- a/sysdeps/aarch64/multiarch/ifunc-impl-list.c +++ b/sysdeps/aarch64/multiarch/ifunc-impl-list.c @@ -54,9 +54,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, memmove, mops, __memmove_mops) IFUNC_IMPL_ADD (array, i, memmove, 1, __memmove_generic)) IFUNC_IMPL (i, name, memset, - /* Enable this on non-falkor processors too so that other cores - can do a comparative analysis with __memset_generic. */ - IFUNC_IMPL_ADD (array, i, memset, (zva_size == 64), __memset_falkor) + IFUNC_IMPL_ADD (array, i, memset, (zva_size == 64), __memset_zva64) IFUNC_IMPL_ADD (array, i, memset, 1, __memset_emag) IFUNC_IMPL_ADD (array, i, memset, 1, __memset_kunpeng) #if HAVE_AARCH64_SVE_ASM diff --git a/sysdeps/aarch64/multiarch/memset.c b/sysdeps/aarch64/multiarch/memset.c index 9193b197ddc3a647768184a6a639d6635cfea96e..6deb6865e5154f129922dca673cf069f72f46d79 100644 --- a/sysdeps/aarch64/multiarch/memset.c +++ b/sysdeps/aarch64/multiarch/memset.c @@ -28,7 +28,7 @@ extern __typeof (__redirect_memset) __libc_memset; -extern __typeof (__redirect_memset) __memset_falkor attribute_hidden; +extern __typeof (__redirect_memset) __memset_zva64 attribute_hidden; extern __typeof (__redirect_memset) __memset_emag attribute_hidden; extern __typeof (__redirect_memset) __memset_kunpeng attribute_hidden; extern __typeof (__redirect_memset) __memset_a64fx attribute_hidden; @@ -47,18 +47,17 @@ select_memset_ifunc (void) { if (IS_A64FX (midr) && zva_size == 256) return __memset_a64fx; - return __memset_generic; } if (IS_KUNPENG920 (midr)) return __memset_kunpeng; - if ((IS_FALKOR (midr) || IS_PHECDA (midr)) && zva_size == 64) - return __memset_falkor; - if (IS_EMAG (midr)) return __memset_emag; + if (zva_size == 64) + return __memset_zva64; + return __memset_generic; } diff --git a/sysdeps/aarch64/multiarch/memset_falkor.S b/sysdeps/aarch64/multiarch/memset_falkor.S deleted file mode 100644 index c6946a8072ce60099f9c3da0cf4ca54785e6a520..0000000000000000000000000000000000000000 --- a/sysdeps/aarch64/multiarch/memset_falkor.S +++ /dev/null @@ -1,54 +0,0 @@ -/* Memset for falkor. - Copyright (C) 2017-2023 Free Software Foundation, Inc. - - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library. If not, see - . */ - -#include -#include - -/* Reading dczid_el0 is expensive on falkor so move it into the ifunc - resolver and assume ZVA size of 64 bytes. The IFUNC resolver takes care to - use this function only when ZVA is enabled. */ - -#if IS_IN (libc) -.macro zva_macro - .p2align 4 - /* Write the first and last 64 byte aligned block using stp rather - than using DC ZVA. This is faster on some cores. */ - str q0, [dst, 16] - stp q0, q0, [dst, 32] - bic dst, dst, 63 - stp q0, q0, [dst, 64] - stp q0, q0, [dst, 96] - sub count, dstend, dst /* Count is now 128 too large. */ - sub count, count, 128+64+64 /* Adjust count and bias for loop. */ - add dst, dst, 128 -1: dc zva, dst - add dst, dst, 64 - subs count, count, 64 - b.hi 1b - stp q0, q0, [dst, 0] - stp q0, q0, [dst, 32] - stp q0, q0, [dstend, -64] - stp q0, q0, [dstend, -32] - ret -.endm - -# define ZVA_MACRO zva_macro -# define MEMSET __memset_falkor -# include -#endif diff --git a/sysdeps/aarch64/multiarch/memset_zva64.S b/sysdeps/aarch64/multiarch/memset_zva64.S new file mode 100644 index 0000000000000000000000000000000000000000..13f45fd3d882c756f18a1679d758e2eb688f9c3d --- /dev/null +++ b/sysdeps/aarch64/multiarch/memset_zva64.S @@ -0,0 +1,27 @@ +/* Optimized memset for zva size = 64. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define ZVA64_ONLY 1 +#define MEMSET __memset_zva64 +#undef libc_hidden_builtin_def +#define libc_hidden_builtin_def(X) + +#include "../memset.S"