From patchwork Fri Jul 26 09:19:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965223 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=qFTUDyue; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=qFTUDyue; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj0j5n5Tz1yY5 for ; Fri, 26 Jul 2024 19:20:53 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0E4553870C20 for ; Fri, 26 Jul 2024 09:20:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on20602.outbound.protection.outlook.com [IPv6:2a01:111:f403:2612::602]) by sourceware.org (Postfix) with ESMTPS id 06553384403E; Fri, 26 Jul 2024 09:20:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 06553384403E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 06553384403E Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2612::602 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985623; cv=pass; b=Zx2ahsYgW/u7dA1ZF62CZVZHaToFk/MFtusmiMay+3LAh9hG0JCdTVoF/i4eQoACM5H60jzs2nLOIxLyUpSfDXJ0KTk6sbtWrf3N/km3uTTGpjGyMRsUNT4hogJWgsHls0mTNQmv+vYzZfU1stwemaC9iHsN6q6HPK3qsLoZUvI= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985623; c=relaxed/simple; bh=RUHun02GENkPqs3MAtn0I+P+3IOpa97TaSDcqfyuyDI=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=Y8W1V354C6x0rHmT2Emq/okOLASxTCZ8+QUzrn8SJu3Lilv0GV829r1Ck9SbJNx9VbWsS56H7KITwn+F4NxXihNg9lI5/VHBkvU5SkUjjJjGyaEFMZtjLpF4p07UJ7sXbreuB3umJAQ6t1/0AeU67U/F4HS7bfixQKhKE8pqSHg= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=Gd0ptabKEUmiCOjCLUirIpVqAjMrA0U6fIjfwGDTabvupmhU1TkUhlOQ0UE/lRJVaNli0osKlRET+HJVO8hdnEjs/4vCU4sQL3aLa03fL27wcEGsPzVM8d6arLT4EvAyf5U8MkSepTCjBhuP3RbzkrCdc7IhJE097btGULMtpysnFuDIumO5tm70kV0eVeMRSbhvmPSiZgfz5iMBva/vV0XCfZv5XQprUGrqSDTYfosiK27soEr1YH95BZi/m48KPLv3+kouWK3VWmv468dvgsYgakHbdJsGxDpmGZX4nwnQqeNEm61/zIC8wkh3EfdpJd4TXiWqf/pOOLn2fdd/oQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nYHd3Fd043REOvlw5xC12NN8lZDiuIJFbT+UIUtshzM=; b=VWq8wEUYLhKZoI8sEID/hEOxYgOJiFqEuSgKD4d3matsxUa9fT8of97+ZiieIUaAKNQJEQxS34ws3CMaqONIglttWZzJ1MIVOLR53sHZAYDyfeueW0wIDpOel0AnL9qu3hxu5uiJ+fo5O8wXsWr4ayO0Pu+VW4HKKDk3KiyAmYPaxfwkxYBtCI4tzs8WMi1H03GwPcMv97Kwr8/+FYPTSOqNN+OkN4eDg/8N8hfEGiLzVLks6r5iHN1ISbibBVlMFvnujabQl+kMbBwGbrY/QlPBFYjmhyv9XZEhYOtiz7uwUKP5QdI3Gus7JC4jDYc+9QgijLUkjwuLFN9NihqYxQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nYHd3Fd043REOvlw5xC12NN8lZDiuIJFbT+UIUtshzM=; b=qFTUDyue+6EAdiMb5sMTW5N4qTREhcQQAHPcxXyuhJJ/+3hOClzCpfNpRryrKgnej47ZGWi8iLD6dKWkb8kDaoKNLBfKMo9dk3QU7Fchr6io0dgsMwyPV6hb7JFwqSc0vzHvggq6XkicqaaiGfvsq7Uoxg+ItfXlmg/r4wafppw= Received: from AS9PR05CA0070.eurprd05.prod.outlook.com (2603:10a6:20b:499::6) by DB9PR08MB7818.eurprd08.prod.outlook.com (2603:10a6:10:39a::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.16; Fri, 26 Jul 2024 09:20:09 +0000 Received: from AM3PEPF0000A798.eurprd04.prod.outlook.com (2603:10a6:20b:499:cafe::97) by AS9PR05CA0070.outlook.office365.com (2603:10a6:20b:499::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.28 via Frontend Transport; Fri, 26 Jul 2024 09:20:09 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM3PEPF0000A798.mail.protection.outlook.com (10.167.16.103) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:20:08 +0000 Received: ("Tessian outbound cd0b9b5d6f11:v365"); Fri, 26 Jul 2024 09:20:08 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d2909154df94a322 X-CR-MTA-TID: 64aa7808 Received: from L5d3c7e820ca8.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 59B72EDA-D956-4F2B-A656-B35C745F5222.1; Fri, 26 Jul 2024 09:20:02 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L5d3c7e820ca8.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:20:02 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=SNkYL3fvyiOHtq/1HIG2oUgQmGOBrENcb6v+AKuqCbIAe0iaO0C3RH0XZ9tRJjajNb7Vs7eoxgGlamnJWuw0MMO6wGlazHuZZ0FYQ7I+0aCRb1m0cCbQHRpZuIYgvkQEVPE1LlBRI3HkUn3+OwNrbMRiPqx8YK/HmPT/MAZiVvkys7QORfzoTHKtAu3pwPnvoWL6x4T/lsgATQylYmCLyEvD+P7CMeDutK2dGwY7ChFposQ+OPtKWnhP2tZXTkqmy1npqyJGS7mNRHaqZAJkyNa5Z3BMY9xVP83JHlqNehj1yN6QlOUg5LkU+jqAmIuwR63QB6sIRFuB4UFITa8JOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nYHd3Fd043REOvlw5xC12NN8lZDiuIJFbT+UIUtshzM=; b=k5QR1Hjq9gTqDRI7MJihcL4N9vjtX3wKKXOcQoqLgMSP5JhO4aKAN3JnKUp0VWlshUkaQ/oiIdpnjZLQbd7MzDIK948rpZJpjrAQmSh7UPWEhMQ38ySuatarZMDXMszxNo8C0F00dc4A/5BZ/EY0CsPNWSZNWxM3GlUjDOK8ZQHvmMDc9DcbdAsBTFpJKECXWnxzKCvm2st+60ZIIkpxbY1pH+j3KoPdP0SvkV71c4mu6LgrgtZPJJCZ+Mj1ta3UC5PkkDOxMMSMlPalEVL4ZlVqzWDpxl+a6eEoLVQrVmoCGDnjRkG01SuInJBXsliWB9grlxLdSWbc5hKpyd5eNQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nYHd3Fd043REOvlw5xC12NN8lZDiuIJFbT+UIUtshzM=; b=qFTUDyue+6EAdiMb5sMTW5N4qTREhcQQAHPcxXyuhJJ/+3hOClzCpfNpRryrKgnej47ZGWi8iLD6dKWkb8kDaoKNLBfKMo9dk3QU7Fchr6io0dgsMwyPV6hb7JFwqSc0vzHvggq6XkicqaaiGfvsq7Uoxg+ItfXlmg/r4wafppw= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB6652.eurprd08.prod.outlook.com (2603:10a6:10:2ab::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:20:01 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:20:00 +0000 Date: Fri, 26 Jul 2024 10:19:57 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P265CA0357.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:d::33) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB6652:EE_|AM3PEPF0000A798:EE_|DB9PR08MB7818:EE_ X-MS-Office365-Filtering-Correlation-Id: c5cf1ade-0650-428f-7612-08dcad542505 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|366016|1800799024|376014; X-Microsoft-Antispam-Message-Info-Original: ZYcVYDe1CdSbHpNqly+i7tqVKj0Lj62BKZxBdktmngaD89xSTa6Jxen25QcdDN+O1ammcoMAtTwgmppxXf60hPUXY80qkKrp40Gt5kPSo8iSDeOSS1erO8WojKcPJ8Br+FLpoYdkSA2XKia8Q47x4L3QAJ571QOR04Y+xBNeqxjVgvSs8/rQCfQjgGEddlQH3b9ZT/+lTAG9F+62x0KM2JHxk+HYPZtO1WW3ybK35FYiSUqvTGtTAuivBspimKppNi0Z4AtZCFuz9d3iiAVYT/4hqBnXW9tIsQYWAuMfw2StEX/fHwA4GxHP/ssKD5/R7/nm+FE3xY34dsiI/FEGH92g5VZkunecYIYO/65zPas8Syxju6crRG+KnrWCXE1puRe3XQdHu6owB100N5Ye6gjaVXN0vzLrgj5x/wTGLtSyAjdizta60wJKsE89XfGcSKFH7UeaAZhP4zeB5OlSqkN5iw+VEBltH/F4pcbM/MqBWg90jhQmHgtCNfa20Yazs9rynwy9efFLacj49x058M+DT9CmgDcQ5NfRPYTRBJGDbcWzvsl3UH3TEX4PX3cJOQKFoZkIlHtAWshEoFALgsZlL4pm2qGCXdFsP33LPlqNeaZGlrw7+zlRM5Xcdo/QA/apgGIKk/6QahxnJqaSGwNsAuTX+PONfegNryMeJPsjzujKSbC+q0ZekSIVIZbuj1mvHED7gLMvYK9vQnODj0itDDmQn/NeN5W/F+Sd7QHCFVvpXg9j2HilwfF71x0o2Qv/Z21nUOX7J9077dqJIDIOFR5fnmOfqqLxoLH0NmnTaTwYhdHiHXupysfrfN0CaKikYOh0zxs8WEVU6xxwm9/KxgQIwSWDA9Sh+jCWU+/H7GM3oT8pD7cBkIX/ljBZov2So1tBxVVZxTNDE/CKHCtLDilGtZEI7lqxjm0j+rd80q2d08bS+z6LRx1xpf18HmUfNzoc7IZGrjPcRvzsjwxFuKVuiQTWSSwaKYlLG/r97lJlJsOU5hFIG5oQUb/blmOfsal/ej9iLPcbwgCs9xKcgiN5XotJe2TOfOYKLmN01heRV2QQKSDoz/+Bt8EsP0AFGcUx+5QHk+VUWAoaur+nS8HixLmtT0OJFrVGjOe1UKFupLRm/dN57f712CAgbslE7HeofCQ8GLL10eNCIzABNh570LKkAn6S8TQt7Qs78oU1HtBu5b5oK+ofZyTFHFADJ/veMs1J6sv08ZF+25Q71pJK80xeql+cPQyhUEvVPIXaiiZTB5iwI9Ke2n6O40yhYZkR3PpXFiu5AuMTUC7VEsMc+5gn8fdfaApoSDvrtD+fCnLyjiei5GKDT49Per4qb8xjlD+meqGnmnOMLQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6652 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM3PEPF0000A798.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3fd05ed4-dbf8-4f15-982b-08dcad542000 X-Microsoft-Antispam: BCL:0; ARA:13230040|34020700016|36860700013|35042699022|1800799024|376014|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?S93U7Cp4KhDPSVtP8c5FKzDU/7NdLM5?= =?utf-8?q?tPxBrCkPQEdB6aWjJ2f+OxzvpZMEzfHDmegtRgYsm8lp8Le/ZHkalnlQuz55/CdVA?= =?utf-8?q?3vdHA9ZyrlhzS+E7VewkwG6J+spP88pQWXeqU7usri+nB3ipjAYhJHAf8NuE9xW+I?= =?utf-8?q?YFeneE/hUhjQVNkqmYn+SB/EtFhKqNLD7h1q4w1ev/m06Q5QzEu6xeGXTnfbBW+u0?= =?utf-8?q?8IMlrIt/Db0xWW44W0MIzdMTPswnBR+doaKwCJglXT2VOqNf+Yavj+D146OdjUhaN?= =?utf-8?q?7BMQa6pDVtiAXOqPKiedBZAdZY2dzzeGU3kmch/qKBbGueXghw3Weo5h1MNQJ7val?= =?utf-8?q?Hf11/uO/jfcLN9Y5R05Rb35sUmfTkWocgGFiW1GbqTMmuKyMk277mohVXBZceg5E9?= =?utf-8?q?E5FzSkzCG3meOWyE5B4ZKg+TKDIC6P4nEexHm3kho1E9MdaPRcCb1eJSSW6c9JqIC?= =?utf-8?q?OJlCjdQi4b1tWTPFyO5LY4MA5K3WyL/M7PiPfCEAD/Klid5g7EGmnQV0Kwj9imYCC?= =?utf-8?q?1epp+uRxzhKg9WFOQJSDsD5kTg0KSQfyTN5Qz3RpFrYmTDamBvq1VmA+aDHMm5cSM?= =?utf-8?q?5Z9SAGKIblmI83Lxj3YNTECmB8D2eOg9/j6REyk4Vb0ABTaNtfDh3VCbLe2V0E+IU?= =?utf-8?q?iCcX8rs9FV5RWQDHGbOsY3KQFW6KVcg2I6bej3+MnJZAZrk5P22/fKixgembHX2X2?= =?utf-8?q?0+auWqsbSYgYCk3EViAOe276ojoDReegw/KLBugJeTPMaUqgfZAmijSeQnCFblAsi?= =?utf-8?q?232ngJMYSOkKsFDAx2ERoNyJL0+7f8R2K9z0huKWAG0cIj0+CSAmSblcPZ/g4nFir?= =?utf-8?q?Qy/121gYp/qHW57JOd9WdiVZ+bQihfGYlwAzGKE40bG7ne/ThiicdTIsokdhaYlOm?= =?utf-8?q?GGH0R322thoUiGz3aHQm8dnXafazYkiAt5RjJw4cDgZGK6OF5UUKo5qnQYbLsQ4Hp?= =?utf-8?q?drMgMtnpBxVLK6WbBrn/Zdi+t5K7xlhKqVtkiIsc1WEWgG2v4lM/JoIGKiiQkmNnq?= =?utf-8?q?KM7S2TY0acDetkPaw9FUEsLBYU5n4YmkXUV4o0c7D5bBKmG2DSPyaNQ4KQ8l/I+xV?= =?utf-8?q?hAj0+S1cu6cx0BLnPJhsjQr0b66HmI7Z/KsW8qkV+8q09jhjzlEHDdri8JVRxbs+/?= =?utf-8?q?6pOOk/X54dQXs0pTr91cIS1g2rzNLOTDui6oYJOViPRU7q/ZhnpeQ7bdk+Bw5JCqI?= =?utf-8?q?3mxjRvxPcPDTdqBBfo5a4a6zumZjRHxdgSqxbkEdtXrwZgajMsS9DvgAJQrqRmna0?= =?utf-8?q?rTcaHehpfky/ujdnwKA0Aps8/uwXWCPBMWQbrP23bzlN197n1R9YpPfaQGg7Tm7ih?= =?utf-8?q?/w6OcKJK5uzqxcKCjjBfQ78iujnyrpQifRrhKYxP3zWYNUU8/t3ZAIOtwjCXiFk80?= =?utf-8?q?4HD2hEL3x6R?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(34020700016)(36860700013)(35042699022)(1800799024)(376014)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:20:08.8803 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c5cf1ade-0650-428f-7612-08dcad542505 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A798.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB7818 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This updates the cost for Neoverse V2 to reflect the updated Software Optimization Guide. It also makes Cortex-X3 use the Neoverse V2 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x3): Use Neoverse-V2 costs. * config/aarch64/tuning_models/neoversev2.h: Update costs. --- -- diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index e58bc0f27de3d60d39c02d2be2aa15570bd5db4d..34307fe0c1721dda67adab768dd22a5649687f6e 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -186,7 +186,7 @@ AARCH64_CORE("cortex-a720", cortexa720, cortexa57, V9_2A, (SVE2_BITPERM, MEMTA AARCH64_CORE("cortex-x2", cortexx2, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversen2, 0x41, 0xd48, -1) -AARCH64_CORE("cortex-x3", cortexx3, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversen2, 0x41, 0xd4e, -1) +AARCH64_CORE("cortex-x3", cortexx3, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversev2, 0x41, 0xd4e, -1) AARCH64_CORE("cortex-x4", cortexx4, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversen2, 0x41, 0xd81, -1) diff --git a/gcc/config/aarch64/tuning_models/neoversev2.h b/gcc/config/aarch64/tuning_models/neoversev2.h index f76e4ef358f7dfb9c7d7b470ea7240eaa2120f8e..cca459e32c1384f57f8345d86b42b7814ae44115 100644 --- a/gcc/config/aarch64/tuning_models/neoversev2.h +++ b/gcc/config/aarch64/tuning_models/neoversev2.h @@ -57,13 +57,13 @@ static const advsimd_vec_cost neoversev2_advsimd_vector_cost = 2, /* ld2_st2_permute_cost */ 2, /* ld3_st3_permute_cost */ 3, /* ld4_st4_permute_cost */ - 3, /* permute_cost */ + 2, /* permute_cost */ 4, /* reduc_i8_cost */ 4, /* reduc_i16_cost */ 2, /* reduc_i32_cost */ 2, /* reduc_i64_cost */ 6, /* reduc_f16_cost */ - 3, /* reduc_f32_cost */ + 4, /* reduc_f32_cost */ 2, /* reduc_f64_cost */ 2, /* store_elt_extra_cost */ /* This value is just inherited from the Cortex-A57 table. */ @@ -86,28 +86,28 @@ static const sve_vec_cost neoversev2_sve_vector_cost = { 2, /* int_stmt_cost */ 2, /* fp_stmt_cost */ - 3, /* ld2_st2_permute_cost */ + 2, /* ld2_st2_permute_cost */ 3, /* ld3_st3_permute_cost */ - 4, /* ld4_st4_permute_cost */ - 3, /* permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ /* Theoretically, a reduction involving 15 scalar ADDs could - complete in ~3 cycles and would have a cost of 15. [SU]ADDV - completes in 11 cycles, so give it a cost of 15 + 8. */ - 21, /* reduc_i8_cost */ - /* Likewise for 7 scalar ADDs (~2 cycles) vs. 9: 7 + 7. */ - 14, /* reduc_i16_cost */ - /* Likewise for 3 scalar ADDs (~2 cycles) vs. 8: 3 + 4. */ + complete in ~5 cycles and would have a cost of 15. [SU]ADDV + completes in 9 cycles, so give it a cost of 15 + 4. */ + 19, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 8: 7 + 5. */ + 12, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 6: 3 + 4. */ 7, /* reduc_i32_cost */ - /* Likewise for 1 scalar ADD (~1 cycles) vs. 2: 1 + 1. */ - 2, /* reduc_i64_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 4: 1 + 3. */ + 4, /* reduc_i64_cost */ /* Theoretically, a reduction involving 7 scalar FADDs could - complete in ~6 cycles and would have a cost of 14. FADDV - completes in 8 cycles, so give it a cost of 14 + 2. */ - 16, /* reduc_f16_cost */ - /* Likewise for 3 scalar FADDs (~4 cycles) vs. 6: 6 + 2. */ - 8, /* reduc_f32_cost */ - /* Likewise for 1 scalar FADD (~2 cycles) vs. 4: 2 + 2. */ - 4, /* reduc_f64_cost */ + complete in ~6 cycles and would have a cost of 7. FADDV + completes in 8 cycles, so give it a cost of 7 + 2. */ + 9, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 6: 3 + 2. */ + 5, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 4: 1 + 2. */ + 3, /* reduc_f64_cost */ 2, /* store_elt_extra_cost */ /* This value is just inherited from the Cortex-A57 table. */ 8, /* vec_to_scalar_cost */ @@ -127,7 +127,7 @@ static const sve_vec_cost neoversev2_sve_vector_cost = /* A strided Advanced SIMD x64 load would take two parallel FP loads (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads - (cost 8) and a vec_construct (cost 2). Add a full vector operation + (cost 8) and a vec_construct (cost 4). Add a full vector operation (cost 2) to that, to avoid the difference being lost in rounding. There is no easy comparison between a strided Advanced SIMD x32 load @@ -165,14 +165,14 @@ static const aarch64_sve_vec_issue_info neoversev2_sve_issue_info = { { { - 3, /* loads_per_cycle */ + 3, /* loads_stores_per_cycle */ 2, /* stores_per_cycle */ 4, /* general_ops_per_cycle */ 0, /* fp_simd_load_general_ops */ 1 /* fp_simd_store_general_ops */ }, 2, /* ld2_st2_general_ops */ - 3, /* ld3_st3_general_ops */ + 2, /* ld3_st3_general_ops */ 3 /* ld4_st4_general_ops */ }, 2, /* pred_ops_per_cycle */ @@ -190,7 +190,7 @@ static const aarch64_vec_issue_info neoversev2_vec_issue_info = &neoversev2_sve_issue_info }; -/* Demeter costs for vector insn classes. */ +/* Neoversev2 costs for vector insn classes. */ static const struct cpu_vector_cost neoversev2_vector_cost = { 1, /* scalar_int_stmt_cost */ @@ -243,4 +243,4 @@ static const struct tune_params neoversev2_tunings = AARCH64_LDP_STP_POLICY_ALWAYS /* stp_policy_model. */ }; -#endif /* GCC_AARCH64_H_NEOVERSEV2. */ +#endif /* GCC_AARCH64_H_NEOVERSEV2. */ \ No newline at end of file From patchwork Fri Jul 26 09:20:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965227 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=rZlvYpcY; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=rZlvYpcY; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj2K3DQSz1yY5 for ; Fri, 26 Jul 2024 19:22:17 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ADB123870C19 for ; Fri, 26 Jul 2024 09:22:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on20600.outbound.protection.outlook.com [IPv6:2a01:111:f403:2613::600]) by sourceware.org (Postfix) with ESMTPS id CD38D38708EE; Fri, 26 Jul 2024 09:20:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CD38D38708EE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CD38D38708EE Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2613::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985638; cv=pass; b=n5UumXJcEZwjsTHkrIRp1NEQN85i5xLL7bjbIV2w64bmV29vnwNPXFypsVnX+GSJd243FYNLU1n9/dA3NtuffJ3Oug8xqkDfG2QLAuydslvP48rchP6pv+mNYtQUflSiFtln/m6gIGxe197dG42OQ3Kz7w8ekS8mFj3DRMeC+4o= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985638; c=relaxed/simple; bh=PmKDiVj8/r8SdzqPpknM0tBRH2KCQpsjh2kIE3B6dbI=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=A58aeJiQovrhw4mLPR1Rb1f2BcTxwb8VsTXx1+rHVCR16ux3knd9mmo8hnb22AsVU56Gg3M+dLk42tP4LxMqR1cfsRxNYJrRPjjq/m7DDBSWBrQbuRYABboNHBdE8Odf1PQltEN+NI/FJLY4tMB9viUlKYIyNyPCGQTUvrLkUiQ= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=Y0CJQJsV7nlEVUfGBGFPOrG4VHJnPWHnS050DDV5VQNzExPPP2dIYKD83Fi96hcfbLEoKE2zzwx9EcEz3GsytwqLrPO9rZWg8xjsK+K69KYVC2K3Pw1iluv6nJXb0sYEB3+KGT1dZuGRf3+mx0qDY5z5YvRiI27wfZhmiRxxRwQXxTHm/1dmm3G96kSRFIrJmAIeWnfEI0ml4akkMGvNfa0oi0TSHj7bh/f6irxCOiExSgPDugPKxlEQttVfckMpemQN0ZNCYDDXwIj/66jR6MQEQPNCFtCO0DmQNvzYgPgaQTfOCReVEVVR+xKI8PM6aYbDn0GQDZlS9FQmcrHoPw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=YpwaY8uLPK9rjiNtzwr2JB4tfHvS8HQuGe1ZxoWFoRwjbdg4kYcoa5Zh/iUhWSTaVP33Ezv/Y3jTK24QrROd+dSebFFq3UaL/n3ePvyJT1JtNHFvnBK9MIBnEjEYDX6TX6u/+ZUefnjVKmwsGSEpAJpuGluNtP2m6Fzxfq/6+iISrkB3BacEa/3u7hUBO++RmLLVLaXZUEBm9Ki5MGpXh5Ejj7PtNUHbUA2TK/zC6+EGttV8JA0cDgcYJaSK5kbgFE3ZIX9wPv+JlAoiplNuF6r7t1Lve3m2eg0Yt2wKzuvSgzjRix2JSkL8NbilFvQiPaLyOa7upVtyEaQg+h7+xQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=rZlvYpcYPgWPvJZauzNS7zD7L+JBbgflnu4ZvHLkgrw/ELWI6zUVJ8Hcez1/P9Yakhp/zW1nGUOyJcVuX7GjL6a9gZWLQ1CgVK03H/dMzEWvQ3P52fvACt+U0ewLKHX5jWz4DL6jZk7QNIxnHUfv/tOJRZOS5+U7dP3S/Lk3/PQ= Received: from AS4P192CA0023.EURP192.PROD.OUTLOOK.COM (2603:10a6:20b:5e1::9) by AS8PR08MB9694.eurprd08.prod.outlook.com (2603:10a6:20b:616::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:20:29 +0000 Received: from AM4PEPF00027A64.eurprd04.prod.outlook.com (2603:10a6:20b:5e1:cafe::61) by AS4P192CA0023.outlook.office365.com (2603:10a6:20b:5e1::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29 via Frontend Transport; Fri, 26 Jul 2024 09:20:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00027A64.mail.protection.outlook.com (10.167.16.75) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:20:28 +0000 Received: ("Tessian outbound ab09e808a502:v365"); Fri, 26 Jul 2024 09:20:28 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ef50a6174c59bb5c X-CR-MTA-TID: 64aa7808 Received: from La0a3fd64d5a3.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 995B0CBB-485C-4350-A8F0-403F3338D255.1; Fri, 26 Jul 2024 09:20:21 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id La0a3fd64d5a3.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:20:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=MOkiHchKEmtJFSeVeyokKIYreOwR71PfX3SNYhOke4xwemLROumFy32CcVsYY3hoKou9iXsh2c7nktkxHKDllrCEftnWx3A1mgwaOFUi4ejcyOlX94MwTjgqvyyeR1j04xpbPBpspv0T6S7nuWpZtk0SLfuE20lfeYtSDIhXLyMx8uTP4i4EsiISfWSsuPi0uDcN1R61ZkilfXzxNluuetV7cUK5YCUg+sjbeLEmCp//lj3c9voor6O/i5T5IkwkKd1G6XrcLa5DXzBCeM7i83Fr2JnrQ3JIxNwpxuRTTcFjDuhZG8Z5uRUDnTZTq+xBMd2nH5C5pNytsTez09L1Dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=r2yuzukNSJkTWsz5rz2XtBxlrxIL7LNXyxtE5CIP0DFTS6IWzCbiNQH2ez5zFi2mVwHLptc6cKPpiHVzkNPc1xtci2GrtEfUG6Wy2uL+bOQXWl6InCznMv8dJb9Fd7z6e25Ggz17dOrf8Hozs7iWo/e4TscVUx6aZwuErIeyUBS7Ao9cXt8AiYwTQ5RXxBhxeBWYAIlmfYfTW8pvBp2O5Scr2oEQumLryIugzzTGAFgCNTPbE5BW0BUoAL46pg7zkstRZg0TzbptaO0OlzoinAbUn2T5fku2GBasLoE2R9OHugROfhGZXY5udTvNpdMC3oWgDtSNILTpwHr0Pbm35A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=rZlvYpcYPgWPvJZauzNS7zD7L+JBbgflnu4ZvHLkgrw/ELWI6zUVJ8Hcez1/P9Yakhp/zW1nGUOyJcVuX7GjL6a9gZWLQ1CgVK03H/dMzEWvQ3P52fvACt+U0ewLKHX5jWz4DL6jZk7QNIxnHUfv/tOJRZOS5+U7dP3S/Lk3/PQ= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB6652.eurprd08.prod.outlook.com (2603:10a6:10:2ab::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:20:19 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:20:19 +0000 Date: Fri, 26 Jul 2024 10:20:16 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 2/8]AArch64: Add Neoverse V3 core definition and cost model Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P265CA0277.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:37a::8) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB6652:EE_|AM4PEPF00027A64:EE_|AS8PR08MB9694:EE_ X-MS-Office365-Filtering-Correlation-Id: 2d5b1452-e199-466d-ef1c-08dcad5430f0 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|366016|1800799024|376014; X-Microsoft-Antispam-Message-Info-Original: G6V+CWcOpAgD9BtNRKErSa2hT43iNtC2B2lnHAYmO3/33jOFotvwss1BVImJlrQtZ7V8dndZtAcoWdejeukf7aMq/fioEGwtndKuSdi8rqGVjxA/FKLBz0NIHosCzppU8Q5t1QFyqjTaVivNKd9pyM9RQFvq7LCHR6JDDSubtE1uyS9gKrcLOOZH3yUWw3FBzvfD3ILKQYYMHGKlbb430O132Zcicbp9l8V5pVy0GDoYoeRVzFz0HWwneiHUhrpN6Vmncifq7A8dSe6BgR5DPKTUBBNsGWIcSRc1yxF3nKek0CKcIMfHB2iMLbmPN9m37bblUJGHTt9rcfa69vqUABmPTL9FYVm7rUR05W36CFllymXNdyVybDQ93+KrS72Xq4M4onaZlPmsUTpADWoISI2zZoiC8P4JOS/yXehexeUuj4sVVCVNHK0Mqfg7mRDaCrFoNEkpNL8kHHh+vgN2sLa7/rpQg4IQXzTsAhJR4je9a6QUhf1z/81GNP9V0C+xgZPYDi21fIuvlX6QdNKBzDcKjll7IpZ8MXwnn8VKAp3/9hdaqCYEWRhydj9It5dsyEFOHQapo5oBwN1FDjPntwNsfpDYG1svrUBOJ5DwQLwb0JgtkCfp0Lg/sSXojnOSHixmUm/F2zT5KndHeRn0YgASaAxXuf06hmiURy8Uiagu1zcgMCSZ7plpNNL0Syxk6lOGUpW/+lzYY1O5xVo0CbDtO/8CJ1+HyySWL8zgJd0W+w8FLglTVSQfyWFTZDggWCHZVMMqMsOFZlHKcPCqRtBqGerlmPEgqF72xm9fK3i7M2gIjFGz77rHzlTnrrcdWUZoUCmXMc2op4nENKaCTOX4rUElMLPCJwVSlkOUTg7ZlPVStnjmP6QLGjoRZB/EBy5dzNHk+Y2BCLSs0D1JoADBezHLdboXkeG+aXZK9RhW4Zc580MJz2BK8l/KMxFQzwSTwlSPmW5cff6jX/eZzcOAUGDpwWrh4rE+TGTIfFu/vBnQCu6gZqqp9K7DAxcU8879Hnc/HI2cCvJk4mB5HmY+eWiPg6el30AP9SLj8zbf1UjVHmnSuffn3NVOzuVVu4z4FrbBKUHHo0vXKckQ3QlsP301yC1a+shgOITslz9YP6EgEbzcpf2B6yPt1rlaFY3z1AcuoQ4RDDUU8Ds7UYi3HoSdryWWqimsztoCwoOfiA2volzXcZRrvdtFuuyt8Inca66+ZJFK56EZlCbbxM9xboP5v+kCoI0/2/L+pkkqHPIaBlrBa/a1qdjkBPymLB/yOyT1LY8r0BBs4xhR9btTdU+pRVbcvswNBaQtYFI= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6652 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00027A64.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 70976725-1604-4b08-db2f-08dcad542aeb X-Microsoft-Antispam: BCL:0; ARA:13230040|35042699022|1800799024|376014|82310400026|36860700013|34020700016; X-Microsoft-Antispam-Message-Info: =?utf-8?q?njfQrNQVdSKZ0g5bLC8TRLrsnu0AFnw?= =?utf-8?q?4WSn9FyW8H15F9Kve4QZ0z1XzutKzCqDb56uMzRG/JU3Nm5PT/F//SJCrLQKrXR3x?= =?utf-8?q?R4MLqAiMkKc5psrAbjT+2yJKRJ/+iJwg6GHiVhMVmp66uJsWfA3Cs4rvMVcc6/Fic?= =?utf-8?q?Ag6CDRizeZz9W09OuOg5xijc0jBClj/MYPHkOHFGkMG+xD2n3JEtadN29IEnfwcPY?= =?utf-8?q?J/MuJYlw1zX91JxkP8jn73tDRvMmxm559+xLslXW5y/zOMKddfo1P+3zvpcLRu46G?= =?utf-8?q?LppMDmVlDksOuvSqE1zZtsSooV+o4sqv9J8AGyLWeNGUbyBBWL4yiDm2LAbCIisOO?= =?utf-8?q?Dr6hQfRHv0QVTodX2xcs0dlD9boRQr3YpAqJoAXXLNtqS6Z/7fLU5Ux0w8jgtqCSH?= =?utf-8?q?X8qZZtWVzYo5ZgXm65GjkuZan97iJd0xQB7cIrpSC/Tht8TlVdYR8GGDR8/062udi?= =?utf-8?q?shD8s2NLVK8rETbSit2U9KKMQaEdDPi/oABo4UJ3erbKM6YTyFGed5lP5W2HiHdWF?= =?utf-8?q?WgqQKlznR2LttOHeG6+sJDLajCN7XdYaSObKcPpuPvLvisSHHJNEi7QuO3h2TFFz6?= =?utf-8?q?POR9KYvc3TkcX9aVDDHzke9gUZKS1GwRpwhMj8sFaHoFknFCZPYeli1ZrT3A4U99e?= =?utf-8?q?z+0lRqMZQ4Lue40brytXHJfE2/VZM3w56BFqtW05ZXqgwjxD2nBqXsysAmGe9grer?= =?utf-8?q?mZ/WZHxSXorb3oj1oV09BlF0iMa/fm1PityDMKjNxLVucgIkTanhDUGX8+BOSecBP?= =?utf-8?q?roJ3DqjjadABITWr5bNdKX7yq9zn/FVK7Vcw5maSb6EBKemCvze1U9MLzMWL8+/Zu?= =?utf-8?q?XiZlO2f6OOJOV6slm+T2ALv0FV5/tb+L79oJ8ZE8YMp9hrSjg6lpCnyvySDL4rQJj?= =?utf-8?q?gg7wYawQI2G53vHKaacWGAhNCdcnoOsYTotOcgT593yEtS5wLa967V2gv94G47gw4?= =?utf-8?q?uRyuNkI5QND8MZrifagpU43uZsbzJDyOAsSJZHYfTyxkeDBBpO+JuYBWGbGkyn0/n?= =?utf-8?q?6/fNNmRjurhLJLqOOZJWRdg1NuQIaByspydTOTM6m3qN3sY+BzoOBPf/McWT5kFz/?= =?utf-8?q?3Kp8R+8SmcKHB9y0nD4DPy3WuC2LlGMYDa+KRQ6BIUePX865cKurgZ6uL0sRVeu/l?= =?utf-8?q?2Cz+rVd1EfRQqlbWPdjmpDx2LdRSzeNmn3tTA7xULrdckNdaZ1dDe2U5rOg22Fbeo?= =?utf-8?q?9QeKBJFUxQV0fC1zjrLaibAk/BpvEBQJ5XXZlA+RJaLb2oi0F9mtDVhnY4ybFNjiQ?= =?utf-8?q?Z3JN4JAEsOCH0WYDjy5tn7byOyiLVSBSlR7M6r94kwg9I/1YNqwSVH9gXHFpis4q3?= =?utf-8?q?HZ0C/jg9p6+Az3pWWhrN8UHbxytV88eVwQ=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(35042699022)(1800799024)(376014)(82310400026)(36860700013)(34020700016); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:20:28.8784 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2d5b1452-e199-466d-ef1c-08dcad5430f0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00027A64.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9694 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This adds a cost model and core definition for Neoverse V3. It also makes Cortex-X4 use the Neoverse V3 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x4): Update. (neoverse-v3): New. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/tuning_models/neoversev3.h: New file. * config/aarch64/aarch64.cc: Use it. * doc/invoke.texi: Document it. --- -- diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 34307fe0c1721dda67adab768dd22a5649687f6e..96c74657a1991acfe86d7c61af4ccce7415fabca 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -188,13 +188,14 @@ AARCH64_CORE("cortex-x2", cortexx2, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8M AARCH64_CORE("cortex-x3", cortexx3, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversev2, 0x41, 0xd4e, -1) -AARCH64_CORE("cortex-x4", cortexx4, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversen2, 0x41, 0xd81, -1) +AARCH64_CORE("cortex-x4", cortexx4, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversev3, 0x41, 0xd81, -1) AARCH64_CORE("neoverse-n2", neoversen2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x41, 0xd49, -1) AARCH64_CORE("cobalt-100", cobalt100, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x6d, 0xd49, -1) AARCH64_CORE("neoverse-v2", neoversev2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversev2, 0x41, 0xd4f, -1) AARCH64_CORE("grace", grace, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, SVE2_AES, SVE2_SHA3, SVE2_SM4, PROFILE), neoversev2, 0x41, 0xd4f, -1) +AARCH64_CORE("neoverse-v3", neoversev3, cortexa57, V9_2A, (SVE2_BITPERM, RNG, LS64, MEMTAG, PROFILE), neoversev3, 0x41, 0xd84, -1) AARCH64_CORE("demeter", demeter, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversev2, 0x41, 0xd4f, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index 719fd3dc62a5860aad3aa92785413892e46f8816..0c3339b53e425ac36387eb63a0005a25c0c064e7 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversev2,grace,demeter,generic,generic_armv8_a,generic_armv9_a" + "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversev2,grace,neoversev3,demeter,generic,generic_armv8_a,generic_armv9_a" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 89eb66348f772a7e94f1acde29cd4badfd51fa3d..569d4a3d16fb9846b89ebbc895cb169a6007a24a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -413,6 +413,7 @@ static const struct aarch64_flag_desc aarch64_tuning_flags[] = #include "tuning_models/neoverse512tvb.h" #include "tuning_models/neoversen2.h" #include "tuning_models/neoversev2.h" +#include "tuning_models/neoversev3.h" #include "tuning_models/a64fx.h" /* Support for fine-grained override of the tuning structures. */ diff --git a/gcc/config/aarch64/tuning_models/neoversev3.h b/gcc/config/aarch64/tuning_models/neoversev3.h new file mode 100644 index 0000000000000000000000000000000000000000..3daa3d2365c817d03c6c0d5e66fe832620d8fb2c --- /dev/null +++ b/gcc/config/aarch64/tuning_models/neoversev3.h @@ -0,0 +1,246 @@ +/* Tuning model description for AArch64 architecture. + Copyright (C) 2009-2024 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_AARCH64_H_NEOVERSEV3 +#define GCC_AARCH64_H_NEOVERSEV3 + +#include "generic.h" + +static const struct cpu_addrcost_table neoversev3_addrcost_table = +{ + { + 1, /* hi */ + 0, /* si */ + 0, /* di */ + 1, /* ti */ + }, + 0, /* pre_modify */ + 0, /* post_modify */ + 2, /* post_modify_ld3_st3 */ + 2, /* post_modify_ld4_st4 */ + 0, /* register_offset */ + 0, /* register_sextend */ + 0, /* register_zextend */ + 0 /* imm_offset */ +}; + +static const struct cpu_regmove_cost neoversev3_regmove_cost = +{ + 3, /* GP2GP */ + /* Spilling to int<->fp instead of memory is recommended so set + realistic costs compared to memmov_cost. */ + 5, /* GP2FP */ + 4, /* FP2GP */ + 4 /* FP2FP */ +}; + +static const advsimd_vec_cost neoversev3_advsimd_vector_cost = +{ + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 2, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + 4, /* reduc_i8_cost */ + 4, /* reduc_i16_cost */ + 2, /* reduc_i32_cost */ + 2, /* reduc_i64_cost */ + 6, /* reduc_f16_cost */ + 4, /* reduc_f32_cost */ + 2, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* This depends very much on what the scalar value is and + where it comes from. E.g. some constants take two dependent + instructions or a load, while others might be moved from a GPR. + 4 seems to be a reasonable compromise in practice. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ +}; + +static const sve_vec_cost neoversev3_sve_vector_cost = +{ + { + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 3, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + /* Theoretically, a reduction involving 15 scalar ADDs could + complete in ~4 cycles and would have a cost of 15. [SU]ADDV + completes in 9 cycles, so give it a cost of 15 + 5. */ + 20, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 8: 7 + 5. */ + 12, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 6: 3 + 4. */ + 7, /* reduc_i32_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 2: 1 + 1. */ + 2, /* reduc_i64_cost */ + /* Theoretically, a reduction involving 7 scalar FADDs could + complete in ~6 cycles and would have a cost of 7. FADDV + completes in 8 cycles, so give it a cost of 7 + 2. */ + 9, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 6: 3 + 2. */ + 5, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 4: 1 + 2. */ + 3, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* See the comment above the Advanced SIMD versions. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ + }, + 3, /* clast_cost */ + 10, /* fadda_f16_cost */ + 6, /* fadda_f32_cost */ + 4, /* fadda_f64_cost */ + /* A strided Advanced SIMD x64 load would take two parallel FP loads + (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather + is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads + (cost 8) and a vec_construct (cost 4). Add a full vector operation + (cost 2) to that, to avoid the difference being lost in rounding. + + There is no easy comparison between a strided Advanced SIMD x32 load + and an SVE 32-bit gather, but cost an SVE 32-bit gather as 1 vector + operation more than a 64-bit gather. */ + 14, /* gather_load_x32_cost */ + 12, /* gather_load_x64_cost */ + 1 /* scatter_store_elt_cost */ +}; + +static const aarch64_scalar_vec_issue_info neoversev3_scalar_issue_info = +{ + 3, /* loads_stores_per_cycle */ + 2, /* stores_per_cycle */ + 8, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ +}; + +static const aarch64_advsimd_vec_issue_info neoversev3_advsimd_issue_info = +{ + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 4, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ +}; + +static const aarch64_sve_vec_issue_info neoversev3_sve_issue_info = +{ + { + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 4, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ + }, + 2, /* pred_ops_per_cycle */ + 1, /* while_pred_ops */ + 0, /* int_cmp_pred_ops */ + 0, /* fp_cmp_pred_ops */ + 1, /* gather_scatter_pair_general_ops */ + 1 /* gather_scatter_pair_pred_ops */ +}; + +static const aarch64_vec_issue_info neoversev3_vec_issue_info = +{ + &neoversev3_scalar_issue_info, + &neoversev3_advsimd_issue_info, + &neoversev3_sve_issue_info +}; + +/* Neoversev3 costs for vector insn classes. */ +static const struct cpu_vector_cost neoversev3_vector_cost = +{ + 1, /* scalar_int_stmt_cost */ + 2, /* scalar_fp_stmt_cost */ + 4, /* scalar_load_cost */ + 1, /* scalar_store_cost */ + 1, /* cond_taken_branch_cost */ + 1, /* cond_not_taken_branch_cost */ + &neoversev3_advsimd_vector_cost, /* advsimd */ + &neoversev3_sve_vector_cost, /* sve */ + &neoversev3_vec_issue_info /* issue_info */ +}; + +static const struct tune_params neoversev3_tunings = +{ + &cortexa76_extra_costs, + &neoversev3_addrcost_table, + &neoversev3_regmove_cost, + &neoversev3_vector_cost, + &generic_branch_cost, + &generic_approx_modes, + SVE_128, /* sve_width */ + { 4, /* load_int. */ + 2, /* store_int. */ + 6, /* load_fp. */ + 1, /* store_fp. */ + 6, /* load_pred. */ + 2 /* store_pred. */ + }, /* memmov_cost. */ + 10, /* issue_rate */ + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ + "32:16", /* function_align. */ + "4", /* jump_align. */ + "32:16", /* loop_align. */ + 4, /* int_reassoc_width. */ + 6, /* fp_reassoc_width. */ + 4, /* fma_reassoc_width. */ + 3, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 0, /* max_case_values. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND + | AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS + | AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS + | AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT + | AARCH64_EXTRA_TUNE_AVOID_PRED_RMW), /* tune_flags. */ + &generic_prefetch_tune, + AARCH64_LDP_STP_POLICY_ALWAYS, /* ldp_policy_model. */ + AARCH64_LDP_STP_POLICY_ALWAYS /* stp_policy_model. */ +}; + +#endif /* GCC_AARCH64_H_NEOVERSEV3. */ \ No newline at end of file diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 403ea9da1abd5a012d0b18849852604b10689682..ffcf4f146d92d410c6b515b3b80f07bdec1d2b55 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21524,6 +21524,7 @@ performance of the code. Permissible values for this option are: @samp{oryon-1}, @samp{neoverse-512tvb}, @samp{neoverse-e1}, @samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{neoverse-v2}, @samp{grace}, +@samp{neoverse-v3}, @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx}, @samp{octeontx81}, @samp{octeontx83}, @samp{octeontx2}, @samp{octeontx2t98}, @samp{octeontx2t96} From patchwork Fri Jul 26 09:20:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965230 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=oCgjcgSx; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=oCgjcgSx; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj3p44P1z1yY5 for ; Fri, 26 Jul 2024 19:23:34 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C8CD0384AB70 for ; Fri, 26 Jul 2024 09:23:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on20601.outbound.protection.outlook.com [IPv6:2a01:111:f403:260e::601]) by sourceware.org (Postfix) with ESMTPS id A9C62384403E; Fri, 26 Jul 2024 09:20:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A9C62384403E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A9C62384403E Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260e::601 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985664; cv=pass; b=juVvUkAvy0ecodjXa+sBMBM85ZbG17uCf7O7LbVbSrXVxgYVZQlH7IpFrehIIioJvgxnENRL89Qeq29r6y0JxvE+r2+mL9XZXCAq74tY9DydnMaUYZiWj+G1v7yMH4iFDBnmIkLhKf9vGPxynTiCjvJBtZnK6JvDCrbDwfcq73Q= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985664; c=relaxed/simple; bh=oj6rS91pG7hg98hTLFMrFZjhjiJNKtEeRfBpHUxYZkg=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=PuIWXjg+qKe5DiLO6Uff6L0MFAeyFDGidKAQc2mfOhRgZZmJjq60IsUsRUG8JGdDyrOrOd++tGYM+P1A5Fz3nqO84SdDICoeXGo+JRI0EzNV/dOGj65nae1mL7oCB0wDyB4A1HgTuCEfiIwnyAx2L9Cf0RsvaeNYupEFZNb1ZOE= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=iR/1s0iPrxPtvNacFCjAqpVORzHx8wJ5ZCiknOFqT1c97SQ33ZAlRZP7skiilPuYIFoiDOROyJFpPOUG1TP4RKPqk/4OGz1bkBzUi/7DgapCGH5VnsICKCrGtjemx4k5963GHS6b/7pTkGIIR3rzuJCmiFWKYU3t9RNZJxXAqYHPGMptcsq2j5GgTWtDmxjLWc0THPIQWX47QuGDhQLxfEuKsLk27Ge5buzXpUEb+uSlAKDpDXrQai5KRuR2Tl4sM9RCV6jCqAmcTIepeWAZUwwZskgDMvsNifaM0nHmDX/4vxdNXHV+UA9QNLFFMWVhgnINaGuSY9wKZXx89pV+3A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QPn6wVkYWdoJd+K243an6MhVRjzaCwzO1u1fyqb4jQo=; b=TvqTd2q+/Xu8+qElhZRNzFwx+w+czYsXq2r5fUaE1YnOl3+7Ng4pcQV+co/w/QznmHuZNlB4QIXhnLT/HMA2FkrXUMTC8y9zjAvgi1l6+zcrygwvMKg7Ls8YyLmN85g2xy+CDrdPR5oKXZJCdvH+oEq1PJIFZY4QjYtGcnMct3cc+NRrUX2SV3HwxM33lgrLuz5+8HlGX8f9IHonKw2sKYe8Zjs4ClaHXn7XmlTthmQ1zQMzl+Q/2MQudXZzSWswkDgl1bpFs3YXe7qeyiFMif4wzXZIPegwxvmzoJly7XpfalgoFWWachNneAQ3KYEYuB0NyG0UnyYQyjt0rtuFoA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QPn6wVkYWdoJd+K243an6MhVRjzaCwzO1u1fyqb4jQo=; b=oCgjcgSx7FWxsqfDi/i3nM9SEqZGOnytUYk94QM5XMGqYA0RLRm+QqpqHEuso7TfLNIyMWKJtjAFyv4Y3bXm0cQWRZDb2O3PXtUm11yPB8QTR7+QxRkr9Km7O0mIHUsiHpffSdZMOF4kLI3OOOX9eRNpTtjEt0Lah8U6aBsEWds= Received: from DU2PR04CA0026.eurprd04.prod.outlook.com (2603:10a6:10:3b::31) by DB9PR08MB6700.eurprd08.prod.outlook.com (2603:10a6:10:2a3::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:20:54 +0000 Received: from DB3PEPF0000885F.eurprd02.prod.outlook.com (2603:10a6:10:3b:cafe::43) by DU2PR04CA0026.outlook.office365.com (2603:10a6:10:3b::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29 via Frontend Transport; Fri, 26 Jul 2024 09:20:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB3PEPF0000885F.mail.protection.outlook.com (10.167.242.10) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:20:54 +0000 Received: ("Tessian outbound 93748f77c01b:v365"); Fri, 26 Jul 2024 09:20:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 23e3963fec902961 X-CR-MTA-TID: 64aa7808 Received: from Lb1dc94daf77d.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id E50214C5-20F2-4AF9-886B-F353AA337E8C.1; Fri, 26 Jul 2024 09:20:47 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lb1dc94daf77d.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:20:47 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KnFNPF/8Oggx46OinQDHFWdx/zWDXXuVadydpjo/HaXlPqP7GDCCOFMYnuQaBXavLVcwLCB717E/RZrSKJMDl7ItHE+eKzZb97AgZ1ctvCZ14vf5jYg6Q6znir+ejkvQNlTw9G759TJp5FaNLl6pkgOz+NAIc1PhMvcaC0a0G8I1+ZPcQ+6i+kLTb4BR2eid9BgYTAiAom9bEKqfxPI4M+Sh8vvwH9RxfodCmk4h6LfmE6vpzsXfT7LaLEgb/mmpmGlvfOdv83gIx5g6mK3f52L3fsAz7MqJHsoeUv+T00hxLnCNry8kBpPua2lnHTGnUXBFiN4saeEKdFu3ruCa9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QPn6wVkYWdoJd+K243an6MhVRjzaCwzO1u1fyqb4jQo=; b=a+qltu68Baa3y6iGdT+gNAvysC6/dfuD7cWP13jAhQfUsN/B+VmbDu2fatnb5d6RPxCCpM+gl10cz3EQWfI4gFxY/26woGMij4yT2ENB9s5oCqrb5EWj+FJHp31pkL+uwC/kRxRciODY6yvfocCtQ3s4vN5oN6Q8N9SmtdcLYekE8AxwWlS+yVnZ+4FNH4KuneS8eCuvIx83Rb6vVZeiRSe5DnB4VEkww4ZooDmcLChmMll8kaUCM6au4KT8KNQ1YvhPbrQGb1/w3J1DCijrf6x2knW76MzP/q2/bo8Rq4UIrxouANYJr71FbGb6HeWveFI0rf65lTRnWXoMU/1PHg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QPn6wVkYWdoJd+K243an6MhVRjzaCwzO1u1fyqb4jQo=; b=oCgjcgSx7FWxsqfDi/i3nM9SEqZGOnytUYk94QM5XMGqYA0RLRm+QqpqHEuso7TfLNIyMWKJtjAFyv4Y3bXm0cQWRZDb2O3PXtUm11yPB8QTR7+QxRkr9Km7O0mIHUsiHpffSdZMOF4kLI3OOOX9eRNpTtjEt0Lah8U6aBsEWds= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV2PR08MB8098.eurprd08.prod.outlook.com (2603:10a6:150:76::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.16; Fri, 26 Jul 2024 09:20:43 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:20:43 +0000 Date: Fri, 26 Jul 2024 10:20:39 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0163.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:18a::6) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV2PR08MB8098:EE_|DB3PEPF0000885F:EE_|DB9PR08MB6700:EE_ X-MS-Office365-Filtering-Correlation-Id: 01c96921-5ec4-48af-13c5-08dcad544009 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info-Original: wluCC1z/gNz4xBSpZK4zvqix39rMSYZ+fkwU70lVDuXqirmsICp3aPmHcp2D14LBc5oagVkAoH9HFod2H8Na5+benqeR7CmBRXKsPJl18iw/o3tjPBMfKIe+arFTQUMe3vCo2KQFOeMePXa0tHdcgMI3Q7z3R0n8UIK4f5yewvZv+pU+vf/NxQILosFKQdPD3jMvE3iNiHTJ6pC6deVEESuUMCc6YeW+gEa+Idu+0RC61xzcVchIFbDtuJdydZ0zEfmeCXw/BW/Rbnqes0IQMpag3/dqre9JzcufKvXZwkzkt1sRrqi9q2fxFZVd0ifUDewHs4trvSITSipvPcPY3EHWs5TcAnfXDbReVyuDL8ByZLu4153v8RpdUzkehevMGUsGobQUQa+5Kx4z2F6eHKUVCK9cbbxgNOpv2y63KbOEnvm5K1drBEGa3AFsotCykLIgtKUD10cj+CYKfxHsu3YSgCT6xfrX99ju0EwKtrp90mt/S8s0GK2PwoBOK6nvDYTkq1QDv1NsPvFjC/12H7Iktn7PXincSg86GCZhv2/8Y+hkgFOPspMdPr7M/mC9mRuZIjiXC2uqLAqR2/7eU0Q0zC12rcLBydF2PnMGsK8wBbAxmvEC54jAjwVbscy0mDpKrhzbX7D9yeTEJAmFCh+N1VoNlDXj/eEoqIbpZlltAqxmWrZMFrD+/+mpHxw7Ws0zKUiDIdd8XI/DHrazlFBu6vkPnV7P5ZcE/NJgun328j3QBo/+Fr5pgUZUzD7aFuBWZz0/pS9FmJnWObWeDrB22h00KSk6Ydy909iJ5MKR+FCHpfHsj9DRPiRxe6Ornn34fHKGwJnQcG14RZTAPZsXj6mgd/UjTUnCuJbc0Y8wjioSpTmTqkKEc6L6dhRhzKmcO3tZYlHeD8Qz38UljGBBjr96QMitbxlp9SxJvGgFqikKOY26RWoMkwfv0uxiUFyKlTr6mPKzN2IEoXxHcG0CBy+uT3Vt63hvMhR3grPSK9t8/w+G297IminshrKX/ntrWJuB0cUrmyK0AkGdrsTQrTsN274eqnA00sD5Q/7hgoYHX2GVGQBViet0smmA2jD8HF/nwEcZRgex2/hr4qI3RLgdaUTPNKJU3RRAZSNwa375Ik9trvterJsODqtHqyUJ56P3k9G5SUio8TCUhs7vjVqYu87Nfebzkp6xT6bwDST8KyzScyeslk2kObVQBpweO9RJvohND1OtpopIrmGB063JS6g8P9UbThjSR4vedlWdUzEpq+IrasYXMre3pRk1JW+V4tW9lan05lwh2s/wPWFl1MjQuoF5nekCeUI= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB8098 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB3PEPF0000885F.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: bf2df3da-f942-4bd1-37cc-08dcad54392b X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|35042699022|36860700013|34020700016|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?utf-8?q?Ckz1ln1zPazJApBHOHU1B2Ww4v0AH5d?= =?utf-8?q?MCtWLCZy5oowu2MOkjK2FTXpkQC7iGxMhmp6mvwdnm1TrNUwM2vNNP+XtClV2iVmF?= =?utf-8?q?J5C3NsDiEttJPMCXLtvvkgiwP7qt16CL36Y7TyIKpn0AcVIK+LQJhVRMfGALKdAVJ?= =?utf-8?q?KpQF0IxRBoiG5bjSfOnYHQdtT197r3yqqw4jZ0LhdPwMW7N0BKFfOOpsm3r3ClW0K?= =?utf-8?q?MracvYsB2SKGUOgkVmIYwKS9h+MxDF/MNzFpP+xOkisqGwYEYTH7c9WVdddzHtEce?= =?utf-8?q?apNuQXWTixm4NAHQQtoyDq/23US0MgTbgro3lNAqp6/IcUfh1lBCI6/iDsUG6w4a0?= =?utf-8?q?TE3SETrGRic4LhgRZMKkmBTTFFklss5bC4+eEci0+e1mJij6MlQ96yIlB8xScyGiJ?= =?utf-8?q?cKe9EUdZBFySNL1OR8Mm808a9pJ1MBvqpbpJvIgeAhXgn5FNL/17h2vkSgL3ViOSi?= =?utf-8?q?L/VaQqGiGOif74H5E1B+IHWo9lWUb6Tb+0PxFpvZCjs3JrbD/ttbZO6RBsnCr9EN1?= =?utf-8?q?v3s99MH2kVIi9JbNTSB3b0qh9KVyN1AbnJ8ibyG89/ETBrXASfKJkcD/zJQC54xOC?= =?utf-8?q?JB/MlsCZkJBHqp5nHSE62KKM3XSIA0/Sf1vUo4o6Ku0cx+WK18BBb3+NH3fqqqBmp?= =?utf-8?q?eLSgmSFa+txoxXGcRHjgG73mTFysDiCj/I09ccbmZNYii5FWVh04wzQ4rCDBES8R4?= =?utf-8?q?3wX7hbgNPgBrAfbww3NXBwHYuxOgr+TROM/cdLqI7v7XQGbcWR152+6u0D3htpZzR?= =?utf-8?q?a7+5rgkXqDKPAmuHnBEgnQFJeXUKLADzaand04FNQLYcMb3xvS9WcWd+HVcrvYdCe?= =?utf-8?q?ounP8VXr0Gyu8DZticx4Wv+zaR3InuLbyS1FX+MmHoXnWOek+HD1F8jx6ybQbiJqV?= =?utf-8?q?WYyU2OMeyoc3qDDK+aRW+k21AAyT3FvKAUSlLUO/cev9RrQZE0ZMbXC7HXw0dzari?= =?utf-8?q?jvqsqhxcX6aHI125MWlKToJIGsyqtMpG4AIdbqTIyCOzs5JuRqf338G+dGKL1KXlQ?= =?utf-8?q?mkA6AlpfMag0qE6w4E3teRQ7tYgpub2QcCcclzf789jjNEq5fvtQuEZIcfJ4k+mrW?= =?utf-8?q?UfP0IDZbxYyUbxA/2czKWEsOO8vGTMh6ImBbZlwfE+FOM1FOG2ixBCRLOBcqlOXC/?= =?utf-8?q?XFMArk5d6er2doeVE301z9ces8I0oNFvB3GncoCaHQdQtcesRGoRESVz/h6ZjO3sn?= =?utf-8?q?ZWg7LBxbvbkQVKJqtPcWoQ4XoIIrf1BhF1aWuPfeIje4vwwjd41QTxZbmlAnETv/G?= =?utf-8?q?/vfDjA66+3bNA6XZKfLqpqS6pajOHSqIMf0Xuysi5wTw4i2n3eFyOi1NqgozhkZjp?= =?utf-8?q?lsFtB+F7tqy0NYIZpT2a1XCFnzGDmAfXnQ=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(1800799024)(35042699022)(36860700013)(34020700016)(82310400026)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:20:54.2806 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 01c96921-5ec4-48af-13c5-08dcad544009 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB3PEPF0000885F.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6700 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This adds a cost model and core definition for Neoverse V3AE. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-v3ae): New. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/tuning_models/neoversev3ae.h: New file. * config/aarch64/aarch64.cc: Use it. * doc/invoke.texi: Document it. --- -- diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 96c74657a1991acfe86d7c61af4ccce7415fabca..72c20b8bc22ba8db4a1e6fdb1ff623020539042b 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -196,6 +196,7 @@ AARCH64_CORE("cobalt-100", cobalt100, cortexa57, V9A, (I8MM, BF16, SVE2_BITPER AARCH64_CORE("neoverse-v2", neoversev2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversev2, 0x41, 0xd4f, -1) AARCH64_CORE("grace", grace, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, SVE2_AES, SVE2_SHA3, SVE2_SM4, PROFILE), neoversev2, 0x41, 0xd4f, -1) AARCH64_CORE("neoverse-v3", neoversev3, cortexa57, V9_2A, (SVE2_BITPERM, RNG, LS64, MEMTAG, PROFILE), neoversev3, 0x41, 0xd84, -1) +AARCH64_CORE("neoverse-v3ae", neoversev3ae, cortexa57, V9_2A, (SVE2_BITPERM, RNG, LS64, MEMTAG, PROFILE), neoversev3ae, 0x0, 0x0, -1) AARCH64_CORE("demeter", demeter, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversev2, 0x41, 0xd4f, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index 0c3339b53e425ac36387eb63a0005a25c0c064e7..b02e891086ccc60aa5eac3b28206b240ef7937e8 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversev2,grace,neoversev3,demeter,generic,generic_armv8_a,generic_armv9_a" + "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversev2,grace,neoversev3,neoversev3ae,demeter,generic,generic_armv8_a,generic_armv9_a" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 569d4a3d16fb9846b89ebbc895cb169a6007a24a..3b7bbf0399405b1ccda71e563de846a2161ac580 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -414,6 +414,7 @@ static const struct aarch64_flag_desc aarch64_tuning_flags[] = #include "tuning_models/neoversen2.h" #include "tuning_models/neoversev2.h" #include "tuning_models/neoversev3.h" +#include "tuning_models/neoversev3ae.h" #include "tuning_models/a64fx.h" /* Support for fine-grained override of the tuning structures. */ diff --git a/gcc/config/aarch64/tuning_models/neoversev3ae.h b/gcc/config/aarch64/tuning_models/neoversev3ae.h new file mode 100644 index 0000000000000000000000000000000000000000..29c6f22e941b26ee333c87b9fac22aea86625e97 --- /dev/null +++ b/gcc/config/aarch64/tuning_models/neoversev3ae.h @@ -0,0 +1,246 @@ +/* Tuning model description for AArch64 architecture. + Copyright (C) 2009-2024 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_AARCH64_H_NEOVERSEV3AE +#define GCC_AARCH64_H_NEOVERSEV3AE + +#include "generic.h" + +static const struct cpu_addrcost_table neoversev3ae_addrcost_table = +{ + { + 1, /* hi */ + 0, /* si */ + 0, /* di */ + 1, /* ti */ + }, + 0, /* pre_modify */ + 0, /* post_modify */ + 2, /* post_modify_ld3_st3 */ + 2, /* post_modify_ld4_st4 */ + 0, /* register_offset */ + 0, /* register_sextend */ + 0, /* register_zextend */ + 0 /* imm_offset */ +}; + +static const struct cpu_regmove_cost neoversev3ae_regmove_cost = +{ + 3, /* GP2GP */ + /* Spilling to int<->fp instead of memory is recommended so set + realistic costs compared to memmov_cost. */ + 5, /* GP2FP */ + 4, /* FP2GP */ + 4 /* FP2FP */ +}; + +static const advsimd_vec_cost neoversev3ae_advsimd_vector_cost = +{ + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 2, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + 4, /* reduc_i8_cost */ + 4, /* reduc_i16_cost */ + 2, /* reduc_i32_cost */ + 2, /* reduc_i64_cost */ + 6, /* reduc_f16_cost */ + 4, /* reduc_f32_cost */ + 2, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* This depends very much on what the scalar value is and + where it comes from. E.g. some constants take two dependent + instructions or a load, while others might be moved from a GPR. + 4 seems to be a reasonable compromise in practice. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ +}; + +static const sve_vec_cost neoversev3ae_sve_vector_cost = +{ + { + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 3, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + /* Theoretically, a reduction involving 15 scalar ADDs could + complete in ~4 cycles and would have a cost of 15. [SU]ADDV + completes in 9 cycles, so give it a cost of 15 + 5. */ + 20, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 8: 7 + 5. */ + 12, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 6: 3 + 4. */ + 7, /* reduc_i32_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 4: 1 + 3. */ + 4, /* reduc_i64_cost */ + /* Theoretically, a reduction involving 7 scalar FADDs could + complete in ~8 cycles and would have a cost of 7. FADDV + completes in 8 cycles, so give it a cost of 7 + 0. */ + 7, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 6: 3 + 2. */ + 5, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 4: 1 + 2. */ + 3, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* See the comment above the Advanced SIMD versions. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ + }, + 3, /* clast_cost */ + 10, /* fadda_f16_cost */ + 6, /* fadda_f32_cost */ + 4, /* fadda_f64_cost */ + /* A strided Advanced SIMD x64 load would take two parallel FP loads + (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather + is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads + (cost 8) and a vec_construct (cost 4). Add a full vector operation + (cost 2) to that, to avoid the difference being lost in rounding. + + There is no easy comparison between a strided Advanced SIMD x32 load + and an SVE 32-bit gather, but cost an SVE 32-bit gather as 1 vector + operation more than a 64-bit gather. */ + 14, /* gather_load_x32_cost */ + 12, /* gather_load_x64_cost */ + 1 /* scatter_store_elt_cost */ +}; + +static const aarch64_scalar_vec_issue_info neoversev3ae_scalar_issue_info = +{ + 3, /* loads_stores_per_cycle */ + 2, /* stores_per_cycle */ + 8, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ +}; + +static const aarch64_advsimd_vec_issue_info neoversev3ae_advsimd_issue_info = +{ + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 2, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ +}; + +static const aarch64_sve_vec_issue_info neoversev3ae_sve_issue_info = +{ + { + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 2, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ + }, + 2, /* pred_ops_per_cycle */ + 1, /* while_pred_ops */ + 0, /* int_cmp_pred_ops */ + 0, /* fp_cmp_pred_ops */ + 1, /* gather_scatter_pair_general_ops */ + 1 /* gather_scatter_pair_pred_ops */ +}; + +static const aarch64_vec_issue_info neoversev3ae_vec_issue_info = +{ + &neoversev3ae_scalar_issue_info, + &neoversev3ae_advsimd_issue_info, + &neoversev3ae_sve_issue_info +}; + +/* Neoversev3ae costs for vector insn classes. */ +static const struct cpu_vector_cost neoversev3ae_vector_cost = +{ + 1, /* scalar_int_stmt_cost */ + 2, /* scalar_fp_stmt_cost */ + 4, /* scalar_load_cost */ + 1, /* scalar_store_cost */ + 1, /* cond_taken_branch_cost */ + 1, /* cond_not_taken_branch_cost */ + &neoversev3ae_advsimd_vector_cost, /* advsimd */ + &neoversev3ae_sve_vector_cost, /* sve */ + &neoversev3ae_vec_issue_info /* issue_info */ +}; + +static const struct tune_params neoversev3ae_tunings = +{ + &cortexa76_extra_costs, + &neoversev3ae_addrcost_table, + &neoversev3ae_regmove_cost, + &neoversev3ae_vector_cost, + &generic_branch_cost, + &generic_approx_modes, + SVE_128, /* sve_width */ + { 4, /* load_int. */ + 2, /* store_int. */ + 6, /* load_fp. */ + 1, /* store_fp. */ + 6, /* load_pred. */ + 2 /* store_pred. */ + }, /* memmov_cost. */ + 10, /* issue_rate */ + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ + "32:16", /* function_align. */ + "4", /* jump_align. */ + "32:16", /* loop_align. */ + 4, /* int_reassoc_width. */ + 6, /* fp_reassoc_width. */ + 4, /* fma_reassoc_width. */ + 3, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 0, /* max_case_values. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND + | AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS + | AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS + | AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT + | AARCH64_EXTRA_TUNE_AVOID_PRED_RMW), /* tune_flags. */ + &generic_prefetch_tune, + AARCH64_LDP_STP_POLICY_ALWAYS, /* ldp_policy_model. */ + AARCH64_LDP_STP_POLICY_ALWAYS /* stp_policy_model. */ +}; + +#endif /* GCC_AARCH64_H_NEOVERSEV3AE. */ \ No newline at end of file diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index ffcf4f146d92d410c6b515b3b80f07bdec1d2b55..c14eaf8526028575feb40bafebb4ee29dc5b4be6 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21524,7 +21524,7 @@ performance of the code. Permissible values for this option are: @samp{oryon-1}, @samp{neoverse-512tvb}, @samp{neoverse-e1}, @samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{neoverse-v2}, @samp{grace}, -@samp{neoverse-v3}, +@samp{neoverse-v3}, @samp{neoverse-v3ae}, @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx}, @samp{octeontx81}, @samp{octeontx83}, @samp{octeontx2}, @samp{octeontx2t98}, @samp{octeontx2t96} From patchwork Fri Jul 26 09:20:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965225 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=PNqqYa17; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=PNqqYa17; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj1w06F9z1yY5 for ; Fri, 26 Jul 2024 19:21:56 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9782A3870C29 for ; Fri, 26 Jul 2024 09:21:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on20601.outbound.protection.outlook.com [IPv6:2a01:111:f403:260e::601]) by sourceware.org (Postfix) with ESMTPS id C51983858408; Fri, 26 Jul 2024 09:21:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C51983858408 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C51983858408 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260e::601 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985685; cv=pass; b=cLzJleOUBu/awWPFoQ20D+jz7Fg0wYLtlH3cTfhbaDFBEcHjA/aP/H7LxvpUY7uu9ZvkN3tMV1RfWipqSUeJ29UUY1T+YV3Fxk3lGNRlDU+X++cmnql1BXSmXdl6ba1tbpDaaQY2NWSU+FVTT+DBVjCGoYUBD2W8P/mm+HYqoso= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985685; c=relaxed/simple; bh=r7wpHHbVGwhv+er/xSDrL/eOCIAF/yKHa0ZHGyi4bCs=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=WErQTA4/2FQXii1XOS/K3loBaWoMXf6GOmrVCHfA4wxDmCdv1K8ClPlTHyyJoVh0EMC6MGfMRRUffuySgmdbtP6LmDeWIyMF9VT/J7IazGi6LEMyE+7BRH766m2ftRygIVGj3xf9lJasbPSG51Q1K4MqFCdUYlKVOb43+112TFI= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=wjC0ZZpktjscKjANjN/6s4q/+lEVEpXusDHcLaeTVVxPf+ZCGjHfoiiDiDHJyyCmDOMt2km0+IAPp1R/Sd5e76T1siECeudcEcCj6kajnQQ12HW01NYTT6g/G2M7XqfhTTi6ZSh8q2u4xCKiZZHxLvA/Lz7S2Lu8SnweSpXh8R9PicDs07FS/Aa77UYixa1hdyO/g1/pQTCYq7A8AtiiKW72SKdtGyFIwII0K8rZmYdqbd0M57Ko4ZhSrMXnF4npR+q2V5I8gUtZdngo+QVZDCeO7I306NCttgnnggopc9bWfhxxy9yMT46ysIs3lLFVbQnXLrW+sYL4XhcSvOtBLw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=P2bDV9yBHhohubgKzurw85tbiLMygeOK4V4wygmn/48=; b=DDRD/mOfCpHVwYFaQ2dEBVR78iWKER6f1eiNYk3vgcGNMp/vLWhJjgrwKTsmaiyg5siejhrBz+tG5dCSfQ/3UtcprAKmnWTtweSYHSSQFvaM/uF5n8Udl6NY8SICzhCx9myxwwQahHhwazBE7xYv5kfiIfwINjedgIYzrOdjra9Bgf6bYACiJmVzjGd3sbDl1w3OUpQozch4QlVUOU8lfOJ7D/Ct/4/7jKKL7QUONBH+DxQwpwcWbgxZZbG6yu5kO3RRFEPyaXXjD4/22W4BOuFgRveUmqUoalpdqY8Tbe8iQvHgU1Xcetd6VyOQvE6Pew+0MgkFBVWYFrdntqxy9Q== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=P2bDV9yBHhohubgKzurw85tbiLMygeOK4V4wygmn/48=; b=PNqqYa17W6CzWxuG32aqenmohinccXpyG8wnMXl43b6PevkjlwP3ETkIskxDWwI8eJx4gk68vaBB4PmoQgWuykuUuqWNt3pSpT/QCz36LEnQw7ST03JqzLPiMLSp6bmN728CwRqrnXcphKJ8kAlOXQmvPX9HS9v4SA5h2snY5ow= Received: from AS9PR05CA0057.eurprd05.prod.outlook.com (2603:10a6:20b:489::35) by DB9PR08MB8359.eurprd08.prod.outlook.com (2603:10a6:10:3d9::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.20; Fri, 26 Jul 2024 09:21:15 +0000 Received: from AM1PEPF000252DB.eurprd07.prod.outlook.com (2603:10a6:20b:489:cafe::26) by AS9PR05CA0057.outlook.office365.com (2603:10a6:20b:489::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.19 via Frontend Transport; Fri, 26 Jul 2024 09:21:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM1PEPF000252DB.mail.protection.outlook.com (10.167.16.53) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:21:15 +0000 Received: ("Tessian outbound 0808e8e76ea3:v365"); Fri, 26 Jul 2024 09:21:15 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 7cecced84b692997 X-CR-MTA-TID: 64aa7808 Received: from L922f60840beb.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 40575899-2BD2-4B90-B677-4A6CD91AD10B.1; Fri, 26 Jul 2024 09:21:03 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L922f60840beb.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:21:03 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=JcYH9tYgSDfb/353ictAQuTVfRkjB+aZ1MbgiCmGp2es9EyuZoLC9daEUMB7TxqaIuSRQmcct6rGy0cSlJWJrlOzx+WCRRrCh4cTtkW4W3ty5G0cOIR3rPcv2R64ggcKO+yVZ01gpke0+ZX+VKeVG4Rz3ie/nwkkkPs8TinC+9ekscKBV8kerqAczIwgn7cPTap/IfrJW67CeKNgKQgDrO+NE/2TTP6rQJM6MOJyDw3oH/j86aNZPqjxpXsIfF/oRtyjAqWmgateA+czAf9N/e9L8xZ2BmuqfJtf3T1wVcHVhLkNqw4iYrtBsKyHgtrgbpOP/JsL37yhoJyldFtioQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=P2bDV9yBHhohubgKzurw85tbiLMygeOK4V4wygmn/48=; b=Q1+dPza94120WbiD1jcHW4cGfIFNw2cf3N7VzHQYOVBeHS2dY2yHCQ2UneH4C5qQ5TsHO7X1SnN+RFmKMhNMTmGLk6/cnkySTtrG4bBrkuMb0t1ZbqTdJtIyjADzZNc7DAy+YPcJ33CXK2KDhiNFOX2K86chqIwDSOIW2mKuKzJncF/+KZu5vrIwmiU7dBAEOtTskPsdAEdVSri4MEIEbXBE848usqn3dXBHKn6Jm+rvt/rWOEG7LHrY8ZH73tzeigc+Kj25JBoe8QQXAMJnUiVFRutMdGF8rRNCrR4smk1hUOaxzgkxprgZxGWzaiYQ6REBCmyo/P27TeW9QRf1XQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=P2bDV9yBHhohubgKzurw85tbiLMygeOK4V4wygmn/48=; b=PNqqYa17W6CzWxuG32aqenmohinccXpyG8wnMXl43b6PevkjlwP3ETkIskxDWwI8eJx4gk68vaBB4PmoQgWuykuUuqWNt3pSpT/QCz36LEnQw7ST03JqzLPiMLSp6bmN728CwRqrnXcphKJ8kAlOXQmvPX9HS9v4SA5h2snY5ow= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV2PR08MB8098.eurprd08.prod.outlook.com (2603:10a6:150:76::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.16; Fri, 26 Jul 2024 09:21:00 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:21:00 +0000 Date: Fri, 26 Jul 2024 10:20:57 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 4/8]AArch64: Add Neoverse N3 and Cortex-A725 core definition and cost model Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0549.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:319::18) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV2PR08MB8098:EE_|AM1PEPF000252DB:EE_|DB9PR08MB8359:EE_ X-MS-Office365-Filtering-Correlation-Id: 2e8d3204-1fd4-4935-17be-08dcad544cc3 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info-Original: +EVJiKY98IIHvKA3C64BkI2eisr6zaB1gw4Yul1KDiimhnsioOOcxnWxSE4BdbwveZgTeConEJ6vc8GtzF4xv8bnA7P25eXouF4uuY1CmjQuavMjgIR3FHqDFO6KhcripiQXAwwJEFEacI2xr/+OUEh+Nfy+pKrYLNeY0AqXHf0u7fcG5764ZN2LjbG39wf3/UO8yfzMjfm4hoF0JL+wBEu1jqA3AEED0L/rVkSAjT8ipTClbLk2L1x/dpL7DifGWI1ZQRi39g29gdmdSLnW0bBJoTititE4OrJANNXSpryJ0SS0lLkB5STDnY2Uc0KAm0598C3bZVszGvRkphWwTElfxiYaaG6xsC9mo+LiEkxaT5kTxiJIaXhW8iZAIOBmjayottSkD7RmSJYqYwMkASeFsex63RA6aiS9v8FVhLKccPUAWTmYfJToRiTKR2VS0T8wHMcYwtvrtWxwbn6rkNjzXYsn2uyJ9Xrfet92UBC9aLTgNMYM4dWbLiGdMAZ6UJPUvtW33+JbwdbDMeps0CZVl/xyd22IFpX8RKN8XXIJyqZNkdndVjwesGlfykgl7Arg6Mj5hdeIMgykPcPosGi9NUyBjrTCv55BU3AGJOxEGtZi/g/lJ5DD8jFM/9r8QZ2q7/5sEPJ67FE8WGVIq2lbviffgJiUJ0ZDe88h3eyLOMdW1IsOPyAEKYbvTfhV/lBJ9wCVZsh06Ty9z+ErQUmqBO0Cbpg1hulE5+CH9TAvs3+T9XcFz4fTXHVGqZzahjn7042KrRCGqiHZJA8YYkNpsQGr+ak0G1NrH5MjPh3ju6c01xd3D7NgdDiWk/nVL1+icB/IXcuZgAd0AKyilWDs8jBqeUhyFGosHN7Zp2j6NikXhgxMpVqFy7XtbxtvbI4KMBOBjGf+TSYi5TYjG5bTrcNrB78WZP0ELHkDPoT0w+RAlh7fMW1RMtTwltupJ6AsjEUHLr2Ja6n4eHCgIbwndQ2Pf0ttgEz6jlUKLtlQHMT5eoIku89L8e2Y0wMI9KMMH7Sqv1FqO0y2VfVtzG/t7VvPzLHkCr78bkHtNHwFfbs6KUP3w0GbG4UDi/X/T+ut+5YaUvmHUuvX9jYdBYMl9NMFVK4KUX45Bqf8b2WPTJH2O4tMWjYbP17FUrFqcuAKDEhYylpwXLfkSC9+e6YarCHYbpHRWgM4KSd8IH3CcC9QTtjyMpMhlYnUdYrkgR9jntwO96s5t1HJ0ZaeATB+fruzW+W/tyV/cCV+WXGFWVMuOu8xzk8/8iByT3AY4X1J/ytnQTN760zAVW1G+YW56GYT3lJ2PCB5ms+Py0s= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB8098 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM1PEPF000252DB.eurprd07.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 21f48f35-6bfc-40b9-cd97-08dcad544397 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|376014|82310400026|34020700016|36860700013|35042699022; X-Microsoft-Antispam-Message-Info: =?utf-8?q?8Z7js4kR0GCuUtchbKPly8GatkWG/pH?= =?utf-8?q?EMaMKrBFNetF3nhl7mS1KtDv+UfYUB6Nzj2jNCPiDsIR7f/VESacvopONjFAmP+Rp?= =?utf-8?q?U7yxwXH9Gd7SCenCCg1cdhTbR7d/b+O8Y0UbO/Jby6yWSx2N3Cu2NKkxKWZfoLliu?= =?utf-8?q?MPUhGISEtMAfGseAMcDF5eDFjRzt6Df5nQjqeIFCcB2Gn0pRXuGz8m/w/25T29muu?= =?utf-8?q?fjIWHOn8+FAkLgQ4ffxMtiyFfUjD7JAp4mur7u04O8aWVIvqKsMnBWtSumJlDmSOG?= =?utf-8?q?kJ0ALUkEMGsbxMHnnmvzXhGCAG5IBOT5QVbs1TsJinZy4b4GZq+pQC/+Mq3uZU/wn?= =?utf-8?q?r0o3utn8CL0cJVzC50VR3npHcOGBtIcMUrUMVYngCEhBZfGmIqmjHoGzhiQfKCmOh?= =?utf-8?q?LbFg6nu9Ozf0FwiajFLjZPCiLzW4W2Zb0HhqK62xkRxVLvY8ZQunXwnJUT+r5LuIM?= =?utf-8?q?VhIKV+anq8mGBZ9oxxnYO/BlUfbvwFqE/+dqQnyunsSCcQoLWizN1IqlNnMN0R90f?= =?utf-8?q?XdcVWhmW6WxlY24CxL5aAJxH2+s8nw3YUuHyydnE2E7XjBZGWSGy2/xKGh1ngy0nF?= =?utf-8?q?wvPsm5+ncM/vho8j5hglG/VopUAQ3YJ/YoI16WQKr9Upb5aismGQQuAm8wkqL2XiR?= =?utf-8?q?YDYiL39CjwZSWo0SlWZTFX2SNTOT90Mq+Y658tZaPKFxl/RPd/OBmy6K8dz6pfkun?= =?utf-8?q?Lw6dFOIF6CbOcA5WZ7CT33UEiNcbbJtrLDrwRdY3LICrf2oHjBTrsnKD9k88ED6RY?= =?utf-8?q?L/aQrJeMvR6kNYQAOst8eoXkWbR937CYmgDm+6sum6cYi5H84vtRrvJF7XxyxAhPO?= =?utf-8?q?vwH2p19OMOgKkX/w6iBCgcuvmWdSF2fMRFOH+GypQFC2pf8jiuKo0itjlNNV7I875?= =?utf-8?q?eeoyG5bfC57BrgV9T4orgtMzQKjkCb0JUviLErObERj7D4zerW/hsVV1d+FMU6Hly?= =?utf-8?q?ZqzihLKcYqKqJyUlCDMIoeAoMibjJKGKUzQC14pw2aK+f0XSKpgY2Mz+wxTAnM/XL?= =?utf-8?q?qBBPdsXy/iWD2v+C8aMlKf1kv8/8ExnL77QfO/tKzGhdASqPJRm/9/p91sBqSME10?= =?utf-8?q?CHWnQtwshffYkyp0amsR11766DJe3K4mwK2ipxn9vtYZ7JvWO8+OexkRmpKVp1I1s?= =?utf-8?q?RTA7ZXLkKnF2dc/kTdp+jWRoHVyus1Kk1aUcgNUeT+Kvrj6j9Grg/ZSRZ7hFnnnIl?= =?utf-8?q?SLVPoRD9f7nkptqediODJZBIql53oPOArXd0KOC1hCxd3RKSnws6nPdYq9n8P3jrB?= =?utf-8?q?iu8T8gta6GZdecR1WgWPEW4VQ98kBmlxNq171catfUTc9enYJbRmIIxBY76nR7M1i?= =?utf-8?q?iF/RwTezu7oW3O4Vva1eIyixrFWGFwK9TQ=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(1800799024)(376014)(82310400026)(34020700016)(36860700013)(35042699022); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:21:15.5698 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2e8d3204-1fd4-4935-17be-08dcad544cc3 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM1PEPF000252DB.eurprd07.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB8359 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This adds a cost model and core definition for Neoverse N3 and Cortex-A725. It also makes Cortex-A725 use the Neoverse N3 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-n3, cortex-a725): New. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/tuning_models/neoversen3.h: New file. * config/aarch64/aarch64.cc: Use it. * doc/invoke.texi: Document it. --- -- diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 72c20b8bc22ba8db4a1e6fdb1ff623020539042b..d70176e86271a65a3610786064432099cd1e75ee 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -183,6 +183,7 @@ AARCH64_CORE("cortex-a710", cortexa710, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, AARCH64_CORE("cortex-a715", cortexa715, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversen2, 0x41, 0xd4d, -1) AARCH64_CORE("cortex-a720", cortexa720, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversen2, 0x41, 0xd81, -1) +AARCH64_CORE("cortex-a725", cortexa725, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversen3, 0x41, 0xd87, -1) AARCH64_CORE("cortex-x2", cortexx2, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversen2, 0x41, 0xd48, -1) @@ -192,6 +193,7 @@ AARCH64_CORE("cortex-x4", cortexx4, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, P AARCH64_CORE("neoverse-n2", neoversen2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x41, 0xd49, -1) AARCH64_CORE("cobalt-100", cobalt100, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x6d, 0xd49, -1) +AARCH64_CORE("neoverse-n3", neoversen3, cortexa57, V9_2A, (SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen3, 0x41, 0xd8e, -1) AARCH64_CORE("neoverse-v2", neoversev2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversev2, 0x41, 0xd4f, -1) AARCH64_CORE("grace", grace, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, SVE2_AES, SVE2_SHA3, SVE2_SM4, PROFILE), neoversev2, 0x41, 0xd4f, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index b02e891086ccc60aa5eac3b28206b240ef7937e8..d71c631b01c767633cc1e9c362ac51533a87c53f 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversev2,grace,neoversev3,neoversev3ae,demeter,generic,generic_armv8_a,generic_armv9_a" + "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexa725,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversen3,neoversev2,grace,neoversev3,neoversev3ae,demeter,generic,generic_armv8_a,generic_armv9_a" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 3b7bbf0399405b1ccda71e563de846a2161ac580..8f4abe4d560a6b5b83667946ee3a2178cfec270a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -412,6 +412,7 @@ static const struct aarch64_flag_desc aarch64_tuning_flags[] = #include "tuning_models/neoversev1.h" #include "tuning_models/neoverse512tvb.h" #include "tuning_models/neoversen2.h" +#include "tuning_models/neoversen3.h" #include "tuning_models/neoversev2.h" #include "tuning_models/neoversev3.h" #include "tuning_models/neoversev3ae.h" diff --git a/gcc/config/aarch64/tuning_models/neoversen3.h b/gcc/config/aarch64/tuning_models/neoversen3.h new file mode 100644 index 0000000000000000000000000000000000000000..95e41b0a61326ecf4dc1ed09c8323991811b65ec --- /dev/null +++ b/gcc/config/aarch64/tuning_models/neoversen3.h @@ -0,0 +1,245 @@ +/* Tuning model description for AArch64 architecture. + Copyright (C) 2009-2024 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_AARCH64_H_NEOVERSEN3 +#define GCC_AARCH64_H_NEOVERSEN3 + +#include "generic.h" + +static const struct cpu_addrcost_table neoversen3_addrcost_table = +{ + { + 1, /* hi */ + 0, /* si */ + 0, /* di */ + 1, /* ti */ + }, + 0, /* pre_modify */ + 0, /* post_modify */ + 2, /* post_modify_ld3_st3 */ + 2, /* post_modify_ld4_st4 */ + 0, /* register_offset */ + 0, /* register_sextend */ + 0, /* register_zextend */ + 0 /* imm_offset */ +}; + +static const struct cpu_regmove_cost neoversen3_regmove_cost = +{ + 3, /* GP2GP */ + /* Spilling to int<->fp instead of memory is recommended so set + realistic costs compared to memmov_cost. */ + 5, /* GP2FP */ + 4, /* FP2GP */ + 4 /* FP2FP */ +}; + +static const advsimd_vec_cost neoversen3_advsimd_vector_cost = +{ + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 2, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + 6, /* reduc_i8_cost */ + 5, /* reduc_i16_cost */ + 3, /* reduc_i32_cost */ + 2, /* reduc_i64_cost */ + 9, /* reduc_f16_cost */ + 6, /* reduc_f32_cost */ + 3, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* This depends very much on what the scalar value is and + where it comes from. E.g. some constants take two dependent + instructions or a load, while others might be moved from a GPR. + 4 seems to be a reasonable compromise in practice. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ +}; + +static const sve_vec_cost neoversen3_sve_vector_cost = +{ + { + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 3, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + /* Theoretically, a reduction involving 15 scalar ADDs could + complete in ~5 cycles and would have a cost of 15. [SU]ADDV + completes in 8 cycles, so give it a cost of 15 + 3. */ + 18, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 7: 7 + 4. */ + 11, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 4: 3 + 2. */ + 5, /* reduc_i32_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 2: 1 + 1. */ + 2, /* reduc_i64_cost */ + /* Theoretically, a reduction involving 7 scalar FADDs could + complete in ~8 cycles and would have a cost of 7. FADDV + completes in 6 cycles, so give it a cost of 7 + -2. */ + 5, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 4: 3 + 0. */ + 3, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 2: 1 + 0. */ + 1, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* See the comment above the Advanced SIMD versions. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ + }, + 3, /* clast_cost */ + 10, /* fadda_f16_cost */ + 6, /* fadda_f32_cost */ + 4, /* fadda_f64_cost */ + /* A strided Advanced SIMD x64 load would take two parallel FP loads + (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather + is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads + (cost 8) and a vec_construct (cost 4). Add a full vector operation + (cost 2) to that, to avoid the difference being lost in rounding. + + There is no easy comparison between a strided Advanced SIMD x32 load + and an SVE 32-bit gather, but cost an SVE 32-bit gather as 1 vector + operation more than a 64-bit gather. */ + 14, /* gather_load_x32_cost */ + 12, /* gather_load_x64_cost */ + 1 /* scatter_store_elt_cost */ +}; + +static const aarch64_scalar_vec_issue_info neoversen3_scalar_issue_info = +{ + 3, /* loads_stores_per_cycle */ + 2, /* stores_per_cycle */ + 4, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ +}; + +static const aarch64_advsimd_vec_issue_info neoversen3_advsimd_issue_info = +{ + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 2, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ +}; + +static const aarch64_sve_vec_issue_info neoversen3_sve_issue_info = +{ + { + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 2, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ + }, + 2, /* pred_ops_per_cycle */ + 1, /* while_pred_ops */ + 0, /* int_cmp_pred_ops */ + 0, /* fp_cmp_pred_ops */ + 1, /* gather_scatter_pair_general_ops */ + 1 /* gather_scatter_pair_pred_ops */ +}; + +static const aarch64_vec_issue_info neoversen3_vec_issue_info = +{ + &neoversen3_scalar_issue_info, + &neoversen3_advsimd_issue_info, + &neoversen3_sve_issue_info +}; + +/* Neoversen3 costs for vector insn classes. */ +static const struct cpu_vector_cost neoversen3_vector_cost = +{ + 1, /* scalar_int_stmt_cost */ + 2, /* scalar_fp_stmt_cost */ + 4, /* scalar_load_cost */ + 1, /* scalar_store_cost */ + 1, /* cond_taken_branch_cost */ + 1, /* cond_not_taken_branch_cost */ + &neoversen3_advsimd_vector_cost, /* advsimd */ + &neoversen3_sve_vector_cost, /* sve */ + &neoversen3_vec_issue_info /* issue_info */ +}; + +static const struct tune_params neoversen3_tunings = +{ + &cortexa76_extra_costs, + &neoversen3_addrcost_table, + &neoversen3_regmove_cost, + &neoversen3_vector_cost, + &generic_branch_cost, + &generic_approx_modes, + SVE_128, /* sve_width */ + { 4, /* load_int. */ + 2, /* store_int. */ + 6, /* load_fp. */ + 1, /* store_fp. */ + 6, /* load_pred. */ + 2 /* store_pred. */ + }, /* memmov_cost. */ + 5, /* issue_rate */ + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ + "32:16", /* function_align. */ + "4", /* jump_align. */ + "32:16", /* loop_align. */ + 3, /* int_reassoc_width. */ + 6, /* fp_reassoc_width. */ + 4, /* fma_reassoc_width. */ + 3, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 0, /* max_case_values. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND + | AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS + | AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS + | AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT), /* tune_flags. */ + &generic_prefetch_tune, + AARCH64_LDP_STP_POLICY_ALWAYS, /* ldp_policy_model. */ + AARCH64_LDP_STP_POLICY_ALWAYS /* stp_policy_model. */ +}; + +#endif /* GCC_AARCH64_H_NEOVERSEN3. */ \ No newline at end of file diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index c14eaf8526028575feb40bafebb4ee29dc5b4be6..13a3b75aa22da99422bce1fecc17174f97e811a1 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21524,7 +21524,8 @@ performance of the code. Permissible values for this option are: @samp{oryon-1}, @samp{neoverse-512tvb}, @samp{neoverse-e1}, @samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{neoverse-v2}, @samp{grace}, -@samp{neoverse-v3}, @samp{neoverse-v3ae}, +@samp{neoverse-v3}, @samp{neoverse-v3ae}, @samp{neoverse-n3}, +@samp{cortex-a725}, @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx}, @samp{octeontx81}, @samp{octeontx83}, @samp{octeontx2}, @samp{octeontx2t98}, @samp{octeontx2t96} From patchwork Fri Jul 26 09:21:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965231 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=D+13840x; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=D+13840x; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj5N26NBz1yY5 for ; Fri, 26 Jul 2024 19:24:56 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 793C93870C1B for ; Fri, 26 Jul 2024 09:24:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-AM0-obe.outbound.protection.outlook.com (mail-am0eur02on20601.outbound.protection.outlook.com [IPv6:2a01:111:f403:2606::601]) by sourceware.org (Postfix) with ESMTPS id 58A8A3870C18; Fri, 26 Jul 2024 09:21:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 58A8A3870C18 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 58A8A3870C18 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2606::601 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985700; cv=pass; b=wD8h/TmAWVoxCoYnQwbPWX45jBpkdnaBWFZ3vobp93ZfGwAvRjuCgm6TkjNOuv3YFS3LnDDULpMfLY3L2V42MxWpVNrEwo8dxANjGXQlRhQMgj0pwLH+mr59RWTCC+tUNSJH9ZFNjqx6tJFDN2tX1Sa5tpIMIPppmezTvEJVl0I= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985700; c=relaxed/simple; bh=/dZVJIuyRcz3k/AoPE6Vz5V1EuyIpmrAqgzCIetqtrk=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=kd5UuqbYh8PZLNG3gWNbt0605JtuwhbNXHMOKAOXy6BldXd61KVzbFpLOdfmYJTry4hksFbDyaqbpTYAA1HO/2XU8Se/2XUWY96bA283O7H38SajG4q1/ouM0V0+bzVwLADxITFWpWKvHpljK7aud9ryznToGlcxhVM2BzlSNfI= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=XCO5/PotwwA0GVq+yHbO4tkWK3qZdcMSeOAJCMlqbr1n/iw+q73J+o9QxUGRnP5xnqSl+HRRwO41Lf4B0cn7b/6Hh5FNgiBOZxlVbC7XmQHzKMWAMcJlViw9HTWYY9vcKs5TvDf1p2vMwL+7ZOOXIavrZbo2PPLx25BCvOrPiGpjs+ABCDJQJUijeN1P7Z240dI7lLJ65INJPxrr0oQ3EvcOPHDFERrNiJHxYYWepEOG1bw3y0ncNlHtHn35pRRe9+/YBEiZJ/vVMXuH5jhqaRJlVHVnTE4ezP8E17ooJo5psk1STcBq4emT7OQPFWjNUKYOKpcSXoiY8xXldNNPxw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=32wxkzwcdbdU13g3B98K+ci+b8JPd1tXjOjdpzhcg0I=; b=KOtsyg+C6j6DWtroUIb7OBnzSm27k2bTk8w8LFnCmHWhPERmeAar8SPKYTMmYezEdW6xZjTb4wyxwAVGp24ufY9UYheTVrhxQABCxZBxy722nk/KDKhpws5O9gpRn0H+wJDSKNXdEsSam7VfXqkW8+tN71nu1rwPjpFR9WvZ6q1TODzbeBguYf72xhEOv2gaSvPHcnGrMOuSAN74Eq6LiM0EIOushjV+5t0ZC9fGW2yT1be3jMe9GhI0zuiiUCM2hhKuM2HFBoDUVplHaohekoxJlyGCvW3PoR1+JcBiMT9Xco+BYFwHYwn4pMM/ZITCFOKmhoKWQBKTwqeU1gCRfQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=32wxkzwcdbdU13g3B98K+ci+b8JPd1tXjOjdpzhcg0I=; b=D+13840x1tCFyYsMAMv8iE5NRjaGa2abjEFjeV43OueCgzWdUCNlRz0wPzmo/681utJp/wzPnbPVIgg+l+q1yMz9W/XwnY4OO/VCuUZ0/DHT3GymU5foM07khJIRziPVy37cgLn63Xs1JxjlqS8febabkyoHDItaX5YzbTfnIBc= Received: from AS9PR04CA0108.eurprd04.prod.outlook.com (2603:10a6:20b:50e::25) by AS2PR08MB9893.eurprd08.prod.outlook.com (2603:10a6:20b:5f3::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.16; Fri, 26 Jul 2024 09:21:30 +0000 Received: from AMS0EPF000001A2.eurprd05.prod.outlook.com (2603:10a6:20b:50e:cafe::8d) by AS9PR04CA0108.outlook.office365.com (2603:10a6:20b:50e::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.28 via Frontend Transport; Fri, 26 Jul 2024 09:21:30 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF000001A2.mail.protection.outlook.com (10.167.16.235) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:21:30 +0000 Received: ("Tessian outbound 5cfbd73e165d:v365"); Fri, 26 Jul 2024 09:21:29 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 216399fe0045f1b7 X-CR-MTA-TID: 64aa7808 Received: from L1a89b88da814.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id FB0BB838-F249-4A2B-B919-A5250B3266BE.1; Fri, 26 Jul 2024 09:21:18 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L1a89b88da814.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:21:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=pt3h9xeorGpR4Q06Uk+H461G+A+PH9kequTh504N0vsoisB6LwdTcI4xvfCSkTGfABwM+5dHXG1OrzamHhDQ5ziWEII8nINeDVzkE3GP/f1dS0A0yXI/MvMpH3z3iViinbFNMA8qdu7sTpGdl++UCoXH35x2FkM3bL1nKhO7e4QqOAYtVgziZXi3pcgsGt5oUEopoc9QSH2Hv60toKTKVnEQrxDm8I7RBka3WhrLNPz5KwTf3/zLiIKR0rShHYH8XgNTimFjVAKhc/f1T5Pvqfdqw3r9HnfWTEz4H3+A0ugyA3Z/VGiL8C46undwmlVL+MDLBtZHdqk4SST2VYmYEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=32wxkzwcdbdU13g3B98K+ci+b8JPd1tXjOjdpzhcg0I=; b=t8zYX6wZMKOdOEg7ubT4i2PY5gYb9hb/8MvbxP9Lz6xd05J7LL8wgjYfpROJog6XuJdfFPx1Hsk2gOw4HVFhPWKD87kI4PpADw9h8c1sMa4N1CXt+nDdDAoeHR+ze6GS06WXdBQnfQTY9rZL9jKR22uyZr00Lvsh7avZtJPKT2iz+rBNBl3avOOQZoIMGHLIAxukvzVI56MuKWAISg27tPeM2ARVlyMbweqeRLrxVKcyYkLIrfccMuDcttvHNK8mW/kV9+iokVMkbziLDTdyqszt6MVQm1+Uclfdq9dvFfE8iWwfRdLae7AYv0jl11rQTrXeAz6Fi1a3l2s6OhLSVQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=32wxkzwcdbdU13g3B98K+ci+b8JPd1tXjOjdpzhcg0I=; b=D+13840x1tCFyYsMAMv8iE5NRjaGa2abjEFjeV43OueCgzWdUCNlRz0wPzmo/681utJp/wzPnbPVIgg+l+q1yMz9W/XwnY4OO/VCuUZ0/DHT3GymU5foM07khJIRziPVy37cgLn63Xs1JxjlqS8febabkyoHDItaX5YzbTfnIBc= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV2PR08MB8098.eurprd08.prod.outlook.com (2603:10a6:150:76::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.16; Fri, 26 Jul 2024 09:21:14 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:21:14 +0000 Date: Fri, 26 Jul 2024 10:21:11 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P265CA0282.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:37a::14) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV2PR08MB8098:EE_|AMS0EPF000001A2:EE_|AS2PR08MB9893:EE_ X-MS-Office365-Filtering-Correlation-Id: ed5aaa89-dc21-4646-713b-08dcad545564 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info-Original: bdWpWrbhEISt49j6bZxIyVTEfVYpeL2NeZJ2MwhOwqZx7H22B+DGriVaGjvEKHfxPhM6PTB/6mVdaD2Dc4uGqxbcN3fG9tC8t2XXzaa1dvH5tm5RtJsAvv7bNGSGsi5HgWOTtHM7hYwBg7dIpuJIXrNBkBCXPGnGNwYRucapieA2i4WA/gJPNn0ZpMThCDZJkXUCz6TrE8pS/BY2J1zXpKhdk4FMru51WKpENJoCDWWP6cyN522MlrZiLHCE5vmj4+dlAajBsnraqrAk+Dk8xYh3Yn/WuMOUPObORnXELwh07eVc+V3aj+sRgDl+HxPpnllumdkJefKuko4mxDFHBDOtfI10eG936juuOyDUoQrNUeo8f8CVb+ICdrQ7so2DE5HefUfftVYXAsqpRCaAlEHe3niz12g8LacrSf6wm6pLqOFWYLpbW/Jmw51LjUYZX3/DGwgCMbtoMFHFaYLqh+5maLondJR5W958qCbuOAu4FDBVQ0Ii2jGIvCtlyl82XeaMzNxBTQwWprBYBPTKQoyT4i75oBdpEmpIHOFE+/7QTB3jtvLSgW8ykiHEcq0llmRbMRej3tWGOXpWzdwWEBx0eFmJwkSqgTdUQKWJ41hYT66H3CBgKy+ZDzIB6S5V8vQp4N7Fk+BWrBYwkAvIVArk0tvExcHDxphWmLmj+kmwg/Vn5QvLQQq9YvY0DG+uSrejv3d4IPbq/XSk2TMeRGKM3HE2V/CbN1uenoF60X8hr6G13FKzgQzTdfggJNteU3U0j8Cok3cD5STwHrTS2QcO3NVbD+sIRU1Z36jslHopehUOLSxYhs+5bdkflaHcebWNhgiQPE72uUs5/3miGm6ZRQYyP1MKf13R2nsHbyTfLFO8SCQPslB6UYYEg0paFM8/hop8GrVi9mi+XNy4zWfnJ7pGVIn68FRoyfvNM8DauKjRHGjwycsxcicHevSdgKBF5paXPF48a/3mgEiNjjDprqlx9qiCMwadtmX3Xp3n9kb/CX8MY2OtwJFAlgfF57JBwtIPwJLL4BlMeXNBQ2Im1JGix3SSGIqMvtpQZ5+Zmcduq50jk1oeSFjAb2K6BAp0OLPAvDId1dZio1WbFJw5hDaoUlzCoV4shT5T7AJ+rexxw2p2LYCB3BL29s6lyc2KTCb7/WDPObmeLJTCKE3XVy5BVAt6vyStbZ51eJvVtE777jrLLxTFZ06sG94hCf9X/LGI9O9kn5LZKgHLehrz5P4T/c1Q/DggOz5czbAGL8Jyz7t3pzWatLJSnTcTD65q4XGRm0Rn/VzvusZvPxUXn6BMzz6MH/6CMIyKsNCEtrr/NH4BSTImgg5WbCRdifDe2avhC7OBBsdaVGI4Gw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB8098 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF000001A2.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 07115a8e-5690-476a-6f56-08dcad544bc9 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|1800799024|376014|36860700013|35042699022|34020700016; X-Microsoft-Antispam-Message-Info: =?utf-8?q?5X12bmcYKcR3xeVR4D7U5a7tAC2v7C4?= =?utf-8?q?zgLvE1npPjXhft5pKXl3/F5XbZq9lCDVw4hd0j4m1AZUys5hebEbN3LRkjN4v1Q2Y?= =?utf-8?q?7aaIwg7lnKvIe0XOTl2KQEP7qkib/wQUkk+d/xjyOpw6FZeDS9I/Zi+vr5T3ckOle?= =?utf-8?q?7myBZbCESNFTf+Xy+tV7s/aaUpR5pzObtiLag865SRoPEpbS7o31+08zX7vrmo3kT?= =?utf-8?q?XOpUTindN9pudXMkKart91JVGgSN0aZifYHCWPurzpoMn+lrk4Myo42eGHSat9woS?= =?utf-8?q?aYG2/5Y7Uw+9naTFFRbBMNGPV03KqY5onXHbxlIf6NZoHdMijz3N4atr7AAW1aPYc?= =?utf-8?q?tyjK3Qm6WplJ3D5rx0yWJ4SLmnqlb/QUXiGQGvWKC756iioAujb2alS2nx84CV7pj?= =?utf-8?q?NVmtO2WmNg7lD8D5nHe+jvaWDoRE7m6bZ5os6faC1ZtRGz8EtOZr28lSAee/c8FFz?= =?utf-8?q?7jEKxaevn0qxRkbfofcQOaYH1waSeNWDErgcPQICHyuWrWmwjGyflH6m0jfsKqFnR?= =?utf-8?q?prPqUZiyPAGUeZrpwquJ58Yg+DeigNRfN/Kmi8rxTgOHbN7XokzmbrjOeSPj0+KFC?= =?utf-8?q?FSMZrW6ikwl4wacrVFi89mD+30XuS9rS7Ppi4rVTyEwLLNK5SlTgHtup3oWlOp1w2?= =?utf-8?q?ooqEk7YsaTj+b8RQNT3Vn4OiPsmHjKkvdXtzYUJqkdsunIG0ezjufIRSlpLxKP/WM?= =?utf-8?q?1ymTlg2WkNZKQRGMZsZa6Ec7DE7sbRxIz6gjM9YI2cUw0Hkbe8wv36/g6gqvdAZFg?= =?utf-8?q?bQMa9on+BA5I3MyH+52/Bgo5L2NyJVJhVpvPxLZHZr8n3uQguvmAiDNpRPPGMO8I9?= =?utf-8?q?dNJGNsFD3L3f+wL3fu/VQwDY3t1VOSC81eXcj5sOy3aeSTuoEdKAoir9rFZv76fP4?= =?utf-8?q?cDQE09L6ZiYCg7afgHH0VeAWcnuU0vSfxwWDpY6x1LlmO1ABqR6nQ4Hb5QgFauo4b?= =?utf-8?q?Ihzoebo4/DR+e9Mvhamd64pUNVwVhBvKimzrUrzxcEsoGXvZWzsxvxJ946jVFnOSR?= =?utf-8?q?NPOJGvEq9r5SWhOR4qyUp6rXmPSn/imArOEIR9dnhBhaAgJjBs5dKlj8flntyCTq4?= =?utf-8?q?7oTSnU3+3HqP8nkVo216EgTEgllMQu1jMNWtSUhjXO2AVYDBh25St2R7yScxOkrIn?= =?utf-8?q?V3Lj/4xWD4fS0KVj9BuOn6B3TZaIwjJnth0jowTbqLYnF+jABwCdyZC2T1q/04DzK?= =?utf-8?q?YJ8LFW/2pDL/6oeSj4JjJtqd+yxtvLPw4HDKlENdJ1amDFhspIoPfT4v32PsVie4m?= =?utf-8?q?294TTnMoZQzdeuVhsJ1t52uXPLjyinQYfCgnAEliu8mk576YlAhTGIrLHtZDxh9Y6?= =?utf-8?q?40F01uUK/sye+bIk+yKRoEji4GzGjcmj6Mu7ezjQ4HIWYWwExay6w9G0nI/iO0qYX?= =?utf-8?q?wyUsTVQjJRC?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(1800799024)(376014)(36860700013)(35042699022)(34020700016); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:21:30.0334 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ed5aaa89-dc21-4646-713b-08dcad545564 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001A2.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB9893 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, this updates the costs for gener-armv9-a based on the updated costs for Neoverse V2 and Neoverse N2. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/tuning_models/generic_armv9_a.h: Update costs. --- -- diff --git a/gcc/config/aarch64/tuning_models/generic_armv9_a.h b/gcc/config/aarch64/tuning_models/generic_armv9_a.h index 0a08c4b4347332d85e15bece30859129feb2d492..b39a0c73db910888168790888d24ddf4406bf1ee 100644 --- a/gcc/config/aarch64/tuning_models/generic_armv9_a.h +++ b/gcc/config/aarch64/tuning_models/generic_armv9_a.h @@ -58,7 +58,7 @@ static const advsimd_vec_cost generic_armv9_a_advsimd_vector_cost = 2, /* ld2_st2_permute_cost */ 2, /* ld3_st3_permute_cost */ 3, /* ld4_st4_permute_cost */ - 3, /* permute_cost */ + 2, /* permute_cost */ 4, /* reduc_i8_cost */ 4, /* reduc_i16_cost */ 2, /* reduc_i32_cost */ @@ -87,28 +87,28 @@ static const sve_vec_cost generic_armv9_a_sve_vector_cost = { 2, /* int_stmt_cost */ 2, /* fp_stmt_cost */ - 3, /* ld2_st2_permute_cost */ - 4, /* ld3_st3_permute_cost */ - 4, /* ld4_st4_permute_cost */ - 3, /* permute_cost */ + 2, /* ld2_st2_permute_cost */ + 3, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ /* Theoretically, a reduction involving 15 scalar ADDs could complete in ~5 cycles and would have a cost of 15. [SU]ADDV - completes in 11 cycles, so give it a cost of 15 + 6. */ - 21, /* reduc_i8_cost */ - /* Likewise for 7 scalar ADDs (~3 cycles) vs. 9: 7 + 6. */ - 13, /* reduc_i16_cost */ - /* Likewise for 3 scalar ADDs (~2 cycles) vs. 8: 3 + 6. */ - 9, /* reduc_i32_cost */ - /* Likewise for 1 scalar ADD (~1 cycles) vs. 2: 1 + 1. */ - 2, /* reduc_i64_cost */ + completes in 9 cycles, so give it a cost of 15 + 4. */ + 19, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 8: 7 + 5. */ + 12, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 6: 3 + 4. */ + 7, /* reduc_i32_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 4: 1 + 3. */ + 4, /* reduc_i64_cost */ /* Theoretically, a reduction involving 7 scalar FADDs could - complete in ~8 cycles and would have a cost of 14. FADDV - completes in 6 cycles, so give it a cost of 14 - 2. */ - 12, /* reduc_f16_cost */ - /* Likewise for 3 scalar FADDs (~4 cycles) vs. 4: 6 - 0. */ - 6, /* reduc_f32_cost */ - /* Likewise for 1 scalar FADD (~2 cycles) vs. 2: 2 - 0. */ - 2, /* reduc_f64_cost */ + complete in ~8 cycles and would have a cost of 7. FADDV + completes in 8 cycles, so give it a cost of 7 + 0. */ + 7, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 6: 3 + 2. */ + 5, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 4: 1 + 2. */ + 3, /* reduc_f64_cost */ 2, /* store_elt_extra_cost */ /* This value is just inherited from the Cortex-A57 table. */ 8, /* vec_to_scalar_cost */ @@ -128,7 +128,7 @@ static const sve_vec_cost generic_armv9_a_sve_vector_cost = /* A strided Advanced SIMD x64 load would take two parallel FP loads (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads - (cost 8) and a vec_construct (cost 2). Add a full vector operation + (cost 8) and a vec_construct (cost 4). Add a full vector operation (cost 2) to that, to avoid the difference being lost in rounding. There is no easy comparison between a strided Advanced SIMD x32 load @@ -166,14 +166,14 @@ static const aarch64_sve_vec_issue_info generic_armv9_a_sve_issue_info = { { { - 3, /* loads_per_cycle */ + 3, /* loads_stores_per_cycle */ 2, /* stores_per_cycle */ 2, /* general_ops_per_cycle */ 0, /* fp_simd_load_general_ops */ 1 /* fp_simd_store_general_ops */ }, 2, /* ld2_st2_general_ops */ - 3, /* ld3_st3_general_ops */ + 2, /* ld3_st3_general_ops */ 3 /* ld4_st4_general_ops */ }, 2, /* pred_ops_per_cycle */ @@ -191,7 +191,7 @@ static const aarch64_vec_issue_info generic_armv9_a_vec_issue_info = &generic_armv9_a_sve_issue_info }; -/* Neoverse N2 costs for vector insn classes. */ +/* Generic_armv9_a costs for vector insn classes. */ static const struct cpu_vector_cost generic_armv9_a_vector_cost = { 1, /* scalar_int_stmt_cost */ @@ -228,7 +228,7 @@ static const struct tune_params generic_armv9_a_tunings = "32:16", /* loop_align. */ 2, /* int_reassoc_width. */ 4, /* fp_reassoc_width. */ - 1, /* fma_reassoc_width. */ + 2, /* fma_reassoc_width. */ 2, /* vec_reassoc_width. */ 2, /* min_div_recip_mul_sf. */ 2, /* min_div_recip_mul_df. */ From patchwork Fri Jul 26 09:21:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965226 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=b0B7nQN2; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=b0B7nQN2; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj2F0PsSz1yY5 for ; Fri, 26 Jul 2024 19:22:13 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 34E9B386C5B5 for ; Fri, 26 Jul 2024 09:22:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on20600.outbound.protection.outlook.com [IPv6:2a01:111:f403:2607::600]) by sourceware.org (Postfix) with ESMTPS id C722D387093D; Fri, 26 Jul 2024 09:21:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C722D387093D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C722D387093D Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2607::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985707; cv=pass; b=DjhZLW6MeNpnk/Qpc1yDXZUXIa60+RuK3dpHk6yQXxbfydVIyI5Rbr78/JmvwavVaHWhr7g75TgK937l1758+SW6lzrudCqNVyM1tuFXg19hbh+2Q0P555ISNVQfJjrueKlH3O1cZJP3P9//tcPtjXXNv69hGJotZ7aRz8NxqaI= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985707; c=relaxed/simple; bh=1g139DHe7wGOPKQsqVQBrrHgEekIWkNGWKPqqSXaJRo=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=dtabX17Tnq3aSWX8GjmS9ZPfu/ZjyJ+rrbyzLRQ1pJA2hp1QzzrwSvTg8J3DjabTIHEuC1bzKPZnmbQmD6nZedX2kUOmlPlLHypld8xWqR3atFHAZE1sXoZ4X4jDrm1WMc6rEIGVU5ZWpMviORrN+ilHb1dPFOhmXpiCuetz2Ew= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=C4H5JKeFnLp/+F+ZIzTqFKHaqRqVIsqVc1mFVtRJvcVrsP2gNVdqrkJUF8ndA5jJ9cA8mnZ6V+j7Id8TeGU0XoQ+mH+aZ1KJXNAULDeS3ZMP9Rg/HHmfai1yhADxnfQO7d23DT4ls5EuW3Zjlvqh2dbmc5IK5QWjqim6cgPusWzKATXWacrye2marLAUDAIzrPBefk291Zr9teAl0ya+vMJNL4VzXCIGv8JgN+9ar9GisRQOmRnzd9OTwl3m59L/rzPALbr7dZsfVD6yYFg9uSZFvPEPOcrzgiSZZhWikvVK79pV3xnisDwJDQecooG0go6DS4fcGWyuat+b8dusAA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qsS1dlHpwUS/2in/6pftpexKiVT3vnTtbDUydlVdUiY=; b=VqZL9Wjpohv5m19ozOFebNkI7ZnN8U/imJLCHErMC78Y//6BjcalKtcEmx+4tWXy6BFfPy+E/1uNDGvhvtWlEduEh53UOWtOkqm4XNQfLCRphCTL07CYlgWt9QYmZxRPQNhg/qVZAvoMpcXZ8y8aw2tNvMxHgMthAkRVuwza0EDOkloT2f4nTE1qqNfvtt8Y1WsoHwSle4fYbIghVDBmJfW++JGY2tspRahOEi2t7eQtcgA7wbY9iLOfcDtD4fzJCQkIMLRQRTJwninvwzeWPU0dEF5XiTJNuLQTorCXUoGMK4S+gyRpdITDDO+gkavL1bD03Z8AHLriyS93fWrKLA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qsS1dlHpwUS/2in/6pftpexKiVT3vnTtbDUydlVdUiY=; b=b0B7nQN2v/L0yQ1fknAiZ1/oIRfuRAcB9o4zGcEzppuRQHX1/fkVj3UfrFjOCI/qEi+XmsrHyKS8lOKw1JqIUinh9FsOAVxUBbIxE5y74nM4Wo6NrMzI92DJwtxwB/PntGgsPIQekavafXa+i2P1bLw/0Gm16iw9KpPRMYLGRj4= Received: from DUZPR01CA0088.eurprd01.prod.exchangelabs.com (2603:10a6:10:46a::14) by GVXPR08MB8236.eurprd08.prod.outlook.com (2603:10a6:150::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.18; Fri, 26 Jul 2024 09:21:39 +0000 Received: from DU2PEPF0001E9C4.eurprd03.prod.outlook.com (2603:10a6:10:46a:cafe::4a) by DUZPR01CA0088.outlook.office365.com (2603:10a6:10:46a::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.28 via Frontend Transport; Fri, 26 Jul 2024 09:21:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF0001E9C4.mail.protection.outlook.com (10.167.8.73) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:21:37 +0000 Received: ("Tessian outbound 0808e8e76ea3:v365"); Fri, 26 Jul 2024 09:21:37 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 7a2719ffc4509ae2 X-CR-MTA-TID: 64aa7808 Received: from L7fa7c710fab4.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 90DF9593-BB50-4164-848E-A740851954AC.1; Fri, 26 Jul 2024 09:21:31 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L7fa7c710fab4.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:21:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Z8yl1aHdg01WmTtWMyV9yLOWBkNYtqL8wpUnKxmq3oBasJQjL5tQdWb5ROwdTKW1kZVJNXyGKDeYsCmJCHB3UdkZXK9ffZGCcgvy2F6jTdMnz8abqmymsdJCUVl90dDvXJmeUONZLUXwMs90chJ9Vjf7+UU/+/mVBmlFCX2tA5iHGwU0OOwp6IFRobCe82S0BHjA136bJFvkx+2Fi9d1ItuzXKbR01ZyYng7Lwirdw6hOeNpLj3kpaiCzW4jyrj4/bI0HAO8QvlXVJHtl9bE92tJJUoQQ9jBb2MhjwZPKwXxLsenEG5dvLWlPkswo0CrcvVXvzjPamrNomrsWZ/IwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qsS1dlHpwUS/2in/6pftpexKiVT3vnTtbDUydlVdUiY=; b=dtxpWTcoTtH5jZiSaGZixXRY4JUg96jMV5gK9I1wsjcdl4D5UccV0vyF9MmENYgr2TKH/dLQMvXDq82UqgQ+vEdv4fnQ9GHDiVJAJHa094lrnm55AS+Z7sckAS7I4Ws4sd53agJGJDz1+u7pPcmnLXgUY1QBIddkQDlfPE6plqCckVCaahwozqRmeEwabuTJZV17GODbFD1TuKX1IyPZaTo5xMLe/r42tt4PCryhQJaQ4ch40rdEa1/UBDeUsapAhK2JXYloLmtLrvA1wW+hGMjYGNOFeWCmsr33BNUwCGnniTjYgJOxm04G5uPyqdAQVQhNenEQx3PB1DEdyeu/uA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qsS1dlHpwUS/2in/6pftpexKiVT3vnTtbDUydlVdUiY=; b=b0B7nQN2v/L0yQ1fknAiZ1/oIRfuRAcB9o4zGcEzppuRQHX1/fkVj3UfrFjOCI/qEi+XmsrHyKS8lOKw1JqIUinh9FsOAVxUBbIxE5y74nM4Wo6NrMzI92DJwtxwB/PntGgsPIQekavafXa+i2P1bLw/0Gm16iw9KpPRMYLGRj4= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV1PR08MB7915.eurprd08.prod.outlook.com (2603:10a6:150:8d::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.14; Fri, 26 Jul 2024 09:21:27 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:21:27 +0000 Date: Fri, 26 Jul 2024 10:21:25 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 6/8]AArch64: Update Neoverse N2 cost model to release costs Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0354.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:18d::17) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV1PR08MB7915:EE_|DU2PEPF0001E9C4:EE_|GVXPR08MB8236:EE_ X-MS-Office365-Filtering-Correlation-Id: ab3229c4-2ec7-411f-51a7-08dcad545a01 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|366016|376014|1800799024; X-Microsoft-Antispam-Message-Info-Original: z9O+KpfpeNAYJe8Qu7WIjQ3sJlQy16i+c5i7t141jCXea/UcWn023cHHggF9gBzCUBMu8jSsIphioYlZ+Y7OxZpNY0BTd9gSerESu4ITfF1yRycsmc0je/0uqETrnLHRh7lBiKo5kiQOJlat5B5TkKFg/BcgLVQpu/2VtiUwk7xf02X5efP9806EteA5YPjgD+f4eH0glVXVViO1Grmgeh6GaGsqGSz1zpnqYk5fDz+9Mps1cvOFjEubshB2sZrYai1a1CqRvKq0ZBmDqRbRqpB0mXGMk1fhDFMTygl6tGGdJDl/jLplRdiBT4XN6a/XCaMp53/2lNok9H8ttkiSn9KHdQenu6yJh/it6fnegAnsZkWKMFhRP5F9n7kF44lUAsNz2ptD+uln/mo6uC9QHZGeZ3Um8UNEzaOgYJVHYjbGr9CNmfFIP+T7g0yH5pgXJ5l4ItKeH/i22NsFxDzrjWetVa3s636Zrte99IaEk9sD30Jeb6OJUUgt9ICGWtMLr2J/OdlLXn5Dh74Yks3w37nEWoPW6ioYS43yU4X2YcX3SZckaBmhX01bxijcH3ro+PQf5Va9cA3lNMm9hNS0LeWpn+kvX/ThJSWJ9shNQAN92h4q1TLPHGB6idoqz/MPYos+MrdmGFjxg8Gt8tSBMw58HThS56VVyzI9LxTBJQI7b5e8dm/wunyfr5vEAI8dWvHyJKKZ5nFTb2dbhpkujigdgyNod5PLHzsYsLF640FZRy2+Bqbmn79G9J0J/E6kj4IAhOIIq0qwD5LkKunzRAzmgoxGm0TO7InlfhcmNP5n5/ZWNp9zpZUtAftEgnG81rnaAf32w83C4QCuRUEQfHVbF9AZLVgTPVhszCi7M/YCAZb8O9h4gdGZA3m6gW7Sj3IYDIOLEsDoEoyzqBaRn82ro9uuzkzBD5p3y/AbhI/lebLVxd+7/xot4HlOcIOva2tv0U43TfiK/puf5JhpUXtRbjklGa+/fa5a6Jasqd20VY3m6cX66Po+TS6Ivp79NjrkmSjD36la1KakNWPT92Jq+mQlJr/KHjlNVmNS/+d51vcid/PZEkla8GPi9Wen/7EGOFGO94qthu9Beu/1bPjQ6XtfhIehIyA733/9S/L1kWI/1wTARhQOBpMfVASoy/Q6uTimKpAqMJGAriZbMSqR9X9viAaQQi9aTKmsp47EOVMnb9L6lEvlQGUlzV+DnQQzBqmzT7S9wcL8df4lYd3qS8FAAOuv0qNkVqjpCvPKptYj24xg1t4rMM+bExTmccKyMEskB6cQGBSmbA6hTpNR8LJkxRN6q9iTRbJdRF2mtsNoXLszEONbm0fMVMGWrUNRSh6IGtcKVB1B0qaR8g== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(376014)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7915 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF0001E9C4.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 76c998a1-131b-442f-6197-08dcad545395 X-Microsoft-Antispam: BCL:0; ARA:13230040|35042699022|82310400026|1800799024|376014|36860700013|34020700016; X-Microsoft-Antispam-Message-Info: =?utf-8?q?SinEhwk2FzxR++Uh+BneclHcuWqLxH3?= =?utf-8?q?giLvFjCZ9RQszBzle/vR9bINc2pPvEUP7/gw+BZB3e4l/Hd7Xj6D2sxLUlDywHBZk?= =?utf-8?q?knEpBjprw5CwzD5Oz9IVjU4+HTHGjGcCg+g0MUXyE6zXg+rp1nsfHD75RCc95FeIb?= =?utf-8?q?udu+t7ri0/dFJLI7HK0cyZTTug29bI1vEXeHN03WvbJkq8OpsCWdbgQomdRcGPNAk?= =?utf-8?q?KUWzvCSWN9//+9u6/gIQcUCsKP6e0Sr11yng9hEBLKEX+APN7X3f1GH0VqS+Rsxtq?= =?utf-8?q?iUFukid20r+LeiYG7NhGlD19Ik/+TZV3p2DDu1UE9rxzAFU7dr7++Bgv+gaZTauyF?= =?utf-8?q?AgOphYnzSUL485a1OYQqcnbUup+oFWXQebgynqNHN749L5eSA/be7P11sPcqVI6LD?= =?utf-8?q?rcHVMdDU5WEfrzevKPlHNn5ZfF4ElzwjSO7lwQvIX6VGZacQ0AQPIdqwEwu501O1z?= =?utf-8?q?mjd8QioGzkExxdJm8d17Rjb0xjm8CeicgFMfbXzFxavbFew/R9xyfheJYvOPyanZm?= =?utf-8?q?T+J0dZ0g7JsJxUU5pW5IhSeabHppF9Ul+LGgk9yoRSmtC7JzWl1amCReQUHDxZz2Y?= =?utf-8?q?xr8Yla7izzj+AuanJyNYiAbCl5Uxr3TOAwoQtPsxTL+kcYKky8f/3kdMttQGbK92Y?= =?utf-8?q?4dVSaGbAynNTJ5i6sH8+xA5EsO5kc5b5W9gIygz3b14r9U6NPykj64od26h4dRQpN?= =?utf-8?q?l5AnlSo5esHSyO507Utdq2ss8DjYxUGoHgPQpyM91fWF2+0AVqBUhryWAHAr/Jx28?= =?utf-8?q?MGyOMFp4l9uq6bzfZQuz6AvVpPFMIALlEKIIWv6PnO10EqAcY26pj89ZfJNtjgTnM?= =?utf-8?q?btO2e+8czJQT6qg5Vss7bDnAG3WZKQFHpblS3vJzWG4Z/FHtqX3XA/RtinQ0coaCq?= =?utf-8?q?h01yhJx7KHsFp7kqqytsZ51E3dGUblV8ayjrtM/lXxmrqMXicB6fW3OYld+D3j/o8?= =?utf-8?q?Ddzzk9gsFCcIdSY9CduAJBKkMFfhWhSgmQzCVOiIK3ZAIUnccSRZLfIRg+xdE/BEO?= =?utf-8?q?LaRXwmHg5FXLfRCHZta8O95RTbD1FNlT5ymJFK3id1TUbhlcfl5gyIQ6BvwFTq7Hs?= =?utf-8?q?h2K52P7l9Zamr0S8c/dBa79LdQ7aHeA1FZOsbSxMG9/pyDT6QWaoOMuL6dD0x6i4+?= =?utf-8?q?f5h3/sP4jEbA6DO2NVejMn4dHEB/zBjGVyYcg/vsx1BRlHzfGMBA2YUUEey4gJny2?= =?utf-8?q?tE6r7rDbCtpPXf+EauT4QtvL879Si4eWtfiLdaUAY475Of1yYYDLoCyh4v+uuUwzR?= =?utf-8?q?ejhxHrRxoNqGwUZR5Z+6VGzR3q7wQ3gmuNng7hAnx6xaBCLlbo3th2y07tmIQgM8Z?= =?utf-8?q?rIiMMzSA+C3wwVzuonf7Kb2qczLnO4zalTPmiDP3l/Agjm7Rh9Vk9S03dctL4CvgW?= =?utf-8?q?qGdVnt7yg53?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(35042699022)(82310400026)(1800799024)(376014)(36860700013)(34020700016); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:21:37.7724 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ab3229c4-2ec7-411f-51a7-08dcad545a01 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF0001E9C4.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR08MB8236 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This updates the cost for Neoverse N2 to reflect the updated Software Optimization Guide. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/tuning_models/neoversen2.h: Update costs. --- -- diff --git a/gcc/config/aarch64/tuning_models/neoversen2.h b/gcc/config/aarch64/tuning_models/neoversen2.h index be9a48ac3adc097f967c217fe09dcac194d7d14f..3430eb9c06819e00ab38966bb960bd6525ff2b5c 100644 --- a/gcc/config/aarch64/tuning_models/neoversen2.h +++ b/gcc/config/aarch64/tuning_models/neoversen2.h @@ -57,7 +57,7 @@ static const advsimd_vec_cost neoversen2_advsimd_vector_cost = 2, /* ld2_st2_permute_cost */ 2, /* ld3_st3_permute_cost */ 3, /* ld4_st4_permute_cost */ - 3, /* permute_cost */ + 2, /* permute_cost */ 4, /* reduc_i8_cost */ 4, /* reduc_i16_cost */ 2, /* reduc_i32_cost */ @@ -86,28 +86,28 @@ static const sve_vec_cost neoversen2_sve_vector_cost = { 2, /* int_stmt_cost */ 2, /* fp_stmt_cost */ - 3, /* ld2_st2_permute_cost */ - 4, /* ld3_st3_permute_cost */ - 4, /* ld4_st4_permute_cost */ - 3, /* permute_cost */ + 2, /* ld2_st2_permute_cost */ + 3, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ /* Theoretically, a reduction involving 15 scalar ADDs could complete in ~5 cycles and would have a cost of 15. [SU]ADDV - completes in 11 cycles, so give it a cost of 15 + 6. */ - 21, /* reduc_i8_cost */ - /* Likewise for 7 scalar ADDs (~3 cycles) vs. 9: 7 + 6. */ - 13, /* reduc_i16_cost */ - /* Likewise for 3 scalar ADDs (~2 cycles) vs. 8: 3 + 6. */ - 9, /* reduc_i32_cost */ - /* Likewise for 1 scalar ADD (~1 cycles) vs. 2: 1 + 1. */ - 2, /* reduc_i64_cost */ + completes in 9 cycles, so give it a cost of 15 + 4. */ + 19, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 8: 7 + 5. */ + 12, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 6: 3 + 4. */ + 7, /* reduc_i32_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 4: 1 + 3. */ + 4, /* reduc_i64_cost */ /* Theoretically, a reduction involving 7 scalar FADDs could - complete in ~8 cycles and would have a cost of 14. FADDV - completes in 6 cycles, so give it a cost of 14 - 2. */ - 12, /* reduc_f16_cost */ - /* Likewise for 3 scalar FADDs (~4 cycles) vs. 4: 6 - 0. */ - 6, /* reduc_f32_cost */ - /* Likewise for 1 scalar FADD (~2 cycles) vs. 2: 2 - 0. */ - 2, /* reduc_f64_cost */ + complete in ~8 cycles and would have a cost of 7. FADDV + completes in 6 cycles, so give it a cost of 7 + -2. */ + 5, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 4: 3 + 0. */ + 3, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 2: 1 + 0. */ + 1, /* reduc_f64_cost */ 2, /* store_elt_extra_cost */ /* This value is just inherited from the Cortex-A57 table. */ 8, /* vec_to_scalar_cost */ @@ -127,7 +127,7 @@ static const sve_vec_cost neoversen2_sve_vector_cost = /* A strided Advanced SIMD x64 load would take two parallel FP loads (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads - (cost 8) and a vec_construct (cost 2). Add a full vector operation + (cost 8) and a vec_construct (cost 4). Add a full vector operation (cost 2) to that, to avoid the difference being lost in rounding. There is no easy comparison between a strided Advanced SIMD x32 load @@ -165,14 +165,14 @@ static const aarch64_sve_vec_issue_info neoversen2_sve_issue_info = { { { - 3, /* loads_per_cycle */ + 3, /* loads_stores_per_cycle */ 2, /* stores_per_cycle */ 2, /* general_ops_per_cycle */ 0, /* fp_simd_load_general_ops */ 1 /* fp_simd_store_general_ops */ }, 2, /* ld2_st2_general_ops */ - 3, /* ld3_st3_general_ops */ + 2, /* ld3_st3_general_ops */ 3 /* ld4_st4_general_ops */ }, 2, /* pred_ops_per_cycle */ @@ -190,7 +190,7 @@ static const aarch64_vec_issue_info neoversen2_vec_issue_info = &neoversen2_sve_issue_info }; -/* Neoverse N2 costs for vector insn classes. */ +/* Neoversen2 costs for vector insn classes. */ static const struct cpu_vector_cost neoversen2_vector_cost = { 1, /* scalar_int_stmt_cost */ @@ -220,7 +220,7 @@ static const struct tune_params neoversen2_tunings = 6, /* load_pred. */ 1 /* store_pred. */ }, /* memmov_cost. */ - 3, /* issue_rate */ + 5, /* issue_rate */ (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ "32:16", /* function_align. */ "4", /* jump_align. */ @@ -243,4 +243,4 @@ static const struct tune_params neoversen2_tunings = AARCH64_LDP_STP_POLICY_ALWAYS /* stp_policy_model. */ }; -#endif /* GCC_AARCH64_H_NEOVERSEN2. */ +#endif /* GCC_AARCH64_H_NEOVERSEN2. */ \ No newline at end of file From patchwork Fri Jul 26 09:21:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965229 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=MYF1deqq; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=MYF1deqq; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj3b5MPTz1yY5 for ; Fri, 26 Jul 2024 19:23:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F2A3B3870C25 for ; Fri, 26 Jul 2024 09:23:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on20600.outbound.protection.outlook.com [IPv6:2a01:111:f400:7e1a::600]) by sourceware.org (Postfix) with ESMTPS id 2D3733870C2B; Fri, 26 Jul 2024 09:22:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2D3733870C2B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2D3733870C2B Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f400:7e1a::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985730; cv=pass; b=arNFWk1sfbi7M47Jgd7dj72Wud7bVi5OlPffoJFFsHcnIf0wB+96l/XIJ0tAcBCan14g31j/1FqDfXfRDbadFTqWuj47YZCNeSWcyYj1sw+Ll22jPnMSa5UHF8Hkea0ZM8PUjblZyhvZDpzUit3WynU3G0aKt0wCpfe0Lo6bvsk= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985730; c=relaxed/simple; bh=rNIQ6v+AnHX5IK4z63WDXM5ew1JHCHxHLU7Ckhs7Oo0=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=YGKgb2lDdeRiDVQNOI8dStsm9vHt4KzwJnkSL4w/ZFjOoa4b5Ylo1RecU73v2HvTVrgFT4+fSBhiWvh7O24V608TyNIu7f3auuxXDsJSN/scgTBetJ2uiZccZGym+yQzi6TQFzSMyQgoK5DliiFK7h//MxI3bhpalJsDShhTOC8= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=FEg+c/KIDe/PX3heAHE54obj/1SR2bBT6eD7GITXTgtvmbjmvw00zrfYNei7YM/Bp1wwf+2NYek6F5YEFRBHXUv0dLxKVPbx1bSf16X1+6RufWlZCYcX5hhLGiDGDWVQf67ipleFENCUSmmdcPnKFiaYpqctTUdejWnvXqPA0wJ/6i+i2a478FQAbxi47kfTlM58/tDzBmNzJAE/Oui8uqj2LvMq0afkVje7q6lOIDP9wmD7zQ7NbsxSajOgGbA6t0SjncJtitQr/DLJidv1ELVYcPp0p6LDCWJdt25cGpNamw5gEuWz7J2kRXOZiCTxjRibmgtBsL+Nn6l/KcV46A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IVdduSOeJSwx0il4SFwQtymHd7xHu186KK+xKt13iQ8=; b=wwZipG4SWpSXLQivDEzHlx6jvHcnu16r8I3RNSc4sv7yBAdgr3bK/rRDiCOu1WaW+Bo2K8TBVqzNHKW9W2zMt79t7NXRAx+4LIPXbISvMEcu53d93wkKXPe8NwZAvS33dJrQyedqjY+X9y7TZTi6LNd7V7fti1beFeXvj4A2zWDDeHA8wpaudHWLi2A89GTR6VZxB4n+7DbMnljlJeNvoYzAMDNEb4iWabW1ANGDXbpH7BOWiJMyY3Ur5N0JJmf0RfhH7OdwxH3uq2/j4SnwkFl81MqlanaieJjxcJ2HpcsaqtAm1D+lIAbAYswkksFIC36PQjszOsTUNcv/rBlrrQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IVdduSOeJSwx0il4SFwQtymHd7xHu186KK+xKt13iQ8=; b=MYF1deqqJrdy8+T8jvPo252L8CZKAFfty7yU4YUkPHxU8p7iuRS7z2DfveItjef1PkmBVy1PJvUb6MWxRc8T8wNLQdyD9h+OY0tH8FqtLuUALzi5NGDcbwnDk+FmpbUZQ16IBpI7iHkQJcfsa63JUFchV6W8rSUKzpUCWRVcWiU= Received: from DU2PR04CA0041.eurprd04.prod.outlook.com (2603:10a6:10:234::16) by AS2PR08MB9643.eurprd08.prod.outlook.com (2603:10a6:20b:608::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:22:01 +0000 Received: from DB1PEPF000509FB.eurprd03.prod.outlook.com (2603:10a6:10:234:cafe::9a) by DU2PR04CA0041.outlook.office365.com (2603:10a6:10:234::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29 via Frontend Transport; Fri, 26 Jul 2024 09:22:01 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB1PEPF000509FB.mail.protection.outlook.com (10.167.242.37) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:22:01 +0000 Received: ("Tessian outbound ca5638866b89:v365"); Fri, 26 Jul 2024 09:22:01 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 4ea847f0bde57dfb X-CR-MTA-TID: 64aa7808 Received: from Le2d67451cf63.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id F838D617-B654-42F0-AC1C-AB5F34E47662.1; Fri, 26 Jul 2024 09:21:50 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Le2d67451cf63.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:21:50 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=xjJsz0PGyoZq9bh0WzdZkeibdYIwfOMyq62WkVvIdbe3q5Wbtsl0nTTo0UB2JGb20Zkm644l2oE2f4CEGvm/avSAvLHBZzI4PNXgS/9JhXO9v2VmpUZ4T5VJD590eByvJPP56JvITvxp22hYk2e5UKFH1O6Z3doT3Z2BUxCTB/fSqLSNnLd72vftuSf4wmDaXiENobQY9EGopkvNgyePzml9Z+Jncf+kT1Y+0IFv4/DjjnhHlKY/vIt7xPvqvdSkEJiib/xeTSAPVwuacsy6IdwUVCZhMvBEZwRpHiDGz0a/RDwLthZvGSEf61BM9IgJu2ROr0590t1FhQS2ZGpZdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IVdduSOeJSwx0il4SFwQtymHd7xHu186KK+xKt13iQ8=; b=D5uCGCn8AZsidwHhK1GnAwXHDTzOZT94MjFKfdGmTPro2jB/QPVgcgWaMIjJ0fYe3zc22UzMIC8f8+9Z3sy0vWQsonbt03EPDN43VJidDCjM7Vc4gCYO9bfNWKRtFlJcjT3w4AF0gxC+MWe/5EfibNLmiNru7y9adcVpbDWbVxNwL3oBg0YYfZrg60ThyZK0+3J9/7r27N6O5pCvBxdzsxtHoq+NLqwBC6IGivICG4nCjQm3a1TbCEFke+Rwoj/iyW1WeCdsHP35qMRQRJGQacQNAbuzM+1Rl+psYXmedGvfsiEQVgaikAXWZwhrZJK3ZOGov4vv+ow8XSdQmceI4Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IVdduSOeJSwx0il4SFwQtymHd7xHu186KK+xKt13iQ8=; b=MYF1deqqJrdy8+T8jvPo252L8CZKAFfty7yU4YUkPHxU8p7iuRS7z2DfveItjef1PkmBVy1PJvUb6MWxRc8T8wNLQdyD9h+OY0tH8FqtLuUALzi5NGDcbwnDk+FmpbUZQ16IBpI7iHkQJcfsa63JUFchV6W8rSUKzpUCWRVcWiU= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV1PR08MB8571.eurprd08.prod.outlook.com (2603:10a6:150:83::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:21:44 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:21:44 +0000 Date: Fri, 26 Jul 2024 10:21:41 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 7/8]AArch64: Add Cortex-X925 core definition and cost model Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO2P265CA0416.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a0::20) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV1PR08MB8571:EE_|DB1PEPF000509FB:EE_|AS2PR08MB9643:EE_ X-MS-Office365-Filtering-Correlation-Id: 09b6fc21-1f16-4c8c-6a82-08dcad546844 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info-Original: ePixiLfYfGL/9FbOzfBU1uQ8QCFabPG7Sbtw3lJK1zcWnrFeV8/D4b8QYJpVTTnGJLu1tlQSsRFQ3CFBkrhXdMKqJomcrOX8vuLBYfCDyM+TmJ2Ta4GFDKSuYRXAU1H8lsUZd5vknefT7mR5AzS386OeHUcA9ix3XzlBmnk2pwvJYxz5lZPu9NAnqLMycQBvszBQPsMYwTilkRFvN9me8P6RITwDRark2HR73njUuqzxpiXR7k8RaoNXdnBxJIYMPr4JSQ6FLFryb6L053gRiE6//utzT3ATOf3jDGhUJ4KeAFHdOQKl8Ssor5wADoTgoz2UBwmUdKK05QsyXv8R6KR3tvgarI9d4GmhVRzO0yHPXfLVnI+zQ7IzFGTcApBUDoYUOffJp7lDFHymlY6MiipodD9Onq0fHKMy0GEPzEeLpQSnod9grbmfTRatEoO3zT/pKL5RCdBxNMTA0nkxkOvMDe7Epk+kEbhTaRJC5Io/GY47g/Y3D9FrbmxuV6eOn0wl+5eqVcqEZJVCy5AC5ZFHo70CJ34a8+gzSGp8B6R+lb2lJ/MNrcCaR6vpD0gadYI8KO3g8O6Yv29bwk/HOgWXG1WYiW5pKj9u7HyQT14n96AnPuLYXoadAvpzrF8fbBXp6fmaR2t4uEjYgJgjWL/XM5vGd9oNQh/xGQaNWXx3jJLL5gHp8b2FxOTQV5YfTs8piZQ+3GKOk+9azJTAi9ep52zwYjxUiSgYVxmmMIb7FkicjCG7LFwCEPTL+7H0pX8xmdRcoOhoaT00IaCMyDeZXtBxMKIHkHRHQxUuOTEJKelEZRi+kT+mgVr1P+pY+mfyNFB/5g0wydxoOdY0vU+sTFGCvv9dfb3pF1XbLbm1D9wjJNKcFIyvJ3Jxcm8+ko3YPYp+gYwb9Wp9J36xp9DHIZDi2tNjBEg3tGFacfz8SBJy0VoKh9pU4w/1QbeIxXp0CDngdJWcbefJuz0Z5cwpfUTO5fEDPtje4SxlAPXlAtrz9Xqwjc2tmbdH2WHQ1u9IXiA98Sd7mVJm7att9PBhKgOHB1Vramij5CQYAdrejWeat9uCG0+cTd2Ndnn/N+6bsoEBaHdL40ja6lrfS6SSJ+anHOo4faxq4alsp9pqFKOCcCd7+1GE5wy8dE0BCXbcnEi3NT4DD1FzM1h6YPlVWWklsnjak7bfTAqEohw0IgVUvr/mdprz7WJCXH143MLblwAwZA4o6Ss68mqxOlwcP7si3GCu9qIdlFaFBTjL/RvRdA/vmeav3LaOf81NdyTcFpRJ5fGCLWcQPHzEkFFONxMRPVmiHCQEfYlLXEY= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8571 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB1PEPF000509FB.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 79dfd1eb-75d0-4de3-b0ec-08dcad545df8 X-Microsoft-Antispam: BCL:0; ARA:13230040|34020700016|36860700013|35042699022|1800799024|376014|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?BfGGwwWqFjgyTCLWZ4BEob+aY+mGaT0?= =?utf-8?q?N1hs/5iJ9d+naso6rWKtBauAtfCCCDtO0h/FqyL81apWdkLaB5SYLAa6DzmxP/UcC?= =?utf-8?q?+jz0U28FHX+dFAZzldBFKuk+xhaC3c/5+PGDVp8m/EHYc8vbPYRKmfYC5twp4Eugy?= =?utf-8?q?7gChk/6uH7nDKx/P9xfRMaCciDhvc3JhmbAFmlGYQZWS0r9x9y2nhVP6+VxYPwyik?= =?utf-8?q?AU/3Ufk6CXX7l3P686Z2cr67/40mbNXgMBaPIa1wskLE8UtzylnatNp++SsreFJJW?= =?utf-8?q?rTXIo7Gs06rSajGp47MJ0Q3NHXwpAop3OJwSuOkIYmx/aU0+2XV9NY49z4bpjBXNS?= =?utf-8?q?wzu2PpjMRg5R2FmMZzj1RsgWRrmx7TJL7x9Ei2rKJnyiPyVWO9oxI/sEUmIrMi4Q2?= =?utf-8?q?F4/XBwGyxl7JV7b4AGqGgZGvtrrVQkuAG1344vfYK/Afgxqa5bCzfxuj7g0/kaFSk?= =?utf-8?q?jlMwog8HxU7HMIkZjKBqEPKNdLX/+gtw0SXbaJPKG9jRDQgh/6TNmp6U5BxLFDpvA?= =?utf-8?q?FkByxX+6U3R9FWQhQfAx5INvWQsZts8UhxFdqs0eKE/W5RGIaaRK4xy/yZAucF0Hb?= =?utf-8?q?y9O2eHjpjwvFPLO8tXQuQ4jy2JNFbOHntNxmrDIbyRSoQVkhzqvdDDuHrq7RzyV9S?= =?utf-8?q?UIcWGluju7kk/osA8pyfTFK/DMfjLJWm+pMgh8WS21uXLh7W4Tg+6+UfHqRJvyqfF?= =?utf-8?q?D8eMVDEF5JvSAYwqz2Qz15lT0jWzrRKOBoUSct6i4ZCXDL4waSipcrZ3N1hyyJzCR?= =?utf-8?q?wEOvziHRTq0MhjYvmirDTuO54VymlalG6s9XZ0p9NJgiTZd0eeSFqOfna3nGrgSuH?= =?utf-8?q?NDC2+IvA8cjwNEYNzs+Wcp3xQk/gWLgBQxd8FyxQRZCTJfJ0Mg4MJjZsMf6TSSNLi?= =?utf-8?q?BXZGD6MvCc3OaQ0sYaS1/xegcLvcAS8eD0mAup/1McAvgcP5xc9aHsBur8EZBS63A?= =?utf-8?q?FQOEVB6oKRfB9CYbzftOYi4Qu2pVNX31ZEJdLv2Nu7rFohklDk4UYQM9L1kWACQpz?= =?utf-8?q?d11njKJTUdyrxlH7MgKsHhYOEoD5GOKRvi3GZZlwZ067DptG75AogzqnK+KTa29Yg?= =?utf-8?q?IbFFtJ4tuG+YerqU6uIBB+P9TNhDKufs2Qn41TQcvgtgcQ2sIe1fByXzGhrDuZ6xJ?= =?utf-8?q?T4tEkfnkSJ0AcW+3+AOo21jqFSCuGTB+Dz1m3FyctMlTPtFufrlFhFbOS7wJHsGN0?= =?utf-8?q?OgRl7X9kgT6S3a7tp8yHyIxB1AtsEQZv9ndnuMbdjDVJ05wYBJUkfg2VJ+uWgHICp?= =?utf-8?q?zmGJobuNJbKf5kRofSKU1foxH+2wQhFXRvx0WYe+Y59P8YMai5eTuyugOLTAf58Gd?= =?utf-8?q?SzpTP+1ZrAS3u9zxOI3/eBdziWWPPFJLcg=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(34020700016)(36860700013)(35042699022)(1800799024)(376014)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:22:01.6533 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 09b6fc21-1f16-4c8c-6a82-08dcad546844 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF000509FB.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB9643 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This adds a cost model and core definition for Cortex-X925. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x925): New. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/tuning_models/cortexx925.h: New file. * config/aarch64/aarch64.cc: Use it. * doc/invoke.texi: Document it. --- -- diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index d70176e86271a65a3610786064432099cd1e75ee..131dbd731a7026c3e02182ae27f6567f9cb719b2 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -190,6 +190,7 @@ AARCH64_CORE("cortex-x2", cortexx2, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8M AARCH64_CORE("cortex-x3", cortexx3, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversev2, 0x41, 0xd4e, -1) AARCH64_CORE("cortex-x4", cortexx4, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversev3, 0x41, 0xd81, -1) +AARCH64_CORE("cortex-x925", cortexx925, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), cortexx925, 0x41, 0xd85, -1) AARCH64_CORE("neoverse-n2", neoversen2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x41, 0xd49, -1) AARCH64_CORE("cobalt-100", cobalt100, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x6d, 0xd49, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index d71c631b01c767633cc1e9c362ac51533a87c53f..4fce0c507f6c0605f8846abe03d7d77641395c5b 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexa725,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversen3,neoversev2,grace,neoversev3,neoversev3ae,demeter,generic,generic_armv8_a,generic_armv9_a" + "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexa725,cortexx2,cortexx3,cortexx4,cortexx925,neoversen2,cobalt100,neoversen3,neoversev2,grace,neoversev3,neoversev3ae,demeter,generic,generic_armv8_a,generic_armv9_a" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 8f4abe4d560a6b5b83667946ee3a2178cfec270a..eafa377cb095f49408d8a926fb49ce13e2155ba2 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -395,6 +395,7 @@ static const struct aarch64_flag_desc aarch64_tuning_flags[] = #include "tuning_models/cortexa57.h" #include "tuning_models/cortexa72.h" #include "tuning_models/cortexa73.h" +#include "tuning_models/cortexx925.h" #include "tuning_models/exynosm1.h" #include "tuning_models/thunderxt88.h" #include "tuning_models/thunderx.h" diff --git a/gcc/config/aarch64/tuning_models/cortexx925.h b/gcc/config/aarch64/tuning_models/cortexx925.h new file mode 100644 index 0000000000000000000000000000000000000000..fb95e87526985b02410d54a5a3ec8539c1b0ba6d --- /dev/null +++ b/gcc/config/aarch64/tuning_models/cortexx925.h @@ -0,0 +1,246 @@ +/* Tuning model description for AArch64 architecture. + Copyright (C) 2009-2024 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_AARCH64_H_CORTEXX925 +#define GCC_AARCH64_H_CORTEXX925 + +#include "generic.h" + +static const struct cpu_addrcost_table cortexx925_addrcost_table = +{ + { + 1, /* hi */ + 0, /* si */ + 0, /* di */ + 1, /* ti */ + }, + 0, /* pre_modify */ + 0, /* post_modify */ + 2, /* post_modify_ld3_st3 */ + 2, /* post_modify_ld4_st4 */ + 0, /* register_offset */ + 0, /* register_sextend */ + 0, /* register_zextend */ + 0 /* imm_offset */ +}; + +static const struct cpu_regmove_cost cortexx925_regmove_cost = +{ + 3, /* GP2GP */ + /* Spilling to int<->fp instead of memory is recommended so set + realistic costs compared to memmov_cost. */ + 5, /* GP2FP */ + 4, /* FP2GP */ + 4 /* FP2FP */ +}; + +static const advsimd_vec_cost cortexx925_advsimd_vector_cost = +{ + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 2, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + 4, /* reduc_i8_cost */ + 4, /* reduc_i16_cost */ + 2, /* reduc_i32_cost */ + 2, /* reduc_i64_cost */ + 6, /* reduc_f16_cost */ + 4, /* reduc_f32_cost */ + 2, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* This depends very much on what the scalar value is and + where it comes from. E.g. some constants take two dependent + instructions or a load, while others might be moved from a GPR. + 4 seems to be a reasonable compromise in practice. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ +}; + +static const sve_vec_cost cortexx925_sve_vector_cost = +{ + { + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 3, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + /* Theoretically, a reduction involving 15 scalar ADDs could + complete in ~4 cycles and would have a cost of 15. [SU]ADDV + completes in 9 cycles, so give it a cost of 15 + 5. */ + 20, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 8: 7 + 5. */ + 12, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 6: 3 + 4. */ + 7, /* reduc_i32_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 2: 1 + 1. */ + 2, /* reduc_i64_cost */ + /* Theoretically, a reduction involving 7 scalar FADDs could + complete in ~6 cycles and would have a cost of 7. FADDV + completes in 8 cycles, so give it a cost of 7 + 2. */ + 9, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 6: 3 + 2. */ + 5, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 4: 1 + 2. */ + 3, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* See the comment above the Advanced SIMD versions. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ + }, + 3, /* clast_cost */ + 10, /* fadda_f16_cost */ + 6, /* fadda_f32_cost */ + 4, /* fadda_f64_cost */ + /* A strided Advanced SIMD x64 load would take two parallel FP loads + (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather + is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads + (cost 8) and a vec_construct (cost 4). Add a full vector operation + (cost 2) to that, to avoid the difference being lost in rounding. + + There is no easy comparison between a strided Advanced SIMD x32 load + and an SVE 32-bit gather, but cost an SVE 32-bit gather as 1 vector + operation more than a 64-bit gather. */ + 14, /* gather_load_x32_cost */ + 12, /* gather_load_x64_cost */ + 1 /* scatter_store_elt_cost */ +}; + +static const aarch64_scalar_vec_issue_info cortexx925_scalar_issue_info = +{ + 4, /* loads_stores_per_cycle */ + 2, /* stores_per_cycle */ + 8, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ +}; + +static const aarch64_advsimd_vec_issue_info cortexx925_advsimd_issue_info = +{ + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 6, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ +}; + +static const aarch64_sve_vec_issue_info cortexx925_sve_issue_info = +{ + { + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 6, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ + }, + 2, /* pred_ops_per_cycle */ + 1, /* while_pred_ops */ + 0, /* int_cmp_pred_ops */ + 0, /* fp_cmp_pred_ops */ + 1, /* gather_scatter_pair_general_ops */ + 1 /* gather_scatter_pair_pred_ops */ +}; + +static const aarch64_vec_issue_info cortexx925_vec_issue_info = +{ + &cortexx925_scalar_issue_info, + &cortexx925_advsimd_issue_info, + &cortexx925_sve_issue_info +}; + +/* Cortexx925 costs for vector insn classes. */ +static const struct cpu_vector_cost cortexx925_vector_cost = +{ + 1, /* scalar_int_stmt_cost */ + 2, /* scalar_fp_stmt_cost */ + 4, /* scalar_load_cost */ + 1, /* scalar_store_cost */ + 1, /* cond_taken_branch_cost */ + 1, /* cond_not_taken_branch_cost */ + &cortexx925_advsimd_vector_cost, /* advsimd */ + &cortexx925_sve_vector_cost, /* sve */ + &cortexx925_vec_issue_info /* issue_info */ +}; + +static const struct tune_params cortexx925_tunings = +{ + &cortexa76_extra_costs, + &cortexx925_addrcost_table, + &cortexx925_regmove_cost, + &cortexx925_vector_cost, + &generic_branch_cost, + &generic_approx_modes, + SVE_128, /* sve_width */ + { 4, /* load_int. */ + 2, /* store_int. */ + 6, /* load_fp. */ + 1, /* store_fp. */ + 6, /* load_pred. */ + 2 /* store_pred. */ + }, /* memmov_cost. */ + 10, /* issue_rate */ + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ + "32:16", /* function_align. */ + "4", /* jump_align. */ + "32:16", /* loop_align. */ + 4, /* int_reassoc_width. */ + 6, /* fp_reassoc_width. */ + 4, /* fma_reassoc_width. */ + 3, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 0, /* max_case_values. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND + | AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS + | AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS + | AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT + | AARCH64_EXTRA_TUNE_AVOID_PRED_RMW), /* tune_flags. */ + &generic_prefetch_tune, + AARCH64_LDP_STP_POLICY_ALWAYS, /* ldp_policy_model. */ + AARCH64_LDP_STP_POLICY_ALWAYS /* stp_policy_model. */ +}; + +#endif /* GCC_AARCH64_H_CORTEXX925. */ \ No newline at end of file diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 13a3b75aa22da99422bce1fecc17174f97e811a1..f15f66d6fa2fae2bbf8e9a73a2039fcf62c6c0f6 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21525,7 +21525,7 @@ performance of the code. Permissible values for this option are: @samp{neoverse-512tvb}, @samp{neoverse-e1}, @samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{neoverse-v2}, @samp{grace}, @samp{neoverse-v3}, @samp{neoverse-v3ae}, @samp{neoverse-n3}, -@samp{cortex-a725}, +@samp{cortex-a725}, @samp{cortex-x925}, @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx}, @samp{octeontx81}, @samp{octeontx83}, @samp{octeontx2}, @samp{octeontx2t98}, @samp{octeontx2t96} From patchwork Fri Jul 26 09:21:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965228 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=K8dTvkvR; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=K8dTvkvR; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj3X45Dxz1yY5 for ; Fri, 26 Jul 2024 19:23:20 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CE1D9384AB70 for ; Fri, 26 Jul 2024 09:23:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on20621.outbound.protection.outlook.com [IPv6:2a01:111:f403:2612::621]) by sourceware.org (Postfix) with ESMTPS id 9E065384A479; Fri, 26 Jul 2024 09:22:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9E065384A479 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9E065384A479 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2612::621 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985751; cv=pass; b=WhByEy7rIdVUUUItQK1bsUeZXIIFxJOx9JHaj30gJoKih0GI7U3F1M6N+MmYrReCfy6coVBInqX6p3lSXGmkJ2HoD9+svi7x80c18vmpxF9Va3lK/Vk7vIkF+n1qPveNjezb9NgxK58JG4yk9Zo/I8S2Xvipe/oIIIzMztnCnz0= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985751; c=relaxed/simple; bh=8TTXOWnL4sccjUOEsiLRH5eWJUN7IY0FCcIYxm4Ed58=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=ildYzJDe6R6X/4V0WPPU7qu3qUlYNnx35KsA9q6SgDbnQwVsNCFnYLUev+mb1uTbES2SYMzeixYhksn2IBUDXrVp5pNKsX4V0CbT42/nffe38TxfSYbvmBz19ttNzdDVNq0xKcmrjpEJM1DqkqjyK8pXdJQ1BbPhOrysVJq6qw8= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=h0LmPXOkPCxgQfb+wGi7knj3Jwo0ty1uZbGsWJoNyUzGjvdiKDcSn37uN+I58lsbe92hvl27y3zU9kUWfgRUSvH6DQwv82FoEAG0rde8j94f53U5IsPFBlNJWwR6ZoqGXsNe3JEYZc3f0iGPUbbYGrJYQc6IjdrjaWiXSJzPJvIonm6rDwe3A6Pm6kldSzDT4nwVw9DoFch1gfoJdZqlmMOcROwnI26MZ3bYxpgeO4HHnS4ykZq0m3HKx1gcdVxrGoW3T1+fDYnUGFsHrD2poyZsQ+4jImcRPsJQNrxHymk6gJxOQ9ofAqM/pDZ9qnYdGwPCxRVNxRHuwVnHzwEGOQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XAVxVwnH+JVDBcLEGomMw4+7BBBp58nwNUOQNRgnVh8=; b=TZPM4im7dLnoBkD2wTTJu5h7x1OiZensckxujWOr/4Y7wQ2jpe3466WHav/WbkBTM8WCWqDyv2a0NPzAUVLDlAiey01JGrUVYfH2lirZz3wVJATebhcDQCAGJNLYtCeYVTzBUe7WEiAC5dyq9awYS+uWBabFi5BDGekZWW/eA103J99Tl3B2epgd975oexe71qX4HoLBUuQa0cAErfJND/whFihy4wZsFgtEjpzbgxMVu/rsNMjDeJzqsArBFJ1YceBsOnWTk6ZjJjXg8y202SFYCTi5MZaw7HTlL8B3ptbruiJhM7rr60IZk09OHBcmiUoaFBsNgGkXriOPEaGFBw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XAVxVwnH+JVDBcLEGomMw4+7BBBp58nwNUOQNRgnVh8=; b=K8dTvkvRHgJH4yvTp0t1Y/kr4EBiNWecDzZ5mQleZupkkki5ERt0jzfeKc6fsH0sqa6G3WQxfDqmMg9M+P0h1i7HV2FkfRjrvoxIhKLYl0DAynVtlQWGoneFTu/N19JYP/cHqku8JdHTaebgIpdnrYZV8ZwBEeu7AhigPk3L0AQ= Received: from AM0PR01CA0110.eurprd01.prod.exchangelabs.com (2603:10a6:208:168::15) by DU0PR08MB8424.eurprd08.prod.outlook.com (2603:10a6:10:404::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.18; Fri, 26 Jul 2024 09:22:16 +0000 Received: from AM4PEPF00027A68.eurprd04.prod.outlook.com (2603:10a6:208:168:cafe::50) by AM0PR01CA0110.outlook.office365.com (2603:10a6:208:168::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.28 via Frontend Transport; Fri, 26 Jul 2024 09:22:16 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00027A68.mail.protection.outlook.com (10.167.16.85) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:22:16 +0000 Received: ("Tessian outbound 2fd79eef2229:v365"); Fri, 26 Jul 2024 09:22:15 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 48eb7746bc8ae901 X-CR-MTA-TID: 64aa7808 Received: from L3a6ba7bce3da.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B13C65A6-AC35-4ABA-8D65-1CE1F0F67EEC.1; Fri, 26 Jul 2024 09:22:04 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L3a6ba7bce3da.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:22:04 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ZhAeexICrLH66gZTLCLi2Wn982tlwDlhRPslY6BDWJ0G8dmebm4rDzFUoURL/U5gRcCoR+cpE1uQz5wouQjwA3Xxb5N0M2mNm5fESvH1G6+wOqHoqk87OYIloxq5ZViy++JquiF/YYirjOlw9JrHm2tZaUH6zUjQ49EBHjfkZiodv9DhWlGQCX+wGN3PetTy/7udhk2ykFSCEndcTlir884Hjyyd0NgbdZZM7J593nQLqo5dsquJj0nDZgdmrgNJkyRMVVCPYb3FVunadKIvqnirJA72luB0ORzsBCzjEVtaicljV1l1eEFSlRMFbFEJJi5Vx538hWyozD1sffwHQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XAVxVwnH+JVDBcLEGomMw4+7BBBp58nwNUOQNRgnVh8=; b=mICSaGeQfjQvcrouYaavP5wQA5LsRuIJ0Emwy7WKr1es4e1SGHAdLYH+bffVqPpOeKLfaAaSzb7ylWjii3iY0DUBAK4qwS4WdYUgmFEhYonmynYu8EUD9sCosA3gXZr3xdg2aQslrrdfeREgBGU1CyrgB+EOlAMFHhmUlKcRP49eY615h+To+qMuFK09ImgU4YRzXrOIevcmgL/oy4jYFHHsVFwRGi5d/oSZbB2jltmc7Ip6KapxKu8676TWXqLtymhQO94HixXmC9ZfAaHlrxdYu5fBbVLPdYCXRvcqje3Vyhoh5z6UciKqiJRZ4ENGMzHYoHOatyXFa+hQhslgBA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XAVxVwnH+JVDBcLEGomMw4+7BBBp58nwNUOQNRgnVh8=; b=K8dTvkvRHgJH4yvTp0t1Y/kr4EBiNWecDzZ5mQleZupkkki5ERt0jzfeKc6fsH0sqa6G3WQxfDqmMg9M+P0h1i7HV2FkfRjrvoxIhKLYl0DAynVtlQWGoneFTu/N19JYP/cHqku8JdHTaebgIpdnrYZV8ZwBEeu7AhigPk3L0AQ= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV1PR08MB8571.eurprd08.prod.outlook.com (2603:10a6:150:83::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:22:02 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:22:02 +0000 Date: Fri, 26 Jul 2024 10:21:59 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 8/8]AArch64: take gather/scatter decode overhead into account Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P265CA0076.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2bd::11) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV1PR08MB8571:EE_|AM4PEPF00027A68:EE_|DU0PR08MB8424:EE_ X-MS-Office365-Filtering-Correlation-Id: 2a92a7a0-cc41-44e6-2347-08dcad5470d1 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info-Original: jbm4TEO1N8Zf07DH2DvtX5h+ZTxnGsLlbubZPOZT7FK5a2sgMlRj22fYG/n+Flx9ZvCvH7+JdLQ4656ZMiiti2/oTToHEou9mdWQ87wM5LdPlrcM72HAVMLwow4p7pDlS5Lu3DBusEedFdbHeOnHrBhkgxZT6vUPxvK7zUeani6nlaqf7Y35Ii5YsOm4429QKKAguJN859ytHeMrWCrIelXdgWSiq17dBsiZHXQahSgsFXBv7ajnL07ngku+Jo9MSE17BWE1p1rLx15GFcbykGggcSJIk1ocYX8D1aZfhsPZlDvPpR6gh41VN8gjNmzV00mixzkPdD0F4RJErF4eCwpDxmGgz77g8moWeECfPyMl9qU5Os0uwlBRwV9Oy0v2j6J2QI9bEKVyBicnYOv8GoeFDBSAVfubb4UXDIJfCHtT1ZJMrkDHCkTuTKVtNXxOwn3fh0cAIZHk8r8qHCDcX1nHuyU9GPimKdQs7Vhv0idm+WJxX+oBoNYoQ24iv6nbuQ92xdB8wi86rqtPmda9yCLGqeqAsxg+JGcoayWQR2uIkL93lO0OoVgR+gq8o9hePS0gu/cxgsp5h07N01COyY46WEqNYOgcp62CG+EP2zC1HCJPDr+qWmiS7/RRttQ8+oZwXk864eh95dteGClbGdoABKOUmNdLuoohv8XyG+6zWc82cJ9kkBIqACaF1LFJE/UcAiwqZEKheaNCcIdFatFkmgKDGPOrqFdtxHrKMFSXQuPfG/Fd3INSjbk1weslKh8MEe1m5hUxsmqDseyjlgA0lQF7unDijvuqYoCpNqUFAg9CBVlFKK/DHlT6XHMhNKDGvTRmcpK/a0hLxpcp8TaeDLSEnWcV9AQf/A+UbezHX0SDrZX9hKjvAJKx4uKYmFZQ2hIYvpfH9yPapu+9In0LNoBhyWwdWZ9wEmRFZD4rX0KitYuORCFHUJHuJix+8JDMuPPMe9nkmxAchkGAC59hZE+maWiuZzbw2Wi6LvZvcZI6sd2QRXFv48C1xB84MUh7JbTtxlQysLppQhRpH6l4NS4EwJdWDDd0WmNuNeBQ+YdqxA/EWU13x/mAAmbnGU0zQeIm/HakR4rfURAJ/qWYNHM/68bgv3yIZtQjKw+uugLqEAIn6k++n2ed13F6ygEGvxcZtaFRK++OHomaxNq7n0qdIjGyZ/nSOExwAmpTAMO+guwXhD87pGkdYkOLicVWi7nAOtW3pvmGLZhYPG2mzx+qr6MYmA6ajN6DVZt3KhNYRF/2RgRpLkY0qJhnNOJww3lyc80W/G4YNDR+70WVJV44fy50xs2aCLIj59qdXzHkIiI9Q4BTrylXsYGPtaz/oPS1QNL0Y6phAmMBag== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8571 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00027A68.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 9c1794ef-0f2f-42c4-9f5a-08dcad546836 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|35042699022|34020700016|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?utf-8?q?eMYqpT09N0YOxXNTFO8hLQ6geI/opGs?= =?utf-8?q?8kwjm9WaUjb+QCe/roeO7/iQq8d+69wcb2FbgLwVStt88q5JTXrORZm4yMmQ9dTJ3?= =?utf-8?q?19Ggvf8cTu3SVQm0iJQ7ZmmqBr07/WxrNw3vu9O8fbZEg9/Vffx/jegqRw3y0r7UL?= =?utf-8?q?QHtb/2rMY4PojzWVvjLE5uZPclTVTf0GzXRkVYQmjmnqoKOY9lLHvA4BFMHxovSf7?= =?utf-8?q?tcpffXBipV3oBN6usrNHcJqGarI7qp7E41rx+6owRGzTu0gedCrXpE7EqSHamytiW?= =?utf-8?q?33g0v9+dO+lxNgfxKQYP4uA1hkKfwtMDLNE73e4jTODOFktXRHkfPzV1z2X/Bdm+7?= =?utf-8?q?5AhfGSYZgILtFnkeFKtGOoIO7deX2Dt55ToWvganOt0gxXWzDO2kbf/rf2sRLMcZJ?= =?utf-8?q?L8KfkVC3anIHVnvcf8YvZg0zMoUEyiHgul5LvTPf35hQNx/akE02XFDEcUHl1vLxo?= =?utf-8?q?xDSCjYOWqhwdW3M99O/VeRjaKTy228kz+xr0+tlISpoxHa0EAgsTNowvADXQ+/ESC?= =?utf-8?q?TbTHBeWpuPHv2HuqjgksQ9lQcm64jvOn2iaS2NRscejGSKInWgf9ejsUdcaOKMfch?= =?utf-8?q?wT+61ySSkE2hLKMgrf31UjL7og52oZbKIIpxIm3tu87csa7l/1+xO6O8eoekbx1+l?= =?utf-8?q?RrqRMcjnjrhfRlFE0MUoek4MH9Qv+AtdTIOX5Fzd4sGClB6oYw6aUFe/PAw3Ifqvq?= =?utf-8?q?Q3SSHgkkXxn/eVMdz0p0FlqMNpcqctqT9ob5A4Ru6yePQiO8ZjBQ+ojbv46RoT2VV?= =?utf-8?q?k6whNVRSPLqdgpyp91ijSVj0ByBwmxABnIjJNpvpOHJg20HY04WHgRUZ3DExGeRN6?= =?utf-8?q?KcN0e36G/qoRrlEqUU0MXSxXvj2cXxDv9nap87iQmRN9FvRrnKTH9W0MnLI7V5CXS?= =?utf-8?q?UZx1oj7U2888fRYm3xucu/1isB5EusViJ8+YHii6pv0uO/Gl3ZNzzjqwP7q66Zhv8?= =?utf-8?q?8b+VRxq2nddyzcngOLB/Hq2S5pC0xwiERd970sXTvLD4STWfnUboWUDk6jTL5xs4F?= =?utf-8?q?+/296yi+19GZtrCo1vOqQpM48GM1yGyZCXK8VbzBuouHzCGTXCGZ6IO6aMucCYl8u?= =?utf-8?q?njoFJs3OJ+goUkWUdu2/4j9Cf/gpoC0IFIzqlz4c3YwfK09O01x65xOSPgftjozKk?= =?utf-8?q?rx3lSuhVTyaTuSSf1JSI1v03zZIKucwi7NUyJzwrZLoeNJBGgOXiCNNepzxA9YHHF?= =?utf-8?q?h2mcH3f8j4aTIYJ/WKNbzWIh6DzNRnGM5eE0vOWURPSks8qQPSWbOUwYhoeUk8M0I?= =?utf-8?q?D9ntHYbULLxVRP2EylDcP8Odw1N9mXq9SE2eJ3NIhHysvaFoOSo/L1z6TmjZ6t4HA?= =?utf-8?q?dSSj4GmWUSPlUqwPwFHAbx4mADUid26D4Ed2YPDiMCv5ScWv+4izRmY/8GAlF5CDt?= =?utf-8?q?our4XQJjCdx?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(1800799024)(35042699022)(34020700016)(36860700013)(82310400026)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:22:16.0442 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2a92a7a0-cc41-44e6-2347-08dcad5470d1 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00027A68.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB8424 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, Gather and scatters are not usually beneficial when the loop count is small. This is because there's not only a cost to their execution within the loop but there is also some cost to enter loops with them. As such this patch models this overhead. For generic tuning we however still prefer gathers/scatters when the loop costs work out. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. This improves performance of Exchange in SPECCPU 2017 by 3% with SVE enabled. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-protos.h (struct sve_vec_cost): Add gather_load_x32_init_cost and gather_load_x64_init_cost. * config/aarch64/aarch64.cc (aarch64_vector_costs): Add m_sve_gather_scatter_x32 and m_sve_gather_scatter_x64. (aarch64_vector_costs::add_stmt_cost): Use them. (aarch64_vector_costs::finish_cost): Likewise. * config/aarch64/tuning_models/a64fx.h: Update. * config/aarch64/tuning_models/cortexx925.h: Update. * config/aarch64/tuning_models/generic.h: Update. * config/aarch64/tuning_models/generic_armv8_a.h: Update. * config/aarch64/tuning_models/generic_armv9_a.h: Update. * config/aarch64/tuning_models/neoverse512tvb.h: Update. * config/aarch64/tuning_models/neoversen2.h: Update. * config/aarch64/tuning_models/neoversen3.h: Update. * config/aarch64/tuning_models/neoversev1.h: Update. * config/aarch64/tuning_models/neoversev2.h: Update. * config/aarch64/tuning_models/neoversev3.h: Update. * config/aarch64/tuning_models/neoversev3ae.h: Update. --- -- diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 42639e9efcf1e0f9362f759ae63a31b8eeb0d581..16eb8edab4d9fdfc6e3672c56ef5c9f6962d0c0b 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -262,6 +262,8 @@ struct sve_vec_cost : simd_vec_cost unsigned int fadda_f64_cost, unsigned int gather_load_x32_cost, unsigned int gather_load_x64_cost, + unsigned int gather_load_x32_init_cost, + unsigned int gather_load_x64_init_cost, unsigned int scatter_store_elt_cost) : simd_vec_cost (base), clast_cost (clast_cost), @@ -270,6 +272,8 @@ struct sve_vec_cost : simd_vec_cost fadda_f64_cost (fadda_f64_cost), gather_load_x32_cost (gather_load_x32_cost), gather_load_x64_cost (gather_load_x64_cost), + gather_load_x32_init_cost (gather_load_x32_init_cost), + gather_load_x64_init_cost (gather_load_x64_init_cost), scatter_store_elt_cost (scatter_store_elt_cost) {} @@ -289,6 +293,12 @@ struct sve_vec_cost : simd_vec_cost const int gather_load_x32_cost; const int gather_load_x64_cost; + /* Additional loop initialization cost of using a gather load instruction. The x32 + value is for loads of 32-bit elements and the x64 value is for loads of + 64-bit elements. */ + const int gather_load_x32_init_cost; + const int gather_load_x64_init_cost; + /* The per-element cost of a scatter store. */ const int scatter_store_elt_cost; }; diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index eafa377cb095f49408d8a926fb49ce13e2155ba2..1e14c3c0d24b449d404724e436ba57e1996ec062 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -16227,6 +16227,12 @@ private: supported by Advanced SIMD and SVE2. */ bool m_has_avg = false; + /* This loop uses an SVE 32-bit element gather or scatter operation. */ + bool m_sve_gather_scatter_x32 = false; + + /* This loop uses an SVE 64-bit element gather or scatter operation. */ + bool m_sve_gather_scatter_x64 = false; + /* True if the vector body contains a store to a decl and if the function is known to have a vld1 from the same decl. @@ -17291,6 +17297,17 @@ aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_cost = aarch64_detect_vector_stmt_subtype (m_vinfo, kind, stmt_info, vectype, where, stmt_cost); + + /* Check if we've seen an SVE gather/scatter operation and which size. */ + if (kind == scalar_load + && aarch64_sve_mode_p (TYPE_MODE (vectype)) + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER) + { + if (GET_MODE_UNIT_BITSIZE (TYPE_MODE (vectype)) == 64) + m_sve_gather_scatter_x64 = true; + else + m_sve_gather_scatter_x32 = true; + } } /* Do any SVE-specific adjustments to the cost. */ @@ -17676,6 +17693,18 @@ aarch64_vector_costs::finish_cost (const vector_costs *uncast_scalar_costs) m_costs[vect_body] = adjust_body_cost (loop_vinfo, scalar_costs, m_costs[vect_body]); m_suggested_unroll_factor = determine_suggested_unroll_factor (); + + /* For gather and scatters there's an additional overhead for the first + iteration. For low count loops they're not beneficial so model the + overhead as loop prologue costs. */ + if (m_sve_gather_scatter_x32 || m_sve_gather_scatter_x64) + { + const sve_vec_cost *sve_costs = aarch64_tune_params.vec_costs->sve; + if (m_sve_gather_scatter_x32) + m_costs[vect_prologue] += sve_costs->gather_load_x32_init_cost; + else + m_costs[vect_prologue] += sve_costs->gather_load_x64_init_cost; + } } /* Apply the heuristic described above m_stp_sequence_cost. Prefer diff --git a/gcc/config/aarch64/tuning_models/a64fx.h b/gcc/config/aarch64/tuning_models/a64fx.h index 6091289d4c3c66f01d7e4dbf97a85c1f8c40bb0b..378a1b3889ee265859786c1ff6525fce2305b615 100644 --- a/gcc/config/aarch64/tuning_models/a64fx.h +++ b/gcc/config/aarch64/tuning_models/a64fx.h @@ -104,6 +104,8 @@ static const sve_vec_cost a64fx_sve_vector_cost = 13, /* fadda_f64_cost */ 64, /* gather_load_x32_cost */ 32, /* gather_load_x64_cost */ + 0, /* gather_load_x32_init_cost */ + 0, /* gather_load_x64_init_cost */ 1 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/cortexx925.h b/gcc/config/aarch64/tuning_models/cortexx925.h index fb95e87526985b02410d54a5a3ec8539c1b0ba6d..c4206018a3ff707f89ff3300700ec7dc2a5bc6b0 100644 --- a/gcc/config/aarch64/tuning_models/cortexx925.h +++ b/gcc/config/aarch64/tuning_models/cortexx925.h @@ -135,6 +135,8 @@ static const sve_vec_cost cortexx925_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 1 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/generic.h b/gcc/config/aarch64/tuning_models/generic.h index 2b1f68b3052117814161a32f426422736ad6462b..101969bdbb9ccf7eafbd9a1cd6e25f0b584fb261 100644 --- a/gcc/config/aarch64/tuning_models/generic.h +++ b/gcc/config/aarch64/tuning_models/generic.h @@ -105,6 +105,8 @@ static const sve_vec_cost generic_sve_vector_cost = 2, /* fadda_f64_cost */ 4, /* gather_load_x32_cost */ 2, /* gather_load_x64_cost */ + 12, /* gather_load_x32_init_cost */ + 4, /* gather_load_x64_init_cost */ 1 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/generic_armv8_a.h b/gcc/config/aarch64/tuning_models/generic_armv8_a.h index b38b9a8c5cad7d12aa38afdb610a14a25e755010..b5088afe068aa4be7f9dd614cfdd2a51fa96e524 100644 --- a/gcc/config/aarch64/tuning_models/generic_armv8_a.h +++ b/gcc/config/aarch64/tuning_models/generic_armv8_a.h @@ -106,6 +106,8 @@ static const sve_vec_cost generic_armv8_a_sve_vector_cost = 2, /* fadda_f64_cost */ 4, /* gather_load_x32_cost */ 2, /* gather_load_x64_cost */ + 12, /* gather_load_x32_init_cost */ + 4, /* gather_load_x64_init_cost */ 1 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/generic_armv9_a.h b/gcc/config/aarch64/tuning_models/generic_armv9_a.h index b39a0c73db910888168790888d24ddf4406bf1ee..fd72de542862909ccb9a9260a16bb01935d97f36 100644 --- a/gcc/config/aarch64/tuning_models/generic_armv9_a.h +++ b/gcc/config/aarch64/tuning_models/generic_armv9_a.h @@ -136,6 +136,8 @@ static const sve_vec_cost generic_armv9_a_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 3 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/neoverse512tvb.h b/gcc/config/aarch64/tuning_models/neoverse512tvb.h index 825c6a64990b72cda3641737957dc94d75db1509..d2a0b647791de8fca6d7684849d2ab1e9104b045 100644 --- a/gcc/config/aarch64/tuning_models/neoverse512tvb.h +++ b/gcc/config/aarch64/tuning_models/neoverse512tvb.h @@ -79,6 +79,8 @@ static const sve_vec_cost neoverse512tvb_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 3 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/neoversen2.h b/gcc/config/aarch64/tuning_models/neoversen2.h index 3430eb9c06819e00ab38966bb960bd6525ff2b5c..00d2c12e739ffd371dd4720826894e980d577ca7 100644 --- a/gcc/config/aarch64/tuning_models/neoversen2.h +++ b/gcc/config/aarch64/tuning_models/neoversen2.h @@ -135,6 +135,8 @@ static const sve_vec_cost neoversen2_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 3 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/neoversen3.h b/gcc/config/aarch64/tuning_models/neoversen3.h index 7438e39a4bbe43de624b63fdd20d3fde9dfb6fc9..fc4333ffdeaef0115ac162e2da9d8d548bacf576 100644 --- a/gcc/config/aarch64/tuning_models/neoversen3.h +++ b/gcc/config/aarch64/tuning_models/neoversen3.h @@ -135,6 +135,8 @@ static const sve_vec_cost neoversen3_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 1 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/neoversev1.h b/gcc/config/aarch64/tuning_models/neoversev1.h index 0fc41ce6a41b3135fa06d2bda1f517fdf4f8dbcf..705ed025730f6683109a4796c6eefa55b437cec9 100644 --- a/gcc/config/aarch64/tuning_models/neoversev1.h +++ b/gcc/config/aarch64/tuning_models/neoversev1.h @@ -126,6 +126,8 @@ static const sve_vec_cost neoversev1_sve_vector_cost = 8, /* fadda_f64_cost */ 32, /* gather_load_x32_cost */ 16, /* gather_load_x64_cost */ + 96, /* gather_load_x32_init_cost */ + 32, /* gather_load_x64_init_cost */ 3 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/neoversev2.h b/gcc/config/aarch64/tuning_models/neoversev2.h index cca459e32c1384f57f8345d86b42b7814ae44115..680feeb9e4ee7bf21d5a258d83e522e079fdc156 100644 --- a/gcc/config/aarch64/tuning_models/neoversev2.h +++ b/gcc/config/aarch64/tuning_models/neoversev2.h @@ -135,6 +135,8 @@ static const sve_vec_cost neoversev2_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 3 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/neoversev3.h b/gcc/config/aarch64/tuning_models/neoversev3.h index 3daa3d2365c817d03c6c0d5e66fe832620d8fb2c..812c6ad304e8d4c503dcd444437bf6528d6f3176 100644 --- a/gcc/config/aarch64/tuning_models/neoversev3.h +++ b/gcc/config/aarch64/tuning_models/neoversev3.h @@ -135,6 +135,8 @@ static const sve_vec_cost neoversev3_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 1 /* scatter_store_elt_cost */ }; diff --git a/gcc/config/aarch64/tuning_models/neoversev3ae.h b/gcc/config/aarch64/tuning_models/neoversev3ae.h index 29c6f22e941b26ee333c87b9fac22aea86625e97..280b5abb27d3c9f404d5f96f14d0cba1e13b9bd1 100644 --- a/gcc/config/aarch64/tuning_models/neoversev3ae.h +++ b/gcc/config/aarch64/tuning_models/neoversev3ae.h @@ -135,6 +135,8 @@ static const sve_vec_cost neoversev3ae_sve_vector_cost = operation more than a 64-bit gather. */ 14, /* gather_load_x32_cost */ 12, /* gather_load_x64_cost */ + 42, /* gather_load_x32_init_cost */ + 24, /* gather_load_x64_init_cost */ 1 /* scatter_store_elt_cost */ };