From patchwork Fri Jul 26 09:20:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1965227 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=rZlvYpcY; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=rZlvYpcY; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVj2K3DQSz1yY5 for ; Fri, 26 Jul 2024 19:22:17 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ADB123870C19 for ; Fri, 26 Jul 2024 09:22:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on20600.outbound.protection.outlook.com [IPv6:2a01:111:f403:2613::600]) by sourceware.org (Postfix) with ESMTPS id CD38D38708EE; Fri, 26 Jul 2024 09:20:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CD38D38708EE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CD38D38708EE Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2613::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985638; cv=pass; b=n5UumXJcEZwjsTHkrIRp1NEQN85i5xLL7bjbIV2w64bmV29vnwNPXFypsVnX+GSJd243FYNLU1n9/dA3NtuffJ3Oug8xqkDfG2QLAuydslvP48rchP6pv+mNYtQUflSiFtln/m6gIGxe197dG42OQ3Kz7w8ekS8mFj3DRMeC+4o= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1721985638; c=relaxed/simple; bh=PmKDiVj8/r8SdzqPpknM0tBRH2KCQpsjh2kIE3B6dbI=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=A58aeJiQovrhw4mLPR1Rb1f2BcTxwb8VsTXx1+rHVCR16ux3knd9mmo8hnb22AsVU56Gg3M+dLk42tP4LxMqR1cfsRxNYJrRPjjq/m7DDBSWBrQbuRYABboNHBdE8Odf1PQltEN+NI/FJLY4tMB9viUlKYIyNyPCGQTUvrLkUiQ= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=Y0CJQJsV7nlEVUfGBGFPOrG4VHJnPWHnS050DDV5VQNzExPPP2dIYKD83Fi96hcfbLEoKE2zzwx9EcEz3GsytwqLrPO9rZWg8xjsK+K69KYVC2K3Pw1iluv6nJXb0sYEB3+KGT1dZuGRf3+mx0qDY5z5YvRiI27wfZhmiRxxRwQXxTHm/1dmm3G96kSRFIrJmAIeWnfEI0ml4akkMGvNfa0oi0TSHj7bh/f6irxCOiExSgPDugPKxlEQttVfckMpemQN0ZNCYDDXwIj/66jR6MQEQPNCFtCO0DmQNvzYgPgaQTfOCReVEVVR+xKI8PM6aYbDn0GQDZlS9FQmcrHoPw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=YpwaY8uLPK9rjiNtzwr2JB4tfHvS8HQuGe1ZxoWFoRwjbdg4kYcoa5Zh/iUhWSTaVP33Ezv/Y3jTK24QrROd+dSebFFq3UaL/n3ePvyJT1JtNHFvnBK9MIBnEjEYDX6TX6u/+ZUefnjVKmwsGSEpAJpuGluNtP2m6Fzxfq/6+iISrkB3BacEa/3u7hUBO++RmLLVLaXZUEBm9Ki5MGpXh5Ejj7PtNUHbUA2TK/zC6+EGttV8JA0cDgcYJaSK5kbgFE3ZIX9wPv+JlAoiplNuF6r7t1Lve3m2eg0Yt2wKzuvSgzjRix2JSkL8NbilFvQiPaLyOa7upVtyEaQg+h7+xQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=rZlvYpcYPgWPvJZauzNS7zD7L+JBbgflnu4ZvHLkgrw/ELWI6zUVJ8Hcez1/P9Yakhp/zW1nGUOyJcVuX7GjL6a9gZWLQ1CgVK03H/dMzEWvQ3P52fvACt+U0ewLKHX5jWz4DL6jZk7QNIxnHUfv/tOJRZOS5+U7dP3S/Lk3/PQ= Received: from AS4P192CA0023.EURP192.PROD.OUTLOOK.COM (2603:10a6:20b:5e1::9) by AS8PR08MB9694.eurprd08.prod.outlook.com (2603:10a6:20b:616::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:20:29 +0000 Received: from AM4PEPF00027A64.eurprd04.prod.outlook.com (2603:10a6:20b:5e1:cafe::61) by AS4P192CA0023.outlook.office365.com (2603:10a6:20b:5e1::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29 via Frontend Transport; Fri, 26 Jul 2024 09:20:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00027A64.mail.protection.outlook.com (10.167.16.75) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Fri, 26 Jul 2024 09:20:28 +0000 Received: ("Tessian outbound ab09e808a502:v365"); Fri, 26 Jul 2024 09:20:28 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ef50a6174c59bb5c X-CR-MTA-TID: 64aa7808 Received: from La0a3fd64d5a3.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 995B0CBB-485C-4350-A8F0-403F3338D255.1; Fri, 26 Jul 2024 09:20:21 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id La0a3fd64d5a3.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 26 Jul 2024 09:20:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=MOkiHchKEmtJFSeVeyokKIYreOwR71PfX3SNYhOke4xwemLROumFy32CcVsYY3hoKou9iXsh2c7nktkxHKDllrCEftnWx3A1mgwaOFUi4ejcyOlX94MwTjgqvyyeR1j04xpbPBpspv0T6S7nuWpZtk0SLfuE20lfeYtSDIhXLyMx8uTP4i4EsiISfWSsuPi0uDcN1R61ZkilfXzxNluuetV7cUK5YCUg+sjbeLEmCp//lj3c9voor6O/i5T5IkwkKd1G6XrcLa5DXzBCeM7i83Fr2JnrQ3JIxNwpxuRTTcFjDuhZG8Z5uRUDnTZTq+xBMd2nH5C5pNytsTez09L1Dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=r2yuzukNSJkTWsz5rz2XtBxlrxIL7LNXyxtE5CIP0DFTS6IWzCbiNQH2ez5zFi2mVwHLptc6cKPpiHVzkNPc1xtci2GrtEfUG6Wy2uL+bOQXWl6InCznMv8dJb9Fd7z6e25Ggz17dOrf8Hozs7iWo/e4TscVUx6aZwuErIeyUBS7Ao9cXt8AiYwTQ5RXxBhxeBWYAIlmfYfTW8pvBp2O5Scr2oEQumLryIugzzTGAFgCNTPbE5BW0BUoAL46pg7zkstRZg0TzbptaO0OlzoinAbUn2T5fku2GBasLoE2R9OHugROfhGZXY5udTvNpdMC3oWgDtSNILTpwHr0Pbm35A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3MV+z3ICzcrtFpGpYpqrCYP16c16mQvZ7jZaEyX4nx4=; b=rZlvYpcYPgWPvJZauzNS7zD7L+JBbgflnu4ZvHLkgrw/ELWI6zUVJ8Hcez1/P9Yakhp/zW1nGUOyJcVuX7GjL6a9gZWLQ1CgVK03H/dMzEWvQ3P52fvACt+U0ewLKHX5jWz4DL6jZk7QNIxnHUfv/tOJRZOS5+U7dP3S/Lk3/PQ= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB6652.eurprd08.prod.outlook.com (2603:10a6:10:2ab::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.29; Fri, 26 Jul 2024 09:20:19 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::89dc:c731:362b:7c69%6]) with mapi id 15.20.7784.020; Fri, 26 Jul 2024 09:20:19 +0000 Date: Fri, 26 Jul 2024 10:20:16 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 2/8]AArch64: Add Neoverse V3 core definition and cost model Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P265CA0277.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:37a::8) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB6652:EE_|AM4PEPF00027A64:EE_|AS8PR08MB9694:EE_ X-MS-Office365-Filtering-Correlation-Id: 2d5b1452-e199-466d-ef1c-08dcad5430f0 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230040|366016|1800799024|376014; X-Microsoft-Antispam-Message-Info-Original: G6V+CWcOpAgD9BtNRKErSa2hT43iNtC2B2lnHAYmO3/33jOFotvwss1BVImJlrQtZ7V8dndZtAcoWdejeukf7aMq/fioEGwtndKuSdi8rqGVjxA/FKLBz0NIHosCzppU8Q5t1QFyqjTaVivNKd9pyM9RQFvq7LCHR6JDDSubtE1uyS9gKrcLOOZH3yUWw3FBzvfD3ILKQYYMHGKlbb430O132Zcicbp9l8V5pVy0GDoYoeRVzFz0HWwneiHUhrpN6Vmncifq7A8dSe6BgR5DPKTUBBNsGWIcSRc1yxF3nKek0CKcIMfHB2iMLbmPN9m37bblUJGHTt9rcfa69vqUABmPTL9FYVm7rUR05W36CFllymXNdyVybDQ93+KrS72Xq4M4onaZlPmsUTpADWoISI2zZoiC8P4JOS/yXehexeUuj4sVVCVNHK0Mqfg7mRDaCrFoNEkpNL8kHHh+vgN2sLa7/rpQg4IQXzTsAhJR4je9a6QUhf1z/81GNP9V0C+xgZPYDi21fIuvlX6QdNKBzDcKjll7IpZ8MXwnn8VKAp3/9hdaqCYEWRhydj9It5dsyEFOHQapo5oBwN1FDjPntwNsfpDYG1svrUBOJ5DwQLwb0JgtkCfp0Lg/sSXojnOSHixmUm/F2zT5KndHeRn0YgASaAxXuf06hmiURy8Uiagu1zcgMCSZ7plpNNL0Syxk6lOGUpW/+lzYY1O5xVo0CbDtO/8CJ1+HyySWL8zgJd0W+w8FLglTVSQfyWFTZDggWCHZVMMqMsOFZlHKcPCqRtBqGerlmPEgqF72xm9fK3i7M2gIjFGz77rHzlTnrrcdWUZoUCmXMc2op4nENKaCTOX4rUElMLPCJwVSlkOUTg7ZlPVStnjmP6QLGjoRZB/EBy5dzNHk+Y2BCLSs0D1JoADBezHLdboXkeG+aXZK9RhW4Zc580MJz2BK8l/KMxFQzwSTwlSPmW5cff6jX/eZzcOAUGDpwWrh4rE+TGTIfFu/vBnQCu6gZqqp9K7DAxcU8879Hnc/HI2cCvJk4mB5HmY+eWiPg6el30AP9SLj8zbf1UjVHmnSuffn3NVOzuVVu4z4FrbBKUHHo0vXKckQ3QlsP301yC1a+shgOITslz9YP6EgEbzcpf2B6yPt1rlaFY3z1AcuoQ4RDDUU8Ds7UYi3HoSdryWWqimsztoCwoOfiA2volzXcZRrvdtFuuyt8Inca66+ZJFK56EZlCbbxM9xboP5v+kCoI0/2/L+pkkqHPIaBlrBa/a1qdjkBPymLB/yOyT1LY8r0BBs4xhR9btTdU+pRVbcvswNBaQtYFI= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6652 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:803:13e::17]; domain=VI1PR08MB5325.eurprd08.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00027A64.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 70976725-1604-4b08-db2f-08dcad542aeb X-Microsoft-Antispam: BCL:0; ARA:13230040|35042699022|1800799024|376014|82310400026|36860700013|34020700016; X-Microsoft-Antispam-Message-Info: =?utf-8?q?njfQrNQVdSKZ0g5bLC8TRLrsnu0AFnw?= =?utf-8?q?4WSn9FyW8H15F9Kve4QZ0z1XzutKzCqDb56uMzRG/JU3Nm5PT/F//SJCrLQKrXR3x?= =?utf-8?q?R4MLqAiMkKc5psrAbjT+2yJKRJ/+iJwg6GHiVhMVmp66uJsWfA3Cs4rvMVcc6/Fic?= =?utf-8?q?Ag6CDRizeZz9W09OuOg5xijc0jBClj/MYPHkOHFGkMG+xD2n3JEtadN29IEnfwcPY?= =?utf-8?q?J/MuJYlw1zX91JxkP8jn73tDRvMmxm559+xLslXW5y/zOMKddfo1P+3zvpcLRu46G?= =?utf-8?q?LppMDmVlDksOuvSqE1zZtsSooV+o4sqv9J8AGyLWeNGUbyBBWL4yiDm2LAbCIisOO?= =?utf-8?q?Dr6hQfRHv0QVTodX2xcs0dlD9boRQr3YpAqJoAXXLNtqS6Z/7fLU5Ux0w8jgtqCSH?= =?utf-8?q?X8qZZtWVzYo5ZgXm65GjkuZan97iJd0xQB7cIrpSC/Tht8TlVdYR8GGDR8/062udi?= =?utf-8?q?shD8s2NLVK8rETbSit2U9KKMQaEdDPi/oABo4UJ3erbKM6YTyFGed5lP5W2HiHdWF?= =?utf-8?q?WgqQKlznR2LttOHeG6+sJDLajCN7XdYaSObKcPpuPvLvisSHHJNEi7QuO3h2TFFz6?= =?utf-8?q?POR9KYvc3TkcX9aVDDHzke9gUZKS1GwRpwhMj8sFaHoFknFCZPYeli1ZrT3A4U99e?= =?utf-8?q?z+0lRqMZQ4Lue40brytXHJfE2/VZM3w56BFqtW05ZXqgwjxD2nBqXsysAmGe9grer?= =?utf-8?q?mZ/WZHxSXorb3oj1oV09BlF0iMa/fm1PityDMKjNxLVucgIkTanhDUGX8+BOSecBP?= =?utf-8?q?roJ3DqjjadABITWr5bNdKX7yq9zn/FVK7Vcw5maSb6EBKemCvze1U9MLzMWL8+/Zu?= =?utf-8?q?XiZlO2f6OOJOV6slm+T2ALv0FV5/tb+L79oJ8ZE8YMp9hrSjg6lpCnyvySDL4rQJj?= =?utf-8?q?gg7wYawQI2G53vHKaacWGAhNCdcnoOsYTotOcgT593yEtS5wLa967V2gv94G47gw4?= =?utf-8?q?uRyuNkI5QND8MZrifagpU43uZsbzJDyOAsSJZHYfTyxkeDBBpO+JuYBWGbGkyn0/n?= =?utf-8?q?6/fNNmRjurhLJLqOOZJWRdg1NuQIaByspydTOTM6m3qN3sY+BzoOBPf/McWT5kFz/?= =?utf-8?q?3Kp8R+8SmcKHB9y0nD4DPy3WuC2LlGMYDa+KRQ6BIUePX865cKurgZ6uL0sRVeu/l?= =?utf-8?q?2Cz+rVd1EfRQqlbWPdjmpDx2LdRSzeNmn3tTA7xULrdckNdaZ1dDe2U5rOg22Fbeo?= =?utf-8?q?9QeKBJFUxQV0fC1zjrLaibAk/BpvEBQJ5XXZlA+RJaLb2oi0F9mtDVhnY4ybFNjiQ?= =?utf-8?q?Z3JN4JAEsOCH0WYDjy5tn7byOyiLVSBSlR7M6r94kwg9I/1YNqwSVH9gXHFpis4q3?= =?utf-8?q?HZ0C/jg9p6+Az3pWWhrN8UHbxytV88eVwQ=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(35042699022)(1800799024)(376014)(82310400026)(36860700013)(34020700016); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jul 2024 09:20:28.8784 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2d5b1452-e199-466d-ef1c-08dcad5430f0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00027A64.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9694 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi All, This adds a cost model and core definition for Neoverse V3. It also makes Cortex-X4 use the Neoverse V3 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x4): Update. (neoverse-v3): New. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/tuning_models/neoversev3.h: New file. * config/aarch64/aarch64.cc: Use it. * doc/invoke.texi: Document it. --- -- diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 34307fe0c1721dda67adab768dd22a5649687f6e..96c74657a1991acfe86d7c61af4ccce7415fabca 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -188,13 +188,14 @@ AARCH64_CORE("cortex-x2", cortexx2, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8M AARCH64_CORE("cortex-x3", cortexx3, cortexa57, V9A, (SVE2_BITPERM, MEMTAG, I8MM, BF16), neoversev2, 0x41, 0xd4e, -1) -AARCH64_CORE("cortex-x4", cortexx4, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversen2, 0x41, 0xd81, -1) +AARCH64_CORE("cortex-x4", cortexx4, cortexa57, V9_2A, (SVE2_BITPERM, MEMTAG, PROFILE), neoversev3, 0x41, 0xd81, -1) AARCH64_CORE("neoverse-n2", neoversen2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x41, 0xd49, -1) AARCH64_CORE("cobalt-100", cobalt100, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversen2, 0x6d, 0xd49, -1) AARCH64_CORE("neoverse-v2", neoversev2, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversev2, 0x41, 0xd4f, -1) AARCH64_CORE("grace", grace, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, SVE2_AES, SVE2_SHA3, SVE2_SM4, PROFILE), neoversev2, 0x41, 0xd4f, -1) +AARCH64_CORE("neoverse-v3", neoversev3, cortexa57, V9_2A, (SVE2_BITPERM, RNG, LS64, MEMTAG, PROFILE), neoversev3, 0x41, 0xd84, -1) AARCH64_CORE("demeter", demeter, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, MEMTAG, PROFILE), neoversev2, 0x41, 0xd4f, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index 719fd3dc62a5860aad3aa92785413892e46f8816..0c3339b53e425ac36387eb63a0005a25c0c064e7 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversev2,grace,demeter,generic,generic_armv8_a,generic_armv9_a" + "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexa510,cortexa520,cortexa710,cortexa715,cortexa720,cortexx2,cortexx3,cortexx4,neoversen2,cobalt100,neoversev2,grace,neoversev3,demeter,generic,generic_armv8_a,generic_armv9_a" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 89eb66348f772a7e94f1acde29cd4badfd51fa3d..569d4a3d16fb9846b89ebbc895cb169a6007a24a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -413,6 +413,7 @@ static const struct aarch64_flag_desc aarch64_tuning_flags[] = #include "tuning_models/neoverse512tvb.h" #include "tuning_models/neoversen2.h" #include "tuning_models/neoversev2.h" +#include "tuning_models/neoversev3.h" #include "tuning_models/a64fx.h" /* Support for fine-grained override of the tuning structures. */ diff --git a/gcc/config/aarch64/tuning_models/neoversev3.h b/gcc/config/aarch64/tuning_models/neoversev3.h new file mode 100644 index 0000000000000000000000000000000000000000..3daa3d2365c817d03c6c0d5e66fe832620d8fb2c --- /dev/null +++ b/gcc/config/aarch64/tuning_models/neoversev3.h @@ -0,0 +1,246 @@ +/* Tuning model description for AArch64 architecture. + Copyright (C) 2009-2024 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_AARCH64_H_NEOVERSEV3 +#define GCC_AARCH64_H_NEOVERSEV3 + +#include "generic.h" + +static const struct cpu_addrcost_table neoversev3_addrcost_table = +{ + { + 1, /* hi */ + 0, /* si */ + 0, /* di */ + 1, /* ti */ + }, + 0, /* pre_modify */ + 0, /* post_modify */ + 2, /* post_modify_ld3_st3 */ + 2, /* post_modify_ld4_st4 */ + 0, /* register_offset */ + 0, /* register_sextend */ + 0, /* register_zextend */ + 0 /* imm_offset */ +}; + +static const struct cpu_regmove_cost neoversev3_regmove_cost = +{ + 3, /* GP2GP */ + /* Spilling to int<->fp instead of memory is recommended so set + realistic costs compared to memmov_cost. */ + 5, /* GP2FP */ + 4, /* FP2GP */ + 4 /* FP2FP */ +}; + +static const advsimd_vec_cost neoversev3_advsimd_vector_cost = +{ + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 2, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + 4, /* reduc_i8_cost */ + 4, /* reduc_i16_cost */ + 2, /* reduc_i32_cost */ + 2, /* reduc_i64_cost */ + 6, /* reduc_f16_cost */ + 4, /* reduc_f32_cost */ + 2, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* This depends very much on what the scalar value is and + where it comes from. E.g. some constants take two dependent + instructions or a load, while others might be moved from a GPR. + 4 seems to be a reasonable compromise in practice. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ +}; + +static const sve_vec_cost neoversev3_sve_vector_cost = +{ + { + 2, /* int_stmt_cost */ + 2, /* fp_stmt_cost */ + 2, /* ld2_st2_permute_cost */ + 3, /* ld3_st3_permute_cost */ + 3, /* ld4_st4_permute_cost */ + 2, /* permute_cost */ + /* Theoretically, a reduction involving 15 scalar ADDs could + complete in ~4 cycles and would have a cost of 15. [SU]ADDV + completes in 9 cycles, so give it a cost of 15 + 5. */ + 20, /* reduc_i8_cost */ + /* Likewise for 7 scalar ADDs (~3 cycles) vs. 8: 7 + 5. */ + 12, /* reduc_i16_cost */ + /* Likewise for 3 scalar ADDs (~2 cycles) vs. 6: 3 + 4. */ + 7, /* reduc_i32_cost */ + /* Likewise for 1 scalar ADDs (~1 cycles) vs. 2: 1 + 1. */ + 2, /* reduc_i64_cost */ + /* Theoretically, a reduction involving 7 scalar FADDs could + complete in ~6 cycles and would have a cost of 7. FADDV + completes in 8 cycles, so give it a cost of 7 + 2. */ + 9, /* reduc_f16_cost */ + /* Likewise for 3 scalar FADDs (~4 cycles) vs. 6: 3 + 2. */ + 5, /* reduc_f32_cost */ + /* Likewise for 1 scalar FADD (~2 cycles) vs. 4: 1 + 2. */ + 3, /* reduc_f64_cost */ + 2, /* store_elt_extra_cost */ + /* This value is just inherited from the Cortex-A57 table. */ + 8, /* vec_to_scalar_cost */ + /* See the comment above the Advanced SIMD versions. */ + 4, /* scalar_to_vec_cost */ + 4, /* align_load_cost */ + 4, /* unalign_load_cost */ + /* Although stores have a latency of 2 and compete for the + vector pipes, in practice it's better not to model that. */ + 1, /* unalign_store_cost */ + 1 /* store_cost */ + }, + 3, /* clast_cost */ + 10, /* fadda_f16_cost */ + 6, /* fadda_f32_cost */ + 4, /* fadda_f64_cost */ + /* A strided Advanced SIMD x64 load would take two parallel FP loads + (8 cycles) plus an insertion (2 cycles). Assume a 64-bit SVE gather + is 1 cycle more. The Advanced SIMD version is costed as 2 scalar loads + (cost 8) and a vec_construct (cost 4). Add a full vector operation + (cost 2) to that, to avoid the difference being lost in rounding. + + There is no easy comparison between a strided Advanced SIMD x32 load + and an SVE 32-bit gather, but cost an SVE 32-bit gather as 1 vector + operation more than a 64-bit gather. */ + 14, /* gather_load_x32_cost */ + 12, /* gather_load_x64_cost */ + 1 /* scatter_store_elt_cost */ +}; + +static const aarch64_scalar_vec_issue_info neoversev3_scalar_issue_info = +{ + 3, /* loads_stores_per_cycle */ + 2, /* stores_per_cycle */ + 8, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ +}; + +static const aarch64_advsimd_vec_issue_info neoversev3_advsimd_issue_info = +{ + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 4, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ +}; + +static const aarch64_sve_vec_issue_info neoversev3_sve_issue_info = +{ + { + { + 0, /* loads_stores_per_cycle */ + 1, /* stores_per_cycle */ + 4, /* general_ops_per_cycle */ + 0, /* fp_simd_load_general_ops */ + 1 /* fp_simd_store_general_ops */ + }, + 2, /* ld2_st2_general_ops */ + 2, /* ld3_st3_general_ops */ + 3 /* ld4_st4_general_ops */ + }, + 2, /* pred_ops_per_cycle */ + 1, /* while_pred_ops */ + 0, /* int_cmp_pred_ops */ + 0, /* fp_cmp_pred_ops */ + 1, /* gather_scatter_pair_general_ops */ + 1 /* gather_scatter_pair_pred_ops */ +}; + +static const aarch64_vec_issue_info neoversev3_vec_issue_info = +{ + &neoversev3_scalar_issue_info, + &neoversev3_advsimd_issue_info, + &neoversev3_sve_issue_info +}; + +/* Neoversev3 costs for vector insn classes. */ +static const struct cpu_vector_cost neoversev3_vector_cost = +{ + 1, /* scalar_int_stmt_cost */ + 2, /* scalar_fp_stmt_cost */ + 4, /* scalar_load_cost */ + 1, /* scalar_store_cost */ + 1, /* cond_taken_branch_cost */ + 1, /* cond_not_taken_branch_cost */ + &neoversev3_advsimd_vector_cost, /* advsimd */ + &neoversev3_sve_vector_cost, /* sve */ + &neoversev3_vec_issue_info /* issue_info */ +}; + +static const struct tune_params neoversev3_tunings = +{ + &cortexa76_extra_costs, + &neoversev3_addrcost_table, + &neoversev3_regmove_cost, + &neoversev3_vector_cost, + &generic_branch_cost, + &generic_approx_modes, + SVE_128, /* sve_width */ + { 4, /* load_int. */ + 2, /* store_int. */ + 6, /* load_fp. */ + 1, /* store_fp. */ + 6, /* load_pred. */ + 2 /* store_pred. */ + }, /* memmov_cost. */ + 10, /* issue_rate */ + (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops */ + "32:16", /* function_align. */ + "4", /* jump_align. */ + "32:16", /* loop_align. */ + 4, /* int_reassoc_width. */ + 6, /* fp_reassoc_width. */ + 4, /* fma_reassoc_width. */ + 3, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 0, /* max_case_values. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND + | AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS + | AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS + | AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT + | AARCH64_EXTRA_TUNE_AVOID_PRED_RMW), /* tune_flags. */ + &generic_prefetch_tune, + AARCH64_LDP_STP_POLICY_ALWAYS, /* ldp_policy_model. */ + AARCH64_LDP_STP_POLICY_ALWAYS /* stp_policy_model. */ +}; + +#endif /* GCC_AARCH64_H_NEOVERSEV3. */ \ No newline at end of file diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 403ea9da1abd5a012d0b18849852604b10689682..ffcf4f146d92d410c6b515b3b80f07bdec1d2b55 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21524,6 +21524,7 @@ performance of the code. Permissible values for this option are: @samp{oryon-1}, @samp{neoverse-512tvb}, @samp{neoverse-e1}, @samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{neoverse-v2}, @samp{grace}, +@samp{neoverse-v3}, @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx}, @samp{octeontx81}, @samp{octeontx83}, @samp{octeontx2}, @samp{octeontx2t98}, @samp{octeontx2t96}