From patchwork Thu Oct 3 09:51:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1992273 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=BcA/S9AZ; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=BcA/S9AZ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XK6Rs5TzQz1xt1 for ; Thu, 3 Oct 2024 19:52:57 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9A913384A82F for ; Thu, 3 Oct 2024 09:52:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on20600.outbound.protection.outlook.com [IPv6:2a01:111:f403:260e::600]) by sourceware.org (Postfix) with ESMTPS id 977B1386544D for ; Thu, 3 Oct 2024 09:52:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 977B1386544D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 977B1386544D Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260e::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949156; cv=pass; b=PjmcAEJVaUCGvbo/o+7j+/w+U8AzDYhB7IP9iAsnjIJAr0h4vyWasik3Aqa6YqC4HL2ZPS9o0/GrJXj0HJPGYaeBGkuZZsVKOdS5BY+x2y5Vao8TGwwV/pNmcLE2XOBInY/NJ3PMrHhwhIusuQZeVqTZoWIV3u+RlA+AByAEcu4= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949156; c=relaxed/simple; bh=9F0C8t741KiWikytl6O9LSRWID7EirhadpaWCcTGiJ4=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=PcwBuGsmI810Z2Qv4VI9PvJ3VCATjVfVky2sY86XrdWzqXVkrLee7HSm57p8BzSX2OlGpd+5VGk9llyzSDlDWuhpVq9els19y5VUVJ2YnGO//wjWBcXD3f4+IeWdoDSW2MPjE51eZjaip77xX+MrKWQv/owow6mVey4WHfyfASA= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=dDZTvelmVk0R+3PNrg3h5YEUkYhUedMp5GI0BZZfwkNZi0Z6Ce7G0LQt83E6NYX6ABRzacYGmMNEdD56ZDTebypB3LmgLpy72BVJR5icy2q8PnDz6KhKFWI8AWe9+JG8Mr7HiIYo5iloBkCyD2TR5+RiknolybFEedottglnCW97XvgH1CcyR4Zq6jIHE6RANnvuXJRUZCvuIOYVb5AxDbtxwWcaKiu7KJHaMxpPxwEk27OkLBLIfjq2V7nI+UIPaO4ZaozLtYyntyVw1TfN8TNDdyf30yI9y2tAZHC02rMrvyGe/qhdOO0E2E7bU/gIHRQopcsDUWV6Dw3aIcaygA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=CH7dbmj05J/8XqmhwxWBjSqR2DJMj2Xn9YYocbfA1HFmBVfDy52kp2/O8PJs4C8ucZ/FQvRbRlN5355ZSbizt8i9t1OZacnM9ZneTg9NhgkoUWY/n1j0iTrb2HnC+ZBsbsQPji5mXlmvdOtj8N9WtrxuQYg+Tim3k0mOJNA6uDuKyDYh1ZxtV5afz2Q/voDyVcvMqBvh6YUfjMfSs6+lM23Cw6lRNOu1xJFaadf8oQOki05JHzG8UidIChk8tYU116dY1/uT+hzBKSsvCZGY6ozgNn/CUAnlvuiAvvo9DhrWON4kuHmNnXQtB1wnV0goBCIZG5v8Wcurid4c2ShJKw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=BcA/S9AZW1H87Z54wTVNFmlumChByjftnnr8WaEdap8CI341hrxaO/VrULiIy8elAJR2EZk9QlETyzX+pBEvGrAi307oKujYocy5zsA4h0PytPkg0IgeSMbWmwZ/Dzwaj+jHnvOlw5nxavRfzoVqskPuAEPwYZdGMnhzuvO1+OY= Received: from DB8PR04CA0016.eurprd04.prod.outlook.com (2603:10a6:10:110::26) by DB9PR08MB9515.eurprd08.prod.outlook.com (2603:10a6:10:453::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.18; Thu, 3 Oct 2024 09:52:28 +0000 Received: from DU2PEPF0001E9BF.eurprd03.prod.outlook.com (2603:10a6:10:110:cafe::c0) by DB8PR04CA0016.outlook.office365.com (2603:10a6:10:110::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.17 via Frontend Transport; Thu, 3 Oct 2024 09:52:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF0001E9BF.mail.protection.outlook.com (10.167.8.68) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8026.11 via Frontend Transport; Thu, 3 Oct 2024 09:52:27 +0000 Received: ("Tessian outbound 10d5cea79515:v473"); Thu, 03 Oct 2024 09:52:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: b5527604e0b17bf7 X-TessianGatewayMetadata: z0EaD/EUcwwnGiPA2/G9MsiRDVJFa1+QWBh1QaZWllT/0++RNnf5j77Pvb+F159coI4rlxWdVQlKSjBSgA21GNgCBueA6ctpdnBndZnfWJEEQJGZYE94D7nI49m0HSkMWwO5b9DY3c6KgfTUMSFPSt6yKuwm6YyXYFfsTavdXCk= X-CR-MTA-TID: 64aa7808 Received: from Lbd7ab37ff161.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C9A8258E-7CED-4568-9F12-C041D7E5FD33.1; Thu, 03 Oct 2024 09:52:15 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lbd7ab37ff161.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 03 Oct 2024 09:52:15 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ogeBqgwQTOAx1fvSGTXp6FfFZ33GsMHtgAlI67DLPVOSSxt/qFgO6ACAizwDGGWWrbY1PUv5+sXPvjOUnFVbp1AXetlteyiQiYx5+1mw/ZMOaLDq+l65MNHk8esF59Kh7/kieiYa3J0b7z/E33OlP3XQUjSOYTYhD/PXjMktbfuWDLRbnJwuBeVoKskJfPj3Ie5a+O8O2dp3jqw9wUC5qH0KivR1C0YFkXsQM6dsnjEM8cAbanvDnuo0PE2uXmFWIjqrrREfxymTrQ7gVLEQsFbAj2zMJoaQ+a608+VHzOBYarzwtx8v5AsUYUPQ+k3PBqV9ztxfiovanE+jJPEgUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=kCtt4OnhN8YeigippfH8rQrTM03PRXtpScgNWFOjNSVX+kHIfatKiPqeJqhV5hAinwCqEBsgywCr9yMxjWt6EHunGNndDMvT51JZsKH625nMmw7Y7LBv6kBWJtY8KxNZah/a1IazM/o5pARSwgWXiI4udLv495Pd/tX8DLqGpfr83yr0Xu9mt3L7e8FWhv9iotuM4pt571QCTYvmuiiLFueHc9siAEm3jlkRUqYRl2NjjNiBh8COjWAY2vHTGen5h2e5zA0m5dBtYm2ljHPD8Z7ExCy9xAcV+wB4L7A2VWl2qH3917/yLIDWTdeavYScqR1746K1URFOxRUjO+/EyQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=BcA/S9AZW1H87Z54wTVNFmlumChByjftnnr8WaEdap8CI341hrxaO/VrULiIy8elAJR2EZk9QlETyzX+pBEvGrAi307oKujYocy5zsA4h0PytPkg0IgeSMbWmwZ/Dzwaj+jHnvOlw5nxavRfzoVqskPuAEPwYZdGMnhzuvO1+OY= Received: from AM6P195CA0018.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:81::31) by DB5PR08MB10311.eurprd08.prod.outlook.com (2603:10a6:10:4a5::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Thu, 3 Oct 2024 09:52:12 +0000 Received: from AMS0EPF00000191.eurprd05.prod.outlook.com (2603:10a6:209:81:cafe::fc) by AM6P195CA0018.outlook.office365.com (2603:10a6:209:81::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Thu, 3 Oct 2024 09:52:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF00000191.mail.protection.outlook.com (10.167.16.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Thu, 3 Oct 2024 09:52:12 +0000 Received: from AZ-NEU-EX06.Arm.com (10.240.25.134) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:08 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX06.Arm.com (10.240.25.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:07 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 3 Oct 2024 09:52:07 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v4 2/2] aarch64: Add codegen support for SVE2 faminmax Date: Thu, 3 Oct 2024 10:51:57 +0100 Message-ID: <20241003095157.1390838-3-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003095157.1390838-1-saurabh.jha@arm.com> References: <20241003095157.1390838-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF00000191:EE_|DB5PR08MB10311:EE_|DU2PEPF0001E9BF:EE_|DB9PR08MB9515:EE_ X-MS-Office365-Filtering-Correlation-Id: 0f71b858-038b-435a-427a-08dce391171a x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|82310400026|1800799024|36860700013|376014; X-Microsoft-Antispam-Message-Info-Original: gOfic+eogX1uUP94gl5FcmQKfsYJTm1bXfLZzXxgvqn8hnzOEbWwiimWuejFMt+ZqJrHKEfJaHiRbcJPducPfL/e7iZDMQJNkubGm0raqg+f/FdgBhx/kBeuaplQurGBIu/5nOYXTSdycFw6lVXsvWIL1VkqWSaWx/C6e2r8wn+h8eXfsAla25IAW491Eg1MYRbu7lNdSYpzf90cxdODVuAogmvBlRcAP/syYww2f8kwSospqTHtGKzi8GmbVxJ2ZFbjwUC6xhLTup+XvKU0+Lz5ch07bT9+7Lf/eX/bxfDVb5lwYpVdZk4jMrLe1ZNIbhsbQuGP0sbI7Y5Ij+HwkQoHDGu7ZSLqBdZV17bLo9tgNXwx7tomFqKCwdtm7qHHCZMHXkXv3ewMQ+MVNhyHchB9sFNHJ5ZM3AXZTYr4OmIXt06D4P/YzgouxV6JdOgpEyw/RBnqIaPj0Z7p2REas+38GAJWuex2nG3Od/UPHg4UVAZyQHJOZ8v2iNThTEKBS8/whZfcul/WbKg5OkLjGNKaN/PtIuulNsf8kWKGUyEPKwL6D1DhrT6wi2mxVCtj2DyNbHOLuNfrSmTSSaFWTvN4/GaY8q1aYIb02Vzn6M938EVB6YfXeS5TLy31xLoumrniN0H1nHP36CEWuaWLvQMvhFpy7CioO0q1ZyCetaZiwwmASPAKkV1osK4g2nD1Njqtusf9HQEn3X5PWqGBRH/CJEniK/RkdqoegMOt2CR1lz0iWAAET3oV4Z4yqQlnr0qmijSfdc3nQFzvRJuzIyWWFwBDn0g8hZbVRgyYyLZbLJscyUxnEHC1+TdKuW3ca018vb1haXFgGyWR9pcsG1NSHHZmLc0cx3uXGpZIz+f1rwYWkRJ3TFWslVfvR9y9O5wElm5j9h9kY/d8YpYPSkjsnimEISppmSVUWtLLBB/3O1wADh6+0UjJtfv0TKMgDh8AkuohzRkw4yD/G92O8fOklQDz/IaJYYkAUQ/f3AIKXnjT9gxDYo4afdubXokxHaUcYqwtQycUQcwOFRMsr2QAEjmPsGm+AEbdi0QoiaGw22+LphuejmDUPltgXuoi/kAEaCshYNe7sxAcUk7qCZfdGhkJNFFRyNLiqLJVJDQRet/J3ObJaOTd9jvlGun0aE8jN30jflpcxbLyZBgPOs7hngIHYmSgTIscEW95hEVbQeZHTgJ4fS7OBsLFXH/h66wxWnJePNMMUBUocKyrLIGj+cnabFSjY3h0JWt415t9deNWNemDaqzfYnjY8Q0o6ysPEQQUP3JTdQJNYTMx4N4f3h+YDWcIpi+9xeIrZxN1PMvsrGjSQls6kDK77im9mQSCrylSPaJTHxi1aD/K6w== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(1800799024)(36860700013)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR08MB10311 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:209:81::31]; domain=AM6P195CA0018.EURP195.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF0001E9BF.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 354a67ca-3efd-436f-d4d0-08dce3910e19 X-Microsoft-Antispam: BCL:0; ARA:13230040|36860700013|35042699022|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?utf-8?q?oY8EeBJTJYjSIXDks9OaY7kv/k1X4Ck?= =?utf-8?q?ckhYiX5FzDJzOioNadtYttK2lzKThWk9Y/5JLD1gMe2D8CEwDWBuAC64YqNO50TPv?= =?utf-8?q?Ila9w4bG3Gqk3J3VgOZ/2GTCp2nH2N6Sz9hmjjkJHd5mjBHIU/egab8yozYuvJ/Kn?= =?utf-8?q?mjSqmgjwoXng7gzm2dsSg9L/Zn3mGRhPF2ehjTuS2FomNqEw2OY/qNE2hmLTPWyrz?= =?utf-8?q?EhwBNrsvj4eW/GSu6GiePL2FRBkiERsjthIMmaiyfRbsRPtaMCrKaFnwpGSVOv0Bg?= =?utf-8?q?TMAVjlKmscOFFFuJ8nWKlVKgTfW99H3xp6Kb/EdeINohTPEhTVtIDDWirymycG7Mx?= =?utf-8?q?KrUrY2Va44voS8ak0n/ua6lxxlbAPi0b18IE7r2De/GZfnhgH09yFLs82Jynx3oEf?= =?utf-8?q?apfaq7cRElF8X0o33FNc/p21gtx/mgK5bC55YZ7YTpOxV/aHOiFqeRSGzGg9dwhbe?= =?utf-8?q?2f/DPcB3lTKXJ3i0RE4rDvvvjUGPOC/TCG8+kkOAwKvZons7HYfSKNwNlvxYN5hDw?= =?utf-8?q?NARvboka/JmKokwqVA7TE9aHgAIEVX/sMmZ8MrIn9lPyPrMnpJQs1KtuZdCKUXROJ?= =?utf-8?q?S/Qjgh/8qdtN/CHcfIV1NvQ49dsYny9bUeMRmdZCi3lwsLkYNja+60aIZsXk18Css?= =?utf-8?q?2hAckseMF/Rcnh0P5bbaTw/vaRBYnxKaY3vY3aYaszY/+6N3YFdZ2qU84v0K+3m5x?= =?utf-8?q?kUrMMgS6xsPvapDSymAzVqZHBVD8Lf7LWR2kEZz/J0OVvPGIT0jyU8oenh4b0bB86?= =?utf-8?q?9XUz18CiIMUfij606YlZE6PWJ3Y2uXEULRQa670e9R4+iyqAmfv4Dx2udBKfMQpiz?= =?utf-8?q?mLdIPsJ4nBdeU7xl4S4p4isRvZjT8kH2auABwjmMC48ArhFKIRW81Aw8HUtgsV98X?= =?utf-8?q?vBrwUMwXrvhPyhrIRY92YBlhlDGaKDm15rpDQNiq/GKOfP4RvlvsCgIoNikeIcmln?= =?utf-8?q?Ei7MVdrxG1r+fJq1F2Cwq/d3Wbd1o131UiFb13aIVngAoaBTSFvsAqSOCtmAV+uqI?= =?utf-8?q?gIYYXtFqSBsc49SoLihaQHHtqs2zD+0TruwPtZbx1aULA5bTPPJmjKYYQLBqUlk4p?= =?utf-8?q?s9FfpEXievyKBiSJQoO2mRXEkYqvnwx0zDIXcarzHVMBmeD5OwPwtSRCr0uIAuba1?= =?utf-8?q?bkn1HrLi7YSatd/hWW7RjOZJs+PSba0otmXMWA1xJ91y3EuCIvlfDJ6NqEMYroG4m?= =?utf-8?q?tYu83XsLo+HBW0pl5QhTVUHdNkZ1h/HoeCDYimxbj98pN68Lj96TGhonsjTi7vSEN?= =?utf-8?q?HFsao81J3Jf+ujWNRkyIP/Jn6DgiKZt1nFA=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(36860700013)(35042699022)(1800799024)(82310400026)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2024 09:52:27.6502 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0f71b858-038b-435a-427a-08dce391171a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF0001E9BF.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB9515 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. We also need to pass -ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX and UNSPEC_COND_SMIN. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Iterator and attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax_1.c: New test. * gcc.target/aarch64/sve/faminmax_2.c: New test. --- gcc/config/aarch64/aarch64-sve2.md | 37 ++++++++++++ gcc/config/aarch64/iterators.md | 6 ++ .../gcc.target/aarch64/sve/faminmax_1.c | 44 ++++++++++++++ .../gcc.target/aarch64/sve/faminmax_2.c | 60 +++++++++++++++++++ 4 files changed, 147 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 725092cc95f..5f2697c3179 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2467,6 +2467,43 @@ [(set_attr "movprfx" "yes")] ) +;; ------------------------------------------------------------------------- +;; -- [FP] Absolute maximum and minimum +;; ------------------------------------------------------------------------- +;; Includes: +;; - FAMAX +;; - FAMIN +;; ------------------------------------------------------------------------- +;; Predicated floating-point absolute maximum and minimum. +(define_insn_and_rewrite "*aarch64_pred_faminmax_fused" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_operand: 1 "register_operand") + (match_operand:SI 4 "aarch64_sve_gp_strictness") + (unspec:SVE_FULL_F + [(match_operand 5) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 2 "register_operand")] + UNSPEC_COND_FABS) + (unspec:SVE_FULL_F + [(match_operand 6) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 3 "register_operand")] + UNSPEC_COND_FABS)] + SVE_COND_SMAXMIN))] + "TARGET_SVE_FAMINMAX" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , Upl , %0 , w ; * ] \t%0., %1/m, %0., %3. + [ ?&w , Upl , w , w ; yes ] movprfx\t%0, %2\;\t%0., %1/m, %0., %3. + } + "&& (!rtx_equal_p (operands[1], operands[5]) + || !rtx_equal_p (operands[1], operands[6]))" + { + operands[5] = copy_rtx (operands[1]); + operands[6] = copy_rtx (operands[1]); + } +) + ;; ========================================================================= ;; == Complex arithmetic ;; ========================================================================= diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c06f8c2c90f..8b18682c341 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3143,6 +3143,9 @@ UNSPEC_COND_FMIN UNSPEC_COND_FMINNM]) +(define_int_iterator SVE_COND_SMAXMIN [UNSPEC_COND_SMAX + UNSPEC_COND_SMIN]) + (define_int_iterator SVE_COND_FP_TERNARY [UNSPEC_COND_FMLA UNSPEC_COND_FMLS UNSPEC_COND_FNMLA @@ -4503,6 +4506,9 @@ (define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN]) +(define_int_attr faminmax_cond_uns_op + [(UNSPEC_COND_SMAX "famax") (UNSPEC_COND_SMIN "famin")]) + (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c new file mode 100644 index 00000000000..3b65ccea065 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_FAMAX(TYPE) \ + void fn_famax_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmax (temp1, temp2); \ + } \ + } \ + +#define TEST_FAMIN(TYPE) \ + void fn_famin_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmin (temp1, temp2); \ + } \ + } \ + +TEST_FAMAX (float16_t) +TEST_FAMAX (float32_t) +TEST_FAMAX (float64_t) +TEST_FAMIN (float16_t) +TEST_FAMIN (float32_t) +TEST_FAMIN (float64_t) + +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c new file mode 100644 index 00000000000..d80f6eca8f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_WITH_SVMAX(TYPE) \ + TYPE fn_fmax_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmax_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMAXNM(TYPE) \ + TYPE fn_fmaxnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmaxnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMIN(TYPE) \ + TYPE fn_fmin_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmin_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMINNM(TYPE) \ + TYPE fn_fminnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svminnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +TEST_WITH_SVMAX (svfloat16_t) +TEST_WITH_SVMAX (svfloat32_t) +TEST_WITH_SVMAX (svfloat64_t) + +TEST_WITH_SVMAXNM (svfloat16_t) +TEST_WITH_SVMAXNM (svfloat32_t) +TEST_WITH_SVMAXNM (svfloat64_t) + +TEST_WITH_SVMIN (svfloat16_t) +TEST_WITH_SVMIN (svfloat32_t) +TEST_WITH_SVMIN (svfloat64_t) + +TEST_WITH_SVMINNM (svfloat16_t) +TEST_WITH_SVMINNM (svfloat32_t) +TEST_WITH_SVMINNM (svfloat64_t) + +/* { dg-final { scan-assembler-not {\tfamax\t} } } */ +/* { dg-final { scan-assembler-not {\tfamin\t} } } */ + +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 8 } } */ + +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */