From patchwork Wed Oct 2 15:50:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1992074 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=arDi6/yN; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=arDi6/yN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XJfSd3VyTz1xtr for ; Thu, 3 Oct 2024 01:52:01 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 950D2385DDCE for ; Wed, 2 Oct 2024 15:51:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2062b.outbound.protection.outlook.com [IPv6:2a01:111:f403:260d::62b]) by sourceware.org (Postfix) with ESMTPS id EB991385B532 for ; Wed, 2 Oct 2024 15:51:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EB991385B532 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EB991385B532 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260d::62b ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727884297; cv=pass; b=tCJ9I0w3KQdPKkL8rJCymSHczxCWZ7bp1x1UZzhCYO67aLufioiw0KiVHSAats2MUbYdyeZIe5frfW8kWJ8Iy6jX9wNzK5Uo8uRCmfbIx7KCUzOwZHOe5R/iNBtpJ+ScUdexYtFh80a8L4nC7iXUaRMjTM/XTwVPUoW3vxdbmlA= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727884297; c=relaxed/simple; bh=xSxLHCWtcCUrYsW0CUtElnmEGUyBZgQ+udbexxoFzTE=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=r9EtBx9wEMBZ+Leg+9SNGieQ0qT9P2axZ4fEI4zkwXjzCXkki1n8MgxH9VgJwOBCUT2Bv828aWu72quMo6oPOrifMSjpWimOaZ9sTImeiEQOAhbceWCHqFV6VnXyclP5pKhUtHcplM/FFYZ118NMrzFo/EIZhiPFOZGiZGxMa6c= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=v0q+i33IotiPSHVWebjBIAiFVfI0vm8uct0igIqPoEhwZxXofQENnGOXpCLkO3vISrszFjVpWDWFg/d9G1rwYc0HvZFzcBiOp+1ZencjllQ4tATtCaY9cpsBGUsjKxaHvwZ5VUI7qIivxP3mwjSVm5YMyYjpwFX8bosiGNIbe79pkv9OxfdjWIDJXQBLe3MJbc1kkwmLaJ7U4Suh2CpHHmYqyh+TLcFSJEF0MTObW+aXex7trtpLPtw5FRpMBICS4PkD2Goapll31zsIX4kmOx+8+SW7MMxZ1v5Zz1oKIKtk55nZdWXzZ9hl9DxqXcppcZOyxACugttCa6X3tDUEMw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=ALoZLWokKHfDNMcipd27UDutKiJloGQADN5DvhZAgKgIhX/7GHn+IxugcN7bn05ytPuSIwvRuXeAdc4nJCA68MyVQx6CxH4WGbm+llieaoMycpswtTA6mtXWNStWm4w1ayCbbuZ/rXJ0ceL2a6jJIraO+rEYlVdU2zWcrK1hB+4NBrBnQPV/OMOT4wJWp6Oxj0ovZbuP5wG/6oiu0k+3HVoF9IY6FmogiOBu4X7HQ421MGkh7xp4C1kL/OrM4h40bJu90ecHpRCbYqJsTlVxuAP9TlglbpodpgTCpOIEKmgUhdjvza+LcpooWVTvWxxlW32mUVlJqY1WxN4FKX36zw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=arDi6/yNUUrZy3Pfe36ppKBjGrzbkhwj8wJpGfZfQiA1k/IOkh2efBBzzwp7T4YBL0SayIAtScwR4IKVBhCqABK/+RNY4AbgJ/zEt90BshQufd6o3vfTzizgodLaZrZzPdB/9gP6fDLN0jx8+OwVqJMQotKXWRYaHGiRBiU6JDs= Received: from AM6PR10CA0073.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:209:8c::14) by AM0PR08MB5364.eurprd08.prod.outlook.com (2603:10a6:208:186::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15; Wed, 2 Oct 2024 15:51:28 +0000 Received: from AM2PEPF0001C715.eurprd05.prod.outlook.com (2603:10a6:209:8c:cafe::dd) by AM6PR10CA0073.outlook.office365.com (2603:10a6:209:8c::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15 via Frontend Transport; Wed, 2 Oct 2024 15:51:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM2PEPF0001C715.mail.protection.outlook.com (10.167.16.185) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 2 Oct 2024 15:51:28 +0000 Received: ("Tessian outbound 5b65fbeb7e07:v473"); Wed, 02 Oct 2024 15:51:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d9c4ca46e0faf46f X-TessianGatewayMetadata: HPSqWijq9uSgXD8OskFSxY4JSxTGCpw4hz/3TxLTFaxocXqwTKMSPLM5qan0EPhscQRmvxIbLD7n8M6QBvYBuaejOv+Hdg34wJTVcs5PrJzn0VTYJldUC8kPo9m/pEXV6vo5h7kKCYhvtH+YnZVJ+7t6sQjYEqkpQF2Vy/N82ms= X-CR-MTA-TID: 64aa7808 Received: from L87d5cab7bf47.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id CA405355-F8C8-43DD-BB20-F13AC43DA6DD.1; Wed, 02 Oct 2024 15:51:16 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L87d5cab7bf47.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 02 Oct 2024 15:51:16 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Ja+mOx/ileGN61OYQIxVB5PL/+HSXuNrqsyTGejB4t1mhyc6Ef7O2L/esZhk4QNDHihz1vQXnxu0TTFLntLX78YlI0VxEwl3+Rl3rCy5A7w4dTxTfQKfPpI1LdMv1sNO1MVXwGZxEv1aALtn1hK87xX5INskF/aEDCSdj1VqyeaiUAojaf5/BuplaajgvMsy4GKWkRZiR6Zi2V3hGUlvm/xSe/W8OnAQMvOkstGcQl17Fxblw+u/ykLgJquOmrZ+iLNFUex36KTndLAKkAqrb1goNGfo9SWTfbOokHGfBlnB2woqORV4acNzYU1qkS4Zx4iNOtFzgndtnfTTuOq6Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=PLQP3cAE8MiDJR/JF566paAwrAwTIeBsG3FWhwQBjbRaPMlAUopYR1mtWuxITajAJdQTP74wDjh1MFGW+yUErRxr8+CYWIDyo6ayGJy/sT+gJHPqsvJwgXtXgz/4q64cqyOqswN/9c0t3GmGQt6baTblkFrS1+03W1Ze2249ZwjbHDgKvkhx0gY/X1q7yV2i8LHWwoneDLwAes8CTuMsr5BY/yIa59xFRuwVS6p0el9ESVPaPSE4gsWsgVA0sj7dHU8EZw5KAEuqvAoRTXxACqAMnNPhzehmjFMt74O0JvxxlJWl7eUmMGopF7wsUal11OpDqPScPwGTJlSKOmgaIg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=arDi6/yNUUrZy3Pfe36ppKBjGrzbkhwj8wJpGfZfQiA1k/IOkh2efBBzzwp7T4YBL0SayIAtScwR4IKVBhCqABK/+RNY4AbgJ/zEt90BshQufd6o3vfTzizgodLaZrZzPdB/9gP6fDLN0jx8+OwVqJMQotKXWRYaHGiRBiU6JDs= Received: from AM0P190CA0024.EURP190.PROD.OUTLOOK.COM (2603:10a6:208:190::34) by AM8PR08MB6340.eurprd08.prod.outlook.com (2603:10a6:20b:368::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Wed, 2 Oct 2024 15:51:12 +0000 Received: from AMS1EPF0000004A.eurprd04.prod.outlook.com (2603:10a6:208:190:cafe::79) by AM0P190CA0024.outlook.office365.com (2603:10a6:208:190::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Wed, 2 Oct 2024 15:51:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS1EPF0000004A.mail.protection.outlook.com (10.167.16.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8026.11 via Frontend Transport; Wed, 2 Oct 2024 15:51:12 +0000 Received: from AZ-NEU-EXJ01.Arm.com (10.240.25.132) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Oct 2024 15:51:12 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EXJ01.Arm.com (10.240.25.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Oct 2024 15:51:11 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Wed, 2 Oct 2024 15:51:11 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v3 2/2] aarch64: Add codegen support for SVE2 faminmax Date: Wed, 2 Oct 2024 16:50:53 +0100 Message-ID: <20241002155053.1343957-3-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241002155053.1343957-1-saurabh.jha@arm.com> References: <20241002155053.1343957-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS1EPF0000004A:EE_|AM8PR08MB6340:EE_|AM2PEPF0001C715:EE_|AM0PR08MB5364:EE_ X-MS-Office365-Filtering-Correlation-Id: 2158865b-2423-494b-0612-08dce2fa13ce x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|82310400026|1800799024|36860700013|376014; X-Microsoft-Antispam-Message-Info-Original: SOp6xiolBxLlbxlh5g/YeLJZZTIFYK2FgGT8CzZuyW9Jl/ka/qqAIyCbtjpM8uSpMLdtFIoN/cGSPz7NCE2O8sEoRfoJqs4v+6WFOdQ9hQcjk0kz2lyCOpBeuiM1K8mMYobrIQ/dxWxiLoobshwgM/CehFnANdajhXDNM4s3r+U7JDnPXeIXqHVBjk6cWciWiERHzdlm80nfEmAR9Yy/1iDVlGvIYHhQNik6mAfYDnphx1KmKx5dvG16S+sgYG29XbRrFT1DBbgBWXoZAo/lmsx9fFwG+8qA4coJbZMStq1L6sSAsXXPIz7gXGpYQELtv4DVn3oluqItaGq0UMBx53sxSeCV34f6xcUutBTj/qYXTU9NF+monHeZhaqG4RJGfGFAYog3cIRTDJJOfOPtuhMQ8iO1mK1hxjGilU+sYW7bpX/2yP8EN6dKTb7RodA+FwR5KcpZpw12fYtPHZaJcOGX8N9r6euvKjBZKNeJmfNNd0z/lF2nx6DqDXa2usk3e7Wd3d2JRqo/vemPJZcn9E1V5ekruZyJMB6vHmhvh3ajD2IC81RfM+2/5ten4GBgAgDsch3oPp7RvvoFIfMLpl40a0rgnXEYDw3XpWDf99lHMwzt4iaxeY0vXlKrGR2TVJmR35xnB4Yu9p7pjYwQvWzkALEfRx2lV7PjoD7GlaTFAj7mFETjiBBNVSScAI/CFWz9Kuat2GdbQgOVzRLUEpSbrUh0LljeAJMHiBJwUXOa8DZ+vzzPuIRTZvEcEpeg/0rjTqn9VTHLZV3H/d8dyjpWp78sZfg+FXF5MFf12UeXymo7Ho942qjo58J2STHqVpW0SiwnAYsTQRc/fC3kNMLnGfVJ3RolaICQ/QXHHw+F9/A3LZgXGtBsSGU4w9JtroONt7atlFlZMa9ChFl+Hj34FpMteY3flJw+x6Py3Y/TAxvXQDnkmok2AcRbW93K3noh/CFnZUEX4pTMLd+xG0pq/sTV8qcWmuQEhov0qa3XUAlG76HfcQIl6GoU9O+EBpl/HCdmbdxeMwMeOdLCYij4Qst/AxQJWLRLg/m3CLGqLb6tqcTmXrvOhu4ArH7sISMnI5JG3Hn9gg+5YToOraOOY1/pJzQ9f+MWIFMEWBajsjMGN6yLUmxkhGxWLFY1rKzwQiVKqO7DQ9paeqVhjkCU/GaVehIWeVpNwjvi2SCCpmIO4vZmzDkperR/OPbNWktqB8TqKgPEgO3KgmtHkHCdjSQaPNEbj6Fa3lj+N/Ey7iz9cPJJxh8jit0UyyKGJnJWjEalqZNrhQdZPb0KPCjUYEzA/PjwinUjSsJT+bI/by3nhAVYAEOO2XTnTstALtEw4dH+KUb9R2B1C0yS3A== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(1800799024)(36860700013)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR08MB6340 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:208:190::34]; domain=AM0P190CA0024.EURP190.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM2PEPF0001C715.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: af37a85d-b083-4c8c-916f-08dce2fa0aa9 X-Microsoft-Antispam: BCL:0; ARA:13230040|35042699022|82310400026|36860700013|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?btLeDyLA/I0tGj5SOQglQy1uq6nnfnZ?= =?utf-8?q?/4Hdo28aDjYvRS0Omx4VoOn4rwIiFT81AhLK1pd2gD7cMmawzg2WKhj6ZC6bKh9zT?= =?utf-8?q?lXeGa/5MWTIRhI/vPpc3g3EdwW5EdFlPGT0ZXBglQll9GR8IStAwxa1fs1T44HCTc?= =?utf-8?q?/XOb+0cHLlCTs6PwIKADRW/nsaakBFwfaRto3uRFDCE+Kjpt7+VeZz1vdyIzhn4Il?= =?utf-8?q?cxDiP1UQCBjAddm/iwpgFEGq9kvpibTG5zmrsPUKQHNhkkvfC5hcUDz3xAwrdSldB?= =?utf-8?q?wwn3MjVjTuLagancRW3xDdjQIjHZd7L3zw84iEYhai4XPbfQ4p1PgHFG/CbOPmUc0?= =?utf-8?q?WjBX5qb8aeVQCvQd7nPLTIckhLHZjcPxt2Z2n96s0BQBPDqRlITSjE38MLtwnyNhI?= =?utf-8?q?oxD0TgIjc9VMCI8azQtZc45zHWB078r50jO6UT7tW7wEn3musk+bCvKYGNLp6X8Ym?= =?utf-8?q?HwE/ZoGbx3Ebe8Q5FUwjyL5UEPb2UStrvax09UjPjyYfpU+Vc64V3mTut39H1bHjh?= =?utf-8?q?YUMUmIpJpj/cCtIe9pfbBHI7kc6t/XPP8ZF+RBc7LFBbTFhZWSVVqjq+Yvhg5J7CK?= =?utf-8?q?iVXRIDnVTFMOor6oHM2xqcdfXfMkxGbH3fLDOhfbkPXWozVKn9VqfnEHyqoCNh+L9?= =?utf-8?q?geJLTov9iG4eso7d8ZMNfMh9xWDz00VYrgXEwUqSAR+iGtk74q9b46YrFcYjbE1Ny?= =?utf-8?q?HOaH3eai7SvXdvyOgqwzK9oWOsh7aR8HoHlQh8pbtCN/3f/Ne4HZkBzrsudcZG6Lu?= =?utf-8?q?P4bI6NTuodDUl6JjHQTjjJChYysDcfT9/WCJJPmzXJrgEk5KAHNTHOyPoPK42pAiz?= =?utf-8?q?XUWmpvXZfPIt8ygBavO/q+zbKyFQfu5yT5J8dwYdrF+vwA0ja2yP0Au2aL3hOPljR?= =?utf-8?q?mOdvjbtxlA9u1u1t3Wr+70f2yArTj0hqgImEqcaDXonZAyZcRh96eVvUqoAjHDQrT?= =?utf-8?q?+dlJI2Fcp1QwlfV7MqaydOvkMpxqm2b8PtEfK31RkUE0zsNrVKJmWECiHughztRss?= =?utf-8?q?Tp163b9jwTPVaiAq4m9WE55fA/BMegfpgUFjDL463roqJAxkkhBsjAhw27I/pg9L0?= =?utf-8?q?3OB3ufiTWIwQcNVUqKCjZKMrlyDInWJyz9nKCZEc/4G1dZfpzZ5sWA4vlkp17kvuz?= =?utf-8?q?C1D3R4Qo7KF607rakH/nfTQAVVfyZ004if6Y0Xa4uVpDUH3jdwiJ0utgk6wbQ3kIK?= =?utf-8?q?HLtTUM7COR3UdJpMCY8aqjHp9PScQXq33hcQLztLuZGo2Wte4iOPYL/FkEGciuXP6?= =?utf-8?q?VN47U17juXIqqI+uJZ+Se0umAdy2tQeaVkA=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(35042699022)(82310400026)(36860700013)(376014)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Oct 2024 15:51:28.0957 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2158865b-2423-494b-0612-08dce2fa13ce X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM2PEPF0001C715.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5364 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. We also need to pass -ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX and UNSPEC_COND_SMIN. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Iterator and attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax_1.c: New test. * gcc.target/aarch64/sve/faminmax_2.c: New test. --- gcc/config/aarch64/aarch64-sve2.md | 37 +++++++++++ gcc/config/aarch64/iterators.md | 6 ++ .../gcc.target/aarch64/sve/faminmax_1.c | 45 ++++++++++++++ .../gcc.target/aarch64/sve/faminmax_2.c | 61 +++++++++++++++++++ 4 files changed, 149 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 725092cc95f..5f2697c3179 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2467,6 +2467,43 @@ [(set_attr "movprfx" "yes")] ) +;; ------------------------------------------------------------------------- +;; -- [FP] Absolute maximum and minimum +;; ------------------------------------------------------------------------- +;; Includes: +;; - FAMAX +;; - FAMIN +;; ------------------------------------------------------------------------- +;; Predicated floating-point absolute maximum and minimum. +(define_insn_and_rewrite "*aarch64_pred_faminmax_fused" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_operand: 1 "register_operand") + (match_operand:SI 4 "aarch64_sve_gp_strictness") + (unspec:SVE_FULL_F + [(match_operand 5) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 2 "register_operand")] + UNSPEC_COND_FABS) + (unspec:SVE_FULL_F + [(match_operand 6) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 3 "register_operand")] + UNSPEC_COND_FABS)] + SVE_COND_SMAXMIN))] + "TARGET_SVE_FAMINMAX" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , Upl , %0 , w ; * ] \t%0., %1/m, %0., %3. + [ ?&w , Upl , w , w ; yes ] movprfx\t%0, %2\;\t%0., %1/m, %0., %3. + } + "&& (!rtx_equal_p (operands[1], operands[5]) + || !rtx_equal_p (operands[1], operands[6]))" + { + operands[5] = copy_rtx (operands[1]); + operands[6] = copy_rtx (operands[1]); + } +) + ;; ========================================================================= ;; == Complex arithmetic ;; ========================================================================= diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c06f8c2c90f..8b18682c341 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3143,6 +3143,9 @@ UNSPEC_COND_FMIN UNSPEC_COND_FMINNM]) +(define_int_iterator SVE_COND_SMAXMIN [UNSPEC_COND_SMAX + UNSPEC_COND_SMIN]) + (define_int_iterator SVE_COND_FP_TERNARY [UNSPEC_COND_FMLA UNSPEC_COND_FMLS UNSPEC_COND_FNMLA @@ -4503,6 +4506,9 @@ (define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN]) +(define_int_attr faminmax_cond_uns_op + [(UNSPEC_COND_SMAX "famax") (UNSPEC_COND_SMIN "famin")]) + (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c new file mode 100644 index 00000000000..d54f5d99b5e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c @@ -0,0 +1,45 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_FAMAX(TYPE) \ + void fn_famax_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmax (temp1, temp2); \ + } \ + } \ + +#define TEST_FAMIN(TYPE) \ + void fn_famin_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmin (temp1, temp2); \ + } \ + } \ + +TEST_FAMAX (float16_t) +TEST_FAMAX (float32_t) +TEST_FAMAX (float64_t) +TEST_FAMIN (float16_t) +TEST_FAMIN (float32_t) +TEST_FAMIN (float64_t) + +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c new file mode 100644 index 00000000000..29e12450831 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c @@ -0,0 +1,61 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_WITH_SVMAX(TYPE) \ + TYPE fn_fmax_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmax_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMAXNM(TYPE) \ + TYPE fn_fmaxnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmaxnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMIN(TYPE) \ + TYPE fn_fmin_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmin_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMINNM(TYPE) \ + TYPE fn_fminnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svminnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +TEST_WITH_SVMAX (svfloat16_t) +TEST_WITH_SVMAX (svfloat32_t) +TEST_WITH_SVMAX (svfloat64_t) + +TEST_WITH_SVMAXNM (svfloat16_t) +TEST_WITH_SVMAXNM (svfloat32_t) +TEST_WITH_SVMAXNM (svfloat64_t) + +TEST_WITH_SVMIN (svfloat16_t) +TEST_WITH_SVMIN (svfloat32_t) +TEST_WITH_SVMIN (svfloat64_t) + +TEST_WITH_SVMINNM (svfloat16_t) +TEST_WITH_SVMINNM (svfloat32_t) +TEST_WITH_SVMINNM (svfloat64_t) + +/* { dg-final { scan-assembler-not {\tfamax\t} } } */ +/* { dg-final { scan-assembler-not {\tfamin\t} } } */ + +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 8 } } */ + +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */