From patchwork Fri Sep 13 09:06:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1985073 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Ky8u1NFl; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Ky8u1NFl; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X4pPc0YbFz1y2H for ; Fri, 13 Sep 2024 19:08:20 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E604C385C6C2 for ; Fri, 13 Sep 2024 09:08:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on2061d.outbound.protection.outlook.com [IPv6:2a01:111:f403:2607::61d]) by sourceware.org (Postfix) with ESMTPS id CA16C3858D28 for ; Fri, 13 Sep 2024 09:07:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CA16C3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CA16C3858D28 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2607::61d ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1726218473; cv=pass; b=nuLe2Co3O+KBivas/DiDYMYHmjtjXzOdPJiIk0Lu/KyMH+aDK617sV7QBY85oncpk2dHM8arm5ikuVIXgBhiq/j6GyWp22U9BTx63HDVV1nneA5pVWPi/n0n6DE/8VbG7kz1uhQOEtpYPxcp9Nkg0q8tO5KvG0WJyQJxwzy9Y0U= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1726218473; c=relaxed/simple; bh=OsQu5gPNPUe10l1MlXS9jjyvqeV+kUtEzvfY2afZIyk=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=A/DvoU5hx/6HCOBShvB21uyIV6DvyVO0jKkEIP7tgBilRnSFmprvaA0IeR2Y9PQ5GaKZzpMd6npLE+kRYE8oVPW1HLlkBVE9FyFTtmacZ8D41PHMvxdG4fpNhUOCs8O9B4sRVDkhEN7YSIUGTC7TzhKH7edg2zBL1NUHyJH5XT8= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=YIh4EQl6YsI/fR56Z296bv6wSdXmppdPIZsPIzjKzs3T1DYDlIff7YkJxex/I572dENQhC0GwNGFTWoe3GCFOxudqxu3rSW08v03XMSafSdM2YhyRF29HpfdAsCFVHtqB0UCggAZKH7K5SJ6Egi6Uj1mMy1dk6pgarEC3cNHOhWiizjxyb4UKqRiktiiNpb4LCA/e6G+/d9T3rJ16ZnW/58gUkA7LgQoqlpPjNNnBop8XbciaPhY6vcpm2Ufz7uwI7HLrWI49go/niF4ixQLHI/xIdzSmaDTnzT9cI0L+fyBq1TQA95oiR41P6QuUdkmn/l9fIDSIZh2TtP/yoZiRA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=c04fiNb8nMiCSkkhPCQNp9gXjsKrj69/mleuWuN3fMw=; b=jJZ2yqkFefk/Ob4D6pPq/oelw4N/CROKDbyRbSfVkfhiKXEBnnsxXYbc3NOBzSxnhnZWTCjBzURjuFDLfGWPk9nqc5anBQB9Q5JmA/BwD5PpU4qcN2Gch05FkpeBOCQuPnR7U3goTDlHMR4gFQXptXsfLvmxzonbxYtbmfUcNp3absw2x94uNhTXqnSwckwJ1Eyc5YGkFqTFNStWchk2YWNJv4BnetPYJsGX+58YiAMCqZJxqlLkawqScomTjxO8WkO4lTruMT3JxGBoI2ibehSQLXQuKygIg2pjo5cyfr5maB7htI1UwQlbL0bgPr5zu37AsSSFwPnugHBHGKScww== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=c04fiNb8nMiCSkkhPCQNp9gXjsKrj69/mleuWuN3fMw=; b=Ky8u1NFlafBo2eKAl3xrwL/OSUhab7zLkOpxEoa2MCsPDmebukM6LpUFeC35ifASjEAU98QKy9njpHtzWfyK9CdKk7i/ND5A/cf1/xoSDrYzGeF9TaaQY9atW/cPOi3bTmDoaZ+eFD49IBuqlXfUxOs5TvoP/dcQkM9mgMKhYXc= Received: from AM8P251CA0023.EURP251.PROD.OUTLOOK.COM (2603:10a6:20b:21b::28) by AS8PR08MB6230.eurprd08.prod.outlook.com (2603:10a6:20b:29d::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7962.17; Fri, 13 Sep 2024 09:07:37 +0000 Received: from AMS1EPF00000045.eurprd04.prod.outlook.com (2603:10a6:20b:21b:cafe::e6) by AM8P251CA0023.outlook.office365.com (2603:10a6:20b:21b::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.27 via Frontend Transport; Fri, 13 Sep 2024 09:07:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS1EPF00000045.mail.protection.outlook.com (10.167.16.42) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Fri, 13 Sep 2024 09:07:37 +0000 Received: ("Tessian outbound 76fca07d1c26:v441"); Fri, 13 Sep 2024 09:07:37 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: f44795c3451dbfa4 X-CR-MTA-TID: 64aa7808 Received: from L17fd05a14c62.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id BFD64E82-5721-4EA6-A2D9-463B177BA51D.1; Fri, 13 Sep 2024 09:07:30 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L17fd05a14c62.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 13 Sep 2024 09:07:30 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=TdKwKE67hwkG9czvl8luW7ySR6cpjgVAiu72n/116E/lhzA8xNumy1Cpc0mRj/IaX825Qr/dOKrSkd6s4+Pkpact7hoB5ZGd4npVSbPJtCe1L6gtbtDE40NV1jGfl46SkNzisQf7TLjyNo25l1iggiEx5sLNtDF8vr/7XtsKKgY8ihsSnJb/GYudBBNymWtdO2d/NirUe4isMV9A/1b+1vzwKZQAFsbrsFFZ+c5fysTdvGBuq87ZpZfWvKPoniKd74H06JigsBEZyHpMKMHp3rjTeHk6AbdF52wzX1KrGRyZ2C2o9cQM11mgF5LyiQgqnwKNgE//uDU/fHDDDpbXDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=c04fiNb8nMiCSkkhPCQNp9gXjsKrj69/mleuWuN3fMw=; b=TbYc6Vm+JKHLQnUn5p13H87aZcJ0cxkfX/8Erd9l+O8FWRA9xM8CpIpambYvK5QC+ujUoCSLRSya/lo5tqJtjOy7/8l0rzeva7+xtVmQqbB5hg5+hh46msf4B6FaWRnkWwUfji8eKbPvisZBSWtjmD+0/6qJ4woqA319LRtw+unllI5yCdURozV+WC1uUmhIGGdehF5aFyC4ajXWKbGr3uS90CPdDKV5JdEYYWvFl8FOsNG9IND6+puajDctX80oV1eO8fA3u/Pevt26nOIsTMTsIwCLATHS0luOJ+zNDzFlj4dQ2+Bs8wAVbMzKePBNCo1leRyvCrVDssRRsvbN2Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=c04fiNb8nMiCSkkhPCQNp9gXjsKrj69/mleuWuN3fMw=; b=Ky8u1NFlafBo2eKAl3xrwL/OSUhab7zLkOpxEoa2MCsPDmebukM6LpUFeC35ifASjEAU98QKy9njpHtzWfyK9CdKk7i/ND5A/cf1/xoSDrYzGeF9TaaQY9atW/cPOi3bTmDoaZ+eFD49IBuqlXfUxOs5TvoP/dcQkM9mgMKhYXc= Received: from AS9PR05CA0233.eurprd05.prod.outlook.com (2603:10a6:20b:494::27) by DB9PR08MB6684.eurprd08.prod.outlook.com (2603:10a6:10:26d::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.8; Fri, 13 Sep 2024 09:07:25 +0000 Received: from AMS0EPF00000197.eurprd05.prod.outlook.com (2603:10a6:20b:494:cafe::93) by AS9PR05CA0233.outlook.office365.com (2603:10a6:20b:494::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.27 via Frontend Transport; Fri, 13 Sep 2024 09:07:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF00000197.mail.protection.outlook.com (10.167.16.219) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Fri, 13 Sep 2024 09:07:24 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 13 Sep 2024 09:07:20 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Fri, 13 Sep 2024 09:07:20 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH 1/2] aarch64: Add SVE2 faminmax intrinsics Date: Fri, 13 Sep 2024 10:06:54 +0100 Message-ID: <20240913090655.1551666-2-saurabh.jha@arm.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240913090655.1551666-1-saurabh.jha@arm.com> References: <20240913090655.1551666-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF00000197:EE_|DB9PR08MB6684:EE_|AMS1EPF00000045:EE_|AS8PR08MB6230:EE_ X-MS-Office365-Filtering-Correlation-Id: 3880f463-39a5-4649-0373-08dcd3d38361 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|1800799024|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info-Original: 8YKXLIuTxoilEo+QWob/fjs/+8hqVc7eWRS7cm2zZQnB/ybeZK8a8h3gifZZx/8AWFC5gTWBgYzLiZx5HHj+6dp4OUhkkQTWUJB1v4wV4epXqUBMJO6xF3CelijW5ZZeHbeYoOf/nXhbJV+UkD5cCCiEqFpWfdLTo8iS7Rh6zQ0S6UW814JaetS369aaOQrBMJqUbtRfazuySNVvuR5P5S4oHZ14YORuufHadRmYdHigT2HC0fD3v/8xJ8Zasmixfb+bYRgXQg6QwqxNzvh1+kcjliGRr8at2WYRAVSYoyXbzRahSUcA69MoPnIRsier9aaiUgXJOw/K9rVZEr+9hrRQsgp4aD67J9ULPbsNgdDG+Hi6UzZfmYTWfVdNJTAOmAgYucH5JOhWlCCjiRVlSWUHi3IxOvUslbjXoH/KunymbJLUD/1/bmldR5rE/BJBy5XjP+7qk2l7BTNsQH1185X4KQR0A06tm1IgctuLRMLU6aLRTZasSz1XOMqNNvuS8ptTnYdEamNccids2QZAMttdIAjxT5I3EMTIwKcutLspCqZkHE5884EhIkEc713loeoiAjiYjdBTeGCI2vHSglQRxKAzLKJOoiNa/ZMGrHkWd/rXDs1riSNqiE0JUSMLhkrmgXH2ja5WyDwH/TGiFrUfGPe2a0Z2IL+yd0zrv5CNwj0ffWTBgwUCgqktqWRDwKNhgXNvzBC65X+i0oKlhZoToAsFtsJR70yBOzA998cEPBtx0tVvo3c0OWyMZP9Zd1e48ERs0mFr6+vPc3o61ghXSmWan0Mn92FEVQz+P7r4eynNmp+o+LkbLap/LZP0Vt2WF6R2lfQg8PnOj1CnoVvPhAsmTYk1Wx0osr9ti41hA6FlRlrh59GvqlXqK0+EfMMJWaSv3QmAy91Q5EbcdU9P4Zo6S7sSNzJFrh/5/w48v284mZKvFdlO3P7DHnfg0MACk1SOa1TQy6VYOchv2Pw74zdI0dqxUUnOg1zpJrBSzzrs/1WF3F4TJilX5JDAaPapx7rxm7D32UJ69Wc249hrnOtMZEkot3R7E66H1mM2QO1u8D9Fqleci+81em0BDs35fcavlayG4aBwutyXCLoy00BoDKbi2NSJJd5E+2fI4q0wDlcLPGuqItVtq45WUr+jMME5s+uFF6+p/np5cSbu4paCZvZKih12XKyp6kAxx2gSEOKRap8M15VGLiYZPuSJw6ZngUbvXlAf9bFWeS/7CU7NZ+q8g27PCAEOvRU1bKymaEbYdDKoxcz9bwejhXpp4ZFkIbmix26gw/VDYg4YyJZfNckpMFxtcncDkRbi/VPRyKNGOdDSFAc5rl9ID00efmN1VIWCI2k+o8AAsKFHhLgSLarcpcZILhsqUS1kaFrJ629l7mlS6vVTYZySxiWWkOuP1rgq0krFks6xnQ== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6684 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:20b:494::27]; domain=AS9PR05CA0233.eurprd05.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS1EPF00000045.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: f37682e4-b690-4b42-c9ea-08dcd3d37bd7 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|82310400026|35042699022; X-Microsoft-Antispam-Message-Info: =?utf-8?q?tKRbXqh4ktOM1DTnKCsGx/Gxyx3R03v?= =?utf-8?q?OvGT2VTFER9VIZ2JmVtNmdTyknWjUO2WZNJZ+13n5O3ijDXLA7OTTli3NxtEnIq04?= =?utf-8?q?6HHy3MffsC29ramc1BuWVKhGjqj2C9hUW/FfqWHldWZEyHoTG3ajGjlfzhiMSzygL?= =?utf-8?q?Z4x10vqMquJ55zSp0Noh1sUTMmtsb+yj0Hw2Qak6KG3q3FLxe1KftCfthZOs/ajWn?= =?utf-8?q?a6mC8fj1w+KKgrza56zDmoeZo2t9CkfX6PFpRd9WIMZT62E/Kiae7aMUzSHvwhVnZ?= =?utf-8?q?Y2glY+hQBhpADb1RIVBiApu1ei01+MNs76Zr0YJAFel/zl/j+dDqSOY2PhkfOHqoc?= =?utf-8?q?sseU2KoklbcONF8Wf0kyQe5W82/g5/AoNuiVLE92VgAow5b1oYuYAxnvQsxEVrQqF?= =?utf-8?q?M5KwXA6nuYHtmjinJ8jA9b2HE7l6OYF/3w8Ds6ta21ZyCpPauOwmMA+1od+Wyemg9?= =?utf-8?q?C7jN0aZDtVosJXJCHSHsyl10cK1EAMVY7caLkzT6G58f/iIo+o0ekU3nBUYtiaWPt?= =?utf-8?q?4xizA/d93zhdfhI9n4pIWutfc4HrRQYvM+E8OHDaQkO/tdMRck/xGEtaknl1hC1xo?= =?utf-8?q?1MWP80Y2ZxsRg9suvWPJ0LqecToGDpwmu8JrIyPwhHEysBlruxWHjN/vj7WJf94jA?= =?utf-8?q?6mg0SooYzUpP5vtkNoUfUQO26fomiwjyi4lXNFfhg8J4ItcK7V3F9jzMszqM1bsTo?= =?utf-8?q?x4HGAg/pxp6erfYgMVm4WNVY8/Wuz02R1NYHvmLRgxzYOZNCUD5fhcd+G10V3xe87?= =?utf-8?q?Pf5S6YLZv+jMQZLnAShGE3fT4fdCF+iNcglD4pOP5WjBqldCq72oAu3rCVE+DWvJY?= =?utf-8?q?ik2fiIbwN+Sxm6ca+FBJI2R2dhxcI0fu75P9gGU5LA4YvJP/KeoeJu09KsO3Jnsp1?= =?utf-8?q?f51ILryB9gL89BVwrryp19ravQRbtIOD9C9UlQy4Z8/+nQG37REPttESfCQg3wvMf?= =?utf-8?q?nYlJgxr2tec8LdHM5KmObZt27RXRH5Os9V+a0aMj0yRp5M3WoWTC07PTE8o1LReP/?= =?utf-8?q?1u6m/kGkXk/cYiA9SovfiLuETGWuZl4c+f+fbmOpzeya0+L6CQ4pFhTjDfrXtMtko?= =?utf-8?q?8MB+H4OXiCBvzc14pVrGvxWNJL9Xr/cjrMexf4Zfzi0dN1+MsdZvx9o6DTz9bKGJi?= =?utf-8?q?E49aA+BGSgDuFBKXuyb18L/RoHzwdziwe5r/7CEBVeyHBSJhTp7gIdAG5tMaBOUsN?= =?utf-8?q?KWlA+pOj+z9/+YOYsikJwwECpr7ds+pqkr6vkoZg++rVSjHg83tJrY+eJQNcxFsC6?= =?utf-8?q?u3fbdpDqk2E74sfO1ukXZmei2J3yPf19rtTtho5dfC8ABH+Dxe1csrWSO73qbEPM/?= =?utf-8?q?Bl07v2GAzwPYE8nyFMFJeA1xS6NdzNH1Eg=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026)(35042699022); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Sep 2024 09:07:37.4719 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3880f463-39a5-4649-0373-08dcd3d38361 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS1EPF00000045.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6230 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max|min]_[m|x|z] * sva[max|min]_[f16|f32|f64]_[m|x|z] * sva[max|min]_n_[f16|f32|f64]_[m|x|z] gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.def (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.h: Declaring function bases for the new intrinsics. * config/aarch64/aarch64.h (TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax. * config/aarch64/iterators.md: New unspecs, iterators, and attrs for the new intrinsics. gcc/testsuite/ChangeLog: * gcc.target/aarch64/aminmax.h: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test. --- .../aarch64/aarch64-sve-builtins-base.cc | 4 + .../aarch64/aarch64-sve-builtins-base.def | 5 + .../aarch64/aarch64-sve-builtins-base.h | 2 + gcc/config/aarch64/aarch64.h | 1 + gcc/config/aarch64/iterators.md | 18 +- gcc/testsuite/gcc.target/aarch64/aminmax.h | 13 ++ .../aarch64/sve2/acle/asm/amax_f16.c | 155 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f32.c | 155 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f64.c | 155 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f16.c | 155 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f32.c | 155 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f64.c | 155 ++++++++++++++++++ 12 files changed, 972 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/aminmax.h create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 8f781e26cc8..80c67715fd7 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -3044,6 +3044,10 @@ FUNCTION (svadrb, svadr_bhwd_impl, (0)) FUNCTION (svadrd, svadr_bhwd_impl, (3)) FUNCTION (svadrh, svadr_bhwd_impl, (1)) FUNCTION (svadrw, svadr_bhwd_impl, (2)) +FUNCTION (svamax, cond_or_uncond_unspec_function, (UNSPEC_COND_FAMAX, + UNSPEC_FAMAX)) +FUNCTION (svamin, cond_or_uncond_unspec_function, (UNSPEC_COND_FAMIN, + UNSPEC_FAMAX)) FUNCTION (svand, rtx_code_function, (AND, AND)) FUNCTION (svandv, reduction, (UNSPEC_ANDV)) FUNCTION (svasr, rtx_code_function, (ASHIFTRT, ASHIFTRT)) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 65fcba91586..95e04e4393d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -379,3 +379,8 @@ DEF_SVE_FUNCTION (svzip2q, binary, all_data, none) DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_FAMINMAX +DEF_SVE_FUNCTION (svamax, binary_opt_single_n, all_float, mxz) +DEF_SVE_FUNCTION (svamin, binary_opt_single_n, all_float, mxz) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.h b/gcc/config/aarch64/aarch64-sve-builtins-base.h index 5bbf3569c4b..978cf7013f9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.h @@ -37,6 +37,8 @@ namespace aarch64_sve extern const function_base *const svadrd; extern const function_base *const svadrh; extern const function_base *const svadrw; + extern const function_base *const svamax; + extern const function_base *const svamin; extern const function_base *const svand; extern const function_base *const svandv; extern const function_base *const svasr; diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index de14f57071a..e9730b8c36a 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -460,6 +460,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED /* Floating Point Absolute Maximum/Minimum extension instructions are enabled through +faminmax. */ #define TARGET_FAMINMAX AARCH64_HAVE_ISA (FAMINMAX) +#define TARGET_SVE_FAMINMAX (TARGET_SVE && TARGET_FAMINMAX) /* Prefer different predicate registers for the output of a predicated operation over re-using an existing input predicate. */ diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c2fcd18306e..b993ac9a7f6 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -841,6 +841,8 @@ UNSPEC_COND_CMPNE_WIDE ; Used in aarch64-sve.md. UNSPEC_COND_FABS ; Used in aarch64-sve.md. UNSPEC_COND_FADD ; Used in aarch64-sve.md. + UNSPEC_COND_FAMAX ; Used in aarch64-sve.md. + UNSPEC_COND_FAMIN ; Used in aarch64-sve.md. UNSPEC_COND_FCADD90 ; Used in aarch64-sve.md. UNSPEC_COND_FCADD270 ; Used in aarch64-sve.md. UNSPEC_COND_FCMEQ ; Used in aarch64-sve.md. @@ -3082,6 +3084,8 @@ (define_int_iterator SVE_COND_ICVTF [UNSPEC_COND_SCVTF UNSPEC_COND_UCVTF]) (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_FADD + (UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") UNSPEC_COND_FDIV UNSPEC_COND_FMAX UNSPEC_COND_FMAXNM @@ -3114,7 +3118,9 @@ UNSPEC_COND_FMINNM UNSPEC_COND_FMUL]) -(define_int_iterator SVE_COND_FP_BINARY_REG [UNSPEC_COND_FDIV +(define_int_iterator SVE_COND_FP_BINARY_REG [(UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") + UNSPEC_COND_FDIV UNSPEC_COND_FMULX]) (define_int_iterator SVE_COND_FCADD [UNSPEC_COND_FCADD90 @@ -3694,6 +3700,8 @@ (UNSPEC_ZIP2Q "zip2q") (UNSPEC_COND_FABS "abs") (UNSPEC_COND_FADD "add") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCADD90 "cadd90") (UNSPEC_COND_FCADD270 "cadd270") (UNSPEC_COND_FCMLA "fcmla") @@ -4230,6 +4238,8 @@ (UNSPEC_FTSSEL "ftssel") (UNSPEC_COND_FABS "fabs") (UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCVTLT "fcvtlt") (UNSPEC_COND_FCVTX "fcvtx") (UNSPEC_COND_FDIV "fdiv") @@ -4254,6 +4264,8 @@ (UNSPEC_COND_FSUB "fsub")]) (define_int_attr sve_fp_op_rev [(UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FDIV "fdivr") (UNSPEC_COND_FMAX "fmax") (UNSPEC_COND_FMAXNM "fmaxnm") @@ -4390,6 +4402,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs1_operand [(UNSPEC_COND_FADD "register_operand") + (UNSPEC_COND_FAMAX "register_operand") + (UNSPEC_COND_FAMIN "register_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "register_operand") (UNSPEC_COND_FMAXNM "register_operand") @@ -4403,6 +4417,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs2_operand [(UNSPEC_COND_FADD "aarch64_sve_float_arith_with_sub_operand") + (UNSPEC_COND_FAMAX "aarch64_sve_float_maxmin_operand") + (UNSPEC_COND_FAMIN "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/aminmax.h b/gcc/testsuite/gcc.target/aarch64/aminmax.h new file mode 100644 index 00000000000..e901da84165 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/aminmax.h @@ -0,0 +1,13 @@ +#ifdef AMINMAX_IDIOM + +#define TEST1(TYPE) +__attribute__((noipa)) \ +void fn_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict out) { \ + for (int i = 0; i < N; i++) { \ + TYPE diff = b[i] - a[i]; \ + out[i] = diff > 0 ? diff : -diff; \ +} } + +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c new file mode 100644 index 00000000000..2646f29e60c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c @@ -0,0 +1,155 @@ +/* { dg-additional-options "-O3 -march=armv9-a+sve+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** amax_f16_m_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied1, svfloat16_t, + z0 = svamax_f16_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f16_m_tied2: +** mov z31\.d, z0\.d +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z31\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied2, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f16_m_untied: +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_untied, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_f16_x_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied1, svfloat16_t, + z0 = svamax_f16_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f16_x_tied2: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied2, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f16_x_untied: +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_untied, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied1, svfloat16_t, + z0 = svamax_f16_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied2, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f16_z_untied: +** movprfx z0\.h, p0/z, z1\.h +** famax z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_untied, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_n_f16_m_tied1: +** mov z7\.h, h7 +** famax z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amax_n_f16_m_tied1, svfloat16_t, svfloat16_t, float16_t, + z0 = svamax_n_f16_m (p0, z0, d7), + z0 = svamax_m (p0, z0, d7)) + +/* +** amax_n_f16_m_untied: +** mov z7\.h, h7 +** movprfx z0, z4 +** famax z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amax_n_f16_m_untied, svfloat16_t, svfloat16_t, float16_t, + z0 = svamax_n_f16_m (p0, z4, d7), + z0 = svamax_m (p0, z4, d7)) + +/* +** amax_n_f16_x_tied1: +** mov z7\.h, h7 +** famax z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amax_n_f16_x_tied1, svfloat16_t, svfloat16_t, float16_t, + z0 = svamax_n_f16_x (p0, z0, d7), + z0 = svamax_x (p0, z0, d7)) + +/* +** amax_n_f16_x_untied: +** mov z0\.h, h7 +** famax z0\.h, p0/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZD (amax_n_f16_x_untied, svfloat16_t, svfloat16_t, float16_t, + z0 = svamax_n_f16_x (p0, z4, d7), + z0 = svamax_x (p0, z4, d7)) + +/* +** amax_n_f16_z_tied1: +** mov z7\.h, h7 +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amax_n_f16_z_tied1, svfloat16_t, svfloat16_t, float16_t, + z0 = svamax_n_f16_z (p0, z0, d7), + z0 = svamax_z (p0, z0, d7)) + +/* +** amax_n_f16_z_untied: +** mov z7\.h, h7 +** movprfx z0\.h, p0/z, z4\.h +** famax z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amax_n_f16_z_untied, svfloat16_t, svfloat16_t, float16_t, + z0 = svamax_n_f16_z (p0, z4, d7), + z0 = svamax_z (p0, z4, d7)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c new file mode 100644 index 00000000000..5b5fd2076f1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c @@ -0,0 +1,155 @@ +/* { dg-additional-options "-O3 -march=armv9-a+sve+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** amax_f32_m_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied1, svfloat32_t, + z0 = svamax_f32_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f32_m_tied2: +** mov z31\.d, z0\.d +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z31\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied2, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f32_m_untied: +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_untied, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_f32_x_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied1, svfloat32_t, + z0 = svamax_f32_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f32_x_tied2: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied2, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f32_x_untied: +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_untied, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied1, svfloat32_t, + z0 = svamax_f32_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied2, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f32_z_untied: +** movprfx z0\.s, p0/z, z1\.s +** famax z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_untied, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_n_f32_m_tied1: +** mov z7\.s, s7 +** famax z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amax_n_f32_m_tied1, svfloat32_t, svfloat32_t, float32_t, + z0 = svamax_n_f32_m (p0, z0, d7), + z0 = svamax_m (p0, z0, d7)) + +/* +** amax_n_f32_m_untied: +** mov z7\.s, s7 +** movprfx z0, z4 +** famax z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amax_n_f32_m_untied, svfloat32_t, svfloat32_t, float32_t, + z0 = svamax_n_f32_m (p0, z4, d7), + z0 = svamax_m (p0, z4, d7)) + +/* +** amax_n_f32_x_tied1: +** mov z7\.s, s7 +** famax z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amax_n_f32_x_tied1, svfloat32_t, svfloat32_t, float32_t, + z0 = svamax_n_f32_x (p0, z0, d7), + z0 = svamax_x (p0, z0, d7)) + +/* +** amax_n_f32_x_untied: +** mov z0\.s, s7 +** famax z0\.s, p0/m, z0\.s, z4\.s +** ret +*/ +TEST_DUAL_ZD (amax_n_f32_x_untied, svfloat32_t, svfloat32_t, float32_t, + z0 = svamax_n_f32_x (p0, z4, d7), + z0 = svamax_x (p0, z4, d7)) + +/* +** amax_n_f32_z_tied1: +** mov z7\.s, s7 +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amax_n_f32_z_tied1, svfloat32_t, svfloat32_t, float32_t, + z0 = svamax_n_f32_z (p0, z0, d7), + z0 = svamax_z (p0, z0, d7)) + +/* +** amax_n_f32_z_untied: +** mov z7\.s, s7 +** movprfx z0\.s, p0/z, z4\.s +** famax z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amax_n_f32_z_untied, svfloat32_t, svfloat32_t, float32_t, + z0 = svamax_n_f32_z (p0, z4, d7), + z0 = svamax_z (p0, z4, d7)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c new file mode 100644 index 00000000000..4a13111dd0d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c @@ -0,0 +1,155 @@ +/* { dg-additional-options "-O3 -march=armv9-a+sve+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** amax_f64_m_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied1, svfloat64_t, + z0 = svamax_f64_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f64_m_tied2: +** mov z31\.d, z0\.d +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z31\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied2, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f64_m_untied: +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_untied, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_f64_x_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied1, svfloat64_t, + z0 = svamax_f64_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f64_x_tied2: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied2, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f64_x_untied: +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_untied, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied1, svfloat64_t, + z0 = svamax_f64_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied2, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f64_z_untied: +** movprfx z0\.d, p0/z, z1\.d +** famax z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_untied, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_n_f64_m_tied1: +** mov z7\.d, d7 +** famax z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amax_n_f64_m_tied1, svfloat64_t, svfloat64_t, float64_t, + z0 = svamax_n_f64_m (p0, z0, d7), + z0 = svamax_m (p0, z0, d7)) + +/* +** amax_n_f64_m_untied: +** mov z7\.d, d7 +** movprfx z0, z4 +** famax z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amax_n_f64_m_untied, svfloat64_t, svfloat64_t, float64_t, + z0 = svamax_n_f64_m (p0, z4, d7), + z0 = svamax_m (p0, z4, d7)) + +/* +** amax_n_f64_x_tied1: +** mov z7\.d, d7 +** famax z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amax_n_f64_x_tied1, svfloat64_t, svfloat64_t, float64_t, + z0 = svamax_n_f64_x (p0, z0, d7), + z0 = svamax_x (p0, z0, d7)) + +/* +** amax_n_f64_x_untied: +** mov z0\.d, d7 +** famax z0\.d, p0/m, z0\.d, z4\.d +** ret +*/ +TEST_DUAL_ZD (amax_n_f64_x_untied, svfloat64_t, svfloat64_t, float64_t, + z0 = svamax_n_f64_x (p0, z4, d7), + z0 = svamax_x (p0, z4, d7)) + +/* +** amax_n_f64_z_tied1: +** mov z7\.d, d7 +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amax_n_f64_z_tied1, svfloat64_t, svfloat64_t, float64_t, + z0 = svamax_n_f64_z (p0, z0, d7), + z0 = svamax_z (p0, z0, d7)) + +/* +** amax_n_f64_z_untied: +** mov z7\.d, d7 +** movprfx z0\.d, p0/z, z4\.d +** famax z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amax_n_f64_z_untied, svfloat64_t, svfloat64_t, float64_t, + z0 = svamax_n_f64_z (p0, z4, d7), + z0 = svamax_z (p0, z4, d7)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c new file mode 100644 index 00000000000..e53253e0cbe --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c @@ -0,0 +1,155 @@ +/* { dg-additional-options "-O3 -march=armv9-a+sve+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** amin_f16_m_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied1, svfloat16_t, + z0 = svamin_f16_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f16_m_tied2: +** mov z31\.d, z0\.d +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z31\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied2, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f16_m_untied: +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_untied, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_f16_x_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied1, svfloat16_t, + z0 = svamin_f16_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f16_x_tied2: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied2, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f16_x_untied: +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_untied, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied1, svfloat16_t, + z0 = svamin_f16_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied2, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f16_z_untied: +** movprfx z0\.h, p0/z, z1\.h +** famin z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_untied, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_n_f16_m_tied1: +** mov z7\.h, h7 +** famin z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amin_n_f16_m_tied1, svfloat16_t, svfloat16_t, float16_t, + z0 = svamin_n_f16_m (p0, z0, d7), + z0 = svamin_m (p0, z0, d7)) + +/* +** amin_n_f16_m_untied: +** mov z7\.h, h7 +** movprfx z0, z4 +** famin z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amin_n_f16_m_untied, svfloat16_t, svfloat16_t, float16_t, + z0 = svamin_n_f16_m (p0, z4, d7), + z0 = svamin_m (p0, z4, d7)) + +/* +** amin_n_f16_x_tied1: +** mov z7\.h, h7 +** famin z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amin_n_f16_x_tied1, svfloat16_t, svfloat16_t, float16_t, + z0 = svamin_n_f16_x (p0, z0, d7), + z0 = svamin_x (p0, z0, d7)) + +/* +** amin_n_f16_x_untied: +** mov z0\.h, h7 +** famin z0\.h, p0/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZD (amin_n_f16_x_untied, svfloat16_t, svfloat16_t, float16_t, + z0 = svamin_n_f16_x (p0, z4, d7), + z0 = svamin_x (p0, z4, d7)) + +/* +** amin_n_f16_z_tied1: +** mov z7\.h, h7 +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amin_n_f16_z_tied1, svfloat16_t, svfloat16_t, float16_t, + z0 = svamin_n_f16_z (p0, z0, d7), + z0 = svamin_z (p0, z0, d7)) + +/* +** amin_n_f16_z_untied: +** mov z7\.h, h7 +** movprfx z0\.h, p0/z, z4\.h +** famin z0\.h, p0/m, z0\.h, z7\.h +** ret +*/ +TEST_DUAL_ZD (amin_n_f16_z_untied, svfloat16_t, svfloat16_t, float16_t, + z0 = svamin_n_f16_z (p0, z4, d7), + z0 = svamin_z (p0, z4, d7)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c new file mode 100644 index 00000000000..9ea9efbe8de --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c @@ -0,0 +1,155 @@ +/* { dg-additional-options "-O3 -march=armv9-a+sve+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** amin_f32_m_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied1, svfloat32_t, + z0 = svamin_f32_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f32_m_tied2: +** mov z31\.d, z0\.d +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z31\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied2, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f32_m_untied: +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_untied, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_f32_x_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied1, svfloat32_t, + z0 = svamin_f32_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f32_x_tied2: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied2, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f32_x_untied: +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_untied, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied1, svfloat32_t, + z0 = svamin_f32_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied2, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f32_z_untied: +** movprfx z0\.s, p0/z, z1\.s +** famin z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_untied, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_n_f32_m_tied1: +** mov z7\.s, s7 +** famin z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amin_n_f32_m_tied1, svfloat32_t, svfloat32_t, float32_t, + z0 = svamin_n_f32_m (p0, z0, d7), + z0 = svamin_m (p0, z0, d7)) + +/* +** amin_n_f32_m_untied: +** mov z7\.s, s7 +** movprfx z0, z4 +** famin z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amin_n_f32_m_untied, svfloat32_t, svfloat32_t, float32_t, + z0 = svamin_n_f32_m (p0, z4, d7), + z0 = svamin_m (p0, z4, d7)) + +/* +** amin_n_f32_x_tied1: +** mov z7\.s, s7 +** famin z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amin_n_f32_x_tied1, svfloat32_t, svfloat32_t, float32_t, + z0 = svamin_n_f32_x (p0, z0, d7), + z0 = svamin_x (p0, z0, d7)) + +/* +** amin_n_f32_x_untied: +** mov z0\.s, s7 +** famin z0\.s, p0/m, z0\.s, z4\.s +** ret +*/ +TEST_DUAL_ZD (amin_n_f32_x_untied, svfloat32_t, svfloat32_t, float32_t, + z0 = svamin_n_f32_x (p0, z4, d7), + z0 = svamin_x (p0, z4, d7)) + +/* +** amin_n_f32_z_tied1: +** mov z7\.s, s7 +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amin_n_f32_z_tied1, svfloat32_t, svfloat32_t, float32_t, + z0 = svamin_n_f32_z (p0, z0, d7), + z0 = svamin_z (p0, z0, d7)) + +/* +** amin_n_f32_z_untied: +** mov z7\.s, s7 +** movprfx z0\.s, p0/z, z4\.s +** famin z0\.s, p0/m, z0\.s, z7\.s +** ret +*/ +TEST_DUAL_ZD (amin_n_f32_z_untied, svfloat32_t, svfloat32_t, float32_t, + z0 = svamin_n_f32_z (p0, z4, d7), + z0 = svamin_z (p0, z4, d7)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c new file mode 100644 index 00000000000..2570c3d0275 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c @@ -0,0 +1,155 @@ +/* { dg-additional-options "-O3 -march=armv9-a+sve+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** amin_f64_m_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied1, svfloat64_t, + z0 = svamin_f64_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f64_m_tied2: +** mov z31\.d, z0\.d +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z31\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied2, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f64_m_untied: +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_untied, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_f64_x_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied1, svfloat64_t, + z0 = svamin_f64_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f64_x_tied2: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied2, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f64_x_untied: +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_untied, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied1, svfloat64_t, + z0 = svamin_f64_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied2, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f64_z_untied: +** movprfx z0\.d, p0/z, z1\.d +** famin z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_untied, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_n_f64_m_tied1: +** mov z7\.d, d7 +** famin z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amin_n_f64_m_tied1, svfloat64_t, svfloat64_t, float64_t, + z0 = svamin_n_f64_m (p0, z0, d7), + z0 = svamin_m (p0, z0, d7)) + +/* +** amin_n_f64_m_untied: +** mov z7\.d, d7 +** movprfx z0, z4 +** famin z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amin_n_f64_m_untied, svfloat64_t, svfloat64_t, float64_t, + z0 = svamin_n_f64_m (p0, z4, d7), + z0 = svamin_m (p0, z4, d7)) + +/* +** amin_n_f64_x_tied1: +** mov z7\.d, d7 +** famin z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amin_n_f64_x_tied1, svfloat64_t, svfloat64_t, float64_t, + z0 = svamin_n_f64_x (p0, z0, d7), + z0 = svamin_x (p0, z0, d7)) + +/* +** amin_n_f64_x_untied: +** mov z0\.d, d7 +** famin z0\.d, p0/m, z0\.d, z4\.d +** ret +*/ +TEST_DUAL_ZD (amin_n_f64_x_untied, svfloat64_t, svfloat64_t, float64_t, + z0 = svamin_n_f64_x (p0, z4, d7), + z0 = svamin_x (p0, z4, d7)) + +/* +** amin_n_f64_z_tied1: +** mov z7\.d, d7 +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amin_n_f64_z_tied1, svfloat64_t, svfloat64_t, float64_t, + z0 = svamin_n_f64_z (p0, z0, d7), + z0 = svamin_z (p0, z0, d7)) + +/* +** amin_n_f64_z_untied: +** mov z7\.d, d7 +** movprfx z0\.d, p0/z, z4\.d +** famin z0\.d, p0/m, z0\.d, z7\.d +** ret +*/ +TEST_DUAL_ZD (amin_n_f64_z_untied, svfloat64_t, svfloat64_t, float64_t, + z0 = svamin_n_f64_z (p0, z4, d7), + z0 = svamin_z (p0, z4, d7)) From patchwork Fri Sep 13 09:06:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1985074 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=DJ6jbgzh; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=DJ6jbgzh; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X4pQj6lSJz1y2H for ; Fri, 13 Sep 2024 19:09:15 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 11BDF385B508 for ; Fri, 13 Sep 2024 09:09:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on20628.outbound.protection.outlook.com [IPv6:2a01:111:f403:2612::628]) by sourceware.org (Postfix) with ESMTPS id 4571A385AC29 for ; Fri, 13 Sep 2024 09:08:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4571A385AC29 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4571A385AC29 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2612::628 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1726218485; cv=pass; b=h5ScMaFWHy34G38qtzwfhKy1gHG1cgulm1ImDfD5GRBPnnwaLqzKMAg8BRTxTZfp0KGb1oig5iD1MelVayBiogG3LlfFzU0i0/ZQBEpxhWWCP4CXqV6U9BLdB9v+crV7SPiXDcfse6prxWvOIXbyuZCEjULugvFlczie3S7WoRg= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1726218485; c=relaxed/simple; bh=52GB3gWX/jXYQyJsnwxkWpltyzGqFEYWh7+yaUlCJVg=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=b7pt3TLxXLbRsy8u5tnK7JteJmsKdhG3/3eCfa4K7e2e2gV0CRSSQmE1BA3+O77fhvctPLE9jjP+G+Wu9Gmrgr4OPFtF2ejBgssxYSQxwmhYOUsmBd8BNzE2bI7p6Hd9Za79VwzO43CfoNzDImlewqtkmPH3a3P3Tjmnxdob+cw= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=DplcUlpHOWNlW4Uh1z835ng7g595SZ52KbKldhlVPhmchmNOASRMTXBOUHOH9MzIAPLihpFphkI8XuBbF4g6VxAipm8nEvLnZ8mRSTjzVg67x98idjVjFZFppG2pxFGcUU0qGFiQb/9hFeeTO5Nws6+BDFPDtoUdNtkRM81XBZt15HpZcNdm+rXv8fRYe0mT+JLr9VJlnj5oP4nhtpSuwzYhpw9bjE6wk4E3Q2P1lyi3jWIbQlqTNYnn4PbIuyKR1miDcYVxC2kU0cEQ5q89Qw5BqgPdf7NIRxk0q45WAde55/XAy2ihTNmuEUVqzqzULaBWDB6eB5CGkgW+Cv3yyQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MwkcnfkZnK48LxNsegEqItWCytEMkFOsc7ukbbJS+YM=; b=H/RCxWovuHGVEpgKeS3bBQUzQyk0duDPD1fxXyzlDZCIZ8FFM37/tIxoqgGqYMOCnZUOzqPlF9Az2ml07nEDfk0zdJgAPXlHWqHodGsYopK0wxpRsGH1AoeVWz3/waWeeSmKsLrKE4bDsZlkTo+LbfBZSsts1SV0rNUD8R3k7D/z9cgRKg1zdvAzIOWOoYpYKYxZyV0c9vRqaVmxohWNOEb/LMCHxKdF4DakgnHcfyFRO6fQutthB6gt8JpOpowVNqLGdtuU3zC8bmKapbeuKewQjTdVKk+ed1ZH4pbslgb9SZYCfckV/zCsddSPe9Po5+7VCGn+MEe84t/RMxB22g== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MwkcnfkZnK48LxNsegEqItWCytEMkFOsc7ukbbJS+YM=; b=DJ6jbgzh24e1BFFt1DN9Y9USfdYuUhDKmGdGnKQ7xiP8UIvORUBcZTHLh326MAoBO0pfJCKB3RazI/ngUs93I+6IT8zFNaAtuNIMH7vnaH/FZi5WDxiH6eISnj5Tf8s3DwD3Jo5G2Dd3zmJnFTK1s2XlEyFSXMLlKnRi85YNRDA= Received: from DU2PR04CA0015.eurprd04.prod.outlook.com (2603:10a6:10:3b::20) by GV1PR08MB11025.eurprd08.prod.outlook.com (2603:10a6:150:1ef::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7982.8; Fri, 13 Sep 2024 09:07:55 +0000 Received: from DU2PEPF00028D13.eurprd03.prod.outlook.com (2603:10a6:10:3b:cafe::ae) by DU2PR04CA0015.outlook.office365.com (2603:10a6:10:3b::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.27 via Frontend Transport; Fri, 13 Sep 2024 09:07:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF00028D13.mail.protection.outlook.com (10.167.242.27) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Fri, 13 Sep 2024 09:07:53 +0000 Received: ("Tessian outbound 70901c2e285e:v441"); Fri, 13 Sep 2024 09:07:53 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: e838f17570b0c862 X-CR-MTA-TID: 64aa7808 Received: from Lb614bdbc3fc0.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 5AB536D2-7947-4369-BDA7-6CB83971FD3F.1; Fri, 13 Sep 2024 09:07:45 +0000 Received: from EUR02-AM0-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lb614bdbc3fc0.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 13 Sep 2024 09:07:45 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=xsp6fWBapq9aKY8wXk6vcl9RsyqoCk1c27fdi2UuZ6A59Z6EwtNN2ALwx9sj5M2aAEXfLExmi1H/IPr+VWe73rrwsSIadrrMjQPuEFaeT5RGspzn2vcsB3hsrJbVOY4DWabaLzdTgoXVRbaYabClUwy/1fP2FD37MPGvnGVhaL/lXTP9hl/qWg2rOoXX92ZkziR3D5DuP6pZheatjQ0b78o7pbjv9d3vZI+J2ZScMub73LJSKU7T8jeHYo/1t+qAv2Oit+ZBGxnj0PTtynqmDirHf7DB+m+y56ejZFp6Bga1aot4xHp8wNSym1fqiMioex7c02SIJHzoT2WGsk6zOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MwkcnfkZnK48LxNsegEqItWCytEMkFOsc7ukbbJS+YM=; b=t5rkRTMjY4MdeNDC84QDEaWoBM2NFBqKjRfULyrK79K1Jj7xSX5jrYhvdvjHm/cpKBEnXzvK39SSn1Dh2IaeCgxeqPvDjkNxoWO504MhIs1iPjYqfqmTMQ/fyIUDhDLbAoja8GQvZjkNdPhmHqfsym1yK2QLm5zDIhwbDd7WQBMZZNuP2zMDqqdoCPYzbOqd36UMr408qJfy4s1bSURArIJDcnj+CtAH17XwoqWo7F+Xfo1WJq/Xva+mcyLB0PFdqrh2xIqEU26kqsLGfrvRV1WW3tXUV6Brf9+FPlQUqsSPJi02twOE+Fv8DgbkvuQfxBVVNPFGIX74V3DagcTXCg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MwkcnfkZnK48LxNsegEqItWCytEMkFOsc7ukbbJS+YM=; b=DJ6jbgzh24e1BFFt1DN9Y9USfdYuUhDKmGdGnKQ7xiP8UIvORUBcZTHLh326MAoBO0pfJCKB3RazI/ngUs93I+6IT8zFNaAtuNIMH7vnaH/FZi5WDxiH6eISnj5Tf8s3DwD3Jo5G2Dd3zmJnFTK1s2XlEyFSXMLlKnRi85YNRDA= Received: from AM4PR0302CA0004.eurprd03.prod.outlook.com (2603:10a6:205:2::17) by GV1PR08MB7897.eurprd08.prod.outlook.com (2603:10a6:150:5c::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7962.17; Fri, 13 Sep 2024 09:07:31 +0000 Received: from AMS0EPF00000194.eurprd05.prod.outlook.com (2603:10a6:205:2:cafe::30) by AM4PR0302CA0004.outlook.office365.com (2603:10a6:205:2::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.28 via Frontend Transport; Fri, 13 Sep 2024 09:07:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF00000194.mail.protection.outlook.com (10.167.16.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Fri, 13 Sep 2024 09:07:31 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 13 Sep 2024 09:07:29 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 13 Sep 2024 09:07:29 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Fri, 13 Sep 2024 09:07:29 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH 2/2] aarch64: Add codegen support for SVE2 faminmax Date: Fri, 13 Sep 2024 10:06:55 +0100 Message-ID: <20240913090655.1551666-3-saurabh.jha@arm.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240913090655.1551666-1-saurabh.jha@arm.com> References: <20240913090655.1551666-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF00000194:EE_|GV1PR08MB7897:EE_|DU2PEPF00028D13:EE_|GV1PR08MB11025:EE_ X-MS-Office365-Filtering-Correlation-Id: a5d0845b-eab6-48a5-6f5f-08dcd3d38d04 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|1800799024|82310400026|36860700013|376014; X-Microsoft-Antispam-Message-Info-Original: Is/A6RSa8GgE5hiRkTwqqSaRZPg16wPKgPqCAS5VjQfUjjmvlsO4gLVQ+uUQpdWbWWghDEO20KC3JRkbKJ3hzLQG3HA17L35hLPe1AAeh683mgTxY/DlQ2xAsBdXyms2Gg2SSonSecCjgAqh5xp94h8ig00bJHwT6cOHdm+vtI8yw2sm9TgCm9sOxnV0RUe6g87doj5wcD0C0Gc4yrX9hBhq+7Ixg5tqFvogFLul1qqwfgEGyqf6gk+aLMactyzUOZtJCna1oLGpOoazDh3anwvhVOvXIeVPYfH9w4Gw6JnCtOxaW+6iRoxVwTTkEcAgw7kBI0nUg96vhrS9HnofXvs7wN6UslSbV1O+S9cBXRSLzZMIFUjldZpJD1uSqOYcV8TrcR/au5tqTTCWRwjaCZ0woiqr1xt69sypGavSGVZt5EwdQeyisJdwzeyYNe30VVHDhtggWrels+LpZ1RKGojMlIQYWYyWhuQV3dGtcjgeveWjkgpyLFOUESG/H1mScdNylBlHIySGVofeiJs+rNFz9+4OolgZ111zymRsiK+HddgBomqFNYp6Mttz22fguD8TS2r/XZi1COymukKx18LS0CGcFHIaJbj5LMT0m0Czcszocfch5ODzXqLmV+eAS0X+0QSRvR8x5if4q/LlWg0Sv+GJXU/3yzJJiM2F+5GJPNnEpOMasaRBs19l+BxX1+uDsBf4ggNd3IqCrnSCUXBA/qJ3Xq8jfMHU/7O14PuCTUs3AdJwKJH1XFgNVG3YyDI/QEyzxdcPBJtE6yUczRVPt4zk1rQEoIv0sHM/h3WsRmctTn/VhTzkhVAQpLrFZUH8D4ZKRY4yN8dw2LF3hZ2YRVbeIGZG04t6tS2hkO2+yhmd56yl56EJbdhFTwHMjZJgoRYDcqvj3G82RLabcPQXJrX+bkY0y3W6MCtBobSYn/XFuv81PgqcI0G2EptGpsDzzhXfh9Z7Q5BVj9XsuaCz5uQdMa2d19iKZh8pjGMBXKobMGVq7vfXspj1BrcI3lZ/iCXQ4mYurzYBYX9yybCrRFPYyxeFRFdH9CjzSZ0wT7tcY0e5+4GwGKRGdwsfkupOPsiXUfod8Itz2hnkyQ8UKhS1i2PqBrrtzKw0fAg8K9aW6GlesK30rNXP0s/m70uTfAyCcxSfc9lb9S/n+joCFHpNu/tpsluKpyvkxqQMEALR+s9jRtFKDiH87wd7bxTNDCaJtr6fPwnJF0K3WPTbAedax7g8xjJsjKbP4EcEiiwGr638yvD0pVMKqLXMyqaq75ftkiwvuBRL3h7zmjiIdCbEZa/5ERGTUq1KN0xkXXUHrbuQAdap6OT4rEaPjyW3kw7DxoladOK/TpYScCRtOaaFEF0fR6kTa3uieKTkgkWczbVfg5/WN+pya7HKy10YROYZ5Rzqals/mJu24g== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(1800799024)(82310400026)(36860700013)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7897 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:205:2::17]; domain=AM4PR0302CA0004.eurprd03.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF00028D13.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 31da293c-1a60-4600-7bf4-08dcd3d37fbf X-Microsoft-Antispam: BCL:0; ARA:13230040|36860700013|82310400026|376014|1800799024|35042699022; X-Microsoft-Antispam-Message-Info: =?utf-8?q?xoxDb7T4t2TYDzI0PszWp5PwD2vQwvp?= =?utf-8?q?rt4nOI44En/s5V6S5TLt3Kt5v/Luw1N6o5XHzOSNt7YEkCUY9M46nlbJ0zPH1mK91?= =?utf-8?q?6SWBC1mR0uByYGwLLAN6xCXnoygFaxiV8rL7usWv8cLlqPfQSCUckHrIz2uOzqNXG?= =?utf-8?q?KrO5DSWG5MFLhGcxmkb0SkHwMiFkbwyv+t+YjQt9/lJHtfDfRcg4bFsniHJHkHmq4?= =?utf-8?q?TH5B5e6B+qSfd/aQ8L1OX31Yx+ZLT8wj9ORN1EDOZxr6nIxM8Id5sCpHuJbeE/bTO?= =?utf-8?q?+QHP0U6J9x5yKCC0Dj65nXnn+N9bmxRjvuI0oV+QWWVGCFfJOtoV0mkhgUSPS3H9/?= =?utf-8?q?adEg4GTd0sp3YQ9rGRDWWMxQ5dHWQqc/11AJuzDGWb91Klg+wB9xcqjV8nb1Y5qob?= =?utf-8?q?xFdwILCPVqFMFtXMOGQLj1wyT07w49sm6R87l4Tj8bGxdTtWuzSno8gMeBrLvcJLR?= =?utf-8?q?y5Pmx/jrh/DFQpGks+o9It9X98iZChr9OYNC2UeBme0PBBX/o2P75n7OqS/Sen2fF?= =?utf-8?q?VVOWicrGhqs0ktlwf2yaXLletaL3WmP0Nr7pD4GQaO4cmRDK1EHa3m2pcekBwKX9/?= =?utf-8?q?edcOW3C7nZ6uPqK2LCQW4Z5XOxYc6OoiYU+CtmtGqQ1IrHXfE6WQh/P9tW/YWVxos?= =?utf-8?q?Hb4TJz5fDiuXWMH+1tj9ZSp8HFvyRk74mY1OAQ/u2uOrXtSZPmYKlUMinoNRfaw7U?= =?utf-8?q?U/tGqi1uZMdDNN3Vff4kkki9QVNImki/E5mvwFACZI6NYGak2R97SUge1B1hBWAib?= =?utf-8?q?mSxe9E3c9gWNBclRBANMbIz1z6hY+2UdtvOOpaRVze+Snw3m+hPN2hMGQi9iXqS2L?= =?utf-8?q?ZYiXNuHQDeEN4T106XOyFxO5L6oTSFAJZOYRah7yR4CSJmoWJdqgIa+PjKaYGccCA?= =?utf-8?q?qcCPD2hI5jBrHayqB5tw3kwLP19SFJh9ZsSMZNDNr8h3+5c2K1gLiml6zs3p14AwQ?= =?utf-8?q?QT9Uuy7h9h4fOtqpHuK/NBzQkPeqciApcNGKjk9jgMM6tU2t7H2MiPGBfw3HhGGH/?= =?utf-8?q?lvay1rHA+KMOIeBpya0KCHsT66BrdKhKoPj8QRv102/pLLQc0afqB6hi4p2IRG4AV?= =?utf-8?q?PdoBYU5Bgo1K4y9QZozXHgEE1QItesLqoP6E2EeSEXcX7I23fQT/TzdE/8eLWdID1?= =?utf-8?q?RV1Hywjer7AkgfzXBCuPsJRZYMv2kAN0FdLtF43EHW9xnJgbKudBYe9ph4H6p7xtT?= =?utf-8?q?66UUxtR1dP/L8uQxW8xnNbd/xavHeMifgklGqL25hciwA+R4JvS5RfruxAGFiUM1U?= =?utf-8?q?lSlUsSgVTvokGl8beUDxiDiKo39KCspjf+UYsJT+w+D2tg/BKhSmw0jTcmIbIMx9+?= =?utf-8?q?2umMthx9MYLZdGn0+62mMEudnQHquHZ21g=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(36860700013)(82310400026)(376014)(1800799024)(35042699022); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Sep 2024 09:07:53.5956 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a5d0845b-eab6-48a5-6f5f-08dcd3d38d04 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028D13.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB11025 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking fmax/fmaxnm of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking fmin/fminnm of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (*aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax.c: New test. --- gcc/config/aarch64/aarch64-sve.md | 29 +++++++ gcc/config/aarch64/iterators.md | 6 ++ .../gcc.target/aarch64/sve/faminmax.c | 85 +++++++++++++++++++ 3 files changed, 120 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax.c diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index a5cd42be9d5..feb6438efde 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -11111,3 +11111,32 @@ return "sel\t%0., %3, %2., %1."; } ) + +;; ------------------------------------------------------------------------- +;; -- [FP] Absolute maximum and minimum +;; ------------------------------------------------------------------------- +;; Includes: +;; - FAMAX +;; - FAMIN +;; ------------------------------------------------------------------------- + +;; Predicated floating-point absolute maximum and minimum. +(define_insn "*aarch64_pred_faminmax_fused" + [(set (match_operand:SVE_FULL_F 0 "register_operand" "=w") + (unspec:SVE_FULL_F + [(match_operand: 1 "register_operand" "Upl") + (match_operand:SI 4 "aarch64_sve_gp_strictness" "w") + (unspec:SVE_FULL_F + [(match_operand 5) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 2 "register_operand" "w")] + UNSPEC_COND_FABS) + (unspec:SVE_FULL_F + [(match_operand 6) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 3 "register_operand" "w")] + UNSPEC_COND_FABS)] + SVE_COND_FP_MAXMIN))] + "TARGET_SVE_FAMINMAX" + "\t%0., %1/m, %0., %3." +) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index b993ac9a7f6..5bdf1970f92 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -4489,5 +4489,11 @@ (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) +(define_int_attr faminmax_cond_uns_op + [(UNSPEC_COND_FMAX "famax") + (UNSPEC_COND_FMAXNM "famax") + (UNSPEC_COND_FMIN "famin") + (UNSPEC_COND_FMINNM "famin")]) + (define_code_attr faminmax_op [(smax "famax") (smin "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax.c new file mode 100644 index 00000000000..b70e19fa276 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax.c @@ -0,0 +1,85 @@ +/* { dg-do assemble} */ +/* { dg-additional-options "-O3 -ffast-math -march=armv9-a+sve+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve" + +#define TEST_FAMAX(TYPE) \ + void fn_famax_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmax (temp1, temp2); \ + } \ + } \ + +#define TEST_FAMIN(TYPE) \ + void fn_famin_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmin (temp1, temp2); \ + } \ + } \ + +/* +** fn_famax_float16_t: +** ... +** famax z31.h, p6/m, z31.h, z30.h +** ... +** ret +*/ +TEST_FAMAX (float16_t) + +/* +** fn_famax_float32_t: +** ... +** famax z31.s, p6/m, z31.s, z30.s +** ... +** ret +*/ +TEST_FAMAX (float32_t) + +/* +** fn_famax_float64_t: +** ... +** famax z31.d, p6/m, z31.d, z30.d +** ... +** ret +*/ +TEST_FAMAX (float64_t) + +/* +** fn_famin_float16_t: +** ... +** famin z31.h, p6/m, z31.h, z30.h +** ... +** ret +*/ +TEST_FAMIN (float16_t) + +/* +** fn_famin_float32_t: +** ... +** famin z31.s, p6/m, z31.s, z30.s +** ... +** ret +*/ +TEST_FAMIN (float32_t) + +/* +** fn_famin_float64_t: +** ... +** famin z31.d, p6/m, z31.d, z30.d +** ... +** ret +*/ +TEST_FAMIN (float64_t)