From patchwork Thu Oct 3 09:51:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1992272 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=E6PL2Uhe; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=E6PL2Uhe; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XK6Rp1X1Lz1xt1 for ; Thu, 3 Oct 2024 19:52:54 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A1CB384A81B for ; Thu, 3 Oct 2024 09:52:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on20601.outbound.protection.outlook.com [IPv6:2a01:111:f403:260e::601]) by sourceware.org (Postfix) with ESMTPS id 8FFA9385C6CC for ; Thu, 3 Oct 2024 09:52:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8FFA9385C6CC Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8FFA9385C6CC Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260e::601 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949151; cv=pass; b=b7+BlOVudEk/qMyXGi4o2NgSHb0fH7OXrpNf3W7sFa0z99sSUKPXvJSYp/3PY7GcKG27buTHXYYtyUHi8EPc/Ay8Vkx//6Yyopj0Z6P0IREv1nd9+qXxam6TwtBmmV4t83hrymapyk2hve7PcwGrPi46RmjPg1/37YwoIze2inc= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949151; c=relaxed/simple; bh=ubvqqrdt2sJZ8Qvk8Fe8mLw+61ZKtaEpr1g8pFz862M=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=fTUr1E51pFU0uILlxYd6xtNLhgArP+OAJDYqYqc5ZYLGQo2+Imfj6/qZyrAYaFEDMZfwAxcoyt42AjYTmVTNdQDSSRJe640xeXDLqBfenHOmq3uqFzsib6eQ70OxUNCtNSeuFTCQVyjy9iSeCH1GzXaraygHudl7DNUWqJOmkmU= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=EwTb2Qm2zB9J2WEHxpIl/Bh4pO1bnBVHX6ppZlM+T5wgsYPWv2M1JbG9Fp6qiPxq01GxNPh9uyiYskRSQu0pU1cFKhmuyPEGgNXNAvQG4GuTMeyhTUAUvxEHXR/vVkxQISTo5juaYMUTVvS5ZWIW9lSo+sfZ674XqhKcTg6sg0QkCsk/WtfTHUQsRqFQhnvztlLG/PLY71hnxFITbpiNRYH8dJByuV4q6RPTw2+1//AKHBLgE4S/QPiwhCPH6ZJgILQv6znYvvuRIoWhJH9if6uVgfYfMIbOS3o5uuN4diDWbc/SNeXI7HJFWNZMXMq7QS0UadK6b/fh6tDvHp3BFQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=O/p9eJ4e8Luz982xskzA2BQybf/pI0jj27zMDqVbp8zi5og02HuuysDHiSveERguS9RGcOmRUxOSwLPx1U8P14r4QD/lvRnLStu9u/roKBx1W5zOp9LVCdxbYbLURwWsDGugxJiVuxCLglUSIBZe8vsB/EvaeoSB8HE9M/+4gyY3ehbMWvMCNWyhZMHAkczP/r3LJEUh+u+5SNyxDZrUEVDc+1ESaw34G/g07g1PUmxpsRS0gsVY2gaXkMuqMl6d2n2Mz6Q1iWhWwJNMsy/0YXCNS+m8se3y/qBBuFyMduzZsjguEoe/1Xqgv63Aj40l259CiBC2jobfRSlkcYw6Pg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=E6PL2UhePHzNRJGt93bthix3FKtbCoaxuMCbTwIDhFYOT6i/Fxa2QxowxklrVitRG4gHbRTR3cnm7fJ9BsmiJhL4hz2Wbtb005UzQoHEocVXrgLEQhZAo3JoHqdJhNmCfavjLBvBSJrtzcfD9EyjOw8HQeb7Zq5zovClEjU4AEQ= Received: from AS4P191CA0019.EURP191.PROD.OUTLOOK.COM (2603:10a6:20b:5d9::9) by AS8PR08MB9815.eurprd08.prod.outlook.com (2603:10a6:20b:614::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Thu, 3 Oct 2024 09:52:19 +0000 Received: from AMS1EPF00000048.eurprd04.prod.outlook.com (2603:10a6:20b:5d9:cafe::dd) by AS4P191CA0019.outlook.office365.com (2603:10a6:20b:5d9::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.17 via Frontend Transport; Thu, 3 Oct 2024 09:52:19 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS1EPF00000048.mail.protection.outlook.com (10.167.16.132) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8026.11 via Frontend Transport; Thu, 3 Oct 2024 09:52:18 +0000 Received: ("Tessian outbound 5b65fbeb7e07:v473"); Thu, 03 Oct 2024 09:52:18 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 07c47f646febfe93 X-TessianGatewayMetadata: JUYgZuQNKXlOaxkt+nirL2F1ggZva4jdfyxCGHbvSZMgacNzP+1rK0qoVkQffvFHdKmF28hfnqS6OvjJysFpuzs/RgKwVsRStuubH1L0/xOE2U1wKGGhptQBI0RDm02Ehw1pF3QhSeAaV6C35sbrEvEe4F5nLPi1mfglf/2/968= X-CR-MTA-TID: 64aa7808 Received: from Lf87a8acaa8bb.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id A2D5E0FD-51E0-4CB8-9E15-17FCDC6830D3.1; Thu, 03 Oct 2024 09:52:12 +0000 Received: from EUR03-AM7-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lf87a8acaa8bb.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 03 Oct 2024 09:52:12 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=c0ixvfdNGU7k9kapSGwbSHMv/YD5m4msWyiwZXzT2a7aui2dtE2LzX528hV0hglfcs+NXweOOi4vFklMbAgTJk8JMQok84RuCRC31qC5JJKDGWj9D4UTAOcJBJYr5+08zBo2rV7oOrsMmzwNcHL6mmMEqOY2tajfWBrIol6XRlJoL9DoK/TnNMTi6aearToYCJC5AUMbu2aqx7tvzHTLutJO1hUMPNqk5HByBSTkTZLcvpDBEZHU2pK4ak4O1mtZGoDZJqD+wmDNpuQec5F+p2EhDmLKC0b7grqczSWJ04P26qDGmi83FdWCKLcgaIVFN2CcyjikKgb5E+5+k6NT7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=SIfTXjBS6mWd5275rDO7cA1tkdhTANqjT9jO1oJiqfZisFMjUc+f/JcTPxGXfZXAm2f5l7b1dpCIDN6C6pC7zPwsxuKmK/xdGqMc33SKGWzSibpTk/rT5yeAjPhohbXyj/pwzGgApB/Ebr1wFBIyKisOwLhF1QnaaEjQtrFAkLbutKGg51b9qUHQcsiyYYigX6RjB6RjzpNMRyTNfqZ7gdavd7IZnvDA4Y95KsKngowRF0gV1h4dtXL6MJ6lbbXuaBSZwFC4aumNZHijctso7EUzfS86Qou/zZ8GPDAIXKAcWBCplHi5EeH9XpwaTKVoi4DHKxZTzfCYC8Kl4vOcNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=E6PL2UhePHzNRJGt93bthix3FKtbCoaxuMCbTwIDhFYOT6i/Fxa2QxowxklrVitRG4gHbRTR3cnm7fJ9BsmiJhL4hz2Wbtb005UzQoHEocVXrgLEQhZAo3JoHqdJhNmCfavjLBvBSJrtzcfD9EyjOw8HQeb7Zq5zovClEjU4AEQ= Received: from AM6P195CA0036.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:81::49) by PAVPR08MB9772.eurprd08.prod.outlook.com (2603:10a6:102:2f8::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Thu, 3 Oct 2024 09:52:08 +0000 Received: from AMS0EPF00000191.eurprd05.prod.outlook.com (2603:10a6:209:81:cafe::c7) by AM6P195CA0036.outlook.office365.com (2603:10a6:209:81::49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Thu, 3 Oct 2024 09:52:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF00000191.mail.protection.outlook.com (10.167.16.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Thu, 3 Oct 2024 09:52:05 +0000 Received: from AZ-NEU-EXJ01.Arm.com (10.240.25.132) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:04 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EXJ01.Arm.com (10.240.25.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:03 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 3 Oct 2024 09:52:03 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v4 1/2] aarch64: Add SVE2 faminmax intrinsics Date: Thu, 3 Oct 2024 10:51:56 +0100 Message-ID: <20241003095157.1390838-2-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003095157.1390838-1-saurabh.jha@arm.com> References: <20241003095157.1390838-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF00000191:EE_|PAVPR08MB9772:EE_|AMS1EPF00000048:EE_|AS8PR08MB9815:EE_ X-MS-Office365-Filtering-Correlation-Id: 1e505ebb-d370-4c54-57c1-08dce39111d5 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|376014|82310400026|1800799024|36860700013; X-Microsoft-Antispam-Message-Info-Original: /w7nJH2kQfkx5/wn95dm3Ak56G8sCPqxEq4AHIgQzRHt+7euuHVW+mPlgd5AyW02yKgHlftDhd0McIF/YFpC/opeA8qiJ4nuhGWDxSJyRMBC7+QS8PXozNm47GvnWtQthM70SjVV5HNPSLVLkiiKvuGUIQjjXiIT4lQZO/E4Hq5In8FwLMeBA4SUm4onZNj2oNVXTgzTBRo2R7HVLn62fsF9XM/7s0YgAn6H9VWq5vPTf0TtMkbzXT5SjaR0C/NP8VnB+xaAKmR8CqLMZO7dysf1RSdwL/0S6j496wZekb7Mba/OFY6tnT28BIHR0TXdVUDVaUAPYIQf7JJNd1+Pw7bBogeU3yOa0yPLuh5yZys8b5k5/TD0M4/3aHNb3whxW9K9Xj77t1Z6Ih6LfQxzJE/3h1K5lV+B/WHIf0ZzYLPZ0BXoDX5cMj5yR2I8ZGegvrpPaLiKeybKwhg792LH8IL3X3FteOFLJA/22kKYpmfA2/8zwzxGU83qSuZ0t4knDVpfp66APrCPjWqkinAKj9x0tL2uFce1KO38NIt6RE2K11HIYalrCTwlvC0+chnLOIPm9B/GyzDiS8TpSdRooSnIj1XiAhl8N5gFctN0JY9/yspRXUnSjlMcw0rbiPYjBfWKK4KOL76NI1m6irqGC9Cylh3LvhdA3CMuHzPHpS7ae53jJPZUaSQ8y2DNdEw7DVSlN9gNWfcHPNpk+vihDnJYEw10bZBXDs/rbg+SRvttybkTNLZgq5SGFTuaEeIdCWC6klqPrf39sI/uYE2/Th7P9k+cwh3oKpk6LVLka05iRZXuc1gelbQMl++EkJB57iY9m15Nxzi37DI8FqbsnT+3MfmMveeJxS5SoTyX8Ub5S4tYGqmOglPvNoj7bVlj8Ep15Az0KltB0Q8UIk7zv2EkzrMvsOTPkjyyZWfZU5NUlPM7sQC2veOdHK9Zs9PIufnIVgqxqijGVaXM+FpUjCzaqZqcw8JtpZeRz3umlHSFWUPhqSnULc155xMPE8SQdirD0e2ib97DLh1R2HF8YVrdi6H0wQlYHOOCkXxoWSgge/LZ8YE0HgV4uDEOFC59emOJ9O3IN9z4f8abtrzvydt197AemxiNvnDjUoF1qxmpCfT/mPQ/OH4aGkIrl0wOp+3PFyepo1u7giDpLVRfNUk92+uoz6O3iJgOFnoGNRo4EYMEUuKZrQiPf8GEdLQMY3f7qTscR2QLlFeOm2W8CnsBVrAR+efmWjVju2wF+L4F2SlK0GacM79Ew6c5LrXNYgQ15FpFn8i1KywrPqOhSTZOhlhIyQl4Vo00j9zBc3vY3YBsyOr00aMpenpzKquq2CR98jzEpsKoGXI0fgh3Fg== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(376014)(82310400026)(1800799024)(36860700013); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB9772 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:209:81::49]; domain=AM6P195CA0036.EURP195.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS1EPF00000048.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 24dfc1d2-641f-42a8-3fb6-08dce39109f7 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|376014|35042699022|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?rQ4uTE/t92sbCoOeToLB4pC34gyhAwK?= =?utf-8?q?lFGaHq4mfPJWud/7iFmT+jk0pQcGJ0UnixKdfsdIC+kdzg8IjMt1OdHS7YSTec/uw?= =?utf-8?q?+YbZndF6ESPDK8YuaYYofKxkpHL5HNdY42kiNG7XbuEPwCFt/YysCGl7vmuti1IwW?= =?utf-8?q?a3+3EEKby7/SeebGfeMZGXgTG8ue7Z1p56gCNZFWTNRT7IELM3bEDR9s1px6yjTgT?= =?utf-8?q?JxYcUCs5eYRAQwe6NdEkudztvwEL4ZTmkgJnKeE3H9x1U2z3B1JibUExmxBK71xsx?= =?utf-8?q?3RIE6kwG+9fBZtCsnUdDwSzJyupCSvOgxcIe0Nr0lhR+zaXKzAQCh1PBUPdheEV+6?= =?utf-8?q?P68XyYIpTQtVxGJub8OEG52csEQjBVUfs5QeIMIzYF/Ffe0bLXsoLYH28VKtQekwa?= =?utf-8?q?GY/4L7pOd8LWENS+K2Lnku/FZjPj/vTdTVuiMXpZRFHseXP+VxY+eZGs1VzGGwy+f?= =?utf-8?q?GQio+aF7LjjGqnog/vdPabzwaSNK6Asijul8DUxCApLFhU2NSBLeth783+7l9hj+H?= =?utf-8?q?BtG05b4mFars5H+Sx9p7zdVA13Lc91WIBigo9g4H3QZU4CaWwzyq1nZ34mYjQ1A2g?= =?utf-8?q?XHkmTRGlCXHfNX/aoOKVzwwObuwvJkaHJXOfwHEf65up8iIFkvbAvM/A4S9guim7g?= =?utf-8?q?/Dw+qY8wc/o1FPsuNp0MbIXxWg8f3sMZRc08fjfHxJC4+b50ErbiqpcPE7M+DNG4U?= =?utf-8?q?GE7LXMtxHwYCB+pd3oSuZ7SlmKU9ABaazor/63ekWZzO6IOFGWrFiiNN2M9x89eEq?= =?utf-8?q?GNz5ujwlk0dQw67+ZDiqCcu10nvre208pzyZ9Eo5L7l2JfUYtsYPoj5QZC7D3GgFT?= =?utf-8?q?GpukD1jVmyJk+I0dZENWuFdaGrtrjMMeBp6Z9MsXunM5SVAUgki2lXqqbb9LNTXyJ?= =?utf-8?q?TdrTrmExtGXiKo8b7/qTqXPlH5e6V8GHY+PaK22hN8zZdV6KrywdShEbyDjs/MM4N?= =?utf-8?q?3LuR0QabHwxE064wiXjx3dYEfri9kYsHY2cfHsxa6waYZY81rdIBRfK9iDE6nyjc6?= =?utf-8?q?dY3RAV90nIZzpdgMvEUgmxeUxBrU6u7cWLa8AgW8FljnWBgJxJsJIRDU+BLuWBjUp?= =?utf-8?q?aoT+QEAe4hyAJBTcAXHP2ee/+W8RI3367kIvbYsm1bKH8v0dZqGuDMy812NboJJyH?= =?utf-8?q?S1jD90CNSRSLkf2XAxgw4D621RW7yL/7IYThkp3oBHO2tryQB0FP7YUm/SEUASJCs?= =?utf-8?q?d4shJuQgUOKx0Ipbuh0GknXJ4bYC6sZsEPwApjPsqCAqa1iWu5aNRTTEVHGSxmjMG?= =?utf-8?q?CoaGM7MbYSo8DnERuc6nEWgy7i3SCAtoH5g=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(376014)(35042699022)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2024 09:52:18.7943 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1e505ebb-d370-4c54-57c1-08dce39111d5 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS1EPF00000048.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9815 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max|min]_[m|x|z] * sva[max|min]_[f16|f32|f64]_[m|x|z] * sva[max|min]_n_[f16|f32|f64]_[m|x|z] gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.def (REQUIRED_EXTENSIONS): Add faminmax intrinsics behind a flag. (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.h: Declaring function bases for the new intrinsics. * config/aarch64/aarch64.h (TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax. * config/aarch64/iterators.md: New unspecs, iterators, and attrs for the new intrinsics. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test. --- .../aarch64/aarch64-sve-builtins-base.cc | 4 + .../aarch64/aarch64-sve-builtins-base.def | 5 + .../aarch64/aarch64-sve-builtins-base.h | 2 + gcc/config/aarch64/aarch64.h | 1 + gcc/config/aarch64/iterators.md | 18 +- .../aarch64/sve2/acle/asm/amax_f16.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f32.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f64.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f16.c | 311 +++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f32.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f64.c | 312 ++++++++++++++++++ 11 files changed, 1900 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 4b33585d981..b189818d643 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -3071,6 +3071,10 @@ FUNCTION (svadrb, svadr_bhwd_impl, (0)) FUNCTION (svadrd, svadr_bhwd_impl, (3)) FUNCTION (svadrh, svadr_bhwd_impl, (1)) FUNCTION (svadrw, svadr_bhwd_impl, (2)) +FUNCTION (svamax, cond_or_uncond_unspec_function, + (UNSPEC_COND_FAMAX, UNSPEC_FAMAX)) +FUNCTION (svamin, cond_or_uncond_unspec_function, + (UNSPEC_COND_FAMIN, UNSPEC_FAMIN)) FUNCTION (svand, rtx_code_function, (AND, AND)) FUNCTION (svandv, reduction, (UNSPEC_ANDV)) FUNCTION (svasr, rtx_code_function, (ASHIFTRT, ASHIFTRT)) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 65fcba91586..95e04e4393d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -379,3 +379,8 @@ DEF_SVE_FUNCTION (svzip2q, binary, all_data, none) DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_FAMINMAX +DEF_SVE_FUNCTION (svamax, binary_opt_single_n, all_float, mxz) +DEF_SVE_FUNCTION (svamin, binary_opt_single_n, all_float, mxz) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.h b/gcc/config/aarch64/aarch64-sve-builtins-base.h index 5bbf3569c4b..978cf7013f9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.h @@ -37,6 +37,8 @@ namespace aarch64_sve extern const function_base *const svadrd; extern const function_base *const svadrh; extern const function_base *const svadrw; + extern const function_base *const svamax; + extern const function_base *const svamin; extern const function_base *const svand; extern const function_base *const svandv; extern const function_base *const svasr; diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index ec8fde783b3..34f56a4b869 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -470,6 +470,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED /* Floating Point Absolute Maximum/Minimum extension instructions are enabled through +faminmax. */ #define TARGET_FAMINMAX AARCH64_HAVE_ISA (FAMINMAX) +#define TARGET_SVE_FAMINMAX (TARGET_SVE && TARGET_FAMINMAX) /* Prefer different predicate registers for the output of a predicated operation over re-using an existing input predicate. */ diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 0836dee61c9..c06f8c2c90f 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -841,6 +841,8 @@ UNSPEC_COND_CMPNE_WIDE ; Used in aarch64-sve.md. UNSPEC_COND_FABS ; Used in aarch64-sve.md. UNSPEC_COND_FADD ; Used in aarch64-sve.md. + UNSPEC_COND_FAMAX ; Used in aarch64-sve.md. + UNSPEC_COND_FAMIN ; Used in aarch64-sve.md. UNSPEC_COND_FCADD90 ; Used in aarch64-sve.md. UNSPEC_COND_FCADD270 ; Used in aarch64-sve.md. UNSPEC_COND_FCMEQ ; Used in aarch64-sve.md. @@ -3085,6 +3087,8 @@ (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_FADD + (UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") UNSPEC_COND_FDIV UNSPEC_COND_FMAX UNSPEC_COND_FMAXNM @@ -3124,7 +3128,9 @@ UNSPEC_COND_SMIN]) (define_int_iterator SVE_COND_FP_BINARY_REG - [UNSPEC_COND_FDIV + [(UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") + UNSPEC_COND_FDIV UNSPEC_COND_FMULX UNSPEC_COND_SMAX UNSPEC_COND_SMIN]) @@ -3701,6 +3707,8 @@ (UNSPEC_ZIP2Q "zip2q") (UNSPEC_COND_FABS "abs") (UNSPEC_COND_FADD "add") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCADD90 "cadd90") (UNSPEC_COND_FCADD270 "cadd270") (UNSPEC_COND_FCMLA "fcmla") @@ -4237,6 +4245,8 @@ (UNSPEC_FTSSEL "ftssel") (UNSPEC_COND_FABS "fabs") (UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCVTLT "fcvtlt") (UNSPEC_COND_FCVTX "fcvtx") (UNSPEC_COND_FDIV "fdiv") @@ -4263,6 +4273,8 @@ (UNSPEC_COND_SMIN "fminnm")]) (define_int_attr sve_fp_op_rev [(UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FDIV "fdivr") (UNSPEC_COND_FMAX "fmax") (UNSPEC_COND_FMAXNM "fmaxnm") @@ -4401,6 +4413,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs1_operand [(UNSPEC_COND_FADD "register_operand") + (UNSPEC_COND_FAMAX "register_operand") + (UNSPEC_COND_FAMIN "register_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "register_operand") (UNSPEC_COND_FMAXNM "register_operand") @@ -4416,6 +4430,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs2_operand [(UNSPEC_COND_FADD "aarch64_sve_float_arith_with_sub_operand") + (UNSPEC_COND_FAMAX "aarch64_sve_float_maxmin_operand") + (UNSPEC_COND_FAMIN "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c new file mode 100644 index 00000000000..de4a6f8efaa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f16_m_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied1, svfloat16_t, + z0 = svamax_f16_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f16_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, \1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied2, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f16_m_untied: +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_untied, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_h4_f16_m_tied1: +** mov (z[0-9]+\.h), h4 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_m_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_h4_f16_m_untied: +** mov (z[0-9]+\.h), h4 +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_m_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f16_m: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_m, svfloat16_t, + z0 = svamax_n_f16_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied1, svfloat16_t, + z0 = svamax_f16_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied2, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f16_z_untied: +** ( +** movprfx z0\.h, p0/z, z1\.h +** famax z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0\.h, p0/z, z2\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_untied, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_h4_f16_z_tied1: +** mov (z[0-9]+\.h), h4 +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_z_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_h4_f16_z_untied: +** mov (z[0-9]+\.h), h4 +** ( +** movprfx z0\.h, p0/z, z1\.h +** famax z0\.h, p0/m, z0\.h, \1 +** | +** movprfx z0\.h, p0/z, \1 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_z_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f16_z: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_z, svfloat16_t, + z0 = svamax_n_f16_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f16_x_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied1, svfloat16_t, + z0 = svamax_f16_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f16_x_tied2: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied2, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f16_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0, z2 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_untied, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_h4_f16_x_tied1: +** mov (z[0-9]+\.h), h4 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_x_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_h4_f16_x_untied: +** mov z0\.h, h4 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_x_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f16_x_tied1: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f16_x_untied: +** fmov z0\.h, #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_tied1, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z0, z1), + z0 = svamax_x (svptrue_b16 (), z0, z1)) + +/* +** ptrue_amax_f16_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_tied2, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z1, z0), + z0 = svamax_x (svptrue_b16 (), z1, z0)) + +/* +** ptrue_amax_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_untied, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z1, z2), + z0 = svamax_x (svptrue_b16 (), z1, z2)) + +/* +** ptrue_amax_0_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 0), + z0 = svamax_x (svptrue_b16 (), z0, 0)) + +/* +** ptrue_amax_0_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 0), + z0 = svamax_x (svptrue_b16 (), z1, 0)) + +/* +** ptrue_amax_1_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 1), + z0 = svamax_x (svptrue_b16 (), z0, 1)) + +/* +** ptrue_amax_1_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 1), + z0 = svamax_x (svptrue_b16 (), z1, 1)) + +/* +** ptrue_amax_2_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 2), + z0 = svamax_x (svptrue_b16 (), z0, 2)) + +/* +** ptrue_amax_2_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 2), + z0 = svamax_x (svptrue_b16 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c new file mode 100644 index 00000000000..24280724c95 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f32_m_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied1, svfloat32_t, + z0 = svamax_f32_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f32_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, \1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied2, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f32_m_untied: +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_untied, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_s4_f32_m_tied1: +** mov (z[0-9]+\.s), s4 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_m_tied1, svfloat32_t, float, + z0 = svamax_n_f32_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_s4_f32_m_untied: +** mov (z[0-9]+\.s), s4 +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_m_untied, svfloat32_t, float, + z0 = svamax_n_f32_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f32_m: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_m, svfloat32_t, + z0 = svamax_n_f32_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied1, svfloat32_t, + z0 = svamax_f32_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied2, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f32_z_untied: +** ( +** movprfx z0\.s, p0/z, z1\.s +** famax z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0\.s, p0/z, z2\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_untied, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_s4_f32_z_tied1: +** mov (z[0-9]+\.s), s4 +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_z_tied1, svfloat32_t, float, + z0 = svamax_n_f32_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_s4_f32_z_untied: +** mov (z[0-9]+\.s), s4 +** ( +** movprfx z0\.s, p0/z, z1\.s +** famax z0\.s, p0/m, z0\.s, \1 +** | +** movprfx z0\.s, p0/z, \1 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_z_untied, svfloat32_t, float, + z0 = svamax_n_f32_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f32_z: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_z, svfloat32_t, + z0 = svamax_n_f32_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f32_x_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied1, svfloat32_t, + z0 = svamax_f32_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f32_x_tied2: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied2, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f32_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0, z2 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_untied, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_s4_f32_x_tied1: +** mov (z[0-9]+\.s), s4 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_x_tied1, svfloat32_t, float, + z0 = svamax_n_f32_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_s4_f32_x_untied: +** mov z0\.s, s4 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_x_untied, svfloat32_t, float, + z0 = svamax_n_f32_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f32_x_tied1: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f32_x_untied: +** fmov z0\.s, #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_tied1, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z0, z1), + z0 = svamax_x (svptrue_b32 (), z0, z1)) + +/* +** ptrue_amax_f32_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_tied2, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z1, z0), + z0 = svamax_x (svptrue_b32 (), z1, z0)) + +/* +** ptrue_amax_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_untied, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z1, z2), + z0 = svamax_x (svptrue_b32 (), z1, z2)) + +/* +** ptrue_amax_0_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 0), + z0 = svamax_x (svptrue_b32 (), z0, 0)) + +/* +** ptrue_amax_0_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 0), + z0 = svamax_x (svptrue_b32 (), z1, 0)) + +/* +** ptrue_amax_1_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 1), + z0 = svamax_x (svptrue_b32 (), z0, 1)) + +/* +** ptrue_amax_1_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 1), + z0 = svamax_x (svptrue_b32 (), z1, 1)) + +/* +** ptrue_amax_2_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 2), + z0 = svamax_x (svptrue_b32 (), z0, 2)) + +/* +** ptrue_amax_2_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 2), + z0 = svamax_x (svptrue_b32 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c new file mode 100644 index 00000000000..5b73db45d8b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f64_m_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied1, svfloat64_t, + z0 = svamax_f64_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f64_m_tied2: +** mov (z[0-9]+\.d), z0\.d +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied2, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f64_m_untied: +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_untied, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_d4_f64_m_tied1: +** mov (z[0-9]+\.d), d4 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_m_tied1, svfloat64_t, double, + z0 = svamax_n_f64_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_d4_f64_m_untied: +** mov (z[0-9]+\.d), d4 +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_m_untied, svfloat64_t, double, + z0 = svamax_n_f64_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f64_m: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_m, svfloat64_t, + z0 = svamax_n_f64_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied1, svfloat64_t, + z0 = svamax_f64_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied2, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f64_z_untied: +** ( +** movprfx z0\.d, p0/z, z1\.d +** famax z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0\.d, p0/z, z2\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_untied, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_d4_f64_z_tied1: +** mov (z[0-9]+\.d), d4 +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_z_tied1, svfloat64_t, double, + z0 = svamax_n_f64_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_d4_f64_z_untied: +** mov (z[0-9]+\.d), d4 +** ( +** movprfx z0\.d, p0/z, z1\.d +** famax z0\.d, p0/m, z0\.d, \1 +** | +** movprfx z0\.d, p0/z, \1 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_z_untied, svfloat64_t, double, + z0 = svamax_n_f64_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f64_z: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_z, svfloat64_t, + z0 = svamax_n_f64_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f64_x_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied1, svfloat64_t, + z0 = svamax_f64_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f64_x_tied2: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied2, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f64_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0, z2 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_untied, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_d4_f64_x_tied1: +** mov (z[0-9]+\.d), d4 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_x_tied1, svfloat64_t, double, + z0 = svamax_n_f64_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_d4_f64_x_untied: +** mov z0\.d, d4 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_x_untied, svfloat64_t, double, + z0 = svamax_n_f64_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f64_x_tied1: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f64_x_untied: +** fmov z0\.d, #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_tied1, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z0, z1), + z0 = svamax_x (svptrue_b64 (), z0, z1)) + +/* +** ptrue_amax_f64_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_tied2, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z1, z0), + z0 = svamax_x (svptrue_b64 (), z1, z0)) + +/* +** ptrue_amax_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_untied, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z1, z2), + z0 = svamax_x (svptrue_b64 (), z1, z2)) + +/* +** ptrue_amax_0_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 0), + z0 = svamax_x (svptrue_b64 (), z0, 0)) + +/* +** ptrue_amax_0_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 0), + z0 = svamax_x (svptrue_b64 (), z1, 0)) + +/* +** ptrue_amax_1_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 1), + z0 = svamax_x (svptrue_b64 (), z0, 1)) + +/* +** ptrue_amax_1_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 1), + z0 = svamax_x (svptrue_b64 (), z1, 1)) + +/* +** ptrue_amax_2_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 2), + z0 = svamax_x (svptrue_b64 (), z0, 2)) + +/* +** ptrue_amax_2_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 2), + z0 = svamax_x (svptrue_b64 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c new file mode 100644 index 00000000000..bb3f20db93d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c @@ -0,0 +1,311 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f16_m_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied1, svfloat16_t, + z0 = svamin_f16_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f16_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, \1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied2, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f16_m_untied: +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_untied, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_h4_f16_m_tied1: +** mov (z[0-9]+\.h), h4 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_m_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_h4_f16_m_untied: +** mov (z[0-9]+\.h), h4 +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_m_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f16_m: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_m, svfloat16_t, + z0 = svamin_n_f16_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied1, svfloat16_t, + z0 = svamin_f16_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied2, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f16_z_untied: +** ( +** movprfx z0\.h, p0/z, z1\.h +** famin z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0\.h, p0/z, z2\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_untied, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_h4_f16_z_tied1: +** mov (z[0-9]+\.h), h4 +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_z_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_h4_f16_z_untied: +** mov (z[0-9]+\.h), h4 +** ( +** movprfx z0\.h, p0/z, z1\.h +** famin z0\.h, p0/m, z0\.h, \1 +** | +** movprfx z0\.h, p0/z, \1 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_z_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f16_z: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_z, svfloat16_t, + z0 = svamin_n_f16_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f16_x_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied1, svfloat16_t, + z0 = svamin_f16_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f16_x_tied2: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied2, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f16_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0, z2 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_untied, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_h4_f16_x_tied1: +** mov (z[0-9]+\.h), h4 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_x_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_h4_f16_x_untied: +** mov z0\.h, h4 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_x_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) +/* +** amin_2_f16_x_tied1: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f16_x_untied: +** fmov z0\.h, #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_tied1, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z0, z1), + z0 = svamin_x (svptrue_b16 (), z0, z1)) + +/* +** ptrue_amin_f16_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_tied2, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z1, z0), + z0 = svamin_x (svptrue_b16 (), z1, z0)) + +/* +** ptrue_amin_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_untied, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z1, z2), + z0 = svamin_x (svptrue_b16 (), z1, z2)) + +/* +** ptrue_amin_0_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 0), + z0 = svamin_x (svptrue_b16 (), z0, 0)) + +/* +** ptrue_amin_0_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 0), + z0 = svamin_x (svptrue_b16 (), z1, 0)) + +/* +** ptrue_amin_1_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 1), + z0 = svamin_x (svptrue_b16 (), z0, 1)) + +/* +** ptrue_amin_1_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 1), + z0 = svamin_x (svptrue_b16 (), z1, 1)) + +/* +** ptrue_amin_2_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 2), + z0 = svamin_x (svptrue_b16 (), z0, 2)) + +/* +** ptrue_amin_2_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 2), + z0 = svamin_x (svptrue_b16 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c new file mode 100644 index 00000000000..704f5d62c59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f32_m_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied1, svfloat32_t, + z0 = svamin_f32_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f32_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, \1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied2, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f32_m_untied: +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_untied, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_s4_f32_m_tied1: +** mov (z[0-9]+\.s), s4 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_m_tied1, svfloat32_t, float, + z0 = svamin_n_f32_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_s4_f32_m_untied: +** mov (z[0-9]+\.s), s4 +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_m_untied, svfloat32_t, float, + z0 = svamin_n_f32_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f32_m: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_m, svfloat32_t, + z0 = svamin_n_f32_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied1, svfloat32_t, + z0 = svamin_f32_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied2, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f32_z_untied: +** ( +** movprfx z0\.s, p0/z, z1\.s +** famin z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0\.s, p0/z, z2\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_untied, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_s4_f32_z_tied1: +** mov (z[0-9]+\.s), s4 +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_z_tied1, svfloat32_t, float, + z0 = svamin_n_f32_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_s4_f32_z_untied: +** mov (z[0-9]+\.s), s4 +** ( +** movprfx z0\.s, p0/z, z1\.s +** famin z0\.s, p0/m, z0\.s, \1 +** | +** movprfx z0\.s, p0/z, \1 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_z_untied, svfloat32_t, float, + z0 = svamin_n_f32_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f32_z: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_z, svfloat32_t, + z0 = svamin_n_f32_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f32_x_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied1, svfloat32_t, + z0 = svamin_f32_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f32_x_tied2: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied2, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f32_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0, z2 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_untied, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_s4_f32_x_tied1: +** mov (z[0-9]+\.s), s4 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_x_tied1, svfloat32_t, float, + z0 = svamin_n_f32_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_s4_f32_x_untied: +** mov z0\.s, s4 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_x_untied, svfloat32_t, float, + z0 = svamin_n_f32_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) + +/* +** amin_2_f32_x_tied1: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f32_x_untied: +** fmov z0\.s, #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_tied1, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z0, z1), + z0 = svamin_x (svptrue_b32 (), z0, z1)) + +/* +** ptrue_amin_f32_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_tied2, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z1, z0), + z0 = svamin_x (svptrue_b32 (), z1, z0)) + +/* +** ptrue_amin_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_untied, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z1, z2), + z0 = svamin_x (svptrue_b32 (), z1, z2)) + +/* +** ptrue_amin_0_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 0), + z0 = svamin_x (svptrue_b32 (), z0, 0)) + +/* +** ptrue_amin_0_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 0), + z0 = svamin_x (svptrue_b32 (), z1, 0)) + +/* +** ptrue_amin_1_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 1), + z0 = svamin_x (svptrue_b32 (), z0, 1)) + +/* +** ptrue_amin_1_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 1), + z0 = svamin_x (svptrue_b32 (), z1, 1)) + +/* +** ptrue_amin_2_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 2), + z0 = svamin_x (svptrue_b32 (), z0, 2)) + +/* +** ptrue_amin_2_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 2), + z0 = svamin_x (svptrue_b32 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c new file mode 100644 index 00000000000..d2880d8507c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f64_m_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied1, svfloat64_t, + z0 = svamin_f64_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f64_m_tied2: +** mov (z[0-9]+\.d), z0\.d +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied2, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f64_m_untied: +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_untied, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_d4_f64_m_tied1: +** mov (z[0-9]+\.d), d4 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_m_tied1, svfloat64_t, double, + z0 = svamin_n_f64_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_d4_f64_m_untied: +** mov (z[0-9]+\.d), d4 +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_m_untied, svfloat64_t, double, + z0 = svamin_n_f64_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f64_m: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_m, svfloat64_t, + z0 = svamin_n_f64_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied1, svfloat64_t, + z0 = svamin_f64_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied2, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f64_z_untied: +** ( +** movprfx z0\.d, p0/z, z1\.d +** famin z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0\.d, p0/z, z2\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_untied, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_d4_f64_z_tied1: +** mov (z[0-9]+\.d), d4 +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_z_tied1, svfloat64_t, double, + z0 = svamin_n_f64_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_d4_f64_z_untied: +** mov (z[0-9]+\.d), d4 +** ( +** movprfx z0\.d, p0/z, z1\.d +** famin z0\.d, p0/m, z0\.d, \1 +** | +** movprfx z0\.d, p0/z, \1 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_z_untied, svfloat64_t, double, + z0 = svamin_n_f64_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f64_z: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_z, svfloat64_t, + z0 = svamin_n_f64_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f64_x_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied1, svfloat64_t, + z0 = svamin_f64_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f64_x_tied2: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied2, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f64_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0, z2 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_untied, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_d4_f64_x_tied1: +** mov (z[0-9]+\.d), d4 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_x_tied1, svfloat64_t, double, + z0 = svamin_n_f64_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_d4_f64_x_untied: +** mov z0\.d, d4 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_x_untied, svfloat64_t, double, + z0 = svamin_n_f64_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) + +/* +** amin_2_f64_x_tied1: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f64_x_untied: +** fmov z0\.d, #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_tied1, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z0, z1), + z0 = svamin_x (svptrue_b64 (), z0, z1)) + +/* +** ptrue_amin_f64_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_tied2, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z1, z0), + z0 = svamin_x (svptrue_b64 (), z1, z0)) + +/* +** ptrue_amin_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_untied, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z1, z2), + z0 = svamin_x (svptrue_b64 (), z1, z2)) + +/* +** ptrue_amin_0_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 0), + z0 = svamin_x (svptrue_b64 (), z0, 0)) + +/* +** ptrue_amin_0_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 0), + z0 = svamin_x (svptrue_b64 (), z1, 0)) + +/* +** ptrue_amin_1_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 1), + z0 = svamin_x (svptrue_b64 (), z0, 1)) + +/* +** ptrue_amin_1_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 1), + z0 = svamin_x (svptrue_b64 (), z1, 1)) + +/* +** ptrue_amin_2_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 2), + z0 = svamin_x (svptrue_b64 (), z0, 2)) + +/* +** ptrue_amin_2_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 2), + z0 = svamin_x (svptrue_b64 (), z1, 2))