From patchwork Thu Oct 3 09:51:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1992272 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=E6PL2Uhe; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=E6PL2Uhe; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XK6Rp1X1Lz1xt1 for ; Thu, 3 Oct 2024 19:52:54 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A1CB384A81B for ; Thu, 3 Oct 2024 09:52:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on20601.outbound.protection.outlook.com [IPv6:2a01:111:f403:260e::601]) by sourceware.org (Postfix) with ESMTPS id 8FFA9385C6CC for ; Thu, 3 Oct 2024 09:52:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8FFA9385C6CC Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8FFA9385C6CC Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260e::601 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949151; cv=pass; b=b7+BlOVudEk/qMyXGi4o2NgSHb0fH7OXrpNf3W7sFa0z99sSUKPXvJSYp/3PY7GcKG27buTHXYYtyUHi8EPc/Ay8Vkx//6Yyopj0Z6P0IREv1nd9+qXxam6TwtBmmV4t83hrymapyk2hve7PcwGrPi46RmjPg1/37YwoIze2inc= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949151; c=relaxed/simple; bh=ubvqqrdt2sJZ8Qvk8Fe8mLw+61ZKtaEpr1g8pFz862M=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=fTUr1E51pFU0uILlxYd6xtNLhgArP+OAJDYqYqc5ZYLGQo2+Imfj6/qZyrAYaFEDMZfwAxcoyt42AjYTmVTNdQDSSRJe640xeXDLqBfenHOmq3uqFzsib6eQ70OxUNCtNSeuFTCQVyjy9iSeCH1GzXaraygHudl7DNUWqJOmkmU= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=EwTb2Qm2zB9J2WEHxpIl/Bh4pO1bnBVHX6ppZlM+T5wgsYPWv2M1JbG9Fp6qiPxq01GxNPh9uyiYskRSQu0pU1cFKhmuyPEGgNXNAvQG4GuTMeyhTUAUvxEHXR/vVkxQISTo5juaYMUTVvS5ZWIW9lSo+sfZ674XqhKcTg6sg0QkCsk/WtfTHUQsRqFQhnvztlLG/PLY71hnxFITbpiNRYH8dJByuV4q6RPTw2+1//AKHBLgE4S/QPiwhCPH6ZJgILQv6znYvvuRIoWhJH9if6uVgfYfMIbOS3o5uuN4diDWbc/SNeXI7HJFWNZMXMq7QS0UadK6b/fh6tDvHp3BFQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=O/p9eJ4e8Luz982xskzA2BQybf/pI0jj27zMDqVbp8zi5og02HuuysDHiSveERguS9RGcOmRUxOSwLPx1U8P14r4QD/lvRnLStu9u/roKBx1W5zOp9LVCdxbYbLURwWsDGugxJiVuxCLglUSIBZe8vsB/EvaeoSB8HE9M/+4gyY3ehbMWvMCNWyhZMHAkczP/r3LJEUh+u+5SNyxDZrUEVDc+1ESaw34G/g07g1PUmxpsRS0gsVY2gaXkMuqMl6d2n2Mz6Q1iWhWwJNMsy/0YXCNS+m8se3y/qBBuFyMduzZsjguEoe/1Xqgv63Aj40l259CiBC2jobfRSlkcYw6Pg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=E6PL2UhePHzNRJGt93bthix3FKtbCoaxuMCbTwIDhFYOT6i/Fxa2QxowxklrVitRG4gHbRTR3cnm7fJ9BsmiJhL4hz2Wbtb005UzQoHEocVXrgLEQhZAo3JoHqdJhNmCfavjLBvBSJrtzcfD9EyjOw8HQeb7Zq5zovClEjU4AEQ= Received: from AS4P191CA0019.EURP191.PROD.OUTLOOK.COM (2603:10a6:20b:5d9::9) by AS8PR08MB9815.eurprd08.prod.outlook.com (2603:10a6:20b:614::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Thu, 3 Oct 2024 09:52:19 +0000 Received: from AMS1EPF00000048.eurprd04.prod.outlook.com (2603:10a6:20b:5d9:cafe::dd) by AS4P191CA0019.outlook.office365.com (2603:10a6:20b:5d9::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.17 via Frontend Transport; Thu, 3 Oct 2024 09:52:19 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS1EPF00000048.mail.protection.outlook.com (10.167.16.132) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8026.11 via Frontend Transport; Thu, 3 Oct 2024 09:52:18 +0000 Received: ("Tessian outbound 5b65fbeb7e07:v473"); Thu, 03 Oct 2024 09:52:18 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 07c47f646febfe93 X-TessianGatewayMetadata: JUYgZuQNKXlOaxkt+nirL2F1ggZva4jdfyxCGHbvSZMgacNzP+1rK0qoVkQffvFHdKmF28hfnqS6OvjJysFpuzs/RgKwVsRStuubH1L0/xOE2U1wKGGhptQBI0RDm02Ehw1pF3QhSeAaV6C35sbrEvEe4F5nLPi1mfglf/2/968= X-CR-MTA-TID: 64aa7808 Received: from Lf87a8acaa8bb.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id A2D5E0FD-51E0-4CB8-9E15-17FCDC6830D3.1; Thu, 03 Oct 2024 09:52:12 +0000 Received: from EUR03-AM7-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lf87a8acaa8bb.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 03 Oct 2024 09:52:12 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=c0ixvfdNGU7k9kapSGwbSHMv/YD5m4msWyiwZXzT2a7aui2dtE2LzX528hV0hglfcs+NXweOOi4vFklMbAgTJk8JMQok84RuCRC31qC5JJKDGWj9D4UTAOcJBJYr5+08zBo2rV7oOrsMmzwNcHL6mmMEqOY2tajfWBrIol6XRlJoL9DoK/TnNMTi6aearToYCJC5AUMbu2aqx7tvzHTLutJO1hUMPNqk5HByBSTkTZLcvpDBEZHU2pK4ak4O1mtZGoDZJqD+wmDNpuQec5F+p2EhDmLKC0b7grqczSWJ04P26qDGmi83FdWCKLcgaIVFN2CcyjikKgb5E+5+k6NT7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=SIfTXjBS6mWd5275rDO7cA1tkdhTANqjT9jO1oJiqfZisFMjUc+f/JcTPxGXfZXAm2f5l7b1dpCIDN6C6pC7zPwsxuKmK/xdGqMc33SKGWzSibpTk/rT5yeAjPhohbXyj/pwzGgApB/Ebr1wFBIyKisOwLhF1QnaaEjQtrFAkLbutKGg51b9qUHQcsiyYYigX6RjB6RjzpNMRyTNfqZ7gdavd7IZnvDA4Y95KsKngowRF0gV1h4dtXL6MJ6lbbXuaBSZwFC4aumNZHijctso7EUzfS86Qou/zZ8GPDAIXKAcWBCplHi5EeH9XpwaTKVoi4DHKxZTzfCYC8Kl4vOcNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0X3tEFiD0WUngOHZmiK0lnZS+DmybTslSt8r6TG+YLs=; b=E6PL2UhePHzNRJGt93bthix3FKtbCoaxuMCbTwIDhFYOT6i/Fxa2QxowxklrVitRG4gHbRTR3cnm7fJ9BsmiJhL4hz2Wbtb005UzQoHEocVXrgLEQhZAo3JoHqdJhNmCfavjLBvBSJrtzcfD9EyjOw8HQeb7Zq5zovClEjU4AEQ= Received: from AM6P195CA0036.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:81::49) by PAVPR08MB9772.eurprd08.prod.outlook.com (2603:10a6:102:2f8::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Thu, 3 Oct 2024 09:52:08 +0000 Received: from AMS0EPF00000191.eurprd05.prod.outlook.com (2603:10a6:209:81:cafe::c7) by AM6P195CA0036.outlook.office365.com (2603:10a6:209:81::49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Thu, 3 Oct 2024 09:52:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF00000191.mail.protection.outlook.com (10.167.16.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Thu, 3 Oct 2024 09:52:05 +0000 Received: from AZ-NEU-EXJ01.Arm.com (10.240.25.132) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:04 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EXJ01.Arm.com (10.240.25.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:03 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 3 Oct 2024 09:52:03 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v4 1/2] aarch64: Add SVE2 faminmax intrinsics Date: Thu, 3 Oct 2024 10:51:56 +0100 Message-ID: <20241003095157.1390838-2-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003095157.1390838-1-saurabh.jha@arm.com> References: <20241003095157.1390838-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF00000191:EE_|PAVPR08MB9772:EE_|AMS1EPF00000048:EE_|AS8PR08MB9815:EE_ X-MS-Office365-Filtering-Correlation-Id: 1e505ebb-d370-4c54-57c1-08dce39111d5 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|376014|82310400026|1800799024|36860700013; X-Microsoft-Antispam-Message-Info-Original: /w7nJH2kQfkx5/wn95dm3Ak56G8sCPqxEq4AHIgQzRHt+7euuHVW+mPlgd5AyW02yKgHlftDhd0McIF/YFpC/opeA8qiJ4nuhGWDxSJyRMBC7+QS8PXozNm47GvnWtQthM70SjVV5HNPSLVLkiiKvuGUIQjjXiIT4lQZO/E4Hq5In8FwLMeBA4SUm4onZNj2oNVXTgzTBRo2R7HVLn62fsF9XM/7s0YgAn6H9VWq5vPTf0TtMkbzXT5SjaR0C/NP8VnB+xaAKmR8CqLMZO7dysf1RSdwL/0S6j496wZekb7Mba/OFY6tnT28BIHR0TXdVUDVaUAPYIQf7JJNd1+Pw7bBogeU3yOa0yPLuh5yZys8b5k5/TD0M4/3aHNb3whxW9K9Xj77t1Z6Ih6LfQxzJE/3h1K5lV+B/WHIf0ZzYLPZ0BXoDX5cMj5yR2I8ZGegvrpPaLiKeybKwhg792LH8IL3X3FteOFLJA/22kKYpmfA2/8zwzxGU83qSuZ0t4knDVpfp66APrCPjWqkinAKj9x0tL2uFce1KO38NIt6RE2K11HIYalrCTwlvC0+chnLOIPm9B/GyzDiS8TpSdRooSnIj1XiAhl8N5gFctN0JY9/yspRXUnSjlMcw0rbiPYjBfWKK4KOL76NI1m6irqGC9Cylh3LvhdA3CMuHzPHpS7ae53jJPZUaSQ8y2DNdEw7DVSlN9gNWfcHPNpk+vihDnJYEw10bZBXDs/rbg+SRvttybkTNLZgq5SGFTuaEeIdCWC6klqPrf39sI/uYE2/Th7P9k+cwh3oKpk6LVLka05iRZXuc1gelbQMl++EkJB57iY9m15Nxzi37DI8FqbsnT+3MfmMveeJxS5SoTyX8Ub5S4tYGqmOglPvNoj7bVlj8Ep15Az0KltB0Q8UIk7zv2EkzrMvsOTPkjyyZWfZU5NUlPM7sQC2veOdHK9Zs9PIufnIVgqxqijGVaXM+FpUjCzaqZqcw8JtpZeRz3umlHSFWUPhqSnULc155xMPE8SQdirD0e2ib97DLh1R2HF8YVrdi6H0wQlYHOOCkXxoWSgge/LZ8YE0HgV4uDEOFC59emOJ9O3IN9z4f8abtrzvydt197AemxiNvnDjUoF1qxmpCfT/mPQ/OH4aGkIrl0wOp+3PFyepo1u7giDpLVRfNUk92+uoz6O3iJgOFnoGNRo4EYMEUuKZrQiPf8GEdLQMY3f7qTscR2QLlFeOm2W8CnsBVrAR+efmWjVju2wF+L4F2SlK0GacM79Ew6c5LrXNYgQ15FpFn8i1KywrPqOhSTZOhlhIyQl4Vo00j9zBc3vY3YBsyOr00aMpenpzKquq2CR98jzEpsKoGXI0fgh3Fg== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(376014)(82310400026)(1800799024)(36860700013); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB9772 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:209:81::49]; domain=AM6P195CA0036.EURP195.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS1EPF00000048.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 24dfc1d2-641f-42a8-3fb6-08dce39109f7 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|376014|35042699022|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?rQ4uTE/t92sbCoOeToLB4pC34gyhAwK?= =?utf-8?q?lFGaHq4mfPJWud/7iFmT+jk0pQcGJ0UnixKdfsdIC+kdzg8IjMt1OdHS7YSTec/uw?= =?utf-8?q?+YbZndF6ESPDK8YuaYYofKxkpHL5HNdY42kiNG7XbuEPwCFt/YysCGl7vmuti1IwW?= =?utf-8?q?a3+3EEKby7/SeebGfeMZGXgTG8ue7Z1p56gCNZFWTNRT7IELM3bEDR9s1px6yjTgT?= =?utf-8?q?JxYcUCs5eYRAQwe6NdEkudztvwEL4ZTmkgJnKeE3H9x1U2z3B1JibUExmxBK71xsx?= =?utf-8?q?3RIE6kwG+9fBZtCsnUdDwSzJyupCSvOgxcIe0Nr0lhR+zaXKzAQCh1PBUPdheEV+6?= =?utf-8?q?P68XyYIpTQtVxGJub8OEG52csEQjBVUfs5QeIMIzYF/Ffe0bLXsoLYH28VKtQekwa?= =?utf-8?q?GY/4L7pOd8LWENS+K2Lnku/FZjPj/vTdTVuiMXpZRFHseXP+VxY+eZGs1VzGGwy+f?= =?utf-8?q?GQio+aF7LjjGqnog/vdPabzwaSNK6Asijul8DUxCApLFhU2NSBLeth783+7l9hj+H?= =?utf-8?q?BtG05b4mFars5H+Sx9p7zdVA13Lc91WIBigo9g4H3QZU4CaWwzyq1nZ34mYjQ1A2g?= =?utf-8?q?XHkmTRGlCXHfNX/aoOKVzwwObuwvJkaHJXOfwHEf65up8iIFkvbAvM/A4S9guim7g?= =?utf-8?q?/Dw+qY8wc/o1FPsuNp0MbIXxWg8f3sMZRc08fjfHxJC4+b50ErbiqpcPE7M+DNG4U?= =?utf-8?q?GE7LXMtxHwYCB+pd3oSuZ7SlmKU9ABaazor/63ekWZzO6IOFGWrFiiNN2M9x89eEq?= =?utf-8?q?GNz5ujwlk0dQw67+ZDiqCcu10nvre208pzyZ9Eo5L7l2JfUYtsYPoj5QZC7D3GgFT?= =?utf-8?q?GpukD1jVmyJk+I0dZENWuFdaGrtrjMMeBp6Z9MsXunM5SVAUgki2lXqqbb9LNTXyJ?= =?utf-8?q?TdrTrmExtGXiKo8b7/qTqXPlH5e6V8GHY+PaK22hN8zZdV6KrywdShEbyDjs/MM4N?= =?utf-8?q?3LuR0QabHwxE064wiXjx3dYEfri9kYsHY2cfHsxa6waYZY81rdIBRfK9iDE6nyjc6?= =?utf-8?q?dY3RAV90nIZzpdgMvEUgmxeUxBrU6u7cWLa8AgW8FljnWBgJxJsJIRDU+BLuWBjUp?= =?utf-8?q?aoT+QEAe4hyAJBTcAXHP2ee/+W8RI3367kIvbYsm1bKH8v0dZqGuDMy812NboJJyH?= =?utf-8?q?S1jD90CNSRSLkf2XAxgw4D621RW7yL/7IYThkp3oBHO2tryQB0FP7YUm/SEUASJCs?= =?utf-8?q?d4shJuQgUOKx0Ipbuh0GknXJ4bYC6sZsEPwApjPsqCAqa1iWu5aNRTTEVHGSxmjMG?= =?utf-8?q?CoaGM7MbYSo8DnERuc6nEWgy7i3SCAtoH5g=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(376014)(35042699022)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2024 09:52:18.7943 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1e505ebb-d370-4c54-57c1-08dce39111d5 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS1EPF00000048.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9815 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max|min]_[m|x|z] * sva[max|min]_[f16|f32|f64]_[m|x|z] * sva[max|min]_n_[f16|f32|f64]_[m|x|z] gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.def (REQUIRED_EXTENSIONS): Add faminmax intrinsics behind a flag. (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.h: Declaring function bases for the new intrinsics. * config/aarch64/aarch64.h (TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax. * config/aarch64/iterators.md: New unspecs, iterators, and attrs for the new intrinsics. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test. --- .../aarch64/aarch64-sve-builtins-base.cc | 4 + .../aarch64/aarch64-sve-builtins-base.def | 5 + .../aarch64/aarch64-sve-builtins-base.h | 2 + gcc/config/aarch64/aarch64.h | 1 + gcc/config/aarch64/iterators.md | 18 +- .../aarch64/sve2/acle/asm/amax_f16.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f32.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f64.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f16.c | 311 +++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f32.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f64.c | 312 ++++++++++++++++++ 11 files changed, 1900 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 4b33585d981..b189818d643 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -3071,6 +3071,10 @@ FUNCTION (svadrb, svadr_bhwd_impl, (0)) FUNCTION (svadrd, svadr_bhwd_impl, (3)) FUNCTION (svadrh, svadr_bhwd_impl, (1)) FUNCTION (svadrw, svadr_bhwd_impl, (2)) +FUNCTION (svamax, cond_or_uncond_unspec_function, + (UNSPEC_COND_FAMAX, UNSPEC_FAMAX)) +FUNCTION (svamin, cond_or_uncond_unspec_function, + (UNSPEC_COND_FAMIN, UNSPEC_FAMIN)) FUNCTION (svand, rtx_code_function, (AND, AND)) FUNCTION (svandv, reduction, (UNSPEC_ANDV)) FUNCTION (svasr, rtx_code_function, (ASHIFTRT, ASHIFTRT)) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 65fcba91586..95e04e4393d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -379,3 +379,8 @@ DEF_SVE_FUNCTION (svzip2q, binary, all_data, none) DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_FAMINMAX +DEF_SVE_FUNCTION (svamax, binary_opt_single_n, all_float, mxz) +DEF_SVE_FUNCTION (svamin, binary_opt_single_n, all_float, mxz) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.h b/gcc/config/aarch64/aarch64-sve-builtins-base.h index 5bbf3569c4b..978cf7013f9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.h @@ -37,6 +37,8 @@ namespace aarch64_sve extern const function_base *const svadrd; extern const function_base *const svadrh; extern const function_base *const svadrw; + extern const function_base *const svamax; + extern const function_base *const svamin; extern const function_base *const svand; extern const function_base *const svandv; extern const function_base *const svasr; diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index ec8fde783b3..34f56a4b869 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -470,6 +470,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED /* Floating Point Absolute Maximum/Minimum extension instructions are enabled through +faminmax. */ #define TARGET_FAMINMAX AARCH64_HAVE_ISA (FAMINMAX) +#define TARGET_SVE_FAMINMAX (TARGET_SVE && TARGET_FAMINMAX) /* Prefer different predicate registers for the output of a predicated operation over re-using an existing input predicate. */ diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 0836dee61c9..c06f8c2c90f 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -841,6 +841,8 @@ UNSPEC_COND_CMPNE_WIDE ; Used in aarch64-sve.md. UNSPEC_COND_FABS ; Used in aarch64-sve.md. UNSPEC_COND_FADD ; Used in aarch64-sve.md. + UNSPEC_COND_FAMAX ; Used in aarch64-sve.md. + UNSPEC_COND_FAMIN ; Used in aarch64-sve.md. UNSPEC_COND_FCADD90 ; Used in aarch64-sve.md. UNSPEC_COND_FCADD270 ; Used in aarch64-sve.md. UNSPEC_COND_FCMEQ ; Used in aarch64-sve.md. @@ -3085,6 +3087,8 @@ (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_FADD + (UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") UNSPEC_COND_FDIV UNSPEC_COND_FMAX UNSPEC_COND_FMAXNM @@ -3124,7 +3128,9 @@ UNSPEC_COND_SMIN]) (define_int_iterator SVE_COND_FP_BINARY_REG - [UNSPEC_COND_FDIV + [(UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") + UNSPEC_COND_FDIV UNSPEC_COND_FMULX UNSPEC_COND_SMAX UNSPEC_COND_SMIN]) @@ -3701,6 +3707,8 @@ (UNSPEC_ZIP2Q "zip2q") (UNSPEC_COND_FABS "abs") (UNSPEC_COND_FADD "add") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCADD90 "cadd90") (UNSPEC_COND_FCADD270 "cadd270") (UNSPEC_COND_FCMLA "fcmla") @@ -4237,6 +4245,8 @@ (UNSPEC_FTSSEL "ftssel") (UNSPEC_COND_FABS "fabs") (UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCVTLT "fcvtlt") (UNSPEC_COND_FCVTX "fcvtx") (UNSPEC_COND_FDIV "fdiv") @@ -4263,6 +4273,8 @@ (UNSPEC_COND_SMIN "fminnm")]) (define_int_attr sve_fp_op_rev [(UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FDIV "fdivr") (UNSPEC_COND_FMAX "fmax") (UNSPEC_COND_FMAXNM "fmaxnm") @@ -4401,6 +4413,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs1_operand [(UNSPEC_COND_FADD "register_operand") + (UNSPEC_COND_FAMAX "register_operand") + (UNSPEC_COND_FAMIN "register_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "register_operand") (UNSPEC_COND_FMAXNM "register_operand") @@ -4416,6 +4430,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs2_operand [(UNSPEC_COND_FADD "aarch64_sve_float_arith_with_sub_operand") + (UNSPEC_COND_FAMAX "aarch64_sve_float_maxmin_operand") + (UNSPEC_COND_FAMIN "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c new file mode 100644 index 00000000000..de4a6f8efaa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f16_m_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied1, svfloat16_t, + z0 = svamax_f16_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f16_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, \1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied2, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f16_m_untied: +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_untied, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_h4_f16_m_tied1: +** mov (z[0-9]+\.h), h4 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_m_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_h4_f16_m_untied: +** mov (z[0-9]+\.h), h4 +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_m_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f16_m: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_m, svfloat16_t, + z0 = svamax_n_f16_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied1, svfloat16_t, + z0 = svamax_f16_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied2, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f16_z_untied: +** ( +** movprfx z0\.h, p0/z, z1\.h +** famax z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0\.h, p0/z, z2\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_untied, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_h4_f16_z_tied1: +** mov (z[0-9]+\.h), h4 +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_z_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_h4_f16_z_untied: +** mov (z[0-9]+\.h), h4 +** ( +** movprfx z0\.h, p0/z, z1\.h +** famax z0\.h, p0/m, z0\.h, \1 +** | +** movprfx z0\.h, p0/z, \1 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_z_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f16_z: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_z, svfloat16_t, + z0 = svamax_n_f16_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f16_x_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied1, svfloat16_t, + z0 = svamax_f16_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f16_x_tied2: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied2, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f16_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0, z2 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_untied, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_h4_f16_x_tied1: +** mov (z[0-9]+\.h), h4 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_x_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_h4_f16_x_untied: +** mov z0\.h, h4 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_x_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f16_x_tied1: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f16_x_untied: +** fmov z0\.h, #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_tied1, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z0, z1), + z0 = svamax_x (svptrue_b16 (), z0, z1)) + +/* +** ptrue_amax_f16_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_tied2, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z1, z0), + z0 = svamax_x (svptrue_b16 (), z1, z0)) + +/* +** ptrue_amax_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_untied, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z1, z2), + z0 = svamax_x (svptrue_b16 (), z1, z2)) + +/* +** ptrue_amax_0_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 0), + z0 = svamax_x (svptrue_b16 (), z0, 0)) + +/* +** ptrue_amax_0_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 0), + z0 = svamax_x (svptrue_b16 (), z1, 0)) + +/* +** ptrue_amax_1_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 1), + z0 = svamax_x (svptrue_b16 (), z0, 1)) + +/* +** ptrue_amax_1_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 1), + z0 = svamax_x (svptrue_b16 (), z1, 1)) + +/* +** ptrue_amax_2_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 2), + z0 = svamax_x (svptrue_b16 (), z0, 2)) + +/* +** ptrue_amax_2_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 2), + z0 = svamax_x (svptrue_b16 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c new file mode 100644 index 00000000000..24280724c95 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f32_m_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied1, svfloat32_t, + z0 = svamax_f32_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f32_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, \1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied2, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f32_m_untied: +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_untied, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_s4_f32_m_tied1: +** mov (z[0-9]+\.s), s4 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_m_tied1, svfloat32_t, float, + z0 = svamax_n_f32_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_s4_f32_m_untied: +** mov (z[0-9]+\.s), s4 +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_m_untied, svfloat32_t, float, + z0 = svamax_n_f32_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f32_m: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_m, svfloat32_t, + z0 = svamax_n_f32_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied1, svfloat32_t, + z0 = svamax_f32_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied2, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f32_z_untied: +** ( +** movprfx z0\.s, p0/z, z1\.s +** famax z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0\.s, p0/z, z2\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_untied, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_s4_f32_z_tied1: +** mov (z[0-9]+\.s), s4 +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_z_tied1, svfloat32_t, float, + z0 = svamax_n_f32_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_s4_f32_z_untied: +** mov (z[0-9]+\.s), s4 +** ( +** movprfx z0\.s, p0/z, z1\.s +** famax z0\.s, p0/m, z0\.s, \1 +** | +** movprfx z0\.s, p0/z, \1 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_z_untied, svfloat32_t, float, + z0 = svamax_n_f32_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f32_z: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_z, svfloat32_t, + z0 = svamax_n_f32_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f32_x_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied1, svfloat32_t, + z0 = svamax_f32_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f32_x_tied2: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied2, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f32_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0, z2 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_untied, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_s4_f32_x_tied1: +** mov (z[0-9]+\.s), s4 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_x_tied1, svfloat32_t, float, + z0 = svamax_n_f32_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_s4_f32_x_untied: +** mov z0\.s, s4 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_x_untied, svfloat32_t, float, + z0 = svamax_n_f32_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f32_x_tied1: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f32_x_untied: +** fmov z0\.s, #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_tied1, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z0, z1), + z0 = svamax_x (svptrue_b32 (), z0, z1)) + +/* +** ptrue_amax_f32_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_tied2, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z1, z0), + z0 = svamax_x (svptrue_b32 (), z1, z0)) + +/* +** ptrue_amax_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_untied, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z1, z2), + z0 = svamax_x (svptrue_b32 (), z1, z2)) + +/* +** ptrue_amax_0_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 0), + z0 = svamax_x (svptrue_b32 (), z0, 0)) + +/* +** ptrue_amax_0_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 0), + z0 = svamax_x (svptrue_b32 (), z1, 0)) + +/* +** ptrue_amax_1_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 1), + z0 = svamax_x (svptrue_b32 (), z0, 1)) + +/* +** ptrue_amax_1_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 1), + z0 = svamax_x (svptrue_b32 (), z1, 1)) + +/* +** ptrue_amax_2_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 2), + z0 = svamax_x (svptrue_b32 (), z0, 2)) + +/* +** ptrue_amax_2_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 2), + z0 = svamax_x (svptrue_b32 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c new file mode 100644 index 00000000000..5b73db45d8b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f64_m_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied1, svfloat64_t, + z0 = svamax_f64_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f64_m_tied2: +** mov (z[0-9]+\.d), z0\.d +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied2, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f64_m_untied: +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_untied, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_d4_f64_m_tied1: +** mov (z[0-9]+\.d), d4 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_m_tied1, svfloat64_t, double, + z0 = svamax_n_f64_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_d4_f64_m_untied: +** mov (z[0-9]+\.d), d4 +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_m_untied, svfloat64_t, double, + z0 = svamax_n_f64_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f64_m: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_m, svfloat64_t, + z0 = svamax_n_f64_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied1, svfloat64_t, + z0 = svamax_f64_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied2, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f64_z_untied: +** ( +** movprfx z0\.d, p0/z, z1\.d +** famax z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0\.d, p0/z, z2\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_untied, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_d4_f64_z_tied1: +** mov (z[0-9]+\.d), d4 +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_z_tied1, svfloat64_t, double, + z0 = svamax_n_f64_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_d4_f64_z_untied: +** mov (z[0-9]+\.d), d4 +** ( +** movprfx z0\.d, p0/z, z1\.d +** famax z0\.d, p0/m, z0\.d, \1 +** | +** movprfx z0\.d, p0/z, \1 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_z_untied, svfloat64_t, double, + z0 = svamax_n_f64_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f64_z: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_z, svfloat64_t, + z0 = svamax_n_f64_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f64_x_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied1, svfloat64_t, + z0 = svamax_f64_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f64_x_tied2: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied2, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f64_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0, z2 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_untied, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_d4_f64_x_tied1: +** mov (z[0-9]+\.d), d4 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_x_tied1, svfloat64_t, double, + z0 = svamax_n_f64_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_d4_f64_x_untied: +** mov z0\.d, d4 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_x_untied, svfloat64_t, double, + z0 = svamax_n_f64_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f64_x_tied1: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f64_x_untied: +** fmov z0\.d, #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_tied1, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z0, z1), + z0 = svamax_x (svptrue_b64 (), z0, z1)) + +/* +** ptrue_amax_f64_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_tied2, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z1, z0), + z0 = svamax_x (svptrue_b64 (), z1, z0)) + +/* +** ptrue_amax_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_untied, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z1, z2), + z0 = svamax_x (svptrue_b64 (), z1, z2)) + +/* +** ptrue_amax_0_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 0), + z0 = svamax_x (svptrue_b64 (), z0, 0)) + +/* +** ptrue_amax_0_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 0), + z0 = svamax_x (svptrue_b64 (), z1, 0)) + +/* +** ptrue_amax_1_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 1), + z0 = svamax_x (svptrue_b64 (), z0, 1)) + +/* +** ptrue_amax_1_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 1), + z0 = svamax_x (svptrue_b64 (), z1, 1)) + +/* +** ptrue_amax_2_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 2), + z0 = svamax_x (svptrue_b64 (), z0, 2)) + +/* +** ptrue_amax_2_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 2), + z0 = svamax_x (svptrue_b64 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c new file mode 100644 index 00000000000..bb3f20db93d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c @@ -0,0 +1,311 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f16_m_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied1, svfloat16_t, + z0 = svamin_f16_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f16_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, \1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied2, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f16_m_untied: +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_untied, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_h4_f16_m_tied1: +** mov (z[0-9]+\.h), h4 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_m_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_h4_f16_m_untied: +** mov (z[0-9]+\.h), h4 +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_m_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f16_m: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_m, svfloat16_t, + z0 = svamin_n_f16_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied1, svfloat16_t, + z0 = svamin_f16_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied2, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f16_z_untied: +** ( +** movprfx z0\.h, p0/z, z1\.h +** famin z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0\.h, p0/z, z2\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_untied, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_h4_f16_z_tied1: +** mov (z[0-9]+\.h), h4 +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_z_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_h4_f16_z_untied: +** mov (z[0-9]+\.h), h4 +** ( +** movprfx z0\.h, p0/z, z1\.h +** famin z0\.h, p0/m, z0\.h, \1 +** | +** movprfx z0\.h, p0/z, \1 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_z_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f16_z: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_z, svfloat16_t, + z0 = svamin_n_f16_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f16_x_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied1, svfloat16_t, + z0 = svamin_f16_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f16_x_tied2: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied2, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f16_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0, z2 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_untied, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_h4_f16_x_tied1: +** mov (z[0-9]+\.h), h4 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_x_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_h4_f16_x_untied: +** mov z0\.h, h4 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_x_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) +/* +** amin_2_f16_x_tied1: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f16_x_untied: +** fmov z0\.h, #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_tied1, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z0, z1), + z0 = svamin_x (svptrue_b16 (), z0, z1)) + +/* +** ptrue_amin_f16_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_tied2, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z1, z0), + z0 = svamin_x (svptrue_b16 (), z1, z0)) + +/* +** ptrue_amin_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_untied, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z1, z2), + z0 = svamin_x (svptrue_b16 (), z1, z2)) + +/* +** ptrue_amin_0_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 0), + z0 = svamin_x (svptrue_b16 (), z0, 0)) + +/* +** ptrue_amin_0_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 0), + z0 = svamin_x (svptrue_b16 (), z1, 0)) + +/* +** ptrue_amin_1_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 1), + z0 = svamin_x (svptrue_b16 (), z0, 1)) + +/* +** ptrue_amin_1_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 1), + z0 = svamin_x (svptrue_b16 (), z1, 1)) + +/* +** ptrue_amin_2_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 2), + z0 = svamin_x (svptrue_b16 (), z0, 2)) + +/* +** ptrue_amin_2_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 2), + z0 = svamin_x (svptrue_b16 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c new file mode 100644 index 00000000000..704f5d62c59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f32_m_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied1, svfloat32_t, + z0 = svamin_f32_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f32_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, \1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied2, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f32_m_untied: +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_untied, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_s4_f32_m_tied1: +** mov (z[0-9]+\.s), s4 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_m_tied1, svfloat32_t, float, + z0 = svamin_n_f32_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_s4_f32_m_untied: +** mov (z[0-9]+\.s), s4 +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_m_untied, svfloat32_t, float, + z0 = svamin_n_f32_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f32_m: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_m, svfloat32_t, + z0 = svamin_n_f32_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied1, svfloat32_t, + z0 = svamin_f32_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied2, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f32_z_untied: +** ( +** movprfx z0\.s, p0/z, z1\.s +** famin z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0\.s, p0/z, z2\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_untied, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_s4_f32_z_tied1: +** mov (z[0-9]+\.s), s4 +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_z_tied1, svfloat32_t, float, + z0 = svamin_n_f32_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_s4_f32_z_untied: +** mov (z[0-9]+\.s), s4 +** ( +** movprfx z0\.s, p0/z, z1\.s +** famin z0\.s, p0/m, z0\.s, \1 +** | +** movprfx z0\.s, p0/z, \1 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_z_untied, svfloat32_t, float, + z0 = svamin_n_f32_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f32_z: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_z, svfloat32_t, + z0 = svamin_n_f32_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f32_x_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied1, svfloat32_t, + z0 = svamin_f32_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f32_x_tied2: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied2, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f32_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0, z2 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_untied, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_s4_f32_x_tied1: +** mov (z[0-9]+\.s), s4 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_x_tied1, svfloat32_t, float, + z0 = svamin_n_f32_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_s4_f32_x_untied: +** mov z0\.s, s4 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_x_untied, svfloat32_t, float, + z0 = svamin_n_f32_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) + +/* +** amin_2_f32_x_tied1: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f32_x_untied: +** fmov z0\.s, #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_tied1, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z0, z1), + z0 = svamin_x (svptrue_b32 (), z0, z1)) + +/* +** ptrue_amin_f32_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_tied2, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z1, z0), + z0 = svamin_x (svptrue_b32 (), z1, z0)) + +/* +** ptrue_amin_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_untied, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z1, z2), + z0 = svamin_x (svptrue_b32 (), z1, z2)) + +/* +** ptrue_amin_0_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 0), + z0 = svamin_x (svptrue_b32 (), z0, 0)) + +/* +** ptrue_amin_0_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 0), + z0 = svamin_x (svptrue_b32 (), z1, 0)) + +/* +** ptrue_amin_1_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 1), + z0 = svamin_x (svptrue_b32 (), z0, 1)) + +/* +** ptrue_amin_1_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 1), + z0 = svamin_x (svptrue_b32 (), z1, 1)) + +/* +** ptrue_amin_2_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 2), + z0 = svamin_x (svptrue_b32 (), z0, 2)) + +/* +** ptrue_amin_2_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 2), + z0 = svamin_x (svptrue_b32 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c new file mode 100644 index 00000000000..d2880d8507c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f64_m_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied1, svfloat64_t, + z0 = svamin_f64_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f64_m_tied2: +** mov (z[0-9]+\.d), z0\.d +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied2, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f64_m_untied: +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_untied, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_d4_f64_m_tied1: +** mov (z[0-9]+\.d), d4 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_m_tied1, svfloat64_t, double, + z0 = svamin_n_f64_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_d4_f64_m_untied: +** mov (z[0-9]+\.d), d4 +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_m_untied, svfloat64_t, double, + z0 = svamin_n_f64_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f64_m: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_m, svfloat64_t, + z0 = svamin_n_f64_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied1, svfloat64_t, + z0 = svamin_f64_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied2, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f64_z_untied: +** ( +** movprfx z0\.d, p0/z, z1\.d +** famin z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0\.d, p0/z, z2\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_untied, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_d4_f64_z_tied1: +** mov (z[0-9]+\.d), d4 +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_z_tied1, svfloat64_t, double, + z0 = svamin_n_f64_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_d4_f64_z_untied: +** mov (z[0-9]+\.d), d4 +** ( +** movprfx z0\.d, p0/z, z1\.d +** famin z0\.d, p0/m, z0\.d, \1 +** | +** movprfx z0\.d, p0/z, \1 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_z_untied, svfloat64_t, double, + z0 = svamin_n_f64_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f64_z: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_z, svfloat64_t, + z0 = svamin_n_f64_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f64_x_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied1, svfloat64_t, + z0 = svamin_f64_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f64_x_tied2: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied2, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f64_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0, z2 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_untied, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_d4_f64_x_tied1: +** mov (z[0-9]+\.d), d4 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_x_tied1, svfloat64_t, double, + z0 = svamin_n_f64_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_d4_f64_x_untied: +** mov z0\.d, d4 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_x_untied, svfloat64_t, double, + z0 = svamin_n_f64_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) + +/* +** amin_2_f64_x_tied1: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f64_x_untied: +** fmov z0\.d, #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_tied1, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z0, z1), + z0 = svamin_x (svptrue_b64 (), z0, z1)) + +/* +** ptrue_amin_f64_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_tied2, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z1, z0), + z0 = svamin_x (svptrue_b64 (), z1, z0)) + +/* +** ptrue_amin_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_untied, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z1, z2), + z0 = svamin_x (svptrue_b64 (), z1, z2)) + +/* +** ptrue_amin_0_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 0), + z0 = svamin_x (svptrue_b64 (), z0, 0)) + +/* +** ptrue_amin_0_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 0), + z0 = svamin_x (svptrue_b64 (), z1, 0)) + +/* +** ptrue_amin_1_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 1), + z0 = svamin_x (svptrue_b64 (), z0, 1)) + +/* +** ptrue_amin_1_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 1), + z0 = svamin_x (svptrue_b64 (), z1, 1)) + +/* +** ptrue_amin_2_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 2), + z0 = svamin_x (svptrue_b64 (), z0, 2)) + +/* +** ptrue_amin_2_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 2), + z0 = svamin_x (svptrue_b64 (), z1, 2)) From patchwork Thu Oct 3 09:51:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1992273 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=BcA/S9AZ; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=BcA/S9AZ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XK6Rs5TzQz1xt1 for ; Thu, 3 Oct 2024 19:52:57 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9A913384A82F for ; Thu, 3 Oct 2024 09:52:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on20600.outbound.protection.outlook.com [IPv6:2a01:111:f403:260e::600]) by sourceware.org (Postfix) with ESMTPS id 977B1386544D for ; Thu, 3 Oct 2024 09:52:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 977B1386544D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 977B1386544D Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260e::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949156; cv=pass; b=PjmcAEJVaUCGvbo/o+7j+/w+U8AzDYhB7IP9iAsnjIJAr0h4vyWasik3Aqa6YqC4HL2ZPS9o0/GrJXj0HJPGYaeBGkuZZsVKOdS5BY+x2y5Vao8TGwwV/pNmcLE2XOBInY/NJ3PMrHhwhIusuQZeVqTZoWIV3u+RlA+AByAEcu4= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727949156; c=relaxed/simple; bh=9F0C8t741KiWikytl6O9LSRWID7EirhadpaWCcTGiJ4=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=PcwBuGsmI810Z2Qv4VI9PvJ3VCATjVfVky2sY86XrdWzqXVkrLee7HSm57p8BzSX2OlGpd+5VGk9llyzSDlDWuhpVq9els19y5VUVJ2YnGO//wjWBcXD3f4+IeWdoDSW2MPjE51eZjaip77xX+MrKWQv/owow6mVey4WHfyfASA= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=dDZTvelmVk0R+3PNrg3h5YEUkYhUedMp5GI0BZZfwkNZi0Z6Ce7G0LQt83E6NYX6ABRzacYGmMNEdD56ZDTebypB3LmgLpy72BVJR5icy2q8PnDz6KhKFWI8AWe9+JG8Mr7HiIYo5iloBkCyD2TR5+RiknolybFEedottglnCW97XvgH1CcyR4Zq6jIHE6RANnvuXJRUZCvuIOYVb5AxDbtxwWcaKiu7KJHaMxpPxwEk27OkLBLIfjq2V7nI+UIPaO4ZaozLtYyntyVw1TfN8TNDdyf30yI9y2tAZHC02rMrvyGe/qhdOO0E2E7bU/gIHRQopcsDUWV6Dw3aIcaygA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=CH7dbmj05J/8XqmhwxWBjSqR2DJMj2Xn9YYocbfA1HFmBVfDy52kp2/O8PJs4C8ucZ/FQvRbRlN5355ZSbizt8i9t1OZacnM9ZneTg9NhgkoUWY/n1j0iTrb2HnC+ZBsbsQPji5mXlmvdOtj8N9WtrxuQYg+Tim3k0mOJNA6uDuKyDYh1ZxtV5afz2Q/voDyVcvMqBvh6YUfjMfSs6+lM23Cw6lRNOu1xJFaadf8oQOki05JHzG8UidIChk8tYU116dY1/uT+hzBKSsvCZGY6ozgNn/CUAnlvuiAvvo9DhrWON4kuHmNnXQtB1wnV0goBCIZG5v8Wcurid4c2ShJKw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=BcA/S9AZW1H87Z54wTVNFmlumChByjftnnr8WaEdap8CI341hrxaO/VrULiIy8elAJR2EZk9QlETyzX+pBEvGrAi307oKujYocy5zsA4h0PytPkg0IgeSMbWmwZ/Dzwaj+jHnvOlw5nxavRfzoVqskPuAEPwYZdGMnhzuvO1+OY= Received: from DB8PR04CA0016.eurprd04.prod.outlook.com (2603:10a6:10:110::26) by DB9PR08MB9515.eurprd08.prod.outlook.com (2603:10a6:10:453::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.18; Thu, 3 Oct 2024 09:52:28 +0000 Received: from DU2PEPF0001E9BF.eurprd03.prod.outlook.com (2603:10a6:10:110:cafe::c0) by DB8PR04CA0016.outlook.office365.com (2603:10a6:10:110::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.17 via Frontend Transport; Thu, 3 Oct 2024 09:52:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF0001E9BF.mail.protection.outlook.com (10.167.8.68) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8026.11 via Frontend Transport; Thu, 3 Oct 2024 09:52:27 +0000 Received: ("Tessian outbound 10d5cea79515:v473"); Thu, 03 Oct 2024 09:52:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: b5527604e0b17bf7 X-TessianGatewayMetadata: z0EaD/EUcwwnGiPA2/G9MsiRDVJFa1+QWBh1QaZWllT/0++RNnf5j77Pvb+F159coI4rlxWdVQlKSjBSgA21GNgCBueA6ctpdnBndZnfWJEEQJGZYE94D7nI49m0HSkMWwO5b9DY3c6KgfTUMSFPSt6yKuwm6YyXYFfsTavdXCk= X-CR-MTA-TID: 64aa7808 Received: from Lbd7ab37ff161.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C9A8258E-7CED-4568-9F12-C041D7E5FD33.1; Thu, 03 Oct 2024 09:52:15 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lbd7ab37ff161.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 03 Oct 2024 09:52:15 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ogeBqgwQTOAx1fvSGTXp6FfFZ33GsMHtgAlI67DLPVOSSxt/qFgO6ACAizwDGGWWrbY1PUv5+sXPvjOUnFVbp1AXetlteyiQiYx5+1mw/ZMOaLDq+l65MNHk8esF59Kh7/kieiYa3J0b7z/E33OlP3XQUjSOYTYhD/PXjMktbfuWDLRbnJwuBeVoKskJfPj3Ie5a+O8O2dp3jqw9wUC5qH0KivR1C0YFkXsQM6dsnjEM8cAbanvDnuo0PE2uXmFWIjqrrREfxymTrQ7gVLEQsFbAj2zMJoaQ+a608+VHzOBYarzwtx8v5AsUYUPQ+k3PBqV9ztxfiovanE+jJPEgUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=kCtt4OnhN8YeigippfH8rQrTM03PRXtpScgNWFOjNSVX+kHIfatKiPqeJqhV5hAinwCqEBsgywCr9yMxjWt6EHunGNndDMvT51JZsKH625nMmw7Y7LBv6kBWJtY8KxNZah/a1IazM/o5pARSwgWXiI4udLv495Pd/tX8DLqGpfr83yr0Xu9mt3L7e8FWhv9iotuM4pt571QCTYvmuiiLFueHc9siAEm3jlkRUqYRl2NjjNiBh8COjWAY2vHTGen5h2e5zA0m5dBtYm2ljHPD8Z7ExCy9xAcV+wB4L7A2VWl2qH3917/yLIDWTdeavYScqR1746K1URFOxRUjO+/EyQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ByqbyexuLBBPuTT36A5kiRwaWNu53fpN/aGRjisxS9Q=; b=BcA/S9AZW1H87Z54wTVNFmlumChByjftnnr8WaEdap8CI341hrxaO/VrULiIy8elAJR2EZk9QlETyzX+pBEvGrAi307oKujYocy5zsA4h0PytPkg0IgeSMbWmwZ/Dzwaj+jHnvOlw5nxavRfzoVqskPuAEPwYZdGMnhzuvO1+OY= Received: from AM6P195CA0018.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:81::31) by DB5PR08MB10311.eurprd08.prod.outlook.com (2603:10a6:10:4a5::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Thu, 3 Oct 2024 09:52:12 +0000 Received: from AMS0EPF00000191.eurprd05.prod.outlook.com (2603:10a6:209:81:cafe::fc) by AM6P195CA0018.outlook.office365.com (2603:10a6:209:81::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Thu, 3 Oct 2024 09:52:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF00000191.mail.protection.outlook.com (10.167.16.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Thu, 3 Oct 2024 09:52:12 +0000 Received: from AZ-NEU-EX06.Arm.com (10.240.25.134) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:08 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX06.Arm.com (10.240.25.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 3 Oct 2024 09:52:07 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 3 Oct 2024 09:52:07 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v4 2/2] aarch64: Add codegen support for SVE2 faminmax Date: Thu, 3 Oct 2024 10:51:57 +0100 Message-ID: <20241003095157.1390838-3-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003095157.1390838-1-saurabh.jha@arm.com> References: <20241003095157.1390838-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF00000191:EE_|DB5PR08MB10311:EE_|DU2PEPF0001E9BF:EE_|DB9PR08MB9515:EE_ X-MS-Office365-Filtering-Correlation-Id: 0f71b858-038b-435a-427a-08dce391171a x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|82310400026|1800799024|36860700013|376014; X-Microsoft-Antispam-Message-Info-Original: gOfic+eogX1uUP94gl5FcmQKfsYJTm1bXfLZzXxgvqn8hnzOEbWwiimWuejFMt+ZqJrHKEfJaHiRbcJPducPfL/e7iZDMQJNkubGm0raqg+f/FdgBhx/kBeuaplQurGBIu/5nOYXTSdycFw6lVXsvWIL1VkqWSaWx/C6e2r8wn+h8eXfsAla25IAW491Eg1MYRbu7lNdSYpzf90cxdODVuAogmvBlRcAP/syYww2f8kwSospqTHtGKzi8GmbVxJ2ZFbjwUC6xhLTup+XvKU0+Lz5ch07bT9+7Lf/eX/bxfDVb5lwYpVdZk4jMrLe1ZNIbhsbQuGP0sbI7Y5Ij+HwkQoHDGu7ZSLqBdZV17bLo9tgNXwx7tomFqKCwdtm7qHHCZMHXkXv3ewMQ+MVNhyHchB9sFNHJ5ZM3AXZTYr4OmIXt06D4P/YzgouxV6JdOgpEyw/RBnqIaPj0Z7p2REas+38GAJWuex2nG3Od/UPHg4UVAZyQHJOZ8v2iNThTEKBS8/whZfcul/WbKg5OkLjGNKaN/PtIuulNsf8kWKGUyEPKwL6D1DhrT6wi2mxVCtj2DyNbHOLuNfrSmTSSaFWTvN4/GaY8q1aYIb02Vzn6M938EVB6YfXeS5TLy31xLoumrniN0H1nHP36CEWuaWLvQMvhFpy7CioO0q1ZyCetaZiwwmASPAKkV1osK4g2nD1Njqtusf9HQEn3X5PWqGBRH/CJEniK/RkdqoegMOt2CR1lz0iWAAET3oV4Z4yqQlnr0qmijSfdc3nQFzvRJuzIyWWFwBDn0g8hZbVRgyYyLZbLJscyUxnEHC1+TdKuW3ca018vb1haXFgGyWR9pcsG1NSHHZmLc0cx3uXGpZIz+f1rwYWkRJ3TFWslVfvR9y9O5wElm5j9h9kY/d8YpYPSkjsnimEISppmSVUWtLLBB/3O1wADh6+0UjJtfv0TKMgDh8AkuohzRkw4yD/G92O8fOklQDz/IaJYYkAUQ/f3AIKXnjT9gxDYo4afdubXokxHaUcYqwtQycUQcwOFRMsr2QAEjmPsGm+AEbdi0QoiaGw22+LphuejmDUPltgXuoi/kAEaCshYNe7sxAcUk7qCZfdGhkJNFFRyNLiqLJVJDQRet/J3ObJaOTd9jvlGun0aE8jN30jflpcxbLyZBgPOs7hngIHYmSgTIscEW95hEVbQeZHTgJ4fS7OBsLFXH/h66wxWnJePNMMUBUocKyrLIGj+cnabFSjY3h0JWt415t9deNWNemDaqzfYnjY8Q0o6ysPEQQUP3JTdQJNYTMx4N4f3h+YDWcIpi+9xeIrZxN1PMvsrGjSQls6kDK77im9mQSCrylSPaJTHxi1aD/K6w== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(1800799024)(36860700013)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR08MB10311 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:209:81::31]; domain=AM6P195CA0018.EURP195.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF0001E9BF.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 354a67ca-3efd-436f-d4d0-08dce3910e19 X-Microsoft-Antispam: BCL:0; ARA:13230040|36860700013|35042699022|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?utf-8?q?oY8EeBJTJYjSIXDks9OaY7kv/k1X4Ck?= =?utf-8?q?ckhYiX5FzDJzOioNadtYttK2lzKThWk9Y/5JLD1gMe2D8CEwDWBuAC64YqNO50TPv?= =?utf-8?q?Ila9w4bG3Gqk3J3VgOZ/2GTCp2nH2N6Sz9hmjjkJHd5mjBHIU/egab8yozYuvJ/Kn?= =?utf-8?q?mjSqmgjwoXng7gzm2dsSg9L/Zn3mGRhPF2ehjTuS2FomNqEw2OY/qNE2hmLTPWyrz?= =?utf-8?q?EhwBNrsvj4eW/GSu6GiePL2FRBkiERsjthIMmaiyfRbsRPtaMCrKaFnwpGSVOv0Bg?= =?utf-8?q?TMAVjlKmscOFFFuJ8nWKlVKgTfW99H3xp6Kb/EdeINohTPEhTVtIDDWirymycG7Mx?= =?utf-8?q?KrUrY2Va44voS8ak0n/ua6lxxlbAPi0b18IE7r2De/GZfnhgH09yFLs82Jynx3oEf?= =?utf-8?q?apfaq7cRElF8X0o33FNc/p21gtx/mgK5bC55YZ7YTpOxV/aHOiFqeRSGzGg9dwhbe?= =?utf-8?q?2f/DPcB3lTKXJ3i0RE4rDvvvjUGPOC/TCG8+kkOAwKvZons7HYfSKNwNlvxYN5hDw?= =?utf-8?q?NARvboka/JmKokwqVA7TE9aHgAIEVX/sMmZ8MrIn9lPyPrMnpJQs1KtuZdCKUXROJ?= =?utf-8?q?S/Qjgh/8qdtN/CHcfIV1NvQ49dsYny9bUeMRmdZCi3lwsLkYNja+60aIZsXk18Css?= =?utf-8?q?2hAckseMF/Rcnh0P5bbaTw/vaRBYnxKaY3vY3aYaszY/+6N3YFdZ2qU84v0K+3m5x?= =?utf-8?q?kUrMMgS6xsPvapDSymAzVqZHBVD8Lf7LWR2kEZz/J0OVvPGIT0jyU8oenh4b0bB86?= =?utf-8?q?9XUz18CiIMUfij606YlZE6PWJ3Y2uXEULRQa670e9R4+iyqAmfv4Dx2udBKfMQpiz?= =?utf-8?q?mLdIPsJ4nBdeU7xl4S4p4isRvZjT8kH2auABwjmMC48ArhFKIRW81Aw8HUtgsV98X?= =?utf-8?q?vBrwUMwXrvhPyhrIRY92YBlhlDGaKDm15rpDQNiq/GKOfP4RvlvsCgIoNikeIcmln?= =?utf-8?q?Ei7MVdrxG1r+fJq1F2Cwq/d3Wbd1o131UiFb13aIVngAoaBTSFvsAqSOCtmAV+uqI?= =?utf-8?q?gIYYXtFqSBsc49SoLihaQHHtqs2zD+0TruwPtZbx1aULA5bTPPJmjKYYQLBqUlk4p?= =?utf-8?q?s9FfpEXievyKBiSJQoO2mRXEkYqvnwx0zDIXcarzHVMBmeD5OwPwtSRCr0uIAuba1?= =?utf-8?q?bkn1HrLi7YSatd/hWW7RjOZJs+PSba0otmXMWA1xJ91y3EuCIvlfDJ6NqEMYroG4m?= =?utf-8?q?tYu83XsLo+HBW0pl5QhTVUHdNkZ1h/HoeCDYimxbj98pN68Lj96TGhonsjTi7vSEN?= =?utf-8?q?HFsao81J3Jf+ujWNRkyIP/Jn6DgiKZt1nFA=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(36860700013)(35042699022)(1800799024)(82310400026)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Oct 2024 09:52:27.6502 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0f71b858-038b-435a-427a-08dce391171a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF0001E9BF.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB9515 X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. We also need to pass -ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX and UNSPEC_COND_SMIN. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Iterator and attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax_1.c: New test. * gcc.target/aarch64/sve/faminmax_2.c: New test. --- gcc/config/aarch64/aarch64-sve2.md | 37 ++++++++++++ gcc/config/aarch64/iterators.md | 6 ++ .../gcc.target/aarch64/sve/faminmax_1.c | 44 ++++++++++++++ .../gcc.target/aarch64/sve/faminmax_2.c | 60 +++++++++++++++++++ 4 files changed, 147 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 725092cc95f..5f2697c3179 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2467,6 +2467,43 @@ [(set_attr "movprfx" "yes")] ) +;; ------------------------------------------------------------------------- +;; -- [FP] Absolute maximum and minimum +;; ------------------------------------------------------------------------- +;; Includes: +;; - FAMAX +;; - FAMIN +;; ------------------------------------------------------------------------- +;; Predicated floating-point absolute maximum and minimum. +(define_insn_and_rewrite "*aarch64_pred_faminmax_fused" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_operand: 1 "register_operand") + (match_operand:SI 4 "aarch64_sve_gp_strictness") + (unspec:SVE_FULL_F + [(match_operand 5) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 2 "register_operand")] + UNSPEC_COND_FABS) + (unspec:SVE_FULL_F + [(match_operand 6) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 3 "register_operand")] + UNSPEC_COND_FABS)] + SVE_COND_SMAXMIN))] + "TARGET_SVE_FAMINMAX" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , Upl , %0 , w ; * ] \t%0., %1/m, %0., %3. + [ ?&w , Upl , w , w ; yes ] movprfx\t%0, %2\;\t%0., %1/m, %0., %3. + } + "&& (!rtx_equal_p (operands[1], operands[5]) + || !rtx_equal_p (operands[1], operands[6]))" + { + operands[5] = copy_rtx (operands[1]); + operands[6] = copy_rtx (operands[1]); + } +) + ;; ========================================================================= ;; == Complex arithmetic ;; ========================================================================= diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c06f8c2c90f..8b18682c341 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3143,6 +3143,9 @@ UNSPEC_COND_FMIN UNSPEC_COND_FMINNM]) +(define_int_iterator SVE_COND_SMAXMIN [UNSPEC_COND_SMAX + UNSPEC_COND_SMIN]) + (define_int_iterator SVE_COND_FP_TERNARY [UNSPEC_COND_FMLA UNSPEC_COND_FMLS UNSPEC_COND_FNMLA @@ -4503,6 +4506,9 @@ (define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN]) +(define_int_attr faminmax_cond_uns_op + [(UNSPEC_COND_SMAX "famax") (UNSPEC_COND_SMIN "famin")]) + (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c new file mode 100644 index 00000000000..3b65ccea065 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_FAMAX(TYPE) \ + void fn_famax_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmax (temp1, temp2); \ + } \ + } \ + +#define TEST_FAMIN(TYPE) \ + void fn_famin_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmin (temp1, temp2); \ + } \ + } \ + +TEST_FAMAX (float16_t) +TEST_FAMAX (float32_t) +TEST_FAMAX (float64_t) +TEST_FAMIN (float16_t) +TEST_FAMIN (float32_t) +TEST_FAMIN (float64_t) + +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c new file mode 100644 index 00000000000..d80f6eca8f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_WITH_SVMAX(TYPE) \ + TYPE fn_fmax_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmax_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMAXNM(TYPE) \ + TYPE fn_fmaxnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmaxnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMIN(TYPE) \ + TYPE fn_fmin_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmin_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMINNM(TYPE) \ + TYPE fn_fminnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svminnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +TEST_WITH_SVMAX (svfloat16_t) +TEST_WITH_SVMAX (svfloat32_t) +TEST_WITH_SVMAX (svfloat64_t) + +TEST_WITH_SVMAXNM (svfloat16_t) +TEST_WITH_SVMAXNM (svfloat32_t) +TEST_WITH_SVMAXNM (svfloat64_t) + +TEST_WITH_SVMIN (svfloat16_t) +TEST_WITH_SVMIN (svfloat32_t) +TEST_WITH_SVMIN (svfloat64_t) + +TEST_WITH_SVMINNM (svfloat16_t) +TEST_WITH_SVMINNM (svfloat32_t) +TEST_WITH_SVMINNM (svfloat64_t) + +/* { dg-final { scan-assembler-not {\tfamax\t} } } */ +/* { dg-final { scan-assembler-not {\tfamin\t} } } */ + +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 8 } } */ + +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */