From patchwork Wed Oct 2 15:50:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1992075 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=nwcXhAvk; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=nwcXhAvk; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XJfTc5Ty9z1xt1 for ; Thu, 3 Oct 2024 01:52:52 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 819A1385B532 for ; Wed, 2 Oct 2024 15:52:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-DB5-obe.outbound.protection.outlook.com (mail-db5eur02on20627.outbound.protection.outlook.com [IPv6:2a01:111:f403:2608::627]) by sourceware.org (Postfix) with ESMTPS id 37FFB3858D29 for ; Wed, 2 Oct 2024 15:51:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 37FFB3858D29 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 37FFB3858D29 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2608::627 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727884291; cv=pass; b=McOV94xVzPzC2zdGgp824H1G68RX7LcPIOKfJqBeg7sS7E8owxXA7tpuVpmbENmsQ8nT8tmiCMX9pfUHXAArFY+A4zw8jOcMvp7ZE4jO1TjlUxQw5RpLEQOGKBoKydYpWlMsO/6CBSpD1PYg8cCKxJbKLMEZzSOorPjMsAZ6GDI= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727884291; c=relaxed/simple; bh=4vwbX1X4v6GiE+2I4CKYrtBMbwCAVgmFxpHg182P0a8=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=ZE83fbfmN0fDhDX76NndXBxLg0ix8JJrfiEE3+m4wGUQlneJuB45p5jdGmH7hhV1BV+ZApEv0JfynqK3SgXzSO6leHkFMmVgQdVJYCziMORDLX4ScrXjzjt8JpMmkU35+WFxwmBFVpHM230luvGYs/FE44UndniTk7XYIa49ckU= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=UqRf4SVWvYnKggw7gcQ5jql2eNTHMjUcxgv+UznB95Hbp0vG5+DcLzHkLTumxXM+yHM9TJR2DM3wLnluhWSv90y5zrsCVWkj27Ewsg+yrvE0vEd47piqnkWEdHjNTmryh7xajB7txoLZf8HM/OzQWeBIrNwWp+bQK2O1kPwpAjPO2Q2lw7YfTCirOAv88urFVyn3oQRD0bPEinHzWqQuyMDPKyQOsynbPrx1krQOG6QkHp1ieYNdf1GpwC33pl2hPh1Nfo+gZVfSxp4KDgLOgUWA2RwjdWMvZj5NztsVlIVv4PwqAo/EzA6/aC6E5FtK8BbX8UyI5+5PWCZ4FSgsQw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=N8Qh00xlhGngRC+puuEKweJ+6G4E3xlKy0mzzBSAP68=; b=amLZgxnBwghTRaoBU7xhrxCEfbPUm6bhwIjmX/ruO0b664kZEXNy0r04MoEg4d/sJB0iFWupLUZ+Q8A6t+CKtplLO6j/CDgqoe3XfGRp5OPMs3yxMjPWwu6xbzfs7pU7wbJNfD9NL4N4RNSB2UgZnaFZh8llNA+D78JVo4hZT2l5cjmcNgBPvlzyjwegPbtVDTNnmV9gHu8c8sCC86ygh7lqDYlv1owzHRKw7uP/1LAe47usrbzixS3e+4yArbhP2oGabjtnS9AxzDzfcQkBTou9cTCuomg/Z/hHXB1TiviXWxRt6cEX8EBUdRVzc9+GXvh8EwxCReiab2DBF5JOUQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=N8Qh00xlhGngRC+puuEKweJ+6G4E3xlKy0mzzBSAP68=; b=nwcXhAvkn7Roahq1y4yJVLiDErAXqC3JG5kkJMzKfFsvcxiUWabwow713W2OZ3mRypYYfnce1clzZ2Cc+UDT4l0ZELjUNZgmhct/NXf9Qf9vwKT7/RdmwbzoJd8ufsrIxa0jpVbFnryXXA3K5VbWRsu9Wd1TmI0RbRAtXPiB/mM= Received: from AM0PR02CA0212.eurprd02.prod.outlook.com (2603:10a6:20b:28f::19) by VE1PR08MB5823.eurprd08.prod.outlook.com (2603:10a6:800:1a5::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.10; Wed, 2 Oct 2024 15:51:18 +0000 Received: from AM3PEPF0000A796.eurprd04.prod.outlook.com (2603:10a6:20b:28f:cafe::85) by AM0PR02CA0212.outlook.office365.com (2603:10a6:20b:28f::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Wed, 2 Oct 2024 15:51:18 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM3PEPF0000A796.mail.protection.outlook.com (10.167.16.101) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8026.11 via Frontend Transport; Wed, 2 Oct 2024 15:51:17 +0000 Received: ("Tessian outbound 1cf41b4bd505:v473"); Wed, 02 Oct 2024 15:51:16 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3e5b649de0fb9dd3 X-TessianGatewayMetadata: /sor5O6jzjTn+kCkcU1217qHbQNJCy/43PZ41HtE3KTOD73AcUI60vzXQ9hiRbSGkaZRBC5jgRboydab7Mli1EdzpL7yy4FFy+JGdPp1dcUdMpaiNwglsvihCIGAv3K/orQ+0Y0HF0RNYWdU0rMvIuqHpkZLH7wzyYUjrLmNjI8= X-CR-MTA-TID: 64aa7808 Received: from L35dc9cc34ffd.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id A784EBDD-A143-46CB-A4A4-365CFA377F28.1; Wed, 02 Oct 2024 15:51:10 +0000 Received: from EUR02-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L35dc9cc34ffd.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 02 Oct 2024 15:51:10 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=TtibBs/owpzkMFfCR0WaES+JPosoLkyMvN4+J2A5nZTUpGjFAXD7NE6AL2w4n1EYUir+11PG8I/Af741P22q1OHaap/PPv87xmlfY82DFA6EV+x2YpPZ87b3WXUbHoCV31vFiDEHS8sQ4Dq9YHYGQCxPFoExz3yzSUNLwSyCigvGkHe1ByPF07HmRdURhGAkSVadSg2NdmQC+qFljgOmeYn5HzMzVtNJVuh+/jb+CRgPJgVIsbG/Xp6aR2TzNsSN7myb5a7t60VSiXn6s1JdCBAZpIIPxgBm95om5SwCmZDxm2855boolW7i6a8wQROgFewzfEuvlk+agB1dNavpJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=N8Qh00xlhGngRC+puuEKweJ+6G4E3xlKy0mzzBSAP68=; b=fs7CkB90H4VDADoXDC5WnEoedcnDRmNtisKK+gC9BUWhWPArlEUyYXNQN9z/EiMo7u/hGeCbiuEFh6VKWy32cLYM381l300iOY1P0Odbw402kQFXRLQmk/ZcMF41iPBHfl8/4FvkXXOoZfy2TvbtVFn4xSEq5tEKYYTgOpaqB2u1Bd/rmLpiXGZAe+ZlUjkguqXxazX4TT9sGFzTYMXS6e4U4Z9tf/yHFpOyniBLaY0scp71WfWITLnQBAn5U0rOQ5MbxWB3uEmwHLVC+jKjKYnA6J7T51vflpbVjgLegHGROG7u5/BxdEReOOBVzk+6KPb4EKseUw8qzDFAehI5uA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=N8Qh00xlhGngRC+puuEKweJ+6G4E3xlKy0mzzBSAP68=; b=nwcXhAvkn7Roahq1y4yJVLiDErAXqC3JG5kkJMzKfFsvcxiUWabwow713W2OZ3mRypYYfnce1clzZ2Cc+UDT4l0ZELjUNZgmhct/NXf9Qf9vwKT7/RdmwbzoJd8ufsrIxa0jpVbFnryXXA3K5VbWRsu9Wd1TmI0RbRAtXPiB/mM= Received: from DB6PR0301CA0094.eurprd03.prod.outlook.com (2603:10a6:6:30::41) by VI1PR08MB5328.eurprd08.prod.outlook.com (2603:10a6:803:13a::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Wed, 2 Oct 2024 15:51:06 +0000 Received: from DU6PEPF0000B622.eurprd02.prod.outlook.com (2603:10a6:6:30:cafe::b4) by DB6PR0301CA0094.outlook.office365.com (2603:10a6:6:30::41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Wed, 2 Oct 2024 15:51:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DU6PEPF0000B622.mail.protection.outlook.com (10.167.8.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8026.11 via Frontend Transport; Wed, 2 Oct 2024 15:51:06 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Oct 2024 15:51:05 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Oct 2024 15:51:05 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Wed, 2 Oct 2024 15:51:05 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v3 1/2] aarch64: Add SVE2 faminmax intrinsics Date: Wed, 2 Oct 2024 16:50:52 +0100 Message-ID: <20241002155053.1343957-2-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241002155053.1343957-1-saurabh.jha@arm.com> References: <20241002155053.1343957-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DU6PEPF0000B622:EE_|VI1PR08MB5328:EE_|AM3PEPF0000A796:EE_|VE1PR08MB5823:EE_ X-MS-Office365-Filtering-Correlation-Id: ac075f75-bd6e-44c3-a91f-08dce2fa0d4d x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|376014|36860700013|1800799024|82310400026; X-Microsoft-Antispam-Message-Info-Original: 1C7e3DL2gBO9arCEHyOoT1bHYovBRawfBevivWeoE75ENCWM+JI7pEl4j/+Qe2sKz7/5q2J9DtBbE85eqJSgAdrcNf2xuD9ZQCVZaja20nNfURrerVoB81k2a0z/YByH5jg0wMYDMVC24nUo9sv6uWL2HcFXw/H3/d58hf3AwCRfisnhG44V+q2icF01n+2EDcOsx2EkAT6PjPdOTpoqAIQY8UA2NyHB+W1cdkfiVoPf0ZPxWXCqppUkO6jq11Fk9N2Ro9KrBrcyhq3BHQppoCvq8tZUmZNOu6zEW4+2Rrq3bIgU0WW9swHGxxJYUTaXdtZeGU1ef24N/sIwl2eEAnt2IPbqnZuCpRLmAOVzfWShGB4TPmhAlOqAbeKnMm1MkGk/5zmyEMvMq/Vy0rff9O8AWuZGYBz9zXj+5g/2boGYaM097jRIRVSZ7DqnZHOiLMrr206XVhPx87sGd217FNwDh+gQHnLBXSijJtdYOoHSOEunldB9SnR/LRqZGYkM6Y76a5cFtco2JMhUnG19JZN8NMCEqjLbh0qp78wKF9O0SkANA0XUkOW25lgaWjiVJvbZaZ0YgDZXqkXPwxRLWKYgFjp7DDWPIM5ng+pn+wrMBZTlRJPCFK64dF8jx26Ts+hm0TzwgFUgV7J3yFTfGhzPcIoqi8H2McmcNAvLQKYeet/rhVyFgCeiHZXukeUoWlaruyTkpkaA48gu00EsciBmjAXySVW7aelGzjUGM26+3lMLZskXRH9V6Afl99fEtrR3iC9Q+dWz+IRs6KEAm7SroE6SmxzbjG96KiR7jHsMKYeGsEJJShy9Tu0mgoOZEQ4y7IiAmNQ5K53elG/PG2++c6dipYceUhEE0f/mLEPnrFwef4KW8XBCFeuLsbeW6olK+bH41BsOwmtw1g2hgBoQ+MknnNdewWdAvz0nvQJVIwJQPOcLiz5ebbpYQtJ+eU5CU0A+4UyZZLH4F0lnX7xAu2eR5g/+uq9x0F/bLIJaAckHEEIr36jPh8Iqq8f7P/+R6irEw0RRjRHBcMqpr2C8QOKdyVuoJrpObCWScyRwTmndveBew0EDHqdR5v7hSPkeF9QlrV+aq7fLgUsQBbhYiPjVxl4PPEaOBHoAklvZlNq/aEYGJ1WvpJ9PU/FTaE9K2f/gxUi4pAOFjXvhrVr+Cf3KRwCQ516fYdP8Omq3uqPoIE7jBhZGBI4Li9nz448vxD1gji9BsXbXka+mUx4AMpQid9HZFkI/ARiqX5Ht3axNr1Awiu6Kk+SAqtPZIH0p6VgWGdFXZ9ZP/i6/2NPyPEMvX1XhW8MgSBdsjxViABP12Lm24pubcdP1FWSSEr1igP8On0wnQEwWBebM1Q== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(376014)(36860700013)(1800799024)(82310400026); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB5328 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:6:30::41]; domain=DB6PR0301CA0094.eurprd03.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM3PEPF0000A796.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: b0fc1ba0-d595-4c9b-9885-08dce2fa06f6 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|376014|36860700013|35042699022|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?c9/+FjaLi/WB5G9lgGN/UEQLK3/Cp2z?= =?utf-8?q?214TsC15js/rBlrH0W62OqRJlFR+uv/8ZqmHtkMHepOq00v2Xj9NguT7jAtXzqG2h?= =?utf-8?q?AR5L7bxQxHvF7DMwnhs4PvuOfGOCsKjIl2gASUCQXPv4App+T4GvgshcelcXhXnOA?= =?utf-8?q?ddihnzdQvoAiIbAm5l/ec0mrV4msuqYSPNhaUMWW2xmE9DxLDm1KuqnByvvkGLX7P?= =?utf-8?q?4qAN2216fMd64OziEs4Y4OLcR2Q+H8BWSx4Q1Dy1H7SEZzudkPVFCYuV74jXbi0qH?= =?utf-8?q?N0nQ15zUXKpWsIoWmNY9H3rQxSxLAYCytBF9Df7N0LOLyZj/ddYKrWZai7iNUQj7Y?= =?utf-8?q?YTfr5vWu1bBqbYAmQBQWOrHyClodD/vEeJqGelU9LbLc1VxNL4UtOE2v58tNMYc7r?= =?utf-8?q?uo2F4bhmb1MEIucyMIyOrJnSJx7ebs4CCrVV1pRTaW3gWXm50BsCigE3GW40tZ2x9?= =?utf-8?q?8ovaELaLpCsC/VQo1h37WvGo/CI3otUdEgNkXcHrQrFOmgqI6ejIwSRdrsWvdy/AJ?= =?utf-8?q?AkwZhrQyzhdwDYftLR9hj2eKExOTSGGTvpj81TlhgvqBeeWUflDYCNPdzh8cQvW7V?= =?utf-8?q?bFGrIFpikuzgzCDI5O/JojDb/7r5O2I8hx0FhNFbqJ96fTvDPBzfzaWsvlanPpK01?= =?utf-8?q?+0CjoxcZHQhHx2G5RukvG3c920AeZJD9DN31LrSpzt0bk1jTgqQ5Tnp5dQc4lAFoH?= =?utf-8?q?iMvyazi+Sy9rk2w3d+1DUy6o16nHMTYG8t20XNc25O6AbzzT2s8TLROZeJNDDNe0U?= =?utf-8?q?LMpvcLkNtbn6AlwMsm3JJS6ClklhR2J9TORu0eFHD/Hz8+5K/8XVfbPOkdXwMLRDk?= =?utf-8?q?GIw0JVXPLCwTIL1ClSSi2FjV84gB1Yd8q6MBnzu48PlZj7nAozpiaRmnHDC0xU+hg?= =?utf-8?q?W5anlE2yVPRPsSoqCtzqOzh4i+P2GhbrrNrmpMSL5o5Op6IBHTSf/+zkTJ1fmYjLB?= =?utf-8?q?UpaVAnzPLDKqQVoIytefMTa/OxGh8UHPQe6buI/m54UIPZXnntqSwM0bUei4IHFER?= =?utf-8?q?wbtg7u25N2pf7Hip9fWwHO00V/5c6rGqfSSXKt0app3NbBvmWlCqdiCP6CEAxI/ax?= =?utf-8?q?1Mpo6szRE69LCQQ1vHMOoGyLzLkPvP++bQNSFcSH8bd7IoQvYNaEbaEcBjpWSLxL4?= =?utf-8?q?tks2haXfFHtL3XvDCiWm61w8Wo2+uvzHOixZE/ckMYB1Ck5J9NDsfnIZPjMp6rYwn?= =?utf-8?q?IFRkk6/Ep/lTu/S6uXUHLl4UFuUJC4inPo+4RGbLVOjqWSpHzuc5glkpHE+Vwzmhi?= =?utf-8?q?guESk9cgLCeISxnLh2sxYojD3GPKN9EqI8w=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(376014)(36860700013)(35042699022)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Oct 2024 15:51:17.1412 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ac075f75-bd6e-44c3-a91f-08dce2fa0d4d X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A796.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5823 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces SVE2 faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * sva[max|min]_[m|x|z] * sva[max|min]_[f16|f32|f64]_[m|x|z] * sva[max|min]_n_[f16|f32|f64]_[m|x|z] gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.def (REQUIRED_EXTENSIONS): Add faminmax intrinsics behind a flag. (svamax): Absolute maximum declaration. (svamin): Absolute minimum declaration. * config/aarch64/aarch64-sve-builtins-base.h: Declaring function bases for the new intrinsics. * config/aarch64/aarch64.h (TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax. * config/aarch64/iterators.md: New unspecs, iterators, and attrs for the new intrinsics. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test. * gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test. --- .../aarch64/aarch64-sve-builtins-base.cc | 4 + .../aarch64/aarch64-sve-builtins-base.def | 5 + .../aarch64/aarch64-sve-builtins-base.h | 2 + gcc/config/aarch64/aarch64.h | 1 + gcc/config/aarch64/iterators.md | 18 +- .../aarch64/sve2/acle/asm/amax_f16.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f32.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amax_f64.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f16.c | 311 +++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f32.c | 312 ++++++++++++++++++ .../aarch64/sve2/acle/asm/amin_f64.c | 312 ++++++++++++++++++ 11 files changed, 1900 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 4b33585d981..b189818d643 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -3071,6 +3071,10 @@ FUNCTION (svadrb, svadr_bhwd_impl, (0)) FUNCTION (svadrd, svadr_bhwd_impl, (3)) FUNCTION (svadrh, svadr_bhwd_impl, (1)) FUNCTION (svadrw, svadr_bhwd_impl, (2)) +FUNCTION (svamax, cond_or_uncond_unspec_function, + (UNSPEC_COND_FAMAX, UNSPEC_FAMAX)) +FUNCTION (svamin, cond_or_uncond_unspec_function, + (UNSPEC_COND_FAMIN, UNSPEC_FAMIN)) FUNCTION (svand, rtx_code_function, (AND, AND)) FUNCTION (svandv, reduction, (UNSPEC_ANDV)) FUNCTION (svasr, rtx_code_function, (ASHIFTRT, ASHIFTRT)) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 65fcba91586..95e04e4393d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -379,3 +379,8 @@ DEF_SVE_FUNCTION (svzip2q, binary, all_data, none) DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_FAMINMAX +DEF_SVE_FUNCTION (svamax, binary_opt_single_n, all_float, mxz) +DEF_SVE_FUNCTION (svamin, binary_opt_single_n, all_float, mxz) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.h b/gcc/config/aarch64/aarch64-sve-builtins-base.h index 5bbf3569c4b..978cf7013f9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.h @@ -37,6 +37,8 @@ namespace aarch64_sve extern const function_base *const svadrd; extern const function_base *const svadrh; extern const function_base *const svadrw; + extern const function_base *const svamax; + extern const function_base *const svamin; extern const function_base *const svand; extern const function_base *const svandv; extern const function_base *const svasr; diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index ec8fde783b3..34f56a4b869 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -470,6 +470,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED /* Floating Point Absolute Maximum/Minimum extension instructions are enabled through +faminmax. */ #define TARGET_FAMINMAX AARCH64_HAVE_ISA (FAMINMAX) +#define TARGET_SVE_FAMINMAX (TARGET_SVE && TARGET_FAMINMAX) /* Prefer different predicate registers for the output of a predicated operation over re-using an existing input predicate. */ diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 0836dee61c9..c06f8c2c90f 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -841,6 +841,8 @@ UNSPEC_COND_CMPNE_WIDE ; Used in aarch64-sve.md. UNSPEC_COND_FABS ; Used in aarch64-sve.md. UNSPEC_COND_FADD ; Used in aarch64-sve.md. + UNSPEC_COND_FAMAX ; Used in aarch64-sve.md. + UNSPEC_COND_FAMIN ; Used in aarch64-sve.md. UNSPEC_COND_FCADD90 ; Used in aarch64-sve.md. UNSPEC_COND_FCADD270 ; Used in aarch64-sve.md. UNSPEC_COND_FCMEQ ; Used in aarch64-sve.md. @@ -3085,6 +3087,8 @@ (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_FADD + (UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") UNSPEC_COND_FDIV UNSPEC_COND_FMAX UNSPEC_COND_FMAXNM @@ -3124,7 +3128,9 @@ UNSPEC_COND_SMIN]) (define_int_iterator SVE_COND_FP_BINARY_REG - [UNSPEC_COND_FDIV + [(UNSPEC_COND_FAMAX "TARGET_SVE_FAMINMAX") + (UNSPEC_COND_FAMIN "TARGET_SVE_FAMINMAX") + UNSPEC_COND_FDIV UNSPEC_COND_FMULX UNSPEC_COND_SMAX UNSPEC_COND_SMIN]) @@ -3701,6 +3707,8 @@ (UNSPEC_ZIP2Q "zip2q") (UNSPEC_COND_FABS "abs") (UNSPEC_COND_FADD "add") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCADD90 "cadd90") (UNSPEC_COND_FCADD270 "cadd270") (UNSPEC_COND_FCMLA "fcmla") @@ -4237,6 +4245,8 @@ (UNSPEC_FTSSEL "ftssel") (UNSPEC_COND_FABS "fabs") (UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FCVTLT "fcvtlt") (UNSPEC_COND_FCVTX "fcvtx") (UNSPEC_COND_FDIV "fdiv") @@ -4263,6 +4273,8 @@ (UNSPEC_COND_SMIN "fminnm")]) (define_int_attr sve_fp_op_rev [(UNSPEC_COND_FADD "fadd") + (UNSPEC_COND_FAMAX "famax") + (UNSPEC_COND_FAMIN "famin") (UNSPEC_COND_FDIV "fdivr") (UNSPEC_COND_FMAX "fmax") (UNSPEC_COND_FMAXNM "fmaxnm") @@ -4401,6 +4413,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs1_operand [(UNSPEC_COND_FADD "register_operand") + (UNSPEC_COND_FAMAX "register_operand") + (UNSPEC_COND_FAMIN "register_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "register_operand") (UNSPEC_COND_FMAXNM "register_operand") @@ -4416,6 +4430,8 @@ ;; 3 pattern. (define_int_attr sve_pred_fp_rhs2_operand [(UNSPEC_COND_FADD "aarch64_sve_float_arith_with_sub_operand") + (UNSPEC_COND_FAMAX "aarch64_sve_float_maxmin_operand") + (UNSPEC_COND_FAMIN "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FDIV "register_operand") (UNSPEC_COND_FMAX "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c new file mode 100644 index 00000000000..de4a6f8efaa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f16.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f16_m_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied1, svfloat16_t, + z0 = svamax_f16_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f16_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, \1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_tied2, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f16_m_untied: +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_m_untied, svfloat16_t, + z0 = svamax_f16_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_h4_f16_m_tied1: +** mov (z[0-9]+\.h), h4 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_m_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_h4_f16_m_untied: +** mov (z[0-9]+\.h), h4 +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_m_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f16_m: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_m, svfloat16_t, + z0 = svamax_n_f16_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied1, svfloat16_t, + z0 = svamax_f16_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_tied2, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f16_z_untied: +** ( +** movprfx z0\.h, p0/z, z1\.h +** famax z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0\.h, p0/z, z2\.h +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f16_z_untied, svfloat16_t, + z0 = svamax_f16_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_h4_f16_z_tied1: +** mov (z[0-9]+\.h), h4 +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_z_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_h4_f16_z_untied: +** mov (z[0-9]+\.h), h4 +** ( +** movprfx z0\.h, p0/z, z1\.h +** famax z0\.h, p0/m, z0\.h, \1 +** | +** movprfx z0\.h, p0/z, \1 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_z_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f16_z: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** movprfx z0\.h, p0/z, z0\.h +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_z, svfloat16_t, + z0 = svamax_n_f16_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f16_x_tied1: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied1, svfloat16_t, + z0 = svamax_f16_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f16_x_tied2: +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_tied2, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f16_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0, z2 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f16_x_untied, svfloat16_t, + z0 = svamax_f16_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_h4_f16_x_tied1: +** mov (z[0-9]+\.h), h4 +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_x_tied1, svfloat16_t, __fp16, + z0 = svamax_n_f16_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_h4_f16_x_untied: +** mov z0\.h, h4 +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZD (amax_h4_f16_x_untied, svfloat16_t, __fp16, + z0 = svamax_n_f16_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f16_x_tied1: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f16_x_untied: +** fmov z0\.h, #2\.0(?:e\+0)? +** famax z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amax_2_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_tied1, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z0, z1), + z0 = svamax_x (svptrue_b16 (), z0, z1)) + +/* +** ptrue_amax_f16_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_tied2, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z1, z0), + z0 = svamax_x (svptrue_b16 (), z1, z0)) + +/* +** ptrue_amax_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f16_x_untied, svfloat16_t, + z0 = svamax_f16_x (svptrue_b16 (), z1, z2), + z0 = svamax_x (svptrue_b16 (), z1, z2)) + +/* +** ptrue_amax_0_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 0), + z0 = svamax_x (svptrue_b16 (), z0, 0)) + +/* +** ptrue_amax_0_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 0), + z0 = svamax_x (svptrue_b16 (), z1, 0)) + +/* +** ptrue_amax_1_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 1), + z0 = svamax_x (svptrue_b16 (), z0, 1)) + +/* +** ptrue_amax_1_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 1), + z0 = svamax_x (svptrue_b16 (), z1, 1)) + +/* +** ptrue_amax_2_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f16_x_tied1, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z0, 2), + z0 = svamax_x (svptrue_b16 (), z0, 2)) + +/* +** ptrue_amax_2_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f16_x_untied, svfloat16_t, + z0 = svamax_n_f16_x (svptrue_b16 (), z1, 2), + z0 = svamax_x (svptrue_b16 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c new file mode 100644 index 00000000000..24280724c95 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f32.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f32_m_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied1, svfloat32_t, + z0 = svamax_f32_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f32_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, \1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_tied2, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f32_m_untied: +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_m_untied, svfloat32_t, + z0 = svamax_f32_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_s4_f32_m_tied1: +** mov (z[0-9]+\.s), s4 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_m_tied1, svfloat32_t, float, + z0 = svamax_n_f32_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_s4_f32_m_untied: +** mov (z[0-9]+\.s), s4 +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_m_untied, svfloat32_t, float, + z0 = svamax_n_f32_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f32_m: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_m, svfloat32_t, + z0 = svamax_n_f32_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied1, svfloat32_t, + z0 = svamax_f32_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_tied2, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f32_z_untied: +** ( +** movprfx z0\.s, p0/z, z1\.s +** famax z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0\.s, p0/z, z2\.s +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f32_z_untied, svfloat32_t, + z0 = svamax_f32_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_s4_f32_z_tied1: +** mov (z[0-9]+\.s), s4 +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_z_tied1, svfloat32_t, float, + z0 = svamax_n_f32_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_s4_f32_z_untied: +** mov (z[0-9]+\.s), s4 +** ( +** movprfx z0\.s, p0/z, z1\.s +** famax z0\.s, p0/m, z0\.s, \1 +** | +** movprfx z0\.s, p0/z, \1 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_z_untied, svfloat32_t, float, + z0 = svamax_n_f32_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f32_z: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** movprfx z0\.s, p0/z, z0\.s +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_z, svfloat32_t, + z0 = svamax_n_f32_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f32_x_tied1: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied1, svfloat32_t, + z0 = svamax_f32_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f32_x_tied2: +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_tied2, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f32_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0, z2 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f32_x_untied, svfloat32_t, + z0 = svamax_f32_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_s4_f32_x_tied1: +** mov (z[0-9]+\.s), s4 +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_x_tied1, svfloat32_t, float, + z0 = svamax_n_f32_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_s4_f32_x_untied: +** mov z0\.s, s4 +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZD (amax_s4_f32_x_untied, svfloat32_t, float, + z0 = svamax_n_f32_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f32_x_tied1: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f32_x_untied: +** fmov z0\.s, #2\.0(?:e\+0)? +** famax z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amax_2_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_tied1, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z0, z1), + z0 = svamax_x (svptrue_b32 (), z0, z1)) + +/* +** ptrue_amax_f32_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_tied2, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z1, z0), + z0 = svamax_x (svptrue_b32 (), z1, z0)) + +/* +** ptrue_amax_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f32_x_untied, svfloat32_t, + z0 = svamax_f32_x (svptrue_b32 (), z1, z2), + z0 = svamax_x (svptrue_b32 (), z1, z2)) + +/* +** ptrue_amax_0_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 0), + z0 = svamax_x (svptrue_b32 (), z0, 0)) + +/* +** ptrue_amax_0_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 0), + z0 = svamax_x (svptrue_b32 (), z1, 0)) + +/* +** ptrue_amax_1_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 1), + z0 = svamax_x (svptrue_b32 (), z0, 1)) + +/* +** ptrue_amax_1_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 1), + z0 = svamax_x (svptrue_b32 (), z1, 1)) + +/* +** ptrue_amax_2_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f32_x_tied1, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z0, 2), + z0 = svamax_x (svptrue_b32 (), z0, 2)) + +/* +** ptrue_amax_2_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f32_x_untied, svfloat32_t, + z0 = svamax_n_f32_x (svptrue_b32 (), z1, 2), + z0 = svamax_x (svptrue_b32 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c new file mode 100644 index 00000000000..5b73db45d8b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amax_f64.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amax_f64_m_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied1, svfloat64_t, + z0 = svamax_f64_m (p0, z0, z1), + z0 = svamax_m (p0, z0, z1)) + +/* +** amax_f64_m_tied2: +** mov (z[0-9]+\.d), z0\.d +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_tied2, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z0), + z0 = svamax_m (p0, z1, z0)) + +/* +** amax_f64_m_untied: +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_m_untied, svfloat64_t, + z0 = svamax_f64_m (p0, z1, z2), + z0 = svamax_m (p0, z1, z2)) + +/* +** amax_d4_f64_m_tied1: +** mov (z[0-9]+\.d), d4 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_m_tied1, svfloat64_t, double, + z0 = svamax_n_f64_m (p0, z0, d4), + z0 = svamax_m (p0, z0, d4)) + +/* +** amax_d4_f64_m_untied: +** mov (z[0-9]+\.d), d4 +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_m_untied, svfloat64_t, double, + z0 = svamax_n_f64_m (p0, z1, d4), + z0 = svamax_m (p0, z1, d4)) + +/* +** amax_2_f64_m: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_m, svfloat64_t, + z0 = svamax_n_f64_m (p0, z0, 2), + z0 = svamax_m (p0, z0, 2)) + +/* +** amax_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied1, svfloat64_t, + z0 = svamax_f64_z (p0, z0, z1), + z0 = svamax_z (p0, z0, z1)) + +/* +** amax_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_tied2, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z0), + z0 = svamax_z (p0, z1, z0)) + +/* +** amax_f64_z_untied: +** ( +** movprfx z0\.d, p0/z, z1\.d +** famax z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0\.d, p0/z, z2\.d +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f64_z_untied, svfloat64_t, + z0 = svamax_f64_z (p0, z1, z2), + z0 = svamax_z (p0, z1, z2)) + +/* +** amax_d4_f64_z_tied1: +** mov (z[0-9]+\.d), d4 +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_z_tied1, svfloat64_t, double, + z0 = svamax_n_f64_z (p0, z0, d4), + z0 = svamax_z (p0, z0, d4)) + +/* +** amax_d4_f64_z_untied: +** mov (z[0-9]+\.d), d4 +** ( +** movprfx z0\.d, p0/z, z1\.d +** famax z0\.d, p0/m, z0\.d, \1 +** | +** movprfx z0\.d, p0/z, \1 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_z_untied, svfloat64_t, double, + z0 = svamax_n_f64_z (p0, z1, d4), + z0 = svamax_z (p0, z1, d4)) + +/* +** amax_2_f64_z: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** movprfx z0\.d, p0/z, z0\.d +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_z, svfloat64_t, + z0 = svamax_n_f64_z (p0, z0, 2), + z0 = svamax_z (p0, z0, 2)) + +/* +** amax_f64_x_tied1: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied1, svfloat64_t, + z0 = svamax_f64_x (p0, z0, z1), + z0 = svamax_x (p0, z0, z1)) + +/* +** amax_f64_x_tied2: +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_tied2, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z0), + z0 = svamax_x (p0, z1, z0)) + +/* +** amax_f64_x_untied: +** ( +** movprfx z0, z1 +** famax z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0, z2 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amax_f64_x_untied, svfloat64_t, + z0 = svamax_f64_x (p0, z1, z2), + z0 = svamax_x (p0, z1, z2)) + +/* +** amax_d4_f64_x_tied1: +** mov (z[0-9]+\.d), d4 +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_x_tied1, svfloat64_t, double, + z0 = svamax_n_f64_x (p0, z0, d4), + z0 = svamax_x (p0, z0, d4)) + +/* +** amax_d4_f64_x_untied: +** mov z0\.d, d4 +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZD (amax_d4_f64_x_untied, svfloat64_t, double, + z0 = svamax_n_f64_x (p0, z1, d4), + z0 = svamax_x (p0, z1, d4)) + +/* +** amax_2_f64_x_tied1: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (p0, z0, 2), + z0 = svamax_x (p0, z0, 2)) + +/* +** amax_2_f64_x_untied: +** fmov z0\.d, #2\.0(?:e\+0)? +** famax z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amax_2_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (p0, z1, 2), + z0 = svamax_x (p0, z1, 2)) + +/* +** ptrue_amax_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_tied1, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z0, z1), + z0 = svamax_x (svptrue_b64 (), z0, z1)) + +/* +** ptrue_amax_f64_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_tied2, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z1, z0), + z0 = svamax_x (svptrue_b64 (), z1, z0)) + +/* +** ptrue_amax_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_f64_x_untied, svfloat64_t, + z0 = svamax_f64_x (svptrue_b64 (), z1, z2), + z0 = svamax_x (svptrue_b64 (), z1, z2)) + +/* +** ptrue_amax_0_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 0), + z0 = svamax_x (svptrue_b64 (), z0, 0)) + +/* +** ptrue_amax_0_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_0_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 0), + z0 = svamax_x (svptrue_b64 (), z1, 0)) + +/* +** ptrue_amax_1_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 1), + z0 = svamax_x (svptrue_b64 (), z0, 1)) + +/* +** ptrue_amax_1_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_1_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 1), + z0 = svamax_x (svptrue_b64 (), z1, 1)) + +/* +** ptrue_amax_2_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f64_x_tied1, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z0, 2), + z0 = svamax_x (svptrue_b64 (), z0, 2)) + +/* +** ptrue_amax_2_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amax_2_f64_x_untied, svfloat64_t, + z0 = svamax_n_f64_x (svptrue_b64 (), z1, 2), + z0 = svamax_x (svptrue_b64 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c new file mode 100644 index 00000000000..bb3f20db93d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f16.c @@ -0,0 +1,311 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f16_m_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied1, svfloat16_t, + z0 = svamin_f16_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f16_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, \1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_tied2, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f16_m_untied: +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_m_untied, svfloat16_t, + z0 = svamin_f16_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_h4_f16_m_tied1: +** mov (z[0-9]+\.h), h4 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_m_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_h4_f16_m_untied: +** mov (z[0-9]+\.h), h4 +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_m_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f16_m: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_m, svfloat16_t, + z0 = svamin_n_f16_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f16_z_tied1: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied1, svfloat16_t, + z0 = svamin_f16_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f16_z_tied2: +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_tied2, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f16_z_untied: +** ( +** movprfx z0\.h, p0/z, z1\.h +** famin z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0\.h, p0/z, z2\.h +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f16_z_untied, svfloat16_t, + z0 = svamin_f16_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_h4_f16_z_tied1: +** mov (z[0-9]+\.h), h4 +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_z_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_h4_f16_z_untied: +** mov (z[0-9]+\.h), h4 +** ( +** movprfx z0\.h, p0/z, z1\.h +** famin z0\.h, p0/m, z0\.h, \1 +** | +** movprfx z0\.h, p0/z, \1 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_z_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f16_z: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** movprfx z0\.h, p0/z, z0\.h +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_z, svfloat16_t, + z0 = svamin_n_f16_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f16_x_tied1: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied1, svfloat16_t, + z0 = svamin_f16_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f16_x_tied2: +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_tied2, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f16_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.h, p0/m, z0\.h, z2\.h +** | +** movprfx z0, z2 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f16_x_untied, svfloat16_t, + z0 = svamin_f16_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_h4_f16_x_tied1: +** mov (z[0-9]+\.h), h4 +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_x_tied1, svfloat16_t, __fp16, + z0 = svamin_n_f16_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_h4_f16_x_untied: +** mov z0\.h, h4 +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZD (amin_h4_f16_x_untied, svfloat16_t, __fp16, + z0 = svamin_n_f16_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) +/* +** amin_2_f16_x_tied1: +** fmov (z[0-9]+\.h), #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f16_x_untied: +** fmov z0\.h, #2\.0(?:e\+0)? +** famin z0\.h, p0/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_Z (amin_2_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_tied1, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z0, z1), + z0 = svamin_x (svptrue_b16 (), z0, z1)) + +/* +** ptrue_amin_f16_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_tied2, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z1, z0), + z0 = svamin_x (svptrue_b16 (), z1, z0)) + +/* +** ptrue_amin_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f16_x_untied, svfloat16_t, + z0 = svamin_f16_x (svptrue_b16 (), z1, z2), + z0 = svamin_x (svptrue_b16 (), z1, z2)) + +/* +** ptrue_amin_0_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 0), + z0 = svamin_x (svptrue_b16 (), z0, 0)) + +/* +** ptrue_amin_0_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 0), + z0 = svamin_x (svptrue_b16 (), z1, 0)) + +/* +** ptrue_amin_1_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 1), + z0 = svamin_x (svptrue_b16 (), z0, 1)) + +/* +** ptrue_amin_1_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 1), + z0 = svamin_x (svptrue_b16 (), z1, 1)) + +/* +** ptrue_amin_2_f16_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f16_x_tied1, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z0, 2), + z0 = svamin_x (svptrue_b16 (), z0, 2)) + +/* +** ptrue_amin_2_f16_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f16_x_untied, svfloat16_t, + z0 = svamin_n_f16_x (svptrue_b16 (), z1, 2), + z0 = svamin_x (svptrue_b16 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c new file mode 100644 index 00000000000..704f5d62c59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f32.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f32_m_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied1, svfloat32_t, + z0 = svamin_f32_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f32_m_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, \1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_tied2, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f32_m_untied: +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_m_untied, svfloat32_t, + z0 = svamin_f32_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_s4_f32_m_tied1: +** mov (z[0-9]+\.s), s4 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_m_tied1, svfloat32_t, float, + z0 = svamin_n_f32_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_s4_f32_m_untied: +** mov (z[0-9]+\.s), s4 +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_m_untied, svfloat32_t, float, + z0 = svamin_n_f32_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f32_m: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_m, svfloat32_t, + z0 = svamin_n_f32_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f32_z_tied1: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied1, svfloat32_t, + z0 = svamin_f32_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f32_z_tied2: +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_tied2, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f32_z_untied: +** ( +** movprfx z0\.s, p0/z, z1\.s +** famin z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0\.s, p0/z, z2\.s +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f32_z_untied, svfloat32_t, + z0 = svamin_f32_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_s4_f32_z_tied1: +** mov (z[0-9]+\.s), s4 +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_z_tied1, svfloat32_t, float, + z0 = svamin_n_f32_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_s4_f32_z_untied: +** mov (z[0-9]+\.s), s4 +** ( +** movprfx z0\.s, p0/z, z1\.s +** famin z0\.s, p0/m, z0\.s, \1 +** | +** movprfx z0\.s, p0/z, \1 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_z_untied, svfloat32_t, float, + z0 = svamin_n_f32_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f32_z: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** movprfx z0\.s, p0/z, z0\.s +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_z, svfloat32_t, + z0 = svamin_n_f32_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f32_x_tied1: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied1, svfloat32_t, + z0 = svamin_f32_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f32_x_tied2: +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_tied2, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f32_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.s, p0/m, z0\.s, z2\.s +** | +** movprfx z0, z2 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f32_x_untied, svfloat32_t, + z0 = svamin_f32_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_s4_f32_x_tied1: +** mov (z[0-9]+\.s), s4 +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_x_tied1, svfloat32_t, float, + z0 = svamin_n_f32_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_s4_f32_x_untied: +** mov z0\.s, s4 +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZD (amin_s4_f32_x_untied, svfloat32_t, float, + z0 = svamin_n_f32_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) + +/* +** amin_2_f32_x_tied1: +** fmov (z[0-9]+\.s), #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f32_x_untied: +** fmov z0\.s, #2\.0(?:e\+0)? +** famin z0\.s, p0/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_Z (amin_2_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_tied1, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z0, z1), + z0 = svamin_x (svptrue_b32 (), z0, z1)) + +/* +** ptrue_amin_f32_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_tied2, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z1, z0), + z0 = svamin_x (svptrue_b32 (), z1, z0)) + +/* +** ptrue_amin_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f32_x_untied, svfloat32_t, + z0 = svamin_f32_x (svptrue_b32 (), z1, z2), + z0 = svamin_x (svptrue_b32 (), z1, z2)) + +/* +** ptrue_amin_0_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 0), + z0 = svamin_x (svptrue_b32 (), z0, 0)) + +/* +** ptrue_amin_0_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 0), + z0 = svamin_x (svptrue_b32 (), z1, 0)) + +/* +** ptrue_amin_1_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 1), + z0 = svamin_x (svptrue_b32 (), z0, 1)) + +/* +** ptrue_amin_1_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 1), + z0 = svamin_x (svptrue_b32 (), z1, 1)) + +/* +** ptrue_amin_2_f32_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f32_x_tied1, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z0, 2), + z0 = svamin_x (svptrue_b32 (), z0, 2)) + +/* +** ptrue_amin_2_f32_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f32_x_untied, svfloat32_t, + z0 = svamin_n_f32_x (svptrue_b32 (), z1, 2), + z0 = svamin_x (svptrue_b32 (), z1, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c new file mode 100644 index 00000000000..d2880d8507c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/amin_f64.c @@ -0,0 +1,312 @@ +/* { dg-do compile } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+sve+faminmax" + +/* +** amin_f64_m_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied1, svfloat64_t, + z0 = svamin_f64_m (p0, z0, z1), + z0 = svamin_m (p0, z0, z1)) + +/* +** amin_f64_m_tied2: +** mov (z[0-9]+\.d), z0\.d +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_tied2, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z0), + z0 = svamin_m (p0, z1, z0)) + +/* +** amin_f64_m_untied: +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_m_untied, svfloat64_t, + z0 = svamin_f64_m (p0, z1, z2), + z0 = svamin_m (p0, z1, z2)) + +/* +** amin_d4_f64_m_tied1: +** mov (z[0-9]+\.d), d4 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_m_tied1, svfloat64_t, double, + z0 = svamin_n_f64_m (p0, z0, d4), + z0 = svamin_m (p0, z0, d4)) + +/* +** amin_d4_f64_m_untied: +** mov (z[0-9]+\.d), d4 +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_m_untied, svfloat64_t, double, + z0 = svamin_n_f64_m (p0, z1, d4), + z0 = svamin_m (p0, z1, d4)) + +/* +** amin_2_f64_m: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_m, svfloat64_t, + z0 = svamin_n_f64_m (p0, z0, 2), + z0 = svamin_m (p0, z0, 2)) + +/* +** amin_f64_z_tied1: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied1, svfloat64_t, + z0 = svamin_f64_z (p0, z0, z1), + z0 = svamin_z (p0, z0, z1)) + +/* +** amin_f64_z_tied2: +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_tied2, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z0), + z0 = svamin_z (p0, z1, z0)) + +/* +** amin_f64_z_untied: +** ( +** movprfx z0\.d, p0/z, z1\.d +** famin z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0\.d, p0/z, z2\.d +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f64_z_untied, svfloat64_t, + z0 = svamin_f64_z (p0, z1, z2), + z0 = svamin_z (p0, z1, z2)) + +/* +** amin_d4_f64_z_tied1: +** mov (z[0-9]+\.d), d4 +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_z_tied1, svfloat64_t, double, + z0 = svamin_n_f64_z (p0, z0, d4), + z0 = svamin_z (p0, z0, d4)) + +/* +** amin_d4_f64_z_untied: +** mov (z[0-9]+\.d), d4 +** ( +** movprfx z0\.d, p0/z, z1\.d +** famin z0\.d, p0/m, z0\.d, \1 +** | +** movprfx z0\.d, p0/z, \1 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_z_untied, svfloat64_t, double, + z0 = svamin_n_f64_z (p0, z1, d4), + z0 = svamin_z (p0, z1, d4)) + +/* +** amin_2_f64_z: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** movprfx z0\.d, p0/z, z0\.d +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_z, svfloat64_t, + z0 = svamin_n_f64_z (p0, z0, 2), + z0 = svamin_z (p0, z0, 2)) + +/* +** amin_f64_x_tied1: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied1, svfloat64_t, + z0 = svamin_f64_x (p0, z0, z1), + z0 = svamin_x (p0, z0, z1)) + +/* +** amin_f64_x_tied2: +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_tied2, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z0), + z0 = svamin_x (p0, z1, z0)) + +/* +** amin_f64_x_untied: +** ( +** movprfx z0, z1 +** famin z0\.d, p0/m, z0\.d, z2\.d +** | +** movprfx z0, z2 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ) +** ret +*/ +TEST_UNIFORM_Z (amin_f64_x_untied, svfloat64_t, + z0 = svamin_f64_x (p0, z1, z2), + z0 = svamin_x (p0, z1, z2)) + +/* +** amin_d4_f64_x_tied1: +** mov (z[0-9]+\.d), d4 +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_x_tied1, svfloat64_t, double, + z0 = svamin_n_f64_x (p0, z0, d4), + z0 = svamin_x (p0, z0, d4)) + +/* +** amin_d4_f64_x_untied: +** mov z0\.d, d4 +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZD (amin_d4_f64_x_untied, svfloat64_t, double, + z0 = svamin_n_f64_x (p0, z1, d4), + z0 = svamin_x (p0, z1, d4)) + +/* +** amin_2_f64_x_tied1: +** fmov (z[0-9]+\.d), #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, \1 +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (p0, z0, 2), + z0 = svamin_x (p0, z0, 2)) + +/* +** amin_2_f64_x_untied: +** fmov z0\.d, #2\.0(?:e\+0)? +** famin z0\.d, p0/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_Z (amin_2_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (p0, z1, 2), + z0 = svamin_x (p0, z1, 2)) + +/* +** ptrue_amin_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_tied1, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z0, z1), + z0 = svamin_x (svptrue_b64 (), z0, z1)) + +/* +** ptrue_amin_f64_x_tied2: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_tied2, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z1, z0), + z0 = svamin_x (svptrue_b64 (), z1, z0)) + +/* +** ptrue_amin_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_f64_x_untied, svfloat64_t, + z0 = svamin_f64_x (svptrue_b64 (), z1, z2), + z0 = svamin_x (svptrue_b64 (), z1, z2)) + +/* +** ptrue_amin_0_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 0), + z0 = svamin_x (svptrue_b64 (), z0, 0)) + +/* +** ptrue_amin_0_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_0_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 0), + z0 = svamin_x (svptrue_b64 (), z1, 0)) + +/* +** ptrue_amin_1_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 1), + z0 = svamin_x (svptrue_b64 (), z0, 1)) + +/* +** ptrue_amin_1_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_1_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 1), + z0 = svamin_x (svptrue_b64 (), z1, 1)) + +/* +** ptrue_amin_2_f64_x_tied1: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f64_x_tied1, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z0, 2), + z0 = svamin_x (svptrue_b64 (), z0, 2)) + +/* +** ptrue_amin_2_f64_x_untied: +** ... +** ptrue p[0-9]+\.b[^\n]* +** ... +** ret +*/ +TEST_UNIFORM_Z (ptrue_amin_2_f64_x_untied, svfloat64_t, + z0 = svamin_n_f64_x (svptrue_b64 (), z1, 2), + z0 = svamin_x (svptrue_b64 (), z1, 2)) From patchwork Wed Oct 2 15:50:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1992074 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=arDi6/yN; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=arDi6/yN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XJfSd3VyTz1xtr for ; Thu, 3 Oct 2024 01:52:01 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 950D2385DDCE for ; Wed, 2 Oct 2024 15:51:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2062b.outbound.protection.outlook.com [IPv6:2a01:111:f403:260d::62b]) by sourceware.org (Postfix) with ESMTPS id EB991385B532 for ; Wed, 2 Oct 2024 15:51:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EB991385B532 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EB991385B532 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260d::62b ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727884297; cv=pass; b=tCJ9I0w3KQdPKkL8rJCymSHczxCWZ7bp1x1UZzhCYO67aLufioiw0KiVHSAats2MUbYdyeZIe5frfW8kWJ8Iy6jX9wNzK5Uo8uRCmfbIx7KCUzOwZHOe5R/iNBtpJ+ScUdexYtFh80a8L4nC7iXUaRMjTM/XTwVPUoW3vxdbmlA= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727884297; c=relaxed/simple; bh=xSxLHCWtcCUrYsW0CUtElnmEGUyBZgQ+udbexxoFzTE=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=r9EtBx9wEMBZ+Leg+9SNGieQ0qT9P2axZ4fEI4zkwXjzCXkki1n8MgxH9VgJwOBCUT2Bv828aWu72quMo6oPOrifMSjpWimOaZ9sTImeiEQOAhbceWCHqFV6VnXyclP5pKhUtHcplM/FFYZ118NMrzFo/EIZhiPFOZGiZGxMa6c= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=v0q+i33IotiPSHVWebjBIAiFVfI0vm8uct0igIqPoEhwZxXofQENnGOXpCLkO3vISrszFjVpWDWFg/d9G1rwYc0HvZFzcBiOp+1ZencjllQ4tATtCaY9cpsBGUsjKxaHvwZ5VUI7qIivxP3mwjSVm5YMyYjpwFX8bosiGNIbe79pkv9OxfdjWIDJXQBLe3MJbc1kkwmLaJ7U4Suh2CpHHmYqyh+TLcFSJEF0MTObW+aXex7trtpLPtw5FRpMBICS4PkD2Goapll31zsIX4kmOx+8+SW7MMxZ1v5Zz1oKIKtk55nZdWXzZ9hl9DxqXcppcZOyxACugttCa6X3tDUEMw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=ALoZLWokKHfDNMcipd27UDutKiJloGQADN5DvhZAgKgIhX/7GHn+IxugcN7bn05ytPuSIwvRuXeAdc4nJCA68MyVQx6CxH4WGbm+llieaoMycpswtTA6mtXWNStWm4w1ayCbbuZ/rXJ0ceL2a6jJIraO+rEYlVdU2zWcrK1hB+4NBrBnQPV/OMOT4wJWp6Oxj0ovZbuP5wG/6oiu0k+3HVoF9IY6FmogiOBu4X7HQ421MGkh7xp4C1kL/OrM4h40bJu90ecHpRCbYqJsTlVxuAP9TlglbpodpgTCpOIEKmgUhdjvza+LcpooWVTvWxxlW32mUVlJqY1WxN4FKX36zw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=arDi6/yNUUrZy3Pfe36ppKBjGrzbkhwj8wJpGfZfQiA1k/IOkh2efBBzzwp7T4YBL0SayIAtScwR4IKVBhCqABK/+RNY4AbgJ/zEt90BshQufd6o3vfTzizgodLaZrZzPdB/9gP6fDLN0jx8+OwVqJMQotKXWRYaHGiRBiU6JDs= Received: from AM6PR10CA0073.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:209:8c::14) by AM0PR08MB5364.eurprd08.prod.outlook.com (2603:10a6:208:186::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15; Wed, 2 Oct 2024 15:51:28 +0000 Received: from AM2PEPF0001C715.eurprd05.prod.outlook.com (2603:10a6:209:8c:cafe::dd) by AM6PR10CA0073.outlook.office365.com (2603:10a6:209:8c::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15 via Frontend Transport; Wed, 2 Oct 2024 15:51:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM2PEPF0001C715.mail.protection.outlook.com (10.167.16.185) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 2 Oct 2024 15:51:28 +0000 Received: ("Tessian outbound 5b65fbeb7e07:v473"); Wed, 02 Oct 2024 15:51:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d9c4ca46e0faf46f X-TessianGatewayMetadata: HPSqWijq9uSgXD8OskFSxY4JSxTGCpw4hz/3TxLTFaxocXqwTKMSPLM5qan0EPhscQRmvxIbLD7n8M6QBvYBuaejOv+Hdg34wJTVcs5PrJzn0VTYJldUC8kPo9m/pEXV6vo5h7kKCYhvtH+YnZVJ+7t6sQjYEqkpQF2Vy/N82ms= X-CR-MTA-TID: 64aa7808 Received: from L87d5cab7bf47.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id CA405355-F8C8-43DD-BB20-F13AC43DA6DD.1; Wed, 02 Oct 2024 15:51:16 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L87d5cab7bf47.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 02 Oct 2024 15:51:16 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Ja+mOx/ileGN61OYQIxVB5PL/+HSXuNrqsyTGejB4t1mhyc6Ef7O2L/esZhk4QNDHihz1vQXnxu0TTFLntLX78YlI0VxEwl3+Rl3rCy5A7w4dTxTfQKfPpI1LdMv1sNO1MVXwGZxEv1aALtn1hK87xX5INskF/aEDCSdj1VqyeaiUAojaf5/BuplaajgvMsy4GKWkRZiR6Zi2V3hGUlvm/xSe/W8OnAQMvOkstGcQl17Fxblw+u/ykLgJquOmrZ+iLNFUex36KTndLAKkAqrb1goNGfo9SWTfbOokHGfBlnB2woqORV4acNzYU1qkS4Zx4iNOtFzgndtnfTTuOq6Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=PLQP3cAE8MiDJR/JF566paAwrAwTIeBsG3FWhwQBjbRaPMlAUopYR1mtWuxITajAJdQTP74wDjh1MFGW+yUErRxr8+CYWIDyo6ayGJy/sT+gJHPqsvJwgXtXgz/4q64cqyOqswN/9c0t3GmGQt6baTblkFrS1+03W1Ze2249ZwjbHDgKvkhx0gY/X1q7yV2i8LHWwoneDLwAes8CTuMsr5BY/yIa59xFRuwVS6p0el9ESVPaPSE4gsWsgVA0sj7dHU8EZw5KAEuqvAoRTXxACqAMnNPhzehmjFMt74O0JvxxlJWl7eUmMGopF7wsUal11OpDqPScPwGTJlSKOmgaIg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dBucHM00i/c6T5keA/5hOTGHq4Oo4qwmyBj0YhAO8Fs=; b=arDi6/yNUUrZy3Pfe36ppKBjGrzbkhwj8wJpGfZfQiA1k/IOkh2efBBzzwp7T4YBL0SayIAtScwR4IKVBhCqABK/+RNY4AbgJ/zEt90BshQufd6o3vfTzizgodLaZrZzPdB/9gP6fDLN0jx8+OwVqJMQotKXWRYaHGiRBiU6JDs= Received: from AM0P190CA0024.EURP190.PROD.OUTLOOK.COM (2603:10a6:208:190::34) by AM8PR08MB6340.eurprd08.prod.outlook.com (2603:10a6:20b:368::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16; Wed, 2 Oct 2024 15:51:12 +0000 Received: from AMS1EPF0000004A.eurprd04.prod.outlook.com (2603:10a6:208:190:cafe::79) by AM0P190CA0024.outlook.office365.com (2603:10a6:208:190::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.16 via Frontend Transport; Wed, 2 Oct 2024 15:51:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS1EPF0000004A.mail.protection.outlook.com (10.167.16.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8026.11 via Frontend Transport; Wed, 2 Oct 2024 15:51:12 +0000 Received: from AZ-NEU-EXJ01.Arm.com (10.240.25.132) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Oct 2024 15:51:12 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EXJ01.Arm.com (10.240.25.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 2 Oct 2024 15:51:11 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Wed, 2 Oct 2024 15:51:11 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v3 2/2] aarch64: Add codegen support for SVE2 faminmax Date: Wed, 2 Oct 2024 16:50:53 +0100 Message-ID: <20241002155053.1343957-3-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241002155053.1343957-1-saurabh.jha@arm.com> References: <20241002155053.1343957-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS1EPF0000004A:EE_|AM8PR08MB6340:EE_|AM2PEPF0001C715:EE_|AM0PR08MB5364:EE_ X-MS-Office365-Filtering-Correlation-Id: 2158865b-2423-494b-0612-08dce2fa13ce x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|82310400026|1800799024|36860700013|376014; X-Microsoft-Antispam-Message-Info-Original: SOp6xiolBxLlbxlh5g/YeLJZZTIFYK2FgGT8CzZuyW9Jl/ka/qqAIyCbtjpM8uSpMLdtFIoN/cGSPz7NCE2O8sEoRfoJqs4v+6WFOdQ9hQcjk0kz2lyCOpBeuiM1K8mMYobrIQ/dxWxiLoobshwgM/CehFnANdajhXDNM4s3r+U7JDnPXeIXqHVBjk6cWciWiERHzdlm80nfEmAR9Yy/1iDVlGvIYHhQNik6mAfYDnphx1KmKx5dvG16S+sgYG29XbRrFT1DBbgBWXoZAo/lmsx9fFwG+8qA4coJbZMStq1L6sSAsXXPIz7gXGpYQELtv4DVn3oluqItaGq0UMBx53sxSeCV34f6xcUutBTj/qYXTU9NF+monHeZhaqG4RJGfGFAYog3cIRTDJJOfOPtuhMQ8iO1mK1hxjGilU+sYW7bpX/2yP8EN6dKTb7RodA+FwR5KcpZpw12fYtPHZaJcOGX8N9r6euvKjBZKNeJmfNNd0z/lF2nx6DqDXa2usk3e7Wd3d2JRqo/vemPJZcn9E1V5ekruZyJMB6vHmhvh3ajD2IC81RfM+2/5ten4GBgAgDsch3oPp7RvvoFIfMLpl40a0rgnXEYDw3XpWDf99lHMwzt4iaxeY0vXlKrGR2TVJmR35xnB4Yu9p7pjYwQvWzkALEfRx2lV7PjoD7GlaTFAj7mFETjiBBNVSScAI/CFWz9Kuat2GdbQgOVzRLUEpSbrUh0LljeAJMHiBJwUXOa8DZ+vzzPuIRTZvEcEpeg/0rjTqn9VTHLZV3H/d8dyjpWp78sZfg+FXF5MFf12UeXymo7Ho942qjo58J2STHqVpW0SiwnAYsTQRc/fC3kNMLnGfVJ3RolaICQ/QXHHw+F9/A3LZgXGtBsSGU4w9JtroONt7atlFlZMa9ChFl+Hj34FpMteY3flJw+x6Py3Y/TAxvXQDnkmok2AcRbW93K3noh/CFnZUEX4pTMLd+xG0pq/sTV8qcWmuQEhov0qa3XUAlG76HfcQIl6GoU9O+EBpl/HCdmbdxeMwMeOdLCYij4Qst/AxQJWLRLg/m3CLGqLb6tqcTmXrvOhu4ArH7sISMnI5JG3Hn9gg+5YToOraOOY1/pJzQ9f+MWIFMEWBajsjMGN6yLUmxkhGxWLFY1rKzwQiVKqO7DQ9paeqVhjkCU/GaVehIWeVpNwjvi2SCCpmIO4vZmzDkperR/OPbNWktqB8TqKgPEgO3KgmtHkHCdjSQaPNEbj6Fa3lj+N/Ey7iz9cPJJxh8jit0UyyKGJnJWjEalqZNrhQdZPb0KPCjUYEzA/PjwinUjSsJT+bI/by3nhAVYAEOO2XTnTstALtEw4dH+KUb9R2B1C0yS3A== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(1800799024)(36860700013)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR08MB6340 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:208:190::34]; domain=AM0P190CA0024.EURP190.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM2PEPF0001C715.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: af37a85d-b083-4c8c-916f-08dce2fa0aa9 X-Microsoft-Antispam: BCL:0; ARA:13230040|35042699022|82310400026|36860700013|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?btLeDyLA/I0tGj5SOQglQy1uq6nnfnZ?= =?utf-8?q?/4Hdo28aDjYvRS0Omx4VoOn4rwIiFT81AhLK1pd2gD7cMmawzg2WKhj6ZC6bKh9zT?= =?utf-8?q?lXeGa/5MWTIRhI/vPpc3g3EdwW5EdFlPGT0ZXBglQll9GR8IStAwxa1fs1T44HCTc?= =?utf-8?q?/XOb+0cHLlCTs6PwIKADRW/nsaakBFwfaRto3uRFDCE+Kjpt7+VeZz1vdyIzhn4Il?= =?utf-8?q?cxDiP1UQCBjAddm/iwpgFEGq9kvpibTG5zmrsPUKQHNhkkvfC5hcUDz3xAwrdSldB?= =?utf-8?q?wwn3MjVjTuLagancRW3xDdjQIjHZd7L3zw84iEYhai4XPbfQ4p1PgHFG/CbOPmUc0?= =?utf-8?q?WjBX5qb8aeVQCvQd7nPLTIckhLHZjcPxt2Z2n96s0BQBPDqRlITSjE38MLtwnyNhI?= =?utf-8?q?oxD0TgIjc9VMCI8azQtZc45zHWB078r50jO6UT7tW7wEn3musk+bCvKYGNLp6X8Ym?= =?utf-8?q?HwE/ZoGbx3Ebe8Q5FUwjyL5UEPb2UStrvax09UjPjyYfpU+Vc64V3mTut39H1bHjh?= =?utf-8?q?YUMUmIpJpj/cCtIe9pfbBHI7kc6t/XPP8ZF+RBc7LFBbTFhZWSVVqjq+Yvhg5J7CK?= =?utf-8?q?iVXRIDnVTFMOor6oHM2xqcdfXfMkxGbH3fLDOhfbkPXWozVKn9VqfnEHyqoCNh+L9?= =?utf-8?q?geJLTov9iG4eso7d8ZMNfMh9xWDz00VYrgXEwUqSAR+iGtk74q9b46YrFcYjbE1Ny?= =?utf-8?q?HOaH3eai7SvXdvyOgqwzK9oWOsh7aR8HoHlQh8pbtCN/3f/Ne4HZkBzrsudcZG6Lu?= =?utf-8?q?P4bI6NTuodDUl6JjHQTjjJChYysDcfT9/WCJJPmzXJrgEk5KAHNTHOyPoPK42pAiz?= =?utf-8?q?XUWmpvXZfPIt8ygBavO/q+zbKyFQfu5yT5J8dwYdrF+vwA0ja2yP0Au2aL3hOPljR?= =?utf-8?q?mOdvjbtxlA9u1u1t3Wr+70f2yArTj0hqgImEqcaDXonZAyZcRh96eVvUqoAjHDQrT?= =?utf-8?q?+dlJI2Fcp1QwlfV7MqaydOvkMpxqm2b8PtEfK31RkUE0zsNrVKJmWECiHughztRss?= =?utf-8?q?Tp163b9jwTPVaiAq4m9WE55fA/BMegfpgUFjDL463roqJAxkkhBsjAhw27I/pg9L0?= =?utf-8?q?3OB3ufiTWIwQcNVUqKCjZKMrlyDInWJyz9nKCZEc/4G1dZfpzZ5sWA4vlkp17kvuz?= =?utf-8?q?C1D3R4Qo7KF607rakH/nfTQAVVfyZ004if6Y0Xa4uVpDUH3jdwiJ0utgk6wbQ3kIK?= =?utf-8?q?HLtTUM7COR3UdJpMCY8aqjHp9PScQXq33hcQLztLuZGo2Wte4iOPYL/FkEGciuXP6?= =?utf-8?q?VN47U17juXIqqI+uJZ+Se0umAdy2tQeaVkA=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(35042699022)(82310400026)(36860700013)(376014)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Oct 2024 15:51:28.0957 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2158865b-2423-494b-0612-08dce2fa13ce X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM2PEPF0001C715.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5364 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. We also need to pass -ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX and UNSPEC_COND_SMIN. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Iterator and attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax_1.c: New test. * gcc.target/aarch64/sve/faminmax_2.c: New test. --- gcc/config/aarch64/aarch64-sve2.md | 37 +++++++++++ gcc/config/aarch64/iterators.md | 6 ++ .../gcc.target/aarch64/sve/faminmax_1.c | 45 ++++++++++++++ .../gcc.target/aarch64/sve/faminmax_2.c | 61 +++++++++++++++++++ 4 files changed, 149 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 725092cc95f..5f2697c3179 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2467,6 +2467,43 @@ [(set_attr "movprfx" "yes")] ) +;; ------------------------------------------------------------------------- +;; -- [FP] Absolute maximum and minimum +;; ------------------------------------------------------------------------- +;; Includes: +;; - FAMAX +;; - FAMIN +;; ------------------------------------------------------------------------- +;; Predicated floating-point absolute maximum and minimum. +(define_insn_and_rewrite "*aarch64_pred_faminmax_fused" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_operand: 1 "register_operand") + (match_operand:SI 4 "aarch64_sve_gp_strictness") + (unspec:SVE_FULL_F + [(match_operand 5) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 2 "register_operand")] + UNSPEC_COND_FABS) + (unspec:SVE_FULL_F + [(match_operand 6) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 3 "register_operand")] + UNSPEC_COND_FABS)] + SVE_COND_SMAXMIN))] + "TARGET_SVE_FAMINMAX" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , Upl , %0 , w ; * ] \t%0., %1/m, %0., %3. + [ ?&w , Upl , w , w ; yes ] movprfx\t%0, %2\;\t%0., %1/m, %0., %3. + } + "&& (!rtx_equal_p (operands[1], operands[5]) + || !rtx_equal_p (operands[1], operands[6]))" + { + operands[5] = copy_rtx (operands[1]); + operands[6] = copy_rtx (operands[1]); + } +) + ;; ========================================================================= ;; == Complex arithmetic ;; ========================================================================= diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index c06f8c2c90f..8b18682c341 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3143,6 +3143,9 @@ UNSPEC_COND_FMIN UNSPEC_COND_FMINNM]) +(define_int_iterator SVE_COND_SMAXMIN [UNSPEC_COND_SMAX + UNSPEC_COND_SMIN]) + (define_int_iterator SVE_COND_FP_TERNARY [UNSPEC_COND_FMLA UNSPEC_COND_FMLS UNSPEC_COND_FNMLA @@ -4503,6 +4506,9 @@ (define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN]) +(define_int_attr faminmax_cond_uns_op + [(UNSPEC_COND_SMAX "famax") (UNSPEC_COND_SMIN "famin")]) + (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c new file mode 100644 index 00000000000..d54f5d99b5e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c @@ -0,0 +1,45 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_FAMAX(TYPE) \ + void fn_famax_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmax (temp1, temp2); \ + } \ + } \ + +#define TEST_FAMIN(TYPE) \ + void fn_famin_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmin (temp1, temp2); \ + } \ + } \ + +TEST_FAMAX (float16_t) +TEST_FAMAX (float32_t) +TEST_FAMAX (float64_t) +TEST_FAMIN (float16_t) +TEST_FAMIN (float32_t) +TEST_FAMIN (float64_t) + +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfamin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c new file mode 100644 index 00000000000..29e12450831 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c @@ -0,0 +1,61 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_WITH_SVMAX(TYPE) \ + TYPE fn_fmax_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmax_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMAXNM(TYPE) \ + TYPE fn_fmaxnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmaxnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMIN(TYPE) \ + TYPE fn_fmin_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmin_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMINNM(TYPE) \ + TYPE fn_fminnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svminnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +TEST_WITH_SVMAX (svfloat16_t) +TEST_WITH_SVMAX (svfloat32_t) +TEST_WITH_SVMAX (svfloat64_t) + +TEST_WITH_SVMAXNM (svfloat16_t) +TEST_WITH_SVMAXNM (svfloat32_t) +TEST_WITH_SVMAXNM (svfloat64_t) + +TEST_WITH_SVMIN (svfloat16_t) +TEST_WITH_SVMIN (svfloat32_t) +TEST_WITH_SVMIN (svfloat64_t) + +TEST_WITH_SVMINNM (svfloat16_t) +TEST_WITH_SVMINNM (svfloat32_t) +TEST_WITH_SVMINNM (svfloat64_t) + +/* { dg-final { scan-assembler-not {\tfamax\t} } } */ +/* { dg-final { scan-assembler-not {\tfamin\t} } } */ + +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 8 } } */ +/* { dg-final { scan-assembler-times {\tfabs\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 8 } } */ + +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmax\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmin\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */