From patchwork Wed Aug 28 09:22:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1977698 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Jx1n9fMk; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Jx1n9fMk; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WtzVN2f63z1yfn for ; Wed, 28 Aug 2024 19:23:24 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2AA243861834 for ; Wed, 28 Aug 2024 09:23:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on20625.outbound.protection.outlook.com [IPv6:2a01:111:f400:7e1a::625]) by sourceware.org (Postfix) with ESMTPS id 261B33861027 for ; Wed, 28 Aug 2024 09:22:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 261B33861027 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 261B33861027 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f400:7e1a::625 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1724836974; cv=pass; b=ZV8OYpyXcuAGFyrKg9ujoke8bnj+KtJ9jzRhNgJ4RW+cWdVrBOiDbwuY3i2qyYYFQ22hLsLqqHeNfrVXXjfaiCBmV4g3KAMBWxh4rfzm0qq4LIdAGOUBVqWHLyxyyeQC95f85JZy6zjTO/QlmKrPPcUJOGIY8qEgIg3cjtFQTIM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1724836974; c=relaxed/simple; bh=hJsqhL5J92riBm/7DINnuX4KBaa8NSWuMbHUDKZdk1Q=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=arFwETOru72CEBEVHcq/BsOtwhShg2D2afwyL2CI7R5Jj2AnqeaGWWeGhbndqd38t43KvsLpTof2C9xnkzdJscXpmlDau1huQYfpEhsoI/PuUW1JSHYRPrhX6Ft/pD7+IjCJDpgmXgym9CIfa3fLd+ZOin3DmNCNuT2ei+gqslY= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=dCG6YgM6tMvC7Bi+kJdnFA0LWt+TuXa8P+rHkx6IWmwtV3yXC5ki8I+ep04ioUGwMGAYs7oBq+fKaA3abx+XlYNMtWoVe01uNzDvUGrY9HaR5J0O7mfrNNh+EhwW6E+c99U/gyIgQMzOJmpWBXcTsdZ/kI1CZy4aorVdxkel3lXY4bElSc8aEHd6beeid4v6IG7lWS8/QsSxOZwdmK/+tzgUek+1n2mtmcT0aGfFWnNjOjDup7NcbWpEl8k2XOacCZSywJBSInE3LQlK9iG/uN2jvcTuQuvdRlOrlgxfCA3k8DECpirXWa7xAEptdvKmZkQYY+BKaFNYipPs8YO95A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2nWNh38P0Ql66jhE34dlNdlKYhUehZOpKIXiPsvPFeg=; b=eaVjDHfWzJzl54UCMpEAn2mwuW8EtI7XGZdrpg1t0vUhXERiz93MQXG0Qhwvt2DFfGqCEw1hO7ZPS+lJfEmcniR5x/TrAEaGH83PZuzrn289rcvaXhrF8vNQS1rViFCNoaF+WuA2ICQND6CG9iX9fIOR1aJqozAE/5DXIWRvybttONxM8R6Mpq29b5Q/iQ592t/raygZ4fGLmRVgI6oiai7zBlzauy9QFoBRRZqSvv10+D3ByHROyRfC1AGrN5r4nVvI/Fozv+XLd6wEc37K5R2+fcFEI8GySll+QWWwFAdYqK8D4+ukNsECuhQIMO4fYdt19FaY+WvVJBwRb4eKDQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2nWNh38P0Ql66jhE34dlNdlKYhUehZOpKIXiPsvPFeg=; b=Jx1n9fMkbqcuTlj1tIECszR4bY/xZLRCZPrD6sFnNsnPKz/8gipr6kwq3vwf2Ib9rxkqUWWl4ENmllgZaytT9SGyAMB7ecRffqeCRK2n6lXxqiOtNQ10J6hCx/vmUuDXaiYuEWkSMZSbucsvFIcZtof8qPQVyZg2/PbpYH8/HgY= Received: from DUZPR01CA0074.eurprd01.prod.exchangelabs.com (2603:10a6:10:3c2::20) by AS2PR08MB9987.eurprd08.prod.outlook.com (2603:10a6:20b:645::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.17; Wed, 28 Aug 2024 09:22:49 +0000 Received: from DU6PEPF0000A7E3.eurprd02.prod.outlook.com (2603:10a6:10:3c2:cafe::47) by DUZPR01CA0074.outlook.office365.com (2603:10a6:10:3c2::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7897.26 via Frontend Transport; Wed, 28 Aug 2024 09:22:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU6PEPF0000A7E3.mail.protection.outlook.com (10.167.8.41) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 28 Aug 2024 09:22:47 +0000 Received: ("Tessian outbound 7d86ec5dfeb5:v403"); Wed, 28 Aug 2024 09:22:47 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8c04f341b1159024 X-CR-MTA-TID: 64aa7808 Received: from Lcbefb93bca1e.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 2862BA77-49EF-402E-BF80-EE6A41605A31.1; Wed, 28 Aug 2024 09:22:35 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lcbefb93bca1e.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 28 Aug 2024 09:22:35 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=mF/vbyG8ndTPQrIaFv7L0qCSSZ8mTAWveR9ture4sg+QDxQMuTL2crxgSki7HRfLvh5UjelRDTYHaC8Mo9xJZJwnTTs/E2YcPhWeD9g1NA5p0cJenU7JGjcZsmIXdEvd2I3hKbj6/io1+v0SnyaobcKimy9oJI5j1/6NNQ0AWj7pY/3vnsOnGp2Y4vzDbb9pUYA5nQ2NycyTvbm6yO6VSTmz52UcGIKcs1AlRPXXnfvUL+bljLeg+z0qAmHw6k97LsGd/EniIhN1AhffMWmPz8IMyFEbb1EC16qZahS6ZMmjL+WjmUgqobsQ9cHBajtIaxmOFM3Fi4HHDKGZvd93zA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2nWNh38P0Ql66jhE34dlNdlKYhUehZOpKIXiPsvPFeg=; b=OvVYW1zgGIowbfgTJMqzobIsPS1hgxtHx+IHVSDN/BxelKV/rVrpPt9jDVipDPvv0sbQ/yPxiTrGpTTv8Z9SF9on+Yv1vpKM259E7ALwtdNrQkNsl3Frwu1pGsXEAfd7KlYTNp6UiGZNC5W8ukds0BCKGhlxD5Kr5THFdbY/wXNlbpQBzeyHPK5JpU/JmywWSix7QUa2/jKDKljo/sR5EG6Km4RltPR5Z6Vl1A1oUSVnbt3QWHuHoJroZqFhSxDYLnH4ypcrfsz/G1TGjNywn5WpoHruBRJwcsUAtMF9fzioSoZ+gBXu4o/D1J2BXenDIkHUVDI0Ro12EtRpyRJbKw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2nWNh38P0Ql66jhE34dlNdlKYhUehZOpKIXiPsvPFeg=; b=Jx1n9fMkbqcuTlj1tIECszR4bY/xZLRCZPrD6sFnNsnPKz/8gipr6kwq3vwf2Ib9rxkqUWWl4ENmllgZaytT9SGyAMB7ecRffqeCRK2n6lXxqiOtNQ10J6hCx/vmUuDXaiYuEWkSMZSbucsvFIcZtof8qPQVyZg2/PbpYH8/HgY= Received: from AS9PR01CA0006.eurprd01.prod.exchangelabs.com (2603:10a6:20b:540::9) by DB3PR08MB8939.eurprd08.prod.outlook.com (2603:10a6:10:42b::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.14; Wed, 28 Aug 2024 09:22:31 +0000 Received: from AM3PEPF0000A790.eurprd04.prod.outlook.com (2603:10a6:20b:540:cafe::a7) by AS9PR01CA0006.outlook.office365.com (2603:10a6:20b:540::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7897.26 via Frontend Transport; Wed, 28 Aug 2024 09:22:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM3PEPF0000A790.mail.protection.outlook.com (10.167.16.119) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Wed, 28 Aug 2024 09:22:31 +0000 Received: from AZ-NEU-EX05.Arm.com (10.240.25.133) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 28 Aug 2024 09:22:31 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX05.Arm.com (10.240.25.133) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 28 Aug 2024 09:22:30 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Wed, 28 Aug 2024 09:22:29 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v5 2/2] aarch64: Add codegen support for AdvSIMD faminmax Date: Wed, 28 Aug 2024 10:22:12 +0100 Message-ID: <20240828092212.3995835-3-saurabh.jha@arm.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240828092212.3995835-1-saurabh.jha@arm.com> References: <20240828092212.3995835-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM3PEPF0000A790:EE_|DB3PR08MB8939:EE_|DU6PEPF0000A7E3:EE_|AS2PR08MB9987:EE_ X-MS-Office365-Filtering-Correlation-Id: 50503b69-230b-4db9-a84c-08dcc742fb19 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|82310400026|36860700013|1800799024|376014; X-Microsoft-Antispam-Message-Info-Original: Fzwv9uTYpbqDiAvyS4Y2ePiALovPOuiiR4uXbB0390B4hT3I1TIJhoHL/EFlG5ree6UwFmfN3empgQclTGeM+kq6aXP9w8iWBBhNxFmQOR5f5oME0JpHZgCOBoCNyJjAlbVriHwG7BuRikWsLx+a8BGdxXUI7AIMQnhbb5pbG0i4iFHLHp5KrVbe2nsqMIDwgaioQgM8Bt5jbFnMfx1DpsR+gmOBytlgC5sFTRadBgUgcs47Cv4DngaVyeKr974eBItTIS28yvfrFrjYob5i9aYfOnGZemMTO5v5qccQDQUbxRgTuvgIov9dVhgoFdtd/0sKtV+T7/4B5LT2D58dKvvuqCgyDf9b8R3H4CNX7l5/epbP156XMnBQnCaWA9I/LMAiXek6lTJqkLWDjUNfDAxQhiaCWVdGnW4S63Pd3QwgAG407lIqw8IOP74DNEdC8hpwpW7Kt8JskxSFNZx/M1trEY5cRiQeJG6C9JQ/3Ck61gmpmBFJlHPnkM+RK72MTOqO0VZgvvr0i7h9VgFW/choqDyp4oEnMr8C2WuKRB6WLTWChyfXbl2zVRUVDEwps7SIpycKXs3DJwSRz4wxfotnHTjRa5kt0HW/IEJWUgw35xIVfx+gFrj8Vw5j6E5yJ9KNBLoh5sZcScEkRnHviirLIDBIFas8vlZ9jOMz0vF1YQfbyur8uhTHaRhaONPFYzTWkI+COSaEpKmSpNS4WHnHWFnnlg5ryJCHZ1Hnh2oDn0j8LsFiFedJk4rTpBcy0PqwAtTP4FGOQFsW6JomgTZOYJZYyusxOEYhdnPYU3CGvO7ydtDBWXFM7kmPNHqP0IXDLnGxqpeOtQ9rP+UzZtsR32uVu6EFaQ9KhftoLijQ6CE4qe8LFzUQB6CjEM85lwFliDgcDJmKIHn8A/oIBSGoRMVSKOpgFcKDH3EzclsdfEODe7TIjEyH+z5hsth2fkaz0AB5U8jWrffze4EYnLLvwXO+qc6aVSnBkj3cmhqw3PazMFc1j+nuFezNjoezcTv18GeQ7arraZAPNLjzuN/1tR7iGsM6SEBLtgSHVjoRNnMS3IQX24DVWNsfMgHsQmGLJb4s+yS4S+AsTEd8WDbd5+hEhLknSepUfDs1wsPWRN+GKtDEd2I9kqML2M/YtGTUI+Vn1Kos+0Zvfgmct+WOSqpRkqNndWIIK/ZW5JDGXk0JdF1nTSaN2Y0KD0sfSNsdzwbgSTQIKtK1DHSxobJR4+THN3YZf01HrRWlxO9WrGmlmOrb0zqbjXfk6uYkW7RzflskuAiOAueJTd81nsSr3r+MegZ865U/oqXizjHmnxf6NUrM4Crd0W6XAD2MifM4SIWJ5TI9XVSMXbHbRL95laZu3tg2SYdICGkFzG5kkC9lPe7QzhuJa+3oKEGGpdU0cJP57xQuGLoOro095A== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR08MB8939 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:20b:540::9]; domain=AS9PR01CA0006.eurprd01.prod.exchangelabs.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU6PEPF0000A7E3.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: bcf89adb-5275-415e-6c1c-08dcc742f19c X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|35042699022|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?TPiJAX2W9qFdpLkyD2MW34XO2trTyM9?= =?utf-8?q?LYugWRQBVejdxZawf4RLBVKFKkFu2TBWXQ8xmcOX+H7mD5tzCB6xFCK3eNQZTGVWS?= =?utf-8?q?U5fVrY+8Py2OqpOYLaBxsGAEax3Bvnk9UEZV1+jqjh9w4887H4AOY7+buHf1M4sui?= =?utf-8?q?Zx1/bbuqLzdsP7WQ1sCqO5DLC59FTsuk34wYvpcKhWgHjkDvPFsDtmYgb+VSPMyId?= =?utf-8?q?IlgI57ORKT4R/vpV8HoL+9s5SaZlIcZHm0pPrgqRhAsKUAx7D5wfYeHA7twyME4TG?= =?utf-8?q?SH6Wyr8Scnwp8QQd+NqibzC+5dXG3pOxWjRPAY2wRoMAR7JMAil1HdUYfmMF0Ly8I?= =?utf-8?q?PyLgOMkz4FNHoLF3bxoKdh0Ssh7cFBEpgm9T87qXUAYihONQyd4xIMZK35r3VJPdo?= =?utf-8?q?xSyWgce1qchtNDSbtgjFdGU6QKK7ZYguWWC79auS/ieBpOeuLRjUnK7VCli0wH/O0?= =?utf-8?q?NEkr1GJi0PwxH8ETqAE6AZpb6mBYjye1E+6Nq0O0xS7Za3AjQ2d/e+jS0hivoxcLk?= =?utf-8?q?6Mpi0OL9f4R2X6+phhdpObaoLsOFCrYrTIE1e4XoRbWhW4QKOVUQTEuje2Dd7hgzL?= =?utf-8?q?h6WE6OVKWZ9ybJxwiWY2W2mu16kBgMC9NNQySyHu9oKRQMyUXUozSRb8aKp8xJe9D?= =?utf-8?q?8nZvOdnGzTlLC3+5EmzEt6I/ClHcjOB60tJ0wmQESaq/wLr9shq4QUIPtf+8MlP/t?= =?utf-8?q?6BeESFrHHAzjMSz4te8oSeZ/dtxBmsqbOHc6cXpdejsHy/ZfOibQyK+VWtL/zxJj2?= =?utf-8?q?SEla+w8VbJCKHZKbPxA8d4jDKQrq1zcLqHshaqrhxKjMoc9dC/1YYWuIwWdTEA6oL?= =?utf-8?q?bsr+FpgLbU1k6k9BtOSVzYo8LkP5YR8poI6Xrwm4Dwi80ALEMRYWFrVK38KEO9vxy?= =?utf-8?q?B8D7kFPpV/bw+bV+b8GdeMG1NC3AhNRLG58l5Ysmf39u9o23tMh8UHkuZxU2p2fq5?= =?utf-8?q?DO4UwXtr4YiP6O5LVbpgG67lWI/4y2KpEsO1iAjbWewfjl9f+AqAanLqCr9akyUx7?= =?utf-8?q?lisInY6ndhcbx+fp37cA+MyFqyaTXR9Pzt+nh9nwDRarPfx2ke8L7xyfCUo90F4Z5?= =?utf-8?q?1awYAEu48pdJL3q94/HBsTfApb1Bqy931UxNADwAlE418byYbRTEbtqGX4AsMqlQ5?= =?utf-8?q?g8DE+XvQhn+D3cUfJigSXNnBQUc+WOcWhuVNFzgKV7FM0pUZgAbnPb8Mj1DG6kADV?= =?utf-8?q?sZuSpeOkHqBRZ8Jf/+npLojnMCHjnxQarcU7b4uCP9NSa5U4VA40pDzLvK6LxlZ8M?= =?utf-8?q?4gpSD4Bl1CtlgeeZ1aDv8UHxwcFYMxVrfLjYjJaRpoNERWW9q7PSN7tpd4Dhzwj4l?= =?utf-8?q?KUswKVsp0ZmZvd4HENPFFa6qKoNSOuCJhQ=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(35042699022)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Aug 2024 09:22:47.3830 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 50503b69-230b-4db9-a84c-08dcc742fb19 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU6PEPF0000A7E3.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB9987 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing RTL operators. famax/famin is equivalent to first taking abs of the operands and then taking smax/smin on the results of abs. famax/famin (a, b) = smax/smin (abs (a), abs (b)) This fusion of operators is only possible when -march=armv9-a+faminmax flags are passed. We also need to pass -ffast-math flag; if we don't, then a statement like c[i] = __builtin_fmaxf16 (a[i], b[i]); is RTL expanded to UNSPEC_FMAXNM instead of smax (likewise for smin). This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/faminmax-codegen-no-flag.c: New test. * gcc.target/aarch64/simd/faminmax-codegen.c: New test. --- gcc/config/aarch64/aarch64-simd.md | 10 + gcc/config/aarch64/iterators.md | 3 + .../aarch64/simd/faminmax-codegen-no-flag.c | 217 ++++++++++++++++++ .../aarch64/simd/faminmax-codegen.c | 197 ++++++++++++++++ 4 files changed, 427 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 7542c81ed91..8973cade488 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -9921,3 +9921,13 @@ "\t%0., %1., %2." [(set_attr "type" "neon_fp_aminmax")] ) + +(define_insn "*aarch64_faminmax_fused" + [(set (match_operand:VHSDF 0 "register_operand" "=w") + (FMAXMIN:VHSDF + (abs:VHSDF (match_operand:VHSDF 1 "register_operand" "w")) + (abs:VHSDF (match_operand:VHSDF 2 "register_operand" "w"))))] + "TARGET_FAMINMAX" + "\t%0., %1., %2." + [(set_attr "type" "neon_fp_aminmax")] +) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 17ac5e073aa..c2fcd18306e 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -4472,3 +4472,6 @@ (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) + +(define_code_attr faminmax_op + [(smax "famax") (smin "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c new file mode 100644 index 00000000000..d77f5a5d19f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c @@ -0,0 +1,217 @@ +/* { dg-do assemble} */ +/* { dg-additional-options "-O3 -ffast-math -march=armv9-a" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_neon.h" + +#pragma GCC target "+nosve" + +/* +** test_vamax_f16: +** fabs v1.4h, v1.4h +** fabs v0.4h, v0.4h +** fmaxnm v0.4h, v0.4h, v1.4h +** ret +*/ +float16x4_t +test_vamax_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f16: +** fabs v1.8h, v1.8h +** fabs v0.8h, v0.8h +** fmaxnm v0.8h, v0.8h, v1.8h +** ret +*/ +float16x8_t +test_vamaxq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamax_f32: +** fabs v1.2s, v1.2s +** fabs v0.2s, v0.2s +** fmaxnm v0.2s, v0.2s, v1.2s +** ret +*/ +float32x2_t +test_vamax_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f32: +** fabs v1.4s, v1.4s +** fabs v0.4s, v0.4s +** fmaxnm v0.4s, v0.4s, v1.4s +** ret +*/ +float32x4_t +test_vamaxq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f64: +** fabs v1.2d, v1.2d +** fabs v0.2d, v0.2d +** fmaxnm v0.2d, v0.2d, v1.2d +** ret +*/ +float64x2_t +test_vamaxq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fmaxf64 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f16: +** fabs v1.4h, v1.4h +** fabs v0.4h, v0.4h +** fminnm v0.4h, v0.4h, v1.4h +** ret +*/ +float16x4_t +test_vamin_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f16: +** fabs v1.8h, v1.8h +** fabs v0.8h, v0.8h +** fminnm v0.8h, v0.8h, v1.8h +** ret +*/ +float16x8_t +test_vaminq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f32: +** fabs v1.2s, v1.2s +** fabs v0.2s, v0.2s +** fminnm v0.2s, v0.2s, v1.2s +** ret +*/ +float32x2_t +test_vamin_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f32: +** fabs v1.4s, v1.4s +** fabs v0.4s, v0.4s +** fminnm v0.4s, v0.4s, v1.4s +** ret +*/ +float32x4_t +test_vaminq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f64: +** fabs v1.2d, v1.2d +** fabs v0.2d, v0.2d +** fminnm v0.2d, v0.2d, v1.2d +** ret +*/ +float64x2_t +test_vaminq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fminf64 (a[i], b[i]); + } + return c; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c new file mode 100644 index 00000000000..971386c0bf0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c @@ -0,0 +1,197 @@ +/* { dg-do assemble} */ +/* { dg-additional-options "-O2 -ffast-math -march=armv9-a+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_neon.h" + +#pragma GCC target "+nosve" + +/* +** test_vamax_f16: +** famax v0.4h, v1.4h, v0.4h +** ret +*/ +float16x4_t +test_vamax_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f16: +** famax v0.8h, v1.8h, v0.8h +** ret +*/ +float16x8_t +test_vamaxq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamax_f32: +** famax v0.2s, v1.2s, v0.2s +** ret +*/ +float32x2_t +test_vamax_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f32: +** famax v0.4s, v1.4s, v0.4s +** ret +*/ +float32x4_t +test_vamaxq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f64: +** famax v0.2d, v1.2d, v0.2d +** ret +*/ +float64x2_t +test_vamaxq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fmaxf64 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f16: +** famin v0.4h, v1.4h, v0.4h +** ret +*/ +float16x4_t +test_vamin_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f16: +** famin v0.8h, v1.8h, v0.8h +** ret +*/ +float16x8_t +test_vaminq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f32: +** famin v0.2s, v1.2s, v0.2s +** ret +*/ +float32x2_t +test_vamin_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f32: +** famin v0.4s, v1.4s, v0.4s +** ret +*/ +float32x4_t +test_vaminq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f64: +** famin v0.2d, v1.2d, v0.2d +** ret +*/ +float64x2_t +test_vaminq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fminf64 (a[i], b[i]); + } + return c; +}