From patchwork Fri Aug 30 11:16:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1978937 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=otwZkSLW; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=otwZkSLW; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WwFx35t5Nz1yfX for ; Fri, 30 Aug 2024 21:17:25 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 26D59385DDD8 for ; Fri, 30 Aug 2024 11:17:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2062d.outbound.protection.outlook.com [IPv6:2a01:111:f403:2612::62d]) by sourceware.org (Postfix) with ESMTPS id B90AD3858C50 for ; Fri, 30 Aug 2024 11:16:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B90AD3858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B90AD3858C50 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2612::62d ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1725016618; cv=pass; b=WNlELAvFk0tll0ekYnYsDHe3ZL0dKbno9RBmLM0aq0d5kVhKhk4OyMWMnsbj46bqO/So4psRL85SCO+slc8KYNlvHBYOL0kWF130SDfOXJUsxqO12IzKCUHXPAo+2hos7NlrN4jpSOzKQ3VyLMtoZyRzTgLpLhm4ooxokL/ntYc= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1725016618; c=relaxed/simple; bh=p5WLwFzpItv8RFAqXXBZPZf5AzxEcucOFeE13r2RaGk=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=gQaogWFuvkJkIzk4eSU/3yaDnYn5Z3229YPja7d04X97wfvAAWh+JNLqjDwICZ7bQ6hSqX2ss1IObDIKprTnKPSlVcFCiUzcB2uUCfXwlpcSusGrzvNTwmwpRr9vw34OEzSMx2dcQ1mNvazbxvL2lcQnhL9Gc8qrb3dwWMCK01c= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=ms6Ve80+++ZXdscqP7M9VsjqbnrguXNoAN+GdjvloPuPNSXoA31QGAnYwvb9+4pR3RN/RRkR6+mhI7tG//v2ckP+7ELsvMiF+FFVFVUhNienYYfLRZd77+spfaa0PdAm76eOAa2n63BnowYqW1UQ8ZQoASR/QcX/Xi/WNOw6ldWF92+YMEAenXyxMiRZFi1LZYhk9cuFVUYHTYniKTZfM6yXggtSSsY8KO4Bv5mRnkaOBXjoFUCz/2rdS02HyZTiY/OGJKX5hruk053Skrfm8Uh+SK2JDcvpM7BeUtQPw9UdkePJMDS5Fa3q0BJE+0diQI+nozf44P0oUL4RFxM51w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8fNikbJXH3bYkpHuL2OrPzpwfrVYjzgyWD+y9b4edU0=; b=v7OZVVtcO/4vyCbynUAL7+sk/HHW64kjOiWmI8KZ/fXcnk5JaDFfitiF07Cs2To9n+tp+l3QCCK8gc4gciG2Z5IJi/ru8+RzUXikICfeP3Ek+p+nwFYEX/LJUJZlcUppAbJdBbId2ufFA2hQLL+JB1Cwfv22kfTSJxFqjx1mNqovej1+V6vwf3bXJkyQKObCj607UDz4iH4GhuZRgARtUXMQJay/5blp4GWx4zR8RwTVBUx+gOSsJy9kYwqZfsin7nbjGIKDFnbRsum9LxPQt5/JYQglDRk6iG6OJfaXtEcy4bvi0k9l6UhabeDpj8MiHnVG8RjpRtF+3DFuzIYthQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8fNikbJXH3bYkpHuL2OrPzpwfrVYjzgyWD+y9b4edU0=; b=otwZkSLWC1p2oMCdq7PrLpHjUOT7e51GajRNGIzzkmN20iZ4j5zoxQvqBhtgGSoiGXCe+PDK8j9BsY7YZ2QLrC1BF6DH7p7+/FHXtZg4A6WEE+pw+fiOaLQ5vPgg7UPx+Hpxiy8wbWtgt5MfU84SRTUZPTykqclQrQprPnWRQTQ= Received: from DB9PR05CA0006.eurprd05.prod.outlook.com (2603:10a6:10:1da::11) by DB9PR08MB6585.eurprd08.prod.outlook.com (2603:10a6:10:250::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.20; Fri, 30 Aug 2024 11:16:50 +0000 Received: from DU2PEPF00028CFF.eurprd03.prod.outlook.com (2603:10a6:10:1da:cafe::93) by DB9PR05CA0006.outlook.office365.com (2603:10a6:10:1da::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.20 via Frontend Transport; Fri, 30 Aug 2024 11:16:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF00028CFF.mail.protection.outlook.com (10.167.242.183) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Fri, 30 Aug 2024 11:16:49 +0000 Received: ("Tessian outbound 901f45c3f9e8:v403"); Fri, 30 Aug 2024 11:16:48 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: cfab75c768c63020 X-CR-MTA-TID: 64aa7808 Received: from La5b826eeacca.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 3401A3AD-4B2C-4CD4-96A1-CAEBEEA1BF50.1; Fri, 30 Aug 2024 11:16:42 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id La5b826eeacca.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 30 Aug 2024 11:16:42 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FomLKpm8RWr+HRt7Od+FSKd8ifnYy0zUKaANsyNTSGrF1tpTfYlryalT8ldr6w6TJJLSRcSqPtDLpxKmZmRLLEAvJD6CqsY/rl9t6+ImuNiYDokE5nSbkqe+JRIzNS82YHEDz0XOpbeB23CxahcMYbSgCAF6sa1iSbGccTUa7yEK7/aWzeT5wAxL9vYh7+iFm9gApJp/4E590NWS5SVSwr75zTaq+sWtsCbWP92jKjcM3iP2oVPcXgUiwFifw91rgY2ETB/X/InsVGDngtlvHL5AYfaAaGe5jVbBR/1vpJxCtrYCizSKPNCqLVFrufmFJhIqkJ7AR/HwIEj9YpbeLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8fNikbJXH3bYkpHuL2OrPzpwfrVYjzgyWD+y9b4edU0=; b=hAaX9YlR7QG0HE57K4UWmZk5+jwZFn4oFfx5AoEBnacbI0gfg7w8rQoUEiek9yoeTfP8UcfX/TOPvlrMiuck/4ASRcQt23g13k4PObYLbqAbhKwYhguWVj0SyZYeumkyN+X2px+O9VNZRWAYmH7L57ZWbkLXF38YdpKRHcikyI1YGKzqOxLS81qBS29M6VJAVFgJYG5WRic2XCbHDlkaAh2ZOSZ2ah9IuFTUJt7Jes7o1JGoUZ3kt3Fzrw8T0m8/EKp7reNJqfWaZsDCvdqkxMzsMcIdmV4FZb0+YGm5fBgNr8N4hIUMJk4NToaLd6qTvKwRow4sEHWaq0NZeRGYYA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8fNikbJXH3bYkpHuL2OrPzpwfrVYjzgyWD+y9b4edU0=; b=otwZkSLWC1p2oMCdq7PrLpHjUOT7e51GajRNGIzzkmN20iZ4j5zoxQvqBhtgGSoiGXCe+PDK8j9BsY7YZ2QLrC1BF6DH7p7+/FHXtZg4A6WEE+pw+fiOaLQ5vPgg7UPx+Hpxiy8wbWtgt5MfU84SRTUZPTykqclQrQprPnWRQTQ= Received: from AS8PR05CA0012.eurprd05.prod.outlook.com (2603:10a6:20b:311::17) by DB3PR08MB9900.eurprd08.prod.outlook.com (2603:10a6:10:43e::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.18; Fri, 30 Aug 2024 11:16:38 +0000 Received: from AM2PEPF0001C715.eurprd05.prod.outlook.com (2603:10a6:20b:311:cafe::1e) by AS8PR05CA0012.outlook.office365.com (2603:10a6:20b:311::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.20 via Frontend Transport; Fri, 30 Aug 2024 11:16:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM2PEPF0001C715.mail.protection.outlook.com (10.167.16.185) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Fri, 30 Aug 2024 11:16:38 +0000 Received: from AZ-NEU-EX06.Arm.com (10.240.25.134) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 30 Aug 2024 11:16:37 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX06.Arm.com (10.240.25.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 30 Aug 2024 11:16:35 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Fri, 30 Aug 2024 11:16:34 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v7 1/2] aarch64: Add AdvSIMD faminmax intrinsics Date: Fri, 30 Aug 2024 12:16:25 +0100 Message-ID: <20240830111626.70300-2-saurabh.jha@arm.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240830111626.70300-1-saurabh.jha@arm.com> References: <20240830111626.70300-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM2PEPF0001C715:EE_|DB3PR08MB9900:EE_|DU2PEPF00028CFF:EE_|DB9PR08MB6585:EE_ X-MS-Office365-Filtering-Correlation-Id: 5c70e640-08a9-4b73-3dda-08dcc8e53dee x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|36860700013|376014|82310400026|1800799024; X-Microsoft-Antispam-Message-Info-Original: SYw7YebVIghg0p1v92W9Yv0HnqtbxFfk3Se5EAG53SNQBVU73Oy5h+wXf10uBgtM4/FdLK/hHoSVK0SV+CAp1fBc4dsVW74nWBfXZtAULC9GQhtdWKg2RXWTO8403L14rywUiQacwP3Kj4uy0abVFyT1jrF4hYFXmFlUtlgOwR8dkOHnakNYUsWpKZ4ch0FdmXDmfgCfAgx/lbL8mz2RX/8ikPdLjGuC5TQ/+YqqiKIG7ltLxFYxChCGHLRRVjVZ9kI/MV4XZpYOYPuZ4xLGU3BRXkgRuQqqpdY4SzxQUsrfK7I+JR0v6U8npECVe3aKDVYxc+Bv0RlexPD1T553n5APfR5YxY3m1cSf4IuH86nGof3TSG5eEBT8/1Nawh1V5TApe6tW957ebgVuYNzFP7mksG8QOecxLmOxAONbSohn7CQYrHVYDOE6URwDE54gUVkGvx5ZJgWmnMfjLj9Z3uvngMnyACHZURO8bVCndxzKj5JMFqEcs2p/XdFVn26vHv5G+EnRyDwo4xvSUvyX92POpK3R0A8cK4JDFPLLVJ2mHmSS3LfH6S0Z7XdODJMFoKbU4pecX7mQnfcMi6BAwz10XhAcyFtwYYDHduu4SPPhw9n9rsnWCeSXBV9/xO8xsxMw//N6m1XTPbbsrYgf1vJvEmKlFBIirMX199SusXSZn6SvVJ9vzD7VKuR/orNL2XqZkwKfBBFFtb7DjCQHQTojV7r0NwUhPt6qpcdZHtMW9wBY7d/Bhp1MsV6LIiCPdujFRuNxgvmbuS2bYG0wNmDaCu0PjonrfyaklfuWPuXxemKE9F0E5JjAL8O9ZDDw7E9zdcA1dX+Aao76IYRjSPeWnTA7zSfChEFndjHbBcB3gJcMRD/wMrpoMR1q5CoWieXf71rCzWyl4EMHqjYA9fXjJAUKqLZK1/nwBr6bPp38EmVDwfyfLS1jtiEFouEW+WTZWVS56Z7Jyba+CbYqYLTDAhduY2pRWRYB3plWSux2lsFRetTz0B3bkGVycA7ej/9C/7ETzPH+rkFKy4IXwrLmRzHb/oFpiD1ACJza+L9oX4TnPIRQEtxy7Bcw3dPnm2rSWQ1LJTpDgyPpjyffDfeiraKFbPJTCzJxEA6YnPm9y0pAtypWhyerFV+xdntC6uVl6kfknceseLVKtVweAFVX6Q3BAQcY9uhnXuzss2Ra/QN4v6fTFi79bMmFrQLjtbPwlpI5gqreEJ1gfSFkL6rrpIqe0vdrqNBKSjqZQ/4epG/WqQuLtUbvu52FmljwMJkLpfBF4lxkicR7xgwePVQps7vBC2mvVT6TdJT5P4dQNE4SY2p5QfxxjkBhlqiIKW/maJzaV4vqNg/A/HJh4Pb5Tqj+xtuk991YyHmusm+EJ3g3fdBd7qzO6IepnQUTLly3soc2NxS3zjgrcNpfrw== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(36860700013)(376014)(82310400026)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR08MB9900 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:20b:311::17]; domain=AS8PR05CA0012.eurprd05.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF00028CFF.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 7d3e8334-58ea-4e1e-77eb-08dcc8e53791 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|376014|35042699022|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?YHVW1odA5SGukjTArrAaZwyCHBu43RN?= =?utf-8?q?NLDzb/55ljvliEhhfY/rWDtUmriFUdqvhq+QTcCgkLmFXjACaH8mq+usGP9nJpn0Y?= =?utf-8?q?7PKeMwNZRDqxSiXKl/hTU3y6bWIaDxyTDVlgeDVHq2R+COkj/NhU30E3FLpnek/Ry?= =?utf-8?q?1JvGUuKQabPvOyQYZKdEfvjgZ1/xCuyQLXFSHrHrz+aNuVr2quc+tb90++kb/Vamh?= =?utf-8?q?v2dUfm2n2lbQKwIoEeEyCNYecYNhJz+smQLW5Ki9vpzoe7TzsdGZPPNKHuQKIl88w?= =?utf-8?q?U095C6yzt08i23lbtBhIBIZED5K99h8tucFbBdxW2OqZR+0dqBCpGgtNi7QctjsJP?= =?utf-8?q?hlzVI6RVxBsTPRuijjEIjekiB7hV5yuHY5hmVOh7UADsZOjhTeiu+NThsJm6f+tGz?= =?utf-8?q?pBtR/IUvxRA04G0WTod+6bKMkFlXyPXmkCmsEgph483ZND5XcbCpvwoFoeIKT9Wli?= =?utf-8?q?d3F38p1G3+bG4NLV4cKVr+LdsSCCpFnIQi8+vNF4qBCcdy1f74ySN1/IWqHdJypHt?= =?utf-8?q?b1KF12EYn3pW6LSnrBFCevai2zxSY4G6YT3/wtEF3iaBCp3I2deqUYrcJsMQchHYG?= =?utf-8?q?5T/GlrX/BAfT4gS4/mSnb7nKALPdNenxtjwjlvjq+OczayD8LZY1A+pBveAM/tJeC?= =?utf-8?q?TlCaZ2C9oQSGHORKrNo5ukc+5lJS/lDlyM21YzemrUwwrE7ZQA4XgBTa03cHi4+m1?= =?utf-8?q?tx1lhQl66NLh2bjd//2pjHGg0/h0argtHf8gOyncVTNtmEPY7vW3XZ+DVCrQyLhJJ?= =?utf-8?q?BEW8zYtenk6W68xK/FUUANunzo8hWngKagjgdh3q6O6eoiTvUcXiQitqXVXmrt3a1?= =?utf-8?q?f1ck2u/jSRJRcJJzABuMnpTdb2t4Gr1mwhUtEtY7xy0KwRShyUfHlTJgtay9Knt/i?= =?utf-8?q?lLSWbfMzljYir9Od316xqYeeiMGxZrM9+SoIQHN6Z/HwquZuRJv4EukW0v3krBjZc?= =?utf-8?q?HMwFL+SexPFDNx07ikxkXglWm2+doaLL95FYb/kYG1zGJi4hZhyArGAz9IoaXhxUV?= =?utf-8?q?vMLCk2PvEmVJQKEMl+c/zTyRGR9kfEo8h7J+lQgcrsx/dsmCSurBHt9iiUws84WYT?= =?utf-8?q?pm7EoR++68pcxAhl3c1RK6IA1/lkgCSd3FP0X5g5ACxwVyFxyuAj5rl9NGclEbWA1?= =?utf-8?q?5F3BZpFld4J/TbhjKjWwpN0qAe7EPDciQFZrmELf9R/em/Pk30LTReUgpKz95NlI6?= =?utf-8?q?fKdxlIDW2+2CDMlXYyRaGmG+YUGpdxvCokfOE5G/kkybXG+7B2Iz51wWQzx0PkyYq?= =?utf-8?q?OL0dgPCC057JU6Ud75Rqk9NlPQdpOKMrLAXJ6LGO4yLvuejR+ABC7WdVG7cS/M8Dw?= =?utf-8?q?Vvv85uNLhbN2KcVtASR7kp4p9ymC/hDIJQ=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(376014)(35042699022)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Aug 2024 11:16:49.1388 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5c70e640-08a9-4b73-3dda-08dcc8e53dee X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028CFF.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6585 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch introduces AdvSIMD faminmax intrinsics. The intrinsics of this extension are implemented as the following builtin functions: * vamax_f16 * vamaxq_f16 * vamax_f32 * vamaxq_f32 * vamaxq_f64 * vamin_f16 * vaminq_f16 * vamin_f32 * vaminq_f32 * vaminq_f64 We are defining a new way to add AArch64 AdvSIMD intrinsics by listing all the intrinsics in a .def file and then using that .def file to initialise various data structures. This would lead to more concise code and easier addition of the new AdvSIMD intrinsics in future. The faminmax intrinsics are defined using the new approach. gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (ENTRY): Macro to parse the contents of aarch64-simd-pragma-builtins.def. (enum aarch64_builtins): New enum values for faminmax builtins via aarch64-simd-pragma-builtins.def. (struct aarch64_pragma_builtins_data): Struct to hold data from aarch64-simd-pragma-builtins.def. (aarch64_init_pragma_builtins): New function to define pragma builtins. (aarch64_get_pragma_builtin): New function to get a row of aarch64_pragma_builtins, given code. (handle_arm_neon_h): Modify to call aarch64_init_pragma_builtins. (aarch64_general_check_builtin_call): Modify to check whether required flag is being used for pragma builtins. (aarch64_expand_pragma_builtin): New function to emit instructions of pragma builtins. (aarch64_general_expand_builtin): Modify to call aarch64_expand_pragma_builtin. * config/aarch64/aarch64-option-extensions.def (AARCH64_OPT_EXTENSION): Introduce new flag for this extension. * config/aarch64/aarch64-simd.md (@aarch64_): Instruction pattern for faminmax intrinsics. * config/aarch64/aarch64.h (TARGET_FAMINMAX): Introduce new flag for this extension. * config/aarch64/iterators.md: New iterators and unspecs. * config/arm/types.md: Introduce neon_fp_aminmax attributes. * doc/invoke.texi: Document extension in AArch64 Options. * config/aarch64/aarch64-simd-pragma-builtins.def: New file to list pragma builtins. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/faminmax-builtins-no-flag.c: New test. * gcc.target/aarch64/simd/faminmax-builtins.c: New test. --- gcc/config/aarch64/aarch64-builtins.cc | 84 +++++++++++++ .../aarch64/aarch64-option-extensions.def | 2 + .../aarch64/aarch64-simd-pragma-builtins.def | 31 +++++ gcc/config/aarch64/aarch64-simd.md | 11 ++ gcc/config/aarch64/aarch64.h | 4 + gcc/config/aarch64/iterators.md | 9 ++ gcc/config/arm/types.md | 6 + gcc/doc/invoke.texi | 2 + .../aarch64/simd/faminmax-builtins-no-flag.c | 10 ++ .../aarch64/simd/faminmax-builtins.c | 115 ++++++++++++++++++ 10 files changed, 274 insertions(+) create mode 100644 gcc/config/aarch64/aarch64-simd-pragma-builtins.def create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index eb878b933fe..a4905dd0aae 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -757,6 +757,10 @@ typedef struct #define VAR1(T, N, MAP, FLAG, A) \ AARCH64_SIMD_BUILTIN_##T##_##N##A, +#undef ENTRY +#define ENTRY(N, M, U, F) \ + AARCH64_##N, + enum aarch64_builtins { AARCH64_BUILTIN_MIN, @@ -829,6 +833,10 @@ enum aarch64_builtins AARCH64_RBIT, AARCH64_RBITL, AARCH64_RBITLL, + /* Pragma builtins. */ + AARCH64_PRAGMA_BUILTIN_START, +#include "aarch64-simd-pragma-builtins.def" + AARCH64_PRAGMA_BUILTIN_END, /* System register builtins. */ AARCH64_RSR, AARCH64_RSRP, @@ -947,6 +955,7 @@ const char *aarch64_scalar_builtin_types[] = { extern GTY(()) aarch64_simd_type_info aarch64_simd_types[]; +#undef ENTRY #define ENTRY(E, M, Q, G) \ {E, "__" #E, #G "__" #E, NULL_TREE, NULL_TREE, E_##M##mode, qualifier_##Q}, struct aarch64_simd_type_info aarch64_simd_types [] = { @@ -1547,6 +1556,50 @@ aarch64_init_simd_builtin_functions (bool called_from_pragma) } } +/* Initialize pragma builtins. */ + +struct aarch64_pragma_builtins_data +{ + const char *name; + machine_mode mode; + int unspec; + aarch64_feature_flags required_extensions; +}; + +#undef ENTRY +#define ENTRY(N, M, U, F) \ + {#N, E_##M##mode, U, F}, + +static aarch64_pragma_builtins_data aarch64_pragma_builtins[] = { +#include "aarch64-simd-pragma-builtins.def" +}; + +static void +aarch64_init_pragma_builtins () +{ + for (size_t i = 0; i < ARRAY_SIZE (aarch64_pragma_builtins); ++i) + { + auto data = aarch64_pragma_builtins[i]; + auto type = aarch64_simd_builtin_type (data.mode, qualifier_none); + auto fntype = build_function_type_list (type, type, type, NULL_TREE); + auto code = AARCH64_PRAGMA_BUILTIN_START + i + 1; + const char *name = data.name; + aarch64_builtin_decls[code] + = aarch64_general_simulate_builtin (name, fntype, code); + } +} + +static const aarch64_pragma_builtins_data * +aarch64_get_pragma_builtin (int code) +{ + if (!(code > AARCH64_PRAGMA_BUILTIN_START + && code < AARCH64_PRAGMA_BUILTIN_END)) + return NULL; + + auto idx = code - (AARCH64_PRAGMA_BUILTIN_START + 1); + return &aarch64_pragma_builtins[idx]; +} + /* Register the tuple type that contains NUM_VECTORS of the AdvSIMD type indexed by TYPE_INDEX. */ static void @@ -1640,6 +1693,7 @@ handle_arm_neon_h (void) aarch64_init_simd_builtin_functions (true); aarch64_init_simd_intrinsics (); + aarch64_init_pragma_builtins (); } static void @@ -2326,6 +2380,12 @@ aarch64_general_check_builtin_call (location_t location, vec, return aarch64_check_required_extensions (location, decl, AARCH64_FL_MEMTAG); + if (auto builtin_data = aarch64_get_pragma_builtin (code)) + { + auto flags = builtin_data->required_extensions; + return aarch64_check_required_extensions (location, decl, flags); + } + return true; } @@ -3189,6 +3249,27 @@ aarch64_expand_builtin_data_intrinsic (unsigned int fcode, tree exp, rtx target) return ops[0].value; } +static rtx +aarch64_expand_pragma_builtin (unsigned int fcode, tree exp, rtx target) +{ + auto builtins_data + = aarch64_pragma_builtins[fcode - (AARCH64_PRAGMA_BUILTIN_START + 1)]; + + expand_operand ops[3]; + auto mode = builtins_data.mode; + auto op1 = expand_normal (CALL_EXPR_ARG (exp, 0)); + auto op2 = expand_normal (CALL_EXPR_ARG (exp, 1)); + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], op1, mode); + create_input_operand (&ops[2], op2, mode); + + auto unspec = builtins_data.unspec; + auto icode = code_for_aarch64 (unspec, mode); + expand_insn (icode, 3, ops); + + return target; +} + /* Expand an expression EXP as fpsr or fpcr setter (depending on UNSPEC) using MODE. */ static void @@ -3368,6 +3449,9 @@ aarch64_general_expand_builtin (unsigned int fcode, tree exp, rtx target, if (fcode >= AARCH64_REV16 && fcode <= AARCH64_RBITLL) return aarch64_expand_builtin_data_intrinsic (fcode, exp, target); + if (fcode > AARCH64_PRAGMA_BUILTIN_START + && fcode < AARCH64_PRAGMA_BUILTIN_END) + return aarch64_expand_pragma_builtin (fcode, exp, target); gcc_unreachable (); } diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index 6998627f377..8279f5a76ea 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -234,6 +234,8 @@ AARCH64_OPT_EXTENSION("gcs", GCS, (), (), (), "gcs") AARCH64_OPT_EXTENSION("fp8", FP8, (SIMD), (), (), "fp8") +AARCH64_OPT_EXTENSION("faminmax", FAMINMAX, (SIMD), (), (), "faminmax") + #undef AARCH64_OPT_FMV_EXTENSION #undef AARCH64_OPT_EXTENSION #undef AARCH64_FMV_FEATURE diff --git a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def new file mode 100644 index 00000000000..be7029c4424 --- /dev/null +++ b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def @@ -0,0 +1,31 @@ +/* AArch64 SIMD pragma builtins + Copyright (C) 2024 Free Software Foundation, Inc. + Contributed by ARM Ltd. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + + // faminmax + ENTRY (vamax_f16, V4HF, UNSPEC_FAMAX, AARCH64_FL_FAMINMAX) + ENTRY (vamaxq_f16, V8HF, UNSPEC_FAMAX, AARCH64_FL_FAMINMAX) + ENTRY (vamax_f32, V2SF, UNSPEC_FAMAX, AARCH64_FL_FAMINMAX) + ENTRY (vamaxq_f32, V4SF, UNSPEC_FAMAX, AARCH64_FL_FAMINMAX) + ENTRY (vamaxq_f64, V2DF, UNSPEC_FAMAX, AARCH64_FL_FAMINMAX) + ENTRY (vamin_f16, V4HF, UNSPEC_FAMIN, AARCH64_FL_FAMINMAX) + ENTRY (vaminq_f16, V8HF, UNSPEC_FAMIN, AARCH64_FL_FAMINMAX) + ENTRY (vamin_f32, V2SF, UNSPEC_FAMIN, AARCH64_FL_FAMINMAX) + ENTRY (vaminq_f32, V4SF, UNSPEC_FAMIN, AARCH64_FL_FAMINMAX) + ENTRY (vaminq_f64, V2DF, UNSPEC_FAMIN, AARCH64_FL_FAMINMAX) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 23c03a96371..7542c81ed91 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -9910,3 +9910,14 @@ "shl\\t%d0, %d1, #16" [(set_attr "type" "neon_shift_imm")] ) + +;; faminmax +(define_insn "@aarch64_" + [(set (match_operand:VHSDF 0 "register_operand" "=w") + (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w") + (match_operand:VHSDF 2 "register_operand" "w")] + FAMINMAX_UNS))] + "TARGET_FAMINMAX" + "\t%0., %1., %2." + [(set_attr "type" "neon_fp_aminmax")] +) diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 2dfb999bea5..de14f57071a 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -457,6 +457,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED enabled through +gcs. */ #define TARGET_GCS AARCH64_HAVE_ISA (GCS) +/* Floating Point Absolute Maximum/Minimum extension instructions are + enabled through +faminmax. */ +#define TARGET_FAMINMAX AARCH64_HAVE_ISA (FAMINMAX) + /* Prefer different predicate registers for the output of a predicated operation over re-using an existing input predicate. */ #define TARGET_SVE_PRED_CLOBBER (TARGET_SVE \ diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 20a318e023b..17ac5e073aa 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -1057,6 +1057,8 @@ UNSPEC_BFCVTN2 ; Used in aarch64-simd.md. UNSPEC_BFCVT ; Used in aarch64-simd.md. UNSPEC_FCVTXN ; Used in aarch64-simd.md. + UNSPEC_FAMAX ; Used in aarch64-simd.md. + UNSPEC_FAMIN ; Used in aarch64-simd.md. ;; All used in aarch64-sve2.md UNSPEC_FCVTN @@ -4463,3 +4465,10 @@ (UNSPECV_SET_FPCR "fpcr")]) (define_int_attr bits_etype [(8 "b") (16 "h") (32 "s") (64 "d")]) + +;; Iterators and attributes for faminmax + +(define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN]) + +(define_int_attr faminmax_uns_op + [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index 9527bdb9e87..d8de9dbc9d1 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -492,6 +492,8 @@ ; neon_fp_reduc_minmax_s_q ; neon_fp_reduc_minmax_d ; neon_fp_reduc_minmax_d_q +; neon_fp_aminmax +; neon_fp_aminmax_q ; neon_fp_cvt_narrow_s_q ; neon_fp_cvt_narrow_d_q ; neon_fp_cvt_widen_h @@ -1044,6 +1046,8 @@ neon_fp_reduc_minmax_d,\ neon_fp_reduc_minmax_d_q,\ \ + neon_fp_aminmax,\ + neon_fp_aminmax_q,\ neon_fp_cvt_narrow_s_q,\ neon_fp_cvt_narrow_d_q,\ neon_fp_cvt_widen_h,\ @@ -1264,6 +1268,8 @@ neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s, neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\ neon_fp_reduc_minmax_d_q,\ + neon_fp_aminmax, neon_fp_aminmax_q,\ + neon_fp_aminmax, neon_fp_aminmax_q,\ neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\ neon_fp_cvt_widen_h, neon_fp_cvt_widen_s, neon_fp_to_int_s,\ neon_fp_to_int_s_q, neon_int_to_fp_s, neon_int_to_fp_s_q,\ diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 32b772d2a8a..2c509f62d98 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21865,6 +21865,8 @@ Enable support for Armv8.9-a/9.4-a translation hardening extension. Enable the RCpc3 (Release Consistency) extension. @item fp8 Enable the fp8 (8-bit floating point) extension. +@item faminmax +Enable the Floating Point Absolute Maximum/Minimum extension. @end table diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c new file mode 100644 index 00000000000..63ed1508c23 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins-no-flag.c @@ -0,0 +1,10 @@ +/* { dg-do assemble} */ +/* { dg-additional-options "-march=armv9-a" } */ + +#include "arm_neon.h" + +void +test (float32x4_t a, float32x4_t b) +{ + vamaxq_f32 (a, b); /* { dg-error {ACLE function 'vamaxq_f32' requires ISA extension 'faminmax'} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c new file mode 100644 index 00000000000..7e4f3eba81a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-builtins.c @@ -0,0 +1,115 @@ +/* { dg-do assemble} */ +/* { dg-additional-options "-O3 -march=armv9-a+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_neon.h" + +/* +** test_vamax_f16: +** famax v0.4h, v0.4h, v1.4h +** ret +*/ +float16x4_t +test_vamax_f16 (float16x4_t a, float16x4_t b) +{ + return vamax_f16 (a, b); +} + +/* +** test_vamaxq_f16: +** famax v0.8h, v0.8h, v1.8h +** ret +*/ +float16x8_t +test_vamaxq_f16 (float16x8_t a, float16x8_t b) +{ + return vamaxq_f16 (a, b); +} + +/* +** test_vamax_f32: +** famax v0.2s, v0.2s, v1.2s +** ret +*/ +float32x2_t +test_vamax_f32 (float32x2_t a, float32x2_t b) +{ + return vamax_f32 (a, b); +} + +/* +** test_vamaxq_f32: +** famax v0.4s, v0.4s, v1.4s +** ret +*/ +float32x4_t +test_vamaxq_f32 (float32x4_t a, float32x4_t b) +{ + return vamaxq_f32 (a, b); +} + +/* +** test_vamaxq_f64: +** famax v0.2d, v0.2d, v1.2d +** ret +*/ +float64x2_t +test_vamaxq_f64 (float64x2_t a, float64x2_t b) +{ + return vamaxq_f64 (a, b); +} + +/* +** test_vamin_f16: +** famin v0.4h, v0.4h, v1.4h +** ret +*/ +float16x4_t +test_vamin_f16 (float16x4_t a, float16x4_t b) +{ + return vamin_f16 (a, b); +} + +/* +** test_vaminq_f16: +** famin v0.8h, v0.8h, v1.8h +** ret +*/ +float16x8_t +test_vaminq_f16 (float16x8_t a, float16x8_t b) +{ + return vaminq_f16 (a, b); +} + +/* +** test_vamin_f32: +** famin v0.2s, v0.2s, v1.2s +** ret +*/ +float32x2_t +test_vamin_f32 (float32x2_t a, float32x2_t b) +{ + return vamin_f32 (a, b); +} + +/* +** test_vaminq_f32: +** famin v0.4s, v0.4s, v1.4s +** ret +*/ +float32x4_t +test_vaminq_f32 (float32x4_t a, float32x4_t b) +{ + return vaminq_f32 (a, b); +} + +/* +** test_vaminq_f64: +** famin v0.2d, v0.2d, v1.2d +** ret +*/ +float64x2_t +test_vaminq_f64 (float64x2_t a, float64x2_t b) +{ + return vaminq_f64 (a, b); +} From patchwork Fri Aug 30 11:16:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1978939 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=O9FXM5HG; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=O9FXM5HG; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WwFyF72wqz1yfX for ; Fri, 30 Aug 2024 21:18:29 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CD302385DDD0 for ; Fri, 30 Aug 2024 11:18:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2060f.outbound.protection.outlook.com [IPv6:2a01:111:f403:2613::60f]) by sourceware.org (Postfix) with ESMTPS id 94A4F385DDD0 for ; Fri, 30 Aug 2024 11:17:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 94A4F385DDD0 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 94A4F385DDD0 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2613::60f ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1725016643; cv=pass; b=K3oGt2sqyMzJCrEPGBPSlgGN07/1npshfSZlHBtF1UKWas1ohRqCo8jgOSxU1aAtTFR6beETYxNI7qG7L/DtwaHEw70Azc7nBVDN4NyZLTNP++qCEPf9UwRZz4gFUUlhSrGAQjJz4L/hQQMDd3qMVwjSPeOE5EINj1JIOMQ8L0I= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1725016643; c=relaxed/simple; bh=K6zwdVoLWKBSfwt9Vqvq6cpPuQ3iX0qR6tpgaTv8XVI=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=n+UaSqXE2Hfyo3t2z9xBN3LiOdkEhEb/Tf3GGd5KP3NpbTt6DkcGJiq6vv3xy9T2nsLiduT8RMWMQ2pyerHqb/LdFDMa7QoyhT2iOIIpbfPugyWl7KA/N8sPbgFcwoBMKHcSBI0y2fMHERFJ4eMwejcQ6d4nWXOsCQhqVjXAo0s= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=u7OU6gAaD21FYhp8giXhDxddCkYAjfODRi+8hhNUvCq96oHIzGKbAdEwWX/C7VA/9p6ItHmu8qejPRBl9bkPwUpte9JTJWS1gROA5HBj3xqeXTa/lD2cQDaR9gBkk6bTqaOA1qDZV0OjusE6dU2HD6gxukjhLaouyzCqQwF2H0uh/Hdq6AyNKVLsMOw4F7Dwhw8d6mmh8J2ZGe5v3HuKj+9DfITvAuGi70jcOdculfPW9chNL189Nzv1QFhqtM6w/NiuckN5/pCDywBMthC0RxhaK0mUfnjS/PoQmQTHeC7F2UnDCmi2+xU4avW1ShU5oVR9cqOxzYkId4ugnCLSuw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wvTMTRduqXNvFack00dSmMcDdEablPCc+K3Bz81o1Cs=; b=MKz0UxCHDUVG6X1EXc7Ne+ilpeGWw+DMM3QWH7vtwYzUkItGiAxK/w5T4zVIR/SaaCQ7lBS+DtPONpVRSEW226M//4mday1u3L/ifg/Eht6Kjd5xig0AV/oT7Gpx5/YhKs/bNxxAF8FrnrngquScqNDMJhM9bKKhg73bNw+LtHJgYefJ7BtChbaPU3/61aoQyVotO7pzwSyB0O0WyGAOhCz8SPPA+HzKLLL+hIsviIBb7fb4VdAgF7XjsgX5vuk2uIcA092IzeXBW6MAVHPEEH5C6xPVwC4uhG+ZSWVQq4aS26vraqjoPCibQh9EBRtVnrBRpeFBlTs3YwKMFKdsng== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wvTMTRduqXNvFack00dSmMcDdEablPCc+K3Bz81o1Cs=; b=O9FXM5HGGiuQYKW3NQfe5m0X9uscJaPlzruIyUFYab0AFAEd0PCTI0V4e+OQ2bcG1TGKODPT6ARYv5CicIrC01+zrY/xhelC98lbiWXbgDDTFvvI5eTsqzCyNdsx99PR4XGvCqQ5q2ySIkZyx/OrUysbdYYvR+JRGidoOK3Ze8I= Received: from AM6PR02CA0027.eurprd02.prod.outlook.com (2603:10a6:20b:6e::40) by DB4PR08MB9382.eurprd08.prod.outlook.com (2603:10a6:10:3f2::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.19; Fri, 30 Aug 2024 11:17:05 +0000 Received: from AM3PEPF00009BA2.eurprd04.prod.outlook.com (2603:10a6:20b:6e:cafe::3a) by AM6PR02CA0027.outlook.office365.com (2603:10a6:20b:6e::40) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.20 via Frontend Transport; Fri, 30 Aug 2024 11:17:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM3PEPF00009BA2.mail.protection.outlook.com (10.167.16.27) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Fri, 30 Aug 2024 11:17:05 +0000 Received: ("Tessian outbound 22f8cf4ed816:v403"); Fri, 30 Aug 2024 11:17:05 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: a600a62db0c6d354 X-CR-MTA-TID: 64aa7808 Received: from L9d9bade74f35.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id A82E8A34-9BDE-4309-81C2-AB1570A751F3.1; Fri, 30 Aug 2024 11:16:53 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L9d9bade74f35.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 30 Aug 2024 11:16:53 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FetK+MLHy8UTF68OCZG1w1uC2292hGpwuOkWpx+8qaZpaYvfvpHBov/Qhhu7FJnZqT86jtDmXJDS1GjWR8RxyhR5cKGCNUNUXGakFBag+iYIOopYzL8kI1/YcjXXJczCgw/YOZK6LKvV3IhofIM6Ip8PFhhQuJq+qtR694slZfYFH4dWDeeVPBNd+qsq9QYpoCQndZzHEw+XWjUAHhANc+jyfJmvWpVDJvez00461IfhUmuurbxTwP95z6ufkWXJ6BMSx1ZGlu9+JbpikAsDzepMl9nYLz/bDkLXJ/DX5kVZmARQ65D+pwDVg6mIjBRCvW+Q3JBynQb1SDOYkXq4DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wvTMTRduqXNvFack00dSmMcDdEablPCc+K3Bz81o1Cs=; b=k/YEUCipUuzg/FmjFQeeQFgDanXC18aDOXCUeerLC3nD6YtbnCnS3emDg5uOi4mukDQrQA1la3qdcXx+X+Nd7cb8+WT+OfjJamaVaoKKzGQ0SFbfH1TCrndCyFvP+IHFR3MdW9DvktT4kRMDwtdZBDi/5mhkp6fGHBqLc/KOSPGILj0ShnWPVL+GYLtnOJxthdaHnvFqm5ePR0eA0ONkdErCV50f7vnTIzzRwRmxPNT5GDJ+Wt9NVIkH2e/yvL/0kX3EMSj9W9GfBNO5UYrfU5QDeFecE/dpjuOxH5c+x2WrZtnKUk0JvFQ1DFOPhbAfH6k3QdyBWYJy12RAlBJtlw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wvTMTRduqXNvFack00dSmMcDdEablPCc+K3Bz81o1Cs=; b=O9FXM5HGGiuQYKW3NQfe5m0X9uscJaPlzruIyUFYab0AFAEd0PCTI0V4e+OQ2bcG1TGKODPT6ARYv5CicIrC01+zrY/xhelC98lbiWXbgDDTFvvI5eTsqzCyNdsx99PR4XGvCqQ5q2ySIkZyx/OrUysbdYYvR+JRGidoOK3Ze8I= Received: from AM4PR07CA0023.eurprd07.prod.outlook.com (2603:10a6:205:1::36) by VI1PR08MB5519.eurprd08.prod.outlook.com (2603:10a6:803:133::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.19; Fri, 30 Aug 2024 11:16:49 +0000 Received: from AMS0EPF0000019F.eurprd05.prod.outlook.com (2603:10a6:205:1:cafe::3c) by AM4PR07CA0023.outlook.office365.com (2603:10a6:205:1::36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7897.13 via Frontend Transport; Fri, 30 Aug 2024 11:16:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF0000019F.mail.protection.outlook.com (10.167.16.251) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Fri, 30 Aug 2024 11:16:49 +0000 Received: from AZ-NEU-EX06.Arm.com (10.240.25.134) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 30 Aug 2024 11:16:44 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX06.Arm.com (10.240.25.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Fri, 30 Aug 2024 11:16:41 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Fri, 30 Aug 2024 11:16:41 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v7 2/2] aarch64: Add codegen support for AdvSIMD faminmax Date: Fri, 30 Aug 2024 12:16:26 +0100 Message-ID: <20240830111626.70300-3-saurabh.jha@arm.com> X-Mailer: git-send-email 2.43.2 In-Reply-To: <20240830111626.70300-1-saurabh.jha@arm.com> References: <20240830111626.70300-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF0000019F:EE_|VI1PR08MB5519:EE_|AM3PEPF00009BA2:EE_|DB4PR08MB9382:EE_ X-MS-Office365-Filtering-Correlation-Id: 8051888e-61a3-4379-5430-08dcc8e547a0 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|376014|36860700013|1800799024|82310400026; X-Microsoft-Antispam-Message-Info-Original: VXtpC0ikvSNX0/NcxFc+Fw+O3CJX6f6NCCkZlj3oORvlXE++wZ8Il90apwNVvkF8mk10wYdY2f21wOu2UNjPdOq1/fLxXCWkJoOTnrBDJg4nURIyVdx8fihWuuo4D9rpodHgDl5LvAMBHyTnWaF40gON1aFIU4dbZuRA+o6D4aCs2qixy8Afp6dLE2E0DxPzZADMz6lYe/xzPpMZZyAp/UUvodxqugVsvFZOVZDiFstTyHr1UOnlZX9fQnKSvn3iMssfAKqAyLpdhr1rpwSmPEXiJ30F4RhGY6eSNXr/CMd5QDL05FMeML3970otCBppXeCAoszMXY8PFAYUDoWR+rOj9nYI77ZHHlbo47+ZIfADllOm4x0lvOU84ZTJbTydSYZW8mFpzeqmFWhQ/G69yw6TjRQDBHjUltoNKryeKiHaQ8dgPaw6UXibCsNuHFZQco8yNnHn2cBBZVCWrIDg926oYLNcXGqtVwfhBjtKB6wgTCagFbZl8jwEuQFG5oTYk9ucPO0W62osdPkJDbm0TIpEgbUi4nwn8uUqZoTIIGDc8celR/3TAXjiXYOcbWhUhkm0o0yPhhOHSxS2Vw7x+gaWWRKpdARpYSThomoTzJmERofcxemEI4ZzvvzKCBhQ/89vQaa3EV4C+P5vJc/FWGqcGy1APEMBQVaAXzrvk/3Bp+fXRzAFF/CZEsaEbDPMkFW6dRh57HJBoYH6vk94WEg6t9/UWgRl8Ib0RattJsOFZmU0ub2zeKt6IqfiyA98hwQkHj8Uwa0aNmWB+SnlL2kmrmWyV58484MEd7r6nOmnBsJIzLefdzHn+lnGZUQpjQMPYj7wTz0U5/TSOMFeyWBRjm6UyAUj6Mp4SvZmU60/HJI3vIx2ojDOUMW+DuLcB/JxGBn9pJhtG4niKACPNxym/42aU1pWXEFtvmmPI0AkxTA7iWCK7FavE+I/iF3c/d57OmsxvL7xJWOCl6mPqDeF6eYvd6DzG6xBNfmarNC+ABXXaTNm0DHo+VX311eACetgxdeF1jTH7YcNN9E01HXYSssC7vxO3goiz1otjf1P1Y80EJelzLNiqUhmtpmtrSTE4aW7eSayyH10FYVE9dwVs7OA3QOLznJ9YYejxL1w3V7t9XiwefxGhH/UY4fSt1/6m+gujaRNJPKyRhhO80kQ1mxEECVF9yP6El7XNFiL427W7zLRV3R16XCbVYlYK/D1XlvPQGNG3+AnoXgaoN9cZDaDEI7Ww+0BQ5IMum4tEyYisAumFQCK6LgMgRBqvNzxIBOw1v3xsNtpX9r0UKwX9kMb9rRpTJH/EUmjEB1LKcW0vDDXhtmqBRFs9TGw43Tb9AFvPF3Hh0we2qPpONbZWRmm9boQv/wzn32gO9gh3FBMOTEq5g9hJ9lgFXu5+RsewUVMXf9vrZYo6NBHgw== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(376014)(36860700013)(1800799024)(82310400026); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB5519 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:205:1::36]; domain=AM4PR07CA0023.eurprd07.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM3PEPF00009BA2.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 9513afdc-970c-4707-3467-08dcc8e53e25 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|376014|36860700013|82310400026|35042699022; X-Microsoft-Antispam-Message-Info: =?utf-8?q?4n0AfRs0ap8B2FTtCqLenJYCmJu1hut?= =?utf-8?q?IypGEeKoKeCCrAkgKkkLWebi9lT77ReTrpHxBVuN6wKLaEU9y0Fd+koP1ftUa6QBU?= =?utf-8?q?rqwTw8NYwp0OP0aoi6zK0HS8Rvl/TnhYQwwXNMPxh4uvNMn0MxTmhffpw7jvXEA7h?= =?utf-8?q?UhVgZse+sspElZIdbmc6n4xRjL6vfh5ZFIgP5oi+BQ2Bq5uEW013IwgYtUbKzJJHF?= =?utf-8?q?dZacklFITSPsDgtTBPtyu0CotroYjeCVDUoTH3XIT0M0ZtOAYcpaMp3syNYcKo7De?= =?utf-8?q?B2fqB8Ca2JkofaCywRlMaxm8zeqWq4KCjilAhuDf9Yjd4kuHxSzwHMPyFVkvFDHsq?= =?utf-8?q?e/qMSkv4HtF53PycrKimrwlAxDuQAc5U+OaClyCGg6YHQgK+AzpF5GI/MDfVawI7/?= =?utf-8?q?zdF9o5gxy/kl4NowTfpu4zEmmZvO3swwn0XgfMaoNUstWkLJQFFWEl4fk/UpFM+rg?= =?utf-8?q?5SP9yKYVArHHF+BcMBgVRtCaRLjxSrm2K8Sox3HRE52jobYIaAOlGbf8D0Axd/oBz?= =?utf-8?q?3oTQfaOz/+Th1s9+UqVIXbD8PyauUtkvWmyYdheDGT6NcGCFlHw4SogeCpHxF8E8P?= =?utf-8?q?eClbdrLJMnOz6DEaC+b5+yjDUS3qWGOHN3Lmt6vTOX+uAUBKUDj5Neguuz2NWwWkG?= =?utf-8?q?Fxt4gLXJqw0YNHn34QoCcr+eXK1q1R2auU2f+RqNPMMSOddK3h8EJZ+cWgGXk0jmh?= =?utf-8?q?bxdJl70h3KuE80jMt9/GEz31nTC6HM1mIEbqk93x3JFhD7uYtJtsDSF+tZwSeaqu+?= =?utf-8?q?TeR354QRxr3mY0TtLMRIv3oyzCrvu023riMP92qycQmiK4TMb0xFc3WlsjUXyNAko?= =?utf-8?q?ZzaB2wR6Jjo612ONvwQfuq3qiN7Svvq4H/EW7ithIxr1CMqEYu8RuLxbNjaBjvV9l?= =?utf-8?q?NPJYKIefb8eZ74ai+C/h5R2/6dJtQTHCBldnJh5zEyUS7fKbQQl76HEPaWkxwe5oQ?= =?utf-8?q?jkYagQM79zbnFDYYwZvUsZciuqspY5fv4krWARmSRHO88SDT/byVuzVf/DEFfBR8U?= =?utf-8?q?eu2Y//QrMTFJB6zsaVQ0T6k08estRpsTluOr4hrj2m3ZdHs7dKYt0EiiRepheDSk3?= =?utf-8?q?M6NuNZcHTFOKOEz4Sm/Qqfh0Jxru/fg+1B2iTnNtQU76BpkK//97COb8gOVvaJeE1?= =?utf-8?q?bVxaq0CceV9+5hjBdn5ao3K3oGFc1/c2mhgTPkG960FfNx0DQSdebgQ6+hTP/Xcsz?= =?utf-8?q?atVLBfMM1y4hpbcW1nunjFTn2GGsKpdXQeA/LrTY1V8Ztb/5afq91512p32dRJAAG?= =?utf-8?q?oSbw/vJ5S+kvb12Sa1C5rIWbaGdbos6KnKFz97QxqXfO0rb+9aVoX4QTsQjTb1geP?= =?utf-8?q?paFSLLjegmf+t7tNLO8aqhFH4mbP+13VJw=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026)(35042699022); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Aug 2024 11:17:05.3898 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8051888e-61a3-4379-5430-08dcc8e547a0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF00009BA2.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB4PR08MB9382 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and mandatory from Armv9.5-a. It introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation support for famax and famin in terms of existing RTL operators. famax/famin is equivalent to first taking abs of the operands and then taking smax/smin on the results of abs. famax/famin (a, b) = smax/smin (abs (a), abs (b)) This fusion of operators is only possible when -march=armv9-a+faminmax flags are passed. We also need to pass -ffast-math flag; if we don't, then a statement like c[i] = __builtin_fmaxf16 (a[i], b[i]); is RTL expanded to UNSPEC_FMAXNM instead of smax (likewise for smin). This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/faminmax-codegen-no-flag.c: New test. * gcc.target/aarch64/simd/faminmax-codegen.c: New test. --- gcc/config/aarch64/aarch64-simd.md | 10 + gcc/config/aarch64/iterators.md | 3 + .../aarch64/simd/faminmax-codegen-no-flag.c | 217 ++++++++++++++++++ .../aarch64/simd/faminmax-codegen.c | 197 ++++++++++++++++ 4 files changed, 427 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 7542c81ed91..8973cade488 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -9921,3 +9921,13 @@ "\t%0., %1., %2." [(set_attr "type" "neon_fp_aminmax")] ) + +(define_insn "*aarch64_faminmax_fused" + [(set (match_operand:VHSDF 0 "register_operand" "=w") + (FMAXMIN:VHSDF + (abs:VHSDF (match_operand:VHSDF 1 "register_operand" "w")) + (abs:VHSDF (match_operand:VHSDF 2 "register_operand" "w"))))] + "TARGET_FAMINMAX" + "\t%0., %1., %2." + [(set_attr "type" "neon_fp_aminmax")] +) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 17ac5e073aa..c2fcd18306e 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -4472,3 +4472,6 @@ (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) + +(define_code_attr faminmax_op + [(smax "famax") (smin "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c new file mode 100644 index 00000000000..d77f5a5d19f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen-no-flag.c @@ -0,0 +1,217 @@ +/* { dg-do assemble} */ +/* { dg-additional-options "-O3 -ffast-math -march=armv9-a" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_neon.h" + +#pragma GCC target "+nosve" + +/* +** test_vamax_f16: +** fabs v1.4h, v1.4h +** fabs v0.4h, v0.4h +** fmaxnm v0.4h, v0.4h, v1.4h +** ret +*/ +float16x4_t +test_vamax_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f16: +** fabs v1.8h, v1.8h +** fabs v0.8h, v0.8h +** fmaxnm v0.8h, v0.8h, v1.8h +** ret +*/ +float16x8_t +test_vamaxq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamax_f32: +** fabs v1.2s, v1.2s +** fabs v0.2s, v0.2s +** fmaxnm v0.2s, v0.2s, v1.2s +** ret +*/ +float32x2_t +test_vamax_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f32: +** fabs v1.4s, v1.4s +** fabs v0.4s, v0.4s +** fmaxnm v0.4s, v0.4s, v1.4s +** ret +*/ +float32x4_t +test_vamaxq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f64: +** fabs v1.2d, v1.2d +** fabs v0.2d, v0.2d +** fmaxnm v0.2d, v0.2d, v1.2d +** ret +*/ +float64x2_t +test_vamaxq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fmaxf64 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f16: +** fabs v1.4h, v1.4h +** fabs v0.4h, v0.4h +** fminnm v0.4h, v0.4h, v1.4h +** ret +*/ +float16x4_t +test_vamin_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f16: +** fabs v1.8h, v1.8h +** fabs v0.8h, v0.8h +** fminnm v0.8h, v0.8h, v1.8h +** ret +*/ +float16x8_t +test_vaminq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f32: +** fabs v1.2s, v1.2s +** fabs v0.2s, v0.2s +** fminnm v0.2s, v0.2s, v1.2s +** ret +*/ +float32x2_t +test_vamin_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f32: +** fabs v1.4s, v1.4s +** fabs v0.4s, v0.4s +** fminnm v0.4s, v0.4s, v1.4s +** ret +*/ +float32x4_t +test_vaminq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f64: +** fabs v1.2d, v1.2d +** fabs v0.2d, v0.2d +** fminnm v0.2d, v0.2d, v1.2d +** ret +*/ +float64x2_t +test_vaminq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fminf64 (a[i], b[i]); + } + return c; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c new file mode 100644 index 00000000000..971386c0bf0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c @@ -0,0 +1,197 @@ +/* { dg-do assemble} */ +/* { dg-additional-options "-O2 -ffast-math -march=armv9-a+faminmax" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_neon.h" + +#pragma GCC target "+nosve" + +/* +** test_vamax_f16: +** famax v0.4h, v1.4h, v0.4h +** ret +*/ +float16x4_t +test_vamax_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f16: +** famax v0.8h, v1.8h, v0.8h +** ret +*/ +float16x8_t +test_vamaxq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fmaxf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamax_f32: +** famax v0.2s, v1.2s, v0.2s +** ret +*/ +float32x2_t +test_vamax_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f32: +** famax v0.4s, v1.4s, v0.4s +** ret +*/ +float32x4_t +test_vamaxq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fmaxf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vamaxq_f64: +** famax v0.2d, v1.2d, v0.2d +** ret +*/ +float64x2_t +test_vamaxq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fmaxf64 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f16: +** famin v0.4h, v1.4h, v0.4h +** ret +*/ +float16x4_t +test_vamin_f16 (float16x4_t a, float16x4_t b) +{ + int i; + float16x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f16: +** famin v0.8h, v1.8h, v0.8h +** ret +*/ +float16x8_t +test_vaminq_f16 (float16x8_t a, float16x8_t b) +{ + int i; + float16x8_t c; + + for (i = 0; i < 8; ++i) { + a[i] = __builtin_fabsf16 (a[i]); + b[i] = __builtin_fabsf16 (b[i]); + c[i] = __builtin_fminf16 (a[i], b[i]); + } + return c; +} + +/* +** test_vamin_f32: +** famin v0.2s, v1.2s, v0.2s +** ret +*/ +float32x2_t +test_vamin_f32 (float32x2_t a, float32x2_t b) +{ + int i; + float32x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f32: +** famin v0.4s, v1.4s, v0.4s +** ret +*/ +float32x4_t +test_vaminq_f32 (float32x4_t a, float32x4_t b) +{ + int i; + float32x4_t c; + + for (i = 0; i < 4; ++i) { + a[i] = __builtin_fabsf32 (a[i]); + b[i] = __builtin_fabsf32 (b[i]); + c[i] = __builtin_fminf32 (a[i], b[i]); + } + return c; +} + +/* +** test_vaminq_f64: +** famin v0.2d, v1.2d, v0.2d +** ret +*/ +float64x2_t +test_vaminq_f64 (float64x2_t a, float64x2_t b) +{ + int i; + float64x2_t c; + + for (i = 0; i < 2; ++i) { + a[i] = __builtin_fabsf64 (a[i]); + b[i] = __builtin_fabsf64 (b[i]); + c[i] = __builtin_fminf64 (a[i], b[i]); + } + return c; +}