From patchwork Tue Oct 1 12:09:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Jha X-Patchwork-Id: 1991443 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=cP+/6DF5; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=cP+/6DF5; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XHxcf5h2Bz1xtg for ; Tue, 1 Oct 2024 22:11:30 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 97025386F812 for ; Tue, 1 Oct 2024 12:11:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on20600.outbound.protection.outlook.com [IPv6:2a01:111:f403:2614::600]) by sourceware.org (Postfix) with ESMTPS id 8440E386F807 for ; Tue, 1 Oct 2024 12:10:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8440E386F807 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8440E386F807 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2614::600 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727784612; cv=pass; b=Tzqh9JT/htgNF3CN3rHUqp3JPZECZIKJ3xrZ29StTjpDyGfw/FAJA0xBQYeGgiIP+BDjzNPAI0aJ9dcOyRbMpsIRWpWxrmFWV7VsJH7szitvrfTbXs+fakkSKuq+K+0yWyHYbUELbDJGj5OZlfj1ocKDLmWvCgUCDx+SoVDCwFc= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1727784612; c=relaxed/simple; bh=eWNQ8RO2gmmWdCnm8XUiyFNGOzoWfNdaCMv948ggVrY=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=f5f8JYpTN++/dfkxUN5o8NM5a7uQCoLBcGfomQnfRQE76M44zCwCKM2CF9aAnRYcDTmjUHu5DdDpDD1xC9t5HAmv5L0lXdeD0vmSbcyFuL4QbJS7GS13SFg3iyfeh5gfazlzvUo25kAqkrIjlCaSoV9ySGIn9HmuAqsve+JyY5o= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=a00ua0+JnwsbaPrBB1yktjKMyHfODe+E4w+TjPs3vZso2bnsfDkN9eUAvBWknM6a0OS+8WSK+yhs8W06WzmflI6m6i7Pjxf+DuY8CRFEAgbcs3ZX0GauTneZsVhLyF2GcjuCwEq6DmVWtcrkVhOJGnBsbCouv4TnW9Oa0ILjX0Z1B9QYgvtFtKM8qQsjmdSG9ZgEvItMoiKhupmuIaguKeXFouq/mlB/g0zIbGIZz3bu3Mr5VbBAEnwqvvbK5hpey1fFiO9wU56+9N57dF7Ag8b10DKvv2IqdkagFUKSb/cP2eRFCqoUDD2G9EIXUR3NpWx9zb9QAcoA12VvBb4PIw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ouC561ysPBW2Pf4DRIWQDVbwhCRm3ioyYK+NlG2SRWg=; b=rZaSJIO75z0mmn//PrwEbdZ5GwZdLgwzNazWVXtg3Rxkd0w/uLvVA2xnBaBaM/eB2JxGiWD0G71Tu76AqBK+sD1uZSbTMVsxK1nXqprBHNSoX52YqLvJTmNoaV/vAxD1E9o1nkMXCiSAwvM3pGpFe4ajajDd5Ng2TXbUPhDT5KTYSzFUJsR5UdqI4ZYRIbVglnwUUCbd63CZkF6bc5xdbpoIzDCP9fg0jz0nIuASpmbSN5NpvWG1syS1IUg4s7zd1FmRI7BbdTIXPR+OfeheDebzuHokqzPw8ZW4aYpQK/xjicM6SXslEjtKmlbp+QPlYQqELq2QMVicD2U4k5KiLA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ouC561ysPBW2Pf4DRIWQDVbwhCRm3ioyYK+NlG2SRWg=; b=cP+/6DF5hQGuA60IXCfg319N+H/arXSLo1IpiTlBhwylrsOng/hXVR5tnZhRmDHehYWbWcG2WIy4/cvFKhIY3G3pdH/C6eYM3CcQbVQnC8KThyANMks4HqguE44fcJ1nFpVsTgPmvIRbRRiZ18JXQQtGJJFD3tu/zcfKHvuMf2M= Received: from DU2P250CA0023.EURP250.PROD.OUTLOOK.COM (2603:10a6:10:231::28) by GV1PR08MB10977.eurprd08.prod.outlook.com (2603:10a6:150:1f5::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15; Tue, 1 Oct 2024 12:10:01 +0000 Received: from DB1PEPF000509FF.eurprd03.prod.outlook.com (2603:10a6:10:231:cafe::f9) by DU2P250CA0023.outlook.office365.com (2603:10a6:10:231::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8005.27 via Frontend Transport; Tue, 1 Oct 2024 12:10:01 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB1PEPF000509FF.mail.protection.outlook.com (10.167.242.41) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8026.11 via Frontend Transport; Tue, 1 Oct 2024 12:10:00 +0000 Received: ("Tessian outbound 1cf41b4bd505:v473"); Tue, 01 Oct 2024 12:10:00 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: c1a37c4508b53938 X-TessianGatewayMetadata: GtjmnB8os2Lr+Sar4ZHlwpAZEZrpJ1X/3hMNrLSu/GPmetBRrD+FnwK97bLNMtIogc3JLSTn/PNcwK9fZd3vsuo5h4XAI4dbwlW1UyvyDbGI5oXmZd1nPtZQWIaKiZKLyQk9Jfq5kTPLdMtQK/5TWtupEHU4R8lB8npcVlw/qlo= X-CR-MTA-TID: 64aa7808 Received: from L15b00c6bf695.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 4FB0B8B2-1470-4C29-8684-5DB1C088867E.1; Tue, 01 Oct 2024 12:09:53 +0000 Received: from EUR02-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L15b00c6bf695.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 01 Oct 2024 12:09:53 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=aNexLOy2+99X45r8gc0B+tzkEEWueHHOp/JA2sY+jKc4u+R+amE5y/3t69QCSLK9mFNpeXJrRJ98gYa7n1vrYkh3/2ufmgkLKiogs+j5pWSTh93jIZkV/j6LSqcm/4bYAUW0iXp8P/4ex2rXmjY8DEkoLEAgzZv3OPmO0moMrpiooZVujIEX/pzEHpHmGMKYEuFu7MWE7VN2NHYTGqFpNGBEUKlydVUHVgf8OEwDXGZrSdo4PPdFKM9ZS9b4U7Z9I6uUX4Tj1OHWU6lEP2IUMgRA1EuATUXH0+AMQd9jjWX8Sjta5Jo9+lqsXd1xZVZKgqzJs5p25WwIQQ7ROVglhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ouC561ysPBW2Pf4DRIWQDVbwhCRm3ioyYK+NlG2SRWg=; b=FBVS4Ou+Xz6EnP2oXw/xJI1jKeH9pSeTfq3lks7Z1i6lUG9yg+tpg8E7POGCFLPSLLm0XDXoJcoXRGzK59d3midzODlABf2Yj0XDrGVtBUti3VUp6Sigadbiet3SYyFrGMA4kRZxytb0K03HZSAC9aXZBbciLC5QIePQNnscr7DgpT74MNNsoELCs9e2C25biJ7YB94ugRAzv4F0ZLGjFl9xT1TuQfBAuTvBNx3+EudwuM2CKs7VwQUMyYjlxGvQBhHtBkMQm3fCeNYbu1gFsMoC8td0tNAA+yvbgmb/4Fjo//hh83XfLrGYtV+Xt9pThucMWpXzu9bzRp9G3bWiKA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ouC561ysPBW2Pf4DRIWQDVbwhCRm3ioyYK+NlG2SRWg=; b=cP+/6DF5hQGuA60IXCfg319N+H/arXSLo1IpiTlBhwylrsOng/hXVR5tnZhRmDHehYWbWcG2WIy4/cvFKhIY3G3pdH/C6eYM3CcQbVQnC8KThyANMks4HqguE44fcJ1nFpVsTgPmvIRbRRiZ18JXQQtGJJFD3tu/zcfKHvuMf2M= Received: from AS9PR07CA0006.eurprd07.prod.outlook.com (2603:10a6:20b:46c::7) by AS4PR08MB7733.eurprd08.prod.outlook.com (2603:10a6:20b:510::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.13; Tue, 1 Oct 2024 12:09:50 +0000 Received: from AM2PEPF0001C714.eurprd05.prod.outlook.com (2603:10a6:20b:46c:cafe::47) by AS9PR07CA0006.outlook.office365.com (2603:10a6:20b:46c::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.15 via Frontend Transport; Tue, 1 Oct 2024 12:09:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM2PEPF0001C714.mail.protection.outlook.com (10.167.16.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7918.13 via Frontend Transport; Tue, 1 Oct 2024 12:09:49 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Tue, 1 Oct 2024 12:09:46 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Tue, 1 Oct 2024 12:09:45 +0000 Received: from e130340.cambridge.arm.com (10.2.80.47) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Tue, 1 Oct 2024 12:09:45 +0000 From: To: CC: , , Saurabh Jha Subject: [PATCH v2 3/3] aarch64: Add codegen support for SVE2 faminmax Date: Tue, 1 Oct 2024 13:09:33 +0100 Message-ID: <20241001120933.1269122-4-saurabh.jha@arm.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241001120933.1269122-1-saurabh.jha@arm.com> References: <20241001120933.1269122-1-saurabh.jha@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM2PEPF0001C714:EE_|AS4PR08MB7733:EE_|DB1PEPF000509FF:EE_|GV1PR08MB10977:EE_ X-MS-Office365-Filtering-Correlation-Id: 511d109a-648d-4374-894d-08dce211f985 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|36860700013|1800799024|376014|82310400026; X-Microsoft-Antispam-Message-Info-Original: l4Gzei8zMMq/C0sMty74OQCyi9cF302BkqY7zi5Pvd7bQvK48as3vyLRN7ZbCpe2El3VY6tS1j/sHU5L0q8AoJqui+jN5UB98wHo3ijrUJ5B9vNHWx8gxaE0Gn2Nc/jEk+B7w48vVI7pOAHq+VcljZmjwhDAkkmtG1iQmtuWnByaeEymCyjY1UUAxnkh2cJ96/MdJYIHKuSqacWmpymDtoJLULe5zIqvWt5EtcdS4WDtYJKQqI83HUENNDlZ9m38MDRGKhJVU5O4Ackvzg/bYIJTKnWN4FtRY1bkcA3xNrWFASiH/wTD8Ml0MAp1ZUVstgHcFpl4UVofH1LqgaiW7J2fyFLsfl3TZuu7RRSnR1yWGLYardTXJmX6OYUerjb0WDJrHOpXOchZQCRejQ5qugHh2GsNHjUr8FY7/gWHsLRE7/uDgM8ZSqAme0+lDpXLYgfmu5Jx/F0jAu0LKdUmhbd40VT2icQIXoPx9+fZ6MH2yOdepw+YmxB2ivDaZzpTe5CbsVPWt4gKG4QeeXpS4Lq/17tHkwVqQRCNu1w3g4p7tz1pVBeJwCukcEbZWQsdqGVl+McEVcSH8PbHQeoa4ApW/4kzKcOqCWz+dvwu2TI01UbystjBOf4Zg8tBkhfnZJ/MRLiSOOB7oP/VoEZhQ032WI37qXeMVxdBO1EXRd07c9n8V4jWGnQkyhAiczAVg4a+4xw09+anFarHPV+mmcICK152SyDym6P3FR3ekdIgny6OQj6TR8s0Bh8CMkI2pIclGicOVHHXoUjXJe0iiZis4JFT4UzkHUykdMIGLKYj+Z8wGSn9T7zmpy+q+WyHZURwsr0WcPm80SSFeKAPbX4YQ7oYDSnQQVuUs8Ax0zrkIY1YlH/uvB8lwxbwJ9SI8H4ul4dk4cUrjHboyCLJz4W1Ks3FvIQk+hqGMMfVJZDXpMjhDsTXbWjP0Q0mXD0/uJGrYxb6AmJEoUS45lunMJg4fHzWXQX3eWpFw25qBJdWsCvb0gXA4WkdzqE6K+iV5nBrOKuDxQ/L66G4nGN+bnXFsjKD/E+u9/eMcoMXejVy9aE/540HfRd6COZUVKsiq8PB70ch9C5ID6XfC8WYBFHbrgTdEcBEqbMRGt3lvH8MKBnduiMlxfNy8M6YaQe6veUxDskB10QnIRqs54CUmA6zbfUhalTyrqeXNT0mgU/NL/ypxPhiH+7diuLYRFGgTuyomdK0SOgtZiImx1O1P4BNm9extuIFnk5fKbvL39HE2y3a3uqIftMTutwWrF9HcJOBMGzqzgjtBe691VJQl6OtbZLuOjmDG3MP3ltvbdUnXTH15jiSguPJN0PhsBtl1b+cJO8e0I5ylrrgomVMfsWwl/81wzN0iCxhmL7i3sA/YI2ASorC2uMWin+A5SuvssR8m0a9AgHVUe+7fMGYIw== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(36860700013)(1800799024)(376014)(82310400026); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS4PR08MB7733 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:20b:46c::7]; domain=AS9PR07CA0006.eurprd07.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB1PEPF000509FF.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 28c4f1d5-d8fc-472a-1042-08dce211f2b0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|35042699022|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?dLYI3V5bXdgl7SMDWGWLdHUH4RJLX2K?= =?utf-8?q?CoruMolxF7Q0j0w2Q/I/pzGMPBZ0HIvSpaJ1SBI2bbnogqqPT+sZF7V3QlodQ5wg1?= =?utf-8?q?xAm63DT4q62BHAjlYsf0iyoW809j2UwT90/qCIr33BscbFA+E4Y5Ueyqo64Wn7aFX?= =?utf-8?q?lMV6dOVwaCMac0TsicI9NL+gxdBamQetqfj+oKjUlF9uQ+xOU5pbsIejGTkvzWvm8?= =?utf-8?q?8xb6+N3qkTB7SXAw2g7EbNq0nT3LGINIX7TTut9LQG0SlnOtsxu+MTsoGByWImrlZ?= =?utf-8?q?tKtrfTHE3yo3Nf1vkpa1FmguVeNRELoUJCT5fLF01Y0zpN/hWzymPajVW1xcIdDzE?= =?utf-8?q?uuGNR6kUMIBdMrySkSmTBuxhJFGOM7PTaOczvcKBm/U62wFhfFrlpRU/JKvRgNO9G?= =?utf-8?q?IYMDAaAYzdjAzcx99C04BepTu24qguKJSinVnmzDQThjFxwANuflL4hz9bteYWyF1?= =?utf-8?q?69IUfqcsx7dtU3dazkajDVpJsKwe0wdRhXcseIHMoNLkNovkawBvYqRKGgHDBet1d?= =?utf-8?q?pfBMnsYU7esXq0RcsWrcr5ie6sTkRx+gHT40gHB11Yn13in516zWa9WHNKuwlzz6W?= =?utf-8?q?va1gFa6gNEeeCSKYY0D7hh3HmLL2bjzQ/aT0cIp1E/hIFdOPzRWMDYTmjLkU31RbZ?= =?utf-8?q?U/ejd/g/lo1fLnHpfT+bc0118sPpR2d6NDaq3UIyUXGZJLxz0F84oUzaA7LEhM2NW?= =?utf-8?q?74t2tZmm296KUrUdhiaLY4VMcNRJQtykn+R0fUkgGEcJHl7BtUHBlAQ/TAFzhxOiy?= =?utf-8?q?j2HhcGeRyC6MMBcc9W1EkE6Qjt00jU3AAAGh8gIA/s2Ee2uVyDAcy6EfhZBznrSLW?= =?utf-8?q?Q8nLzn0Z73UwhVUTeLFhlFSrDwNQtMHpFE62HFnU86vzYzp4/T3LCKhFf1i2GRVnK?= =?utf-8?q?0PZBa531Ua9T9ni/hBr2miTzz/YTwIZrqqVMaYnX+qaZfIWKzq2gmywwpIHtziDoX?= =?utf-8?q?0DKtwG9D+iYhzhTQou4cKT1Tv3VoODC+4D7ZMjiehMpkwyLzxvtwXoNBMYvzguy1j?= =?utf-8?q?yKC9m5lrE0JVJkGRgFrqS/oRdwxJ/YTCmIkLolhqEiEZUvpRf8Mo9q5ira8QcJ1fn?= =?utf-8?q?8HHSWcUpbtn6MmIch6PKXuwH/KNR8Yv3p1XnBSCCgNdieLVd07WQVqPqY1SqMi90b?= =?utf-8?q?co3psMGbUxA1IpXpUEe5fPCAp9zpnfoeO7cQx6zpyfupIR8wt2wm9r41ld30WXWI/?= =?utf-8?q?c1PmS/Wp+zM+h+NH7pER7HL0f3sqbPEAwIMx/sDemDOnJvHb/hwqU3sbszzkZC8Vm?= =?utf-8?q?c/JcwJbWIR4ZeWlnjjfyX0atcMeFpawIJOE2UurygEBaMnPZmhZ1YbjJ+eXld6k24?= =?utf-8?q?MOa5A5nqySQG16gNKi92gMoCmNU+63vVvQ=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(35042699022)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Oct 2024 12:10:00.8367 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 511d109a-648d-4374-894d-08dce211f985 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF000509FF.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10977 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The AArch64 FEAT_FAMINMAX extension introduces instructions for computing the floating point absolute maximum and minimum of the two vectors element-wise. This patch adds code generation for famax and famin in terms of existing unspecs. With this patch: 1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands and then taking absolute value of their result. 2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands and then taking absolute value of their result. This fusion of operators is only possible when -march=armv9-a+faminmax+sve flags are passed. We also need to pass -ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX and UNSPEC_COND_SMIN. This code generation is only available on -O2 or -O3 as that is when auto-vectorization is enabled. gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_pred_faminmax_fused): Instruction pattern for faminmax codegen. * config/aarch64/iterators.md: Iterator and attribute for faminmax codegen. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/faminmax_1.c: New test. * gcc.target/aarch64/sve/faminmax_2.c: New test. --- gcc/config/aarch64/aarch64-sve2.md | 31 ++++ gcc/config/aarch64/iterators.md | 6 + .../gcc.target/aarch64/sve/faminmax_1.c | 85 ++++++++++ .../gcc.target/aarch64/sve/faminmax_2.c | 154 ++++++++++++++++++ 4 files changed, 276 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 972b03a4fef..6a8e940e16d 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2467,6 +2467,37 @@ [(set_attr "movprfx" "yes")] ) +;; ------------------------------------------------------------------------- +;; -- [FP] Absolute maximum and minimum +;; ------------------------------------------------------------------------- +;; Includes: +;; - FAMAX +;; - FAMIN +;; ------------------------------------------------------------------------- +;; Predicated floating-point absolute maximum and minimum. +(define_insn "*aarch64_pred_faminmax_fused" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_operand: 1 "register_operand") + (match_operand:SI 4 "aarch64_sve_gp_strictness") + (unspec:SVE_FULL_F + [(match_operand 5) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 2 "register_operand")] + UNSPEC_COND_FABS) + (unspec:SVE_FULL_F + [(match_operand 6) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 3 "register_operand")] + UNSPEC_COND_FABS)] + SVE_COND_FP_SMAXMIN))] + "TARGET_SVE_FAMINMAX" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , Upl , %0 , w ; * ] \t%0., %1/m, %0., %3. + [ ?&w , Upl , w , w ; yes ] movprfx\t%0, %2\;\t%0., %1/m, %0., %3. + } +) + ;; ========================================================================= ;; == Complex arithmetic ;; ========================================================================= diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index d3a457fc6d9..e9adb4209da 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3143,6 +3143,9 @@ UNSPEC_COND_FMIN UNSPEC_COND_FMINNM]) +(define_int_iterator SVE_COND_FP_SMAXMIN [UNSPEC_COND_SMAX + UNSPEC_COND_SMIN]) + (define_int_iterator SVE_COND_FP_TERNARY [UNSPEC_COND_FMLA UNSPEC_COND_FMLS UNSPEC_COND_FNMLA @@ -4503,6 +4506,9 @@ (define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN]) +(define_int_attr faminmax_cond_uns_op + [(UNSPEC_COND_SMAX "famax") (UNSPEC_COND_SMIN "famin")]) + (define_int_attr faminmax_uns_op [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c new file mode 100644 index 00000000000..bdf077ab2f7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_1.c @@ -0,0 +1,85 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_FAMAX(TYPE) \ + void fn_famax_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmax (temp1, temp2); \ + } \ + } \ + +#define TEST_FAMIN(TYPE) \ + void fn_famin_##TYPE (TYPE * restrict a, \ + TYPE * restrict b, \ + TYPE * restrict c, \ + int n) { \ + for (int i = 0; i < n; i++) { \ + TYPE temp1 = __builtin_fabs (a[i]); \ + TYPE temp2 = __builtin_fabs (b[i]); \ + c[i] = __builtin_fmin (temp1, temp2); \ + } \ + } \ + +/* +** fn_famax_float16_t: +** ... +** famax z30.h, p6/m, z30.h, z31.h +** ... +** ret +*/ +TEST_FAMAX (float16_t) + +/* +** fn_famax_float32_t: +** ... +** famax z30.s, p6/m, z30.s, z31.s +** ... +** ret +*/ +TEST_FAMAX (float32_t) + +/* +** fn_famax_float64_t: +** ... +** famax z30.d, p6/m, z30.d, z31.d +** ... +** ret +*/ +TEST_FAMAX (float64_t) + +/* +** fn_famin_float16_t: +** ... +** famin z30.h, p6/m, z30.h, z31.h +** ... +** ret +*/ +TEST_FAMIN (float16_t) + +/* +** fn_famin_float32_t: +** ... +** famin z30.s, p6/m, z30.s, z31.s +** ... +** ret +*/ +TEST_FAMIN (float32_t) + +/* +** fn_famin_float64_t: +** ... +** famin z30.d, p6/m, z30.d, z31.d +** ... +** ret +*/ +TEST_FAMIN (float64_t) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c new file mode 100644 index 00000000000..26396979389 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/faminmax_2.c @@ -0,0 +1,154 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -ffast-math" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_sve.h" + +#pragma GCC target "+sve+faminmax" + +#define TEST_WITH_SVMAX(TYPE) \ + TYPE fn_fmax_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmax_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMAXNM(TYPE) \ + TYPE fn_fmaxnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmaxnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMIN(TYPE) \ + TYPE fn_fmin_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svmin_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +#define TEST_WITH_SVMINNM(TYPE) \ + TYPE fn_fminnm_##TYPE (TYPE x, TYPE y) { \ + svbool_t pg = svptrue_b8(); \ + return svminnm_x(pg, svabs_x(pg, x), svabs_x(pg, y)); \ + } \ + +/* +** fn_fmax_svfloat16_t: +** ptrue p3.b, all +** fabs z0.h, p3/m, z0.h +** fabs z1.h, p3/m, z1.h +** fmax z0.h, p3/m, z0.h, z1.h +** ret +*/ +TEST_WITH_SVMAX (svfloat16_t) + +/* +** fn_fmax_svfloat32_t: +** ptrue p3.b, all +** fabs z0.s, p3/m, z0.s +** fabs z1.s, p3/m, z1.s +** fmax z0.s, p3/m, z0.s, z1.s +** ret +*/ +TEST_WITH_SVMAX (svfloat32_t) + +/* +** fn_fmax_svfloat64_t: +** ptrue p3.b, all +** fabs z0.d, p3/m, z0.d +** fabs z1.d, p3/m, z1.d +** fmax z0.d, p3/m, z0.d, z1.d +** ret +*/ +TEST_WITH_SVMAX (svfloat64_t) + +/* +** fn_fmaxnm_svfloat16_t: +** ptrue p3.b, all +** fabs z0.h, p3/m, z0.h +** fabs z1.h, p3/m, z1.h +** fmaxnm z0.h, p3/m, z0.h, z1.h +** ret +*/ +TEST_WITH_SVMAXNM (svfloat16_t) + +/* +** fn_fmaxnm_svfloat32_t: +** ptrue p3.b, all +** fabs z0.s, p3/m, z0.s +** fabs z1.s, p3/m, z1.s +** fmaxnm z0.s, p3/m, z0.s, z1.s +** ret +*/ +TEST_WITH_SVMAXNM (svfloat32_t) + +/* +** fn_fmaxnm_svfloat64_t: +** ptrue p3.b, all +** fabs z0.d, p3/m, z0.d +** fabs z1.d, p3/m, z1.d +** fmaxnm z0.d, p3/m, z0.d, z1.d +** ret +*/ +TEST_WITH_SVMAXNM (svfloat64_t) + +/* +** fn_fmin_svfloat16_t: +** ptrue p3.b, all +** fabs z0.h, p3/m, z0.h +** fabs z1.h, p3/m, z1.h +** fmin z0.h, p3/m, z0.h, z1.h +** ret +*/ +TEST_WITH_SVMIN (svfloat16_t) + +/* +** fn_fmin_svfloat32_t: +** ptrue p3.b, all +** fabs z0.s, p3/m, z0.s +** fabs z1.s, p3/m, z1.s +** fmin z0.s, p3/m, z0.s, z1.s +** ret +*/ +TEST_WITH_SVMIN (svfloat32_t) + +/* +** fn_fmin_svfloat64_t: +** ptrue p3.b, all +** fabs z0.d, p3/m, z0.d +** fabs z1.d, p3/m, z1.d +** fmin z0.d, p3/m, z0.d, z1.d +** ret +*/ +TEST_WITH_SVMIN (svfloat64_t) + +/* +** fn_fminnm_svfloat16_t: +** ptrue p3.b, all +** fabs z0.h, p3/m, z0.h +** fabs z1.h, p3/m, z1.h +** fminnm z0.h, p3/m, z0.h, z1.h +** ret +*/ +TEST_WITH_SVMINNM (svfloat16_t) + +/* +** fn_fminnm_svfloat32_t: +** ptrue p3.b, all +** fabs z0.s, p3/m, z0.s +** fabs z1.s, p3/m, z1.s +** fminnm z0.s, p3/m, z0.s, z1.s +** ret +*/ +TEST_WITH_SVMINNM (svfloat32_t) + +/* +** fn_fminnm_svfloat64_t: +** ptrue p3.b, all +** fabs z0.d, p3/m, z0.d +** fabs z1.d, p3/m, z1.d +** fminnm z0.d, p3/m, z0.d, z1.d +** ret +*/ +TEST_WITH_SVMINNM (svfloat64_t) + +/* { dg-final { scan-assembler-not {\tfamax\t} } } */ +/* { dg-final { scan-assembler-not {\tfamin\t} } } */