From patchwork Tue Jul 30 09:10:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1966409 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Wg6+xOVb; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Wg6+xOVb; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WY8dF0WSWz1yYq for ; Tue, 30 Jul 2024 19:12:32 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9DF213858428 for ; Tue, 30 Jul 2024 09:12:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on20601.outbound.protection.outlook.com [IPv6:2a01:111:f400:7e1a::601]) by sourceware.org (Postfix) with ESMTPS id 8E40D3858C56 for ; Tue, 30 Jul 2024 09:11:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8E40D3858C56 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8E40D3858C56 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f400:7e1a::601 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1722330718; cv=pass; b=rPqJWqPADfmN0qNnWh2jsSc2Y9ZdIDuYEHHZASSkhXZAJZ8kBV4VtbQPL1plZLg7uJdMOLL+6Se946NhNQbZbrDWYSKCLMOVuYhCeP0bcnuHE9k1JTvV/5dXB3AFWF4KadcGtrrHPB0NBKlywdOy4AHWDUsmls7f44V3NE8dS2U= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1722330718; c=relaxed/simple; bh=4yw/HatYfOrv0UpVGMYJU3NYwU0q/BmUZE8zDEN+JX8=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=UXBbJTV5NQTAZVkJWw1hSoo9wUgU7sSTB9FYtkCYZ8NcSH9bdrAc1wjoZ1rV51sqmsn5wFHtf5ie/1OdaA9toocLParNsFbV7LiAnux4b2Z5vo+vCK1ACTvZkukyuaMCdWNmmUU4/ThvQ4DJEhhX5k46pD2arZVVFjFVoxqdcKs= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=cw28lRVXcHeEOMqaaKo2PvxMDXuJEbvxp4sACPN778bOugg2pIb9JoMq4pGGx/bfRu4td+7g8ya2z78foVkTvBVRcNkyfQwcqj0nnESr4wmHpna+RxH7DZI1zKaF6qP7/qF4SceVuooqiPC1FuWsEgO5ktRsahOU26EYyShyp6pVMp/5JnTR7XcLxSUUP2w62FiaTFoW+BlBXKJmshT6Rw9C7/658ja2VVKtKEipuVq/VdImRCoJG2LyQ2pl6cuaZrbKBAhRTp9RxenssFNrH7mzjE5mvp43Tpge4q/pw1yDDN+c828fXw+2eb/Lu2AQ2lhgvYO51XJMUfacsokyYg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LfaxJGHvalHrifpsTczP8RTG9+DVB/Il2c1REp8MusM=; b=Co6smpY5vgcwMzLXxMHtG3Tpvw/2qiROlkk73WDWs/kXagWKSQO8A7oJ1OvH6i0CuTpc91tn0L+z625X5c8LHevgemlrTJ9nE/9fNPVjQoTao9hLh64IXWbLTSvNgibrpLq4RMZu88DGJ9SiatGNsMiwcWrnm1/k0S98tEE1u4l+Q8XHPWDUt4AQqntr2FL/2v0YkmjXSOgM2PZCSSGOgjTy0WjCCvfjpiIiI5J8TisIAFbYfli9nAJwXRTn7nvzflj4Jk/l/Y2L+ve1DejSoDRMSbgpP1wfxJTNYAiYK3rK3KWZm22K5pWzu73EG6lrZMh+Z342FtFXsqUvwMvwGg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LfaxJGHvalHrifpsTczP8RTG9+DVB/Il2c1REp8MusM=; b=Wg6+xOVbEMw1Sor200wNNixfVHthJuZuPSZAbhDAPWF1osWAr2iKTDfp/49eMlCPzyqglv8Dv+d+6eoG65FjMITMEn+6GiOjaWY0QH+C7gkN5UsbwA5vvGvxo7v6OtBukSc1mI2lkN2SBfkB7f5n+Eawkuj0SRNCqi0GLoIMRMo= Received: from DU2PR04CA0306.eurprd04.prod.outlook.com (2603:10a6:10:2b5::11) by DU0PR08MB7637.eurprd08.prod.outlook.com (2603:10a6:10:31c::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7828.19; Tue, 30 Jul 2024 09:11:48 +0000 Received: from DB1PEPF000509F7.eurprd02.prod.outlook.com (2603:10a6:10:2b5:cafe::be) by DU2PR04CA0306.outlook.office365.com (2603:10a6:10:2b5::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.35 via Frontend Transport; Tue, 30 Jul 2024 09:11:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB1PEPF000509F7.mail.protection.outlook.com (10.167.242.153) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7828.19 via Frontend Transport; Tue, 30 Jul 2024 09:11:48 +0000 Received: ("Tessian outbound 2fd79eef2229:v365"); Tue, 30 Jul 2024 09:11:48 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d820ae7839471058 X-CR-MTA-TID: 64aa7808 Received: from Lb7741bac6484.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 56EF1375-F116-489D-9C14-00A72091A332.1; Tue, 30 Jul 2024 09:10:52 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Lb7741bac6484.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 30 Jul 2024 09:10:52 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=fUr3+1VfZs814qjERy3Bg2tYclb8L8TrgfsgF/Fnhd3TgTIQ1NIX/Sm0gagFFlwH0jOsP7gwni0nEoX9ubiPyKTLdgz2Z+g7sWkASQenmGg5rAg5fgFAPAnHmtmmSh3zCFZYa9x6fZPVidDNlp9eLPM9j+PwaItBvadfp6bGnge5IVdgiRMVi7KIYZwACxqppTtvZJEGRgc3rn5kBkBhwYaSj+LCyR5Jhq6SekcMLug3AHOvtxE6G2sk57Mu6m8qrvyiwwhdKX6wsSjcmiU+rIWGq9ogeZDWMBBeM/UdJNXK91hZyXbF7mVYTmUdNmATeRX2fczhhiYQ4s2Hv3Mxbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LfaxJGHvalHrifpsTczP8RTG9+DVB/Il2c1REp8MusM=; b=W9TI8uTLOQS04yaVGVIwj1OjbD8F351/G4bVTBzk7k1z1cH3H3pHjSFwHC+L/52BLrDAMw1k30eHvcb6TF7cjisUxpLvlCgMQf051s/+Nit1jmyaYyM9IkMMnQEFycqd/lf0prsBFjgchiSkV5GWhfjIbk6hPlX40zuQXAvuVO+s7HkWVtdqS4ahsJJACgclzUiS5pMyWu+3S2ZZ9+IXJjM0IAX561nesl+1xow6C1xO91LhAzjZM0Y+znVmBYZVR+NGfQANG7uhQeTWS3cV9LpWLyqnwQXNhIBKnK1ioikn8+RJCPKbY4b2LennD7+1rG/HlRQLtgFlyROuYDiyiw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LfaxJGHvalHrifpsTczP8RTG9+DVB/Il2c1REp8MusM=; b=Wg6+xOVbEMw1Sor200wNNixfVHthJuZuPSZAbhDAPWF1osWAr2iKTDfp/49eMlCPzyqglv8Dv+d+6eoG65FjMITMEn+6GiOjaWY0QH+C7gkN5UsbwA5vvGvxo7v6OtBukSc1mI2lkN2SBfkB7f5n+Eawkuj0SRNCqi0GLoIMRMo= Received: from DU2PR04CA0190.eurprd04.prod.outlook.com (2603:10a6:10:28d::15) by AS8PR08MB6038.eurprd08.prod.outlook.com (2603:10a6:20b:23f::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.28; Tue, 30 Jul 2024 09:10:48 +0000 Received: from DB1PEPF000509FE.eurprd03.prod.outlook.com (2603:10a6:10:28d:cafe::38) by DU2PR04CA0190.outlook.office365.com (2603:10a6:10:28d::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.35 via Frontend Transport; Tue, 30 Jul 2024 09:10:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DB1PEPF000509FE.mail.protection.outlook.com (10.167.242.40) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7784.11 via Frontend Transport; Tue, 30 Jul 2024 09:10:48 +0000 Received: from AZ-NEU-EX03.Arm.com (10.251.24.31) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Tue, 30 Jul 2024 09:10:45 +0000 Received: from e129018.arm.com (10.57.11.217) by mail.arm.com (10.251.24.31) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Tue, 30 Jul 2024 09:10:45 +0000 From: Christophe Lyon To: , , , CC: Christophe Lyon Subject: [PATCH v3] arm: [MVE intrinsics] Improve vdupq_n implementation Date: Tue, 30 Jul 2024 11:10:31 +0200 Message-ID: <20240730091031.59487-1-christophe.lyon@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <7735bc94-1b2c-4bcb-9436-7341ea4b011f@arm.com> References: <7735bc94-1b2c-4bcb-9436-7341ea4b011f@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DB1PEPF000509FE:EE_|AS8PR08MB6038:EE_|DB1PEPF000509F7:EE_|DU0PR08MB7637:EE_ X-MS-Office365-Filtering-Correlation-Id: 96356666-31bc-4110-afdf-08dcb077a486 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info-Original: 7HxS7NTWBHqkSOImoC6Uyj2jOryGbLATprs4DFohk/2HyKhh+DBDoh+ESqP1t9GQ0mdDOHcwvFAUCtWwacQAufNf6IG3qOmxd0ulmqd+4z0NJKhUzXSozdEttXQuTwnZUsCi8hoeOKkekJHPXMm9Khq9+wWCcjArSUfnXQ5gLXL5aLS7T/6tFsBJ9eh4J5n/kcR2siwDcWovThiUfgxh+w1h0WK4mBJ/tvdsynjNvcCoij/6WVzc2OICDibK5uq4Vt43PwtS4itxe0eREQR/svy9WMtLd4hJ+tFRqlEYQiQQWI9WZgMTONTZm5/HXXpFHq2bUtSjCJRmuETWBp8CyB3rce9ZlkY6wxq5/cHBQCMDlVSc0G6MRYs4D7sZRfMiquo+oCPA7C1FGBusfxSfCXa6qdLZub8cF7Ji9Xwsj3cqDDrkphsvGIG+mYWQiS0KA9AKQY8+krhIu+Fq7t0mFffZ4wUXFpoA/Ubex/KPxW655Fi6j1rpt/ixwbWu4DM73xQtqpV4aGoEeLTz8BgtzCh/wmLBUOy8HppCt4nzI8uydMOqzqU3LzQgZ5plHqmCMkfPmI14x9AZ/saKvXB3jjFFrCUkDkmwEq+fxAexkDkjLvlwRlRzMT9dNyNfsy7sQ3jj7fB81JAV7yYUz7B27m5j77kOfN6feIxVFQIwMPavLB9N2hSkiYIPkv6PiiPix/KDraUql6AZXchianjgvPuNPvhxQ6nV/ZouCiNJmXnbrv+Ltp9oyqGRPzSX7yMLpi7n+YTrPU6CG7T2RcoROzYxON6srYtF5G9IQQum3OUKNaZ0IPl+NSC99wKBTMiQ9SqPAgtbgXb8PerXpkLNrGBZwUwMajZJg9XseqXH+ZXAxth5ca8H/hr3RRoIi7AEksI5VCra6+SUnhRJ/bsiJngRF2DBOlOHwfPnuA4Gcf1rOU+gJA2YMIm+7JxO8cn89+ku2FC+3xy+s7kz1vU0N4VBMsLs4rh6ht0NBVIRwxWT19+vOx3pyeqUoaibvv+G9fOY8ooe6q+tP5GZ/6xafUqy578gHk9R935x0QtkKocs43h47dYWo/sTqVcAmzm0Yp6BKnE+CwJF8aXPoFsNER667DuKtrqGbJQfovbNp/NoEF9iVouHi75k50bXq5nBm9Au7jewcthj9M/R7hHMtNmSm7vOI/IVbeF1kNVMutKtomsmujt/QitohujPDrMd9SkoC2isGeYnjSGEUTHMTfm9JaY9cy+u/ndrdhLrdXdQW+bgfnqDwPw9Blp4VlCnAL2P87HPfmWrMA75tKGmN6YX79o+U25t0W0WSmkz7o4fLunF68wmEaa6UGvmyf4AGpDJx3oDlX7RwNLFsRhbsCWrveyk1aVPmgunyBS3kWeGBzWVLKCH0/02ucr87mY2xOiNEbNDxpK0w1HBdM+C5g== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6038 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:10:28d::15]; domain=DU2PR04CA0190.eurprd04.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB1PEPF000509F7.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: c0899541-e5c1-444a-c9d6-08dcb077805d X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|35042699022|1800799024|36860700013|376014; X-Microsoft-Antispam-Message-Info: GhfPz73737nqHMaj5v3RnZm2vGjZlnUcgmcEqcukzyRXJ3/eRSBKzmUZkXVJpJbRtuU7YRFS5+CwSXLOJVz7S43zeWu2qI1b4Clfqm1om6J1ancai1em9wsKgWoNPs4au7hUlE15NP++LcpQGz4GCGVVrKA8k+MzvfkEMiDcd3ay/1DkTOCWCPSYiCs8ZIb4iFC5YKAAmfsuAzrhSA9Zf0zWIaLq/01pFC+wn37c2umWyC+4xUVbUKDPlUwgjv+cqIWQshl87kc9D4Wz6WGD27mje3dPOgleL9vaT2VN14KHPeOkzSjO6/b9S/zy9kxDy3GGvGx5FZ/hs7kQkaM5z963X54tosRFyt9CaFCojKMFGNVXIjiM/QqceaeVnC2RIcRk5oHRWpjLWCh+aKTVHdhn5QTU84a6S+wf2rrulusbUXk9PywPZpafFUqsOzfuah+F1Vb9c5c4jhrPIecTKAPELxt/M2jYR+/TTd/TcoX4EOFIpt9EEyg7UwwVUaAL4hjCpZGZ8zkB2pSdEPWizduC2tyV84gRd8FEDnluC7cT87lS4vxADSu0e6qZnr8oCUs36t14lDCGh9spXSLPyuEJZqbBcXxtOF9CqwkLBzAYS1bKUOjvvJ7t/G6BWKz+7g00DBgOSZ9KwP0f+5oRGZyF+9mQdFYnRZHT97/TkmBa20FK4san5ni+Ll1McIsKeEaqombYTYm9CxgeuLGY0G2ygK1r2It5wKsQjxHO9jbMFSYRToc6TzqePu3my+YGzdZrQ5psA9yuORyriaYfCkCE0qiHk3GlXi9sYrtDU/aYNiu58E2vepc50wSTEOGbGVYDbbMN36ZhD4dEsC5lRE+vJxALDLLwdSANHZXQjyk0Jdm8IUTUbQLKel3O8BtD2MSE9nh6BPohCK4klcRY8UreqG9ujBCB/efxtEu9csloLNA3d0FPO9zH2SZ3sx4oKjSqmEYIpqvXaeZcqe18fJNdpJtfDAXoltriv8xhBEpIFC9eiDtHPAFWIZy6GaZ1Z0d8HyAl2CiybnZW+3y//3+uOEvqXyDjec9DLzdJc5WQlNKY+vZWwijbBo0A064J93oTvVkz+8ZCtzyd3K8ogPCnMQfmVno5ws+OJhB0C+ujCK+dK2uQ3kPCUE12sdwG4Ei9HOyxPVmxtDY8PFmy5C4dhFwf59Prb6AIe61aAIBriMX+3/MZV508ONcBOzChyVoOkA1/VYAu02GdpJ7I+HzGB4GvmmJHGIMYECFbE91uML1ZO/d+ALg26x2rX+rDeaMuaLu4fbQk7fa9BJeXKAa3BGGIhyBVocPsy6jS8LJRAxMRdb0LTgALDYh3yUvVjj4aLVUGv1hCaG7tJzfiyr2da/R5C8Oekrn1mupNMHoVFjiV5GzXKQbbL/pncyzTktFp+/ZN/rcbTwXxN6T5aA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(35042699022)(1800799024)(36860700013)(376014); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jul 2024 09:11:48.7498 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 96356666-31bc-4110-afdf-08dcb077a486 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF000509F7.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB7637 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, v3 of patch 2/2 uses your suggested fix about using extra_cost as an adjustment. I did not introduce the ARM_INSN_COST macro you suggested because it seems there's only a handful (maybe two) of cases where it could be used, and I thought it wouldn't make the code really easier to understand. Since you already approved patch 1/2, I'm not reposting it. Thanks, Christophe This patch makes the non-predicated vdupq_n MVE intrinsics use vec_duplicate rather than an unspec. This enables the compiler to generate better code sequences (for instance using vmov when possible). The patch renames the existing mve_vdup pattern into @mve_vdupq_n, and removes the now useless @mve_q_n_f and @mve_q_n_ ones. As a side-effect, it needs to update the mve_unpredicated_insn predicates in @mve_q_m_n_ and @mve_q_m_n_f. Using vec_duplicates means the compiler is now able to use vmov in the tests with an immediate argument in vdupq_n_[su]{8,16,32}.c: vmov.i8 q0,#0x1 However, this is only possible when the immediate has a suitable value (MVE encoding constraints, see imm_for_neon_mov_operand predicate). Provided we adjust the cost computations in arm_rtx_costs_internal(), when the immediate does not meet the vmov constraints, we now generate: mov r0, #imm vdup.xx q0,r0 or ldr r0, .L4 vdup.32 q0,r0 in the f32 case (with 1.1 as immediate). Without the cost adjustment, we would generate: vldr.64 d0, .L4 vldr.64 d1, .L4+8 and an associated literal pool entry. Regarding the testsuite updates: -------------------------------- * The signed versions of vdupq_* tests lack a version with an immediate argument. This patch adds them, similar to what we already have for vdupq_n_u*.c tests. * Code generation for different immediate values is checked with the new tests this patch introduces. Note there's no need for s8/u8 tests because 8-bit immediates always comply wth imm_for_neon_mov_operand. * We can remove xfail from vcmp*f tests since we now generate: movw r3, #15462 vcmp.f16 eq, q0, r3 instead of the previous: vldr.64 d6, .L5 vldr.64 d7, .L5+8 vcmp.f16 eq, q0, q3 Tested on arm-linux-gnueabihf and arm-none-eabi with no regression. 2024-07-02 Jolen Li Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vdupq_impl): New class. (vdupq): Use new implementation. * config/arm/arm.cc (arm_rtx_costs_internal): Handle HFmode for COST_DOUBLE. Update costing for CONST_VECTOR. * config/arm/arm_mve_builtins.def: Merge vdupq_n_f, vdupq_n_s and vdupq_n_u into vdupq_n. * config/arm/mve.md (mve_vdup): Rename into ... (@mve_vdup_n): ... this. (@mve_q_n_f): Delete. (@mve_q_n_): Delete.. (@mve_q_m_n_): Update mve_unpredicated_insn attribute. (@mve_q_m_n_f): Likewise. gcc/testsuite/ * gcc.target/arm/mve/intrinsics/vdupq_n_u8.c (foo1): Update expected code. * gcc.target/arm/mve/intrinsics/vdupq_n_u16.c (foo1): Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_u32.c (foo1): Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_s8.c: Add test with immediate argument. * gcc.target/arm/mve/intrinsics/vdupq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f16.c (foo1): Update expected code. * gcc.target/arm/mve/intrinsics/vdupq_n_f32.c (foo1): Likewise. * gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c: Add test with immediate argument. * gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c: New test. * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c: Remove xfail. * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c: Likewise. --- gcc/config/arm/arm-mve-builtins-base.cc | 55 ++++++++++++++++++- gcc/config/arm/arm.cc | 10 +++- gcc/config/arm/arm_mve_builtins.def | 4 +- gcc/config/arm/mve.md | 41 +++----------- .../arm/mve/intrinsics/vcmpeqq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpeqq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpgeq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpgeq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpgtq_n_f16.c | 2 +- .../arm/mve/intrinsics/vcmpgtq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpleq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpleq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpltq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpltq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpneq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpneq_n_f32.c | 4 +- .../arm/mve/intrinsics/vdupq_m_n_s16.c | 18 +++++- .../arm/mve/intrinsics/vdupq_m_n_s32.c | 18 +++++- .../arm/mve/intrinsics/vdupq_m_n_s8.c | 18 +++++- .../arm/mve/intrinsics/vdupq_n_f16.c | 3 +- .../arm/mve/intrinsics/vdupq_n_f32-2.c | 29 ++++++++++ .../arm/mve/intrinsics/vdupq_n_f32.c | 5 +- .../arm/mve/intrinsics/vdupq_n_s16-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_s16.c | 14 ++++- .../arm/mve/intrinsics/vdupq_n_s32-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_s32.c | 14 ++++- .../arm/mve/intrinsics/vdupq_n_s8.c | 14 ++++- .../arm/mve/intrinsics/vdupq_n_u16-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_u16.c | 4 +- .../arm/mve/intrinsics/vdupq_n_u32-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_u32.c | 4 +- .../arm/mve/intrinsics/vdupq_n_u8.c | 4 +- .../arm/mve/intrinsics/vdupq_x_n_s16.c | 18 +++++- .../arm/mve/intrinsics/vdupq_x_n_s32.c | 18 +++++- .../arm/mve/intrinsics/vdupq_x_n_s8.c | 18 +++++- 35 files changed, 394 insertions(+), 81 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index e0ae593a6c0..be0f9c26c83 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -39,6 +39,59 @@ using namespace arm_mve; namespace { +/* Implements vdup_* intrinsics. */ +class vdupq_impl : public quiet +{ +public: + CONSTEXPR vdupq_impl (int unspec_for_m_n_sint, + int unspec_for_m_n_uint, + int unspec_for_m_n_fp) + : m_unspec_for_m_n_sint (unspec_for_m_n_sint), + m_unspec_for_m_n_uint (unspec_for_m_n_uint), + m_unspec_for_m_n_fp (unspec_for_m_n_fp) + {} + int m_unspec_for_m_n_sint; + int m_unspec_for_m_n_uint; + int m_unspec_for_m_n_fp; + + rtx expand (function_expander &e) const override + { + gcc_assert (e.mode_suffix_id == MODE_n); + + insn_code code; + machine_mode mode = e.vector_mode (0); + + switch (e.pred) + { + case PRED_none: + /* No predicate, _n suffix. */ + code = code_for_mve_vdupq_n (mode); + return e.use_exact_insn (code); + + case PRED_m: + case PRED_x: + /* "m" or "x" predicate, _n suffix. */ + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, + m_unspec_for_m_n_uint, mode); + else + code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, + m_unspec_for_m_n_sint, mode); + else + code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, mode); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + + default: + gcc_unreachable (); + } + } +}; + /* Implements vreinterpretq_* intrinsics. */ class vreinterpretq_impl : public quiet { @@ -339,7 +392,7 @@ FUNCTION (vcmpltq, unspec_based_mve_function_exact_insn_vcmp, (LT, UNKNOWN, LT, FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNKNOWN, UNKNOWN, VCMPCSQ_M_U, UNKNOWN, UNKNOWN, VCMPCSQ_M_N_U, UNKNOWN)) FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) -FUNCTION_ONLY_N (vdupq, VDUPQ) +FUNCTION (vdupq, vdupq_impl, (VDUPQ_M_N_S, VDUPQ_M_N_U, VDUPQ_M_N_F)) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) FUNCTION (vfmasq, unspec_mve_function_exact_insn, (-1, -1, -1, -1, -1, VFMASQ_N_F, -1, -1, -1, -1, -1, VFMASQ_M_N_F)) diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc index 93993d95eb9..759e38c4dda 100644 --- a/gcc/config/arm/arm.cc +++ b/gcc/config/arm/arm.cc @@ -11911,7 +11911,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code, case CONST_DOUBLE: if (TARGET_HARD_FLOAT && GET_MODE_CLASS (mode) == MODE_FLOAT - && (mode == SFmode || !TARGET_VFP_SINGLE)) + && (mode == SFmode || mode == HFmode || !TARGET_VFP_SINGLE)) { if (vfp3_const_double_rtx (x)) { @@ -11936,12 +11936,18 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code, return true; case CONST_VECTOR: - /* Fixme. */ if (((TARGET_NEON && TARGET_HARD_FLOAT && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode))) || TARGET_HAVE_MVE) && simd_immediate_valid_for_move (x, mode, NULL, NULL)) *cost = COSTS_N_INSNS (1); + else if (TARGET_HAVE_MVE) + { + /* 128-bit vector requires two vldr.64 on MVE. */ + *cost = COSTS_N_INSNS (2); + if (speed_p) + *cost += extra_cost->ldst.loadd * 2; + } else *cost = COSTS_N_INSNS (4); return true; diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index f141aab816c..dd99a90b952 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -27,7 +27,7 @@ VAR2 (UNOP_NONE_NONE, vrndmq_f, v8hf, v4sf) VAR2 (UNOP_NONE_NONE, vrndaq_f, v8hf, v4sf) VAR2 (UNOP_NONE_NONE, vrev64q_f, v8hf, v4sf) VAR2 (UNOP_NONE_NONE, vnegq_f, v8hf, v4sf) -VAR2 (UNOP_NONE_NONE, vdupq_n_f, v8hf, v4sf) +VAR5 (UNOP_NONE_NONE, vdupq_n, v8hf, v4sf, v16qi, v8hi, v4si) VAR2 (UNOP_NONE_NONE, vabsq_f, v8hf, v4sf) VAR1 (UNOP_NONE_NONE, vrev32q_f, v8hf) VAR1 (UNOP_NONE_NONE, vcvttq_f32_f16, v4sf) @@ -39,7 +39,6 @@ VAR3 (UNOP_SNONE_SNONE, vqnegq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vqabsq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vnegq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vmvnq_s, v16qi, v8hi, v4si) -VAR3 (UNOP_SNONE_SNONE, vdupq_n_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vclzq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vclsq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vaddvq_s, v16qi, v8hi, v4si) @@ -57,7 +56,6 @@ VAR1 (UNOP_SNONE_SNONE, vrev16q_s, v16qi) VAR1 (UNOP_SNONE_SNONE, vaddlvq_s, v4si) VAR3 (UNOP_UNONE_UNONE, vrev64q_u, v16qi, v8hi, v4si) VAR3 (UNOP_UNONE_UNONE, vmvnq_u, v16qi, v8hi, v4si) -VAR3 (UNOP_UNONE_UNONE, vdupq_n_u, v16qi, v8hi, v4si) VAR3 (UNOP_UNONE_UNONE, vclzq_u, v16qi, v8hi, v4si) VAR3 (UNOP_UNONE_UNONE, vaddvq_u, v16qi, v8hi, v4si) VAR2 (UNOP_UNONE_UNONE, vrev32q_u, v16qi, v8hi) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index afe5fba698c..9fcc1242206 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -94,13 +94,16 @@ (define_insn "mve_mov" (set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*") (set_attr "neg_pool_range" "*,*,*,*,996,*,*,*")]) -(define_insn "mve_vdup" +;; +;; [vdupq_n_u, vdupq_n_s, vdupq_n_f] +;; +(define_insn "@mve_vdupq_n" [(set (match_operand:MVE_VLD_ST 0 "s_register_operand" "=w") (vec_duplicate:MVE_VLD_ST (match_operand: 1 "s_register_operand" "r")))] "TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT" "vdup.\t%q0, %1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vdup")) + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vdupq_n")) (set_attr "length" "4") (set_attr "type" "mve_move")]) @@ -188,21 +191,6 @@ (define_insn "mve_vq_f" (set_attr "type" "mve_move") ]) -;; -;; [vdupq_n_f]) -;; -(define_insn "@mve_q_n_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand: 1 "s_register_operand" "r")] - MVE_FP_N_VDUPQ_ONLY)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - ".%#\t%q0, %1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_f")) - (set_attr "type" "mve_move") -]) - ;; ;; [vrev32q_f]) ;; @@ -328,21 +316,6 @@ (define_expand "mve_vmvnq_s" "TARGET_HAVE_MVE" ) -;; -;; [vdupq_n_u, vdupq_n_s]) -;; -(define_insn "@mve_q_n_" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand: 1 "s_register_operand" "r")] - VDUPQ_N)) - ] - "TARGET_HAVE_MVE" - ".%#\t%q0, %1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_")) - (set_attr "type" "mve_move") -]) - ;; ;; [vclzq_u, vclzq_s]) ;; @@ -1903,7 +1876,7 @@ (define_insn "@mve_q_m_n_" ] "TARGET_HAVE_MVE" "vpst\;t.%#\t%q0, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_")) + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2317,7 +2290,7 @@ (define_insn "@mve_q_m_n_f" ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" "vpst\;t.%#\t%q0, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_f")) + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n")) (set_attr "type" "mve_move") (set_attr "length""8")]) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c index 2f84d751c53..335e511b17b 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 eq, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c index 6cfe7338fce..e5c16be65e3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 eq, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c index 978bd7d4b52..47d54863a85 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 ge, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c index 66b6d8b0056..1b775eaf8a0 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 ge, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c index 9c5f1f2f5c8..89d8e2b9109 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 gt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c index 2723aa7f98f..a5510e852d5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 gt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c index 1d1f4bf0e58..c94b3119d59 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 le, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c index bf77a808064..80e2cfa1079 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 le, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c index f9f091cd9b3..c3a106455cb 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 lt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c index d22ea1aca30..b485f75e769 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 lt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c index 83beca964d6..1156caafda1 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 ne, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c index abe1abfed2a..c3ffbd1335f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 ne, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c index bf05c73fc1d..dbbf8540681 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c @@ -42,8 +42,24 @@ foo1 (int16x8_t inactive, int16_t a, mve_pred16_t p) return vdupq_m (inactive, a, p); } +/* +**foo2: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int16x8_t +foo2 (int16x8_t inactive, mve_pred16_t p) +{ + return vdupq_m (inactive, 1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c index 71789bb620e..613b5d30fb3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c @@ -42,8 +42,24 @@ foo1 (int32x4_t inactive, int32_t a, mve_pred16_t p) return vdupq_m (inactive, a, p); } +/* +**foo2: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int32x4_t +foo2 (int32x4_t inactive, mve_pred16_t p) +{ + return vdupq_m (inactive, 1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c index 48c4fbd1f82..a1ff48e94e5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c @@ -42,8 +42,24 @@ foo1 (int8x16_t inactive, int8_t a, mve_pred16_t p) return vdupq_m (inactive, a, p); } +/* +**foo2: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.8 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int8x16_t +foo2 (int8x16_t inactive, mve_pred16_t p) +{ + return vdupq_m (inactive, 1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c index 44112190fb8..f9aae2fd120 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c @@ -24,6 +24,7 @@ foo (float16_t a) /* **foo1: ** ... +** movw r[0-9]+, #15462 ** vdup.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... */ @@ -37,4 +38,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c new file mode 100644 index 00000000000..a4b0022cdfc --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c @@ -0,0 +1,29 @@ +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */ +/* { dg-add-options arm_v8_1m_mve_fp } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that fits in vmov. */ +/* +**foo1: +** ... +** vmov.f32 q[0-9]+, #0.0 .* +** ... +*/ +float32x4_t +foo1 () +{ + return vdupq_n_f32 (0); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c index 059e3e42dd0..afa78129291 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c @@ -24,7 +24,8 @@ foo (float32_t a) /* **foo1: ** ... -** vdup.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ldr r[0-9]+, .L.* +** vdup.32 q0, r[0-9]+ ** ... */ float32x4_t @@ -37,4 +38,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c new file mode 100644 index 00000000000..3fedbb100b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000(?: @.*|) +** vdup.16 q[0-9]+, r[0-9]+ +** ... +*/ +int16x8_t +foo1 () +{ + return vdupq_n_s16 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c index d8ba299cb15..f2746075a3b 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c @@ -21,8 +21,20 @@ foo (int16_t a) return vdupq_n_s16 (a); } +/* +**foo1: +** ... +** vmov.i16 q[0-9]+, (#0x1) (?:@.*|) +** ... +*/ +int16x8_t +foo1 () +{ + return vdupq_n_s16 (1); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c new file mode 100644 index 00000000000..da8adeeeb7b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000 +** vdup.32 q0, r[0-9]+ +** ... +*/ +int32x4_t +foo1 () +{ + return vdupq_n_s32 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c index a81c6d1e220..7f75eca2ad2 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c @@ -21,8 +21,20 @@ foo (int32_t a) return vdupq_n_s32 (a); } +/* +**foo1: +** ... +** vmov.i32 q[0-9]+, (#0x1) (?:@.*|) +** ... +*/ +int32x4_t +foo1 () +{ + return vdupq_n_s32 (1); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c index b0bac4fce89..454ff5abac2 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c @@ -21,8 +21,20 @@ foo (int8_t a) return vdupq_n_s8 (a); } +/* +**foo1: +** ... +** vmov.i8 q[0-9]+, (#0x1) (?:@.*|) +** ... +*/ +int8x16_t +foo1 () +{ + return vdupq_n_s8 (1); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c new file mode 100644 index 00000000000..accd7fac130 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000(?: @.*|) +** vdup.16 q[0-9]+, r[0-9]+ +** ... +*/ +uint16x8_t +foo1 () +{ + return vdupq_n_u16 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c index 55e0a601110..4accb6480dd 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c @@ -24,7 +24,7 @@ foo (uint16_t a) /* **foo1: ** ... -** vdup.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** vmov.i16 q[0-9]+, (#0x1) (?:@.*|) ** ... */ uint16x8_t @@ -37,4 +37,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c new file mode 100644 index 00000000000..c97cea186ef --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000 +** vdup.32 q0, r[0-9]+ +** ... +*/ +uint32x4_t +foo1 () +{ + return vdupq_n_u32 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c index bf73bc17fc7..d08a94c7a16 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c @@ -24,7 +24,7 @@ foo (uint32_t a) /* **foo1: ** ... -** vdup.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** vmov.i32 q[0-9]+, (#0x1) (?:@.*|) ** ... */ uint32x4_t @@ -37,4 +37,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c index 48cbdb2a1da..f1fcd4acaa3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c @@ -24,7 +24,7 @@ foo (uint8_t a) /* **foo1: ** ... -** vdup.8 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** vmov.i8 q[0-9]+, (#0x1) (?:@.*|) ** ... */ uint8x16_t @@ -37,4 +37,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c index 6756502ab21..9dcfe4e0376 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c @@ -25,8 +25,24 @@ foo (int16_t a, mve_pred16_t p) return vdupq_x_n_s16 (a, p); } +/* +**foo1: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int16x8_t +foo1 (mve_pred16_t p) +{ + return vdupq_x_n_s16 (1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c index b04afb3834b..eacdb2e454f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c @@ -25,8 +25,24 @@ foo (int32_t a, mve_pred16_t p) return vdupq_x_n_s32 (a, p); } +/* +**foo1: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int32x4_t +foo1 (mve_pred16_t p) +{ + return vdupq_x_n_s32 (1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c index b23facd5e94..8951f7475f5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c @@ -25,8 +25,24 @@ foo (int8_t a, mve_pred16_t p) return vdupq_x_n_s8 (a, p); } +/* +**foo1: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.8 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int8x16_t +foo1 (mve_pred16_t p) +{ + return vdupq_x_n_s8 (1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */