From patchwork Thu Jul 11 09:36:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 1959200 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=nRAA3QRt; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=nRAA3QRt; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WKW4m6qN5z1xpd for ; Thu, 11 Jul 2024 20:22:32 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 337193849ACB for ; Thu, 11 Jul 2024 10:22:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2060f.outbound.protection.outlook.com [IPv6:2a01:111:f403:2612::60f]) by sourceware.org (Postfix) with ESMTPS id EE91E384A4AD for ; Thu, 11 Jul 2024 10:22:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE91E384A4AD Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EE91E384A4AD Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2612::60f ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1720693327; cv=pass; b=RVpyKZJrbkkCx3EDUAP4fK+yNBN/7F0j7mlHR9kp0hhEApfErO2rOP4bYDLKiKwo5T5/pQdskFl9ktivAQ9shj2DwHUglBnFkrxUJOv/5HWpoDkBHeaw7kD0YMbUtr7NxCHkh37STncSCHFC7KfMFiOn3elUeIN0r7F+Vk5ZDKk= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1720693327; c=relaxed/simple; bh=7jrPOc9sgoYgUNaZu3XJu9eX98/tBb4kJ7kjeU83ZLQ=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=hXIcsdWnwAAHImzyuhUrHxcZuXcIZhgdbSBwkmr9TVQubnNI4MDPkWd/lmw+lKUiK08bXgBsWG6glq0v3/MFFfZWo3h/sPLyAPM8zDBFmwiv7p+TjmHYc96I6tFl/+SjvfHULe6egeUOzmX3R/WrbaO+pTph3UkPP7zCxRpmG/Q= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=jfClICVNmzKCniF6dSKY3Fi4v99bNnHFb3B2cgdiW1F1fUe5hs+EOWJkMXVim8uRRvsnhWqpmdRpFff2JEOAzLwN6Tip+MGJbJFJyUZp+m4OMphh1rDhDm9gsRgsjqd0M6USqWMuCf8HJNACyi0opr4uUHxjNMJjkPRbTfanF6kipkFP/iuMLu0xWTUOQ9u1L5avj6pO1LX/lY+UeRtPAKfvDvKtgZcJgwlAz1D9tXFQkrjJ6KgORMy6EEoX/WSRWYx5Q47QgtLDzq4iLEtHlGrrjwgWR1McRTidiojii694GALYimIx5+7N6wapmXPqXnEImOL7zhS4rZldqytXig== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=l8CxloFBGbRB3gKlE30l/bxrnUbXGuUASZ/kXTvsXaQ=; b=qlu91iadHQtUuXOB6/FljL068ZSaJOotmjdv0onJRuOuEqHt1OdBeX/W7kVtYxFOks7obzx2IeZqRiiYfFJIN8ixOg8yVNuE1GlzwcJkUp6DAwZxcr3SgP1KQqyTHZlL+lG+r8tFbGr3cshzPFc6sRn0MciRik1YFMSk4Dx7uaEOgNZ94eO7p8YvkxaH/AYcG8wXbHL12jdUJ4TTl8UlFmj3DLpAdTPmtPNCT248p37WvjTXHDvaKnnCtNAoZqXVkw9PNYc9+b6RKZT4Euus/5YHigUOzsb2AgmxTbGDWEr1iPTSbOk+mil+QRq9wFE0q/Oo+6xO4pWiKEOXNtv/bA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=l8CxloFBGbRB3gKlE30l/bxrnUbXGuUASZ/kXTvsXaQ=; b=nRAA3QRtmVZHqgP35/UaSHJbagrLqpRUC1MmiEIyFWYeR9Gdn1nqQEneBRFcbI5lBV6AmopXzE0uS2uYwMN/mWJdtFpH9Xo/UiBh9fzZix1rlLKfPpI+sehjoC0AUXEfL+tIYHLpS/+am2IbOl8f71xTqzEaUs3LmGL1TI8IAVM= Received: from DB9PR05CA0011.eurprd05.prod.outlook.com (2603:10a6:10:1da::16) by DB3PR08MB9010.eurprd08.prod.outlook.com (2603:10a6:10:42b::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.35; Thu, 11 Jul 2024 10:21:57 +0000 Received: from DU2PEPF00028CFC.eurprd03.prod.outlook.com (2603:10a6:10:1da:cafe::d) by DB9PR05CA0011.outlook.office365.com (2603:10a6:10:1da::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.36 via Frontend Transport; Thu, 11 Jul 2024 10:21:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF00028CFC.mail.protection.outlook.com (10.167.242.180) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7762.17 via Frontend Transport; Thu, 11 Jul 2024 10:21:57 +0000 Received: ("Tessian outbound 7c3e8814239e:v359"); Thu, 11 Jul 2024 10:21:56 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8b15ae0b8b843a96 X-CR-MTA-TID: 64aa7808 Received: from 77b97f1567ab.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 8A66307A-B7EF-4923-B860-D044D8C2C0F1.1; Thu, 11 Jul 2024 09:36:22 +0000 Received: from EUR03-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 77b97f1567ab.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 11 Jul 2024 09:36:22 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=BeCg2bq8J8fQF1O2Cs0vlPsBRrU740trRcmrzchH8UfwAIzZhJbv3yXKoV13NH2akeVqktlUvphSF7YtuSchJEYb9NbvRaesn/f+91jbQXlQJGQVnurA5x91Q9Ftv6D2SeCXaG7Wg8wgGu11N65PK29CeAe9pbGfO5VPl3tFgQE0xYm36EtVG1FIV5m/25i6c1EYC30S6JAhaBgVcdKzTz45+/gXHiahNP/1/REX5pXvwJvrwWDWJ0ymxYlI9py5SR5w7MG95naz/TtR/VTmK1R55u1OjSnIBIHjdMlxhdjLZ8sPRubsfXNfglt/lO2C0gvl2DOtX8tdox8VCuXmqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=l8CxloFBGbRB3gKlE30l/bxrnUbXGuUASZ/kXTvsXaQ=; b=Weo7fVfbLYONike04rPy7UYV11ZunSvzvjNrj5MCTD/EGPgKcii1+nfBA2wu10kYooQ05+OYt+L9fDumalXTnF0g0oByQcYrNHilNsHBKw+BEnz/WdnDWZbz+07YEsJDqKQeFh2IgFJl3n1bYWFJ05PZWZ0vVXBTJI51On9BrO80YvGNxMQnTdMC+TQ5Mh3/jQ/c4QAIu5X1KMJW9jDPuAr2HK+o/Tp4o6gQIF81Y/m33q0i0hviknKMPM1hfrGduSXZonhaKtzy2GcwA17eVfqqXK6Pt2BVJWJ+ZxsYtXI63KTvEqzNv6DqVt1HyoW6E49yA8wrSzSnxb3LeEgS0g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=l8CxloFBGbRB3gKlE30l/bxrnUbXGuUASZ/kXTvsXaQ=; b=nRAA3QRtmVZHqgP35/UaSHJbagrLqpRUC1MmiEIyFWYeR9Gdn1nqQEneBRFcbI5lBV6AmopXzE0uS2uYwMN/mWJdtFpH9Xo/UiBh9fzZix1rlLKfPpI+sehjoC0AUXEfL+tIYHLpS/+am2IbOl8f71xTqzEaUs3LmGL1TI8IAVM= Received: from AM6P192CA0028.EURP192.PROD.OUTLOOK.COM (2603:10a6:209:83::41) by DU0PR08MB9935.eurprd08.prod.outlook.com (2603:10a6:10:401::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.35; Thu, 11 Jul 2024 09:36:17 +0000 Received: from AMS0EPF000001B7.eurprd05.prod.outlook.com (2603:10a6:209:83:cafe::7f) by AM6P192CA0028.outlook.office365.com (2603:10a6:209:83::41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7762.20 via Frontend Transport; Thu, 11 Jul 2024 09:36:17 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF000001B7.mail.protection.outlook.com (10.167.16.171) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7762.17 via Frontend Transport; Thu, 11 Jul 2024 09:36:17 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 11 Jul 2024 09:36:15 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 11 Jul 2024 09:36:14 +0000 Received: from e129018.arm.com (10.57.45.250) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.35 via Frontend Transport; Thu, 11 Jul 2024 09:36:14 +0000 From: Christophe Lyon To: , , , CC: Christophe Lyon Subject: [PATCH v2 2/2] arm: [MVE intrinsics] Improve vdupq_n implementation Date: Thu, 11 Jul 2024 11:36:11 +0200 Message-ID: <20240711093611.179956-2-christophe.lyon@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240711093611.179956-1-christophe.lyon@arm.com> References: <20240711093611.179956-1-christophe.lyon@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF000001B7:EE_|DU0PR08MB9935:EE_|DU2PEPF00028CFC:EE_|DB3PR08MB9010:EE_ X-MS-Office365-Filtering-Correlation-Id: 23d67a79-3060-4d98-7b38-08dca1934b08 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|82310400026|36860700013|1800799024|376014; X-Microsoft-Antispam-Message-Info-Original: uqpJ0gx+bpBgKcFfV88N8HAJ4PgyI08S/BDU5WMhhv8NchiNeNnO5Ewwb0qZ2qcOLVQ6ARBmVjRvJ/VMhWalZ+KRyva2MPsCnxPT7bc1nWWVtFbcTshNvmKJTw8WbeipZV3Fhmn+vlxtTmBVTD2ePPljmXwBAuJyg1w75mKYY/f8ND3eny82tEo2GffDBG7JIPHHCg7dwkwnwK34jTmha5m+Vwrg98m0OWyEPJorMpz6TgkskywPwwjiINi+zxw4DLVUK5xC9Wab3MGp3nYeoC/QCdjgp+gVdAJKHDC+s/tL2eYG9nmh2ZVXXNsDNyCgkJrqv3dp0QAN6UZpcHCas6njMcn8Tl9SDL0vHSHLLxpZSn717xBiB9WIbzeSEB6dVXlCDAU+stryFlAG5zwQ8kjZ5OM3b3Lu4vq8GnZoIXTzxlbWhPqiDjVqDDPlCfENLBYuGN6K4K/Y4Nbph1wQWjaCjbvvuONiZLGkrMq3DJsEAescbTNzC6srAzGFmroA5//jANhk2dt+YOrIL/FDEHqlh4VwLrdVW/7RPrbYncBYU7z5ne6aHcJgZb2gsprZ+ocAh9/5NiIuJ/KArcyEMLAIhDhyXwzUyToZX9WeaxztwX4ZQc3s0SJ+vOA6XyGdyisByiwbTuMe4G01ADe52MTQIBDWG7npbglXOvggu5h0xn+UOrh6g4ARtiu7MhcxVWUinz3+Z41whOA2/q0Li6lmHifk/gya6sS2rBKrOypvE/euTNoK3rCqljY0xSF5XFTyxaDBvhFnmJXoJbEnPyHBUotGfRGtqZJ9WprWlpoaFaECM1ti3Nrxj6nqj58KBio6IckvDeSSyHCqwx8xC7NHxPD3pZosP+OzEmpCcNmL/5fF2evvK2tHdhKXRIAPC2AaEZuF9yn1Uw6eNdbMuq77uri1u2AMiMokwugRnobaT1AWWrUhU4YeBAo91BRSZde5EMQ36+9DuG//OzXQy/QVUwrKypqJ8nP7OulLnpaWmngrTxKhbLW0cvhtNHH+r6a9rE8KRK+D6EgjR+SGHnUft1z2Qvek5n4alpzh54PyMCgiNVMbw1RGIbxGrmb3qGTgaEFElgl8szE3OrI84NA4ikl5UszVu3ue5KNnPYMjtUn96y1UjRDy1XeS4jG9GKoSCLs/QF+pHd61c8lYUm2wIcRJNd+eIqRNrOdiqkBUfd9qYBBejWd+TnN9CBzgr7+/jD09xRGsmzbQMg3vW7i3UIATmdB50A6dbP/B3aAlnsKeMQ+DOPbYz61G+gAmtf0fagUFvV8mNSzt/n6WiKd3OHWkR4Q0n0DZoCL5mSFGfIe1Bzr50lU7MS0qdRLeM9wngelAYo7O2dgvz/jVHV4jE5+FfEHe1J+ZfyT86dbrvVdQtrBNV+lGDx6dQyYChbJIRb0YXiQsel01Bn0/OQ== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9935 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF00028CFC.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a3a039f4-5cd5-4fb4-ee02-08dca18ce9ef X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|36860700013|376014|35042699022|1800799024; X-Microsoft-Antispam-Message-Info: 7WBOjpTbAEcYueT7bwJHQ93q/ODDAam/ewCeS+ST81HyShJATMpI3JgRTEDXuUV7CKzvGfRI8yc0bBALQQE3AzivZeMkzIWmbrim0Cwd8VMuAcVYYWeVp5rgRA8jzIHfKiQOie/x9YSwtJHcKegkPqCIusJKtyoI5RNNH9QuBH0rro36eu7L6dY+9VvkDELav31cDhr444tbtNhBUAW3ZaDFTz3l6GUesOOOiC/UADVyZXuA3VNLucQilVy1TlpJfzP7TfwYLCBotGYiXeCyqiAKtMnqO0bcrmnh6lXOeWspQoPvMC9sPmqi7cFB9ll3RsymFdBoYozi7MyvJeO61LloXlafPFUQSTYIe5MdiYABc+SbmuX/MGPrfKhKWL5nR5rOynAbvMdAvp0i8xwIPrQb8AO66HhRqjuNCO7nLsijrvZqsvHZ6SY/ForG+UxCQXOOzPRyRgarJtDsG1fZrOBe5c9z+csMePMXxntCNYoOH5b7okYd2tKmKT5IgNvVSviTGU8hcqNLIttS7nXYIBrd4XWKLZCqcOSgYdiqc7MVONuRlharCDx/xVdV56zAxInGOTvYKESPseUoj/nTTsFnHjMDraTPM45gRQdlbRJ3WgSvljGkWAl/BKtosJusn5uMyVunt6cpdPBxeLUdGjOlAyWhQt/uYIpMczU825TCrrvei3Gzbd8osZwDs6fuvN8vGZOu5wLKAIVPpFw0xhl8aT8SvMEZR+SlV7oIB6ao4XHwzFC5CELRuhzLSYA8lpPFCkMktKWzUY38FWBqulkHa7cC5PJCqXHBG40ClIOXCkeaxfPwGpyPuoxwj4cBY3fgXD2tZl8lIJn3MVckc1EAwJ7DzQAwRd9I8vhVOIhGVMBEQWY8PccxPdoE3nNGvDF1GzCbcke2/p/oos88peFmkK10z5lUlgRcQm+JQ/7NENQ14CX6tzdlMSgXZRPJ6t4XsOQXz1gUZAyv5StkG1BV9ebN4rHZ7hs0PKBKdm3Vyvx7cMk+fzNcGVEBWdE2pY2AVoBRIBQsoDsiVoRmLY0pTP9LeTedQEnmWZfmhrT/3ZXnmBCsSyZx9MTmj85IU2YbVkXLH0aUn1u1E9T2efDryrSc+7gKqTnZcPGcs45xLT/3nRLouu7PmhCZHoVTUc2U0bpMAwz50B2SBgyIlbggtjx5HBa12CERUKXbDdxfWSDYI2truejJVxHkkHNWdqszG6EdhTZJ+39WAQLvzEJML0VFUDt+7+dTfxesVGJsY+ufVQrBTvZPZLiOQwcqx4DsJXAZK5vevRwACrFQGozUyle24tP1emZ6UbQrJhi7DdzqdUIyYMY1b3bmAWSvf6+o4jtKIeGSzaXTkg85iu2LdPxIUVuON5PSIfOGjLaSa+HQHJoi3XyBwW4GB2sUzqs38LKmzkD0HWaZvWusUQ== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230040)(82310400026)(36860700013)(376014)(35042699022)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Jul 2024 10:21:57.0564 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 23d67a79-3060-4d98-7b38-08dca1934b08 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028CFC.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR08MB9010 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch makes the non-predicated vdupq_n MVE intrinsics use vec_duplicate rather than an unspec. This enables the compiler to generate better code sequences (for instance using vmov when possible). The patch renames the existing mve_vdup pattern into @mve_vdupq_n, and removes the now useless @mve_q_n_f and @mve_q_n_ ones. As a side-effect, it needs to update the mve_unpredicated_insn predicates in @mve_q_m_n_ and @mve_q_m_n_f. Using vec_duplicates means the compiler is now able to use vmov in the tests with an immediate argument in vdupq_n_[su]{8,16,32}.c: vmov.i8 q0,#0x1 However, this is only possible when the immediate has a suitable value (MVE encoding constraints, see imm_for_neon_mov_operand predicate). Provided we adjust the cost computations in arm_rtx_costs_internal(), when the immediate does not meet the vmov constraints, we now generate: mov r0, #imm vdup.xx q0,r0 or ldr r0, .L4 vdup.32 q0,r0 in the f32 case (with 1.1 as immediate). Without the cost adjustment, we would generate: vldr.64 d0, .L4 vldr.64 d1, .L4+8 and an associated literal pool entry. Regarding the testsuite updates: -------------------------------- * The signed versions of vdupq_* tests lack a version with an immediate argument. This patch adds them, similar to what we already have for vdupq_n_u*.c tests. * Code generation for different immediate values is checked with the new tests this patch introduces. Note there's no need for s8/u8 tests because 8-bit immediates always comply wth imm_for_neon_mov_operand. * We can remove xfail from vcmp*f tests since we now generate: movw r3, #15462 vcmp.f16 eq, q0, r3 instead of the previous: vldr.64 d6, .L5 vldr.64 d7, .L5+8 vcmp.f16 eq, q0, q3 Changes v1->v2: * Dropped change to cost computation for Neon, and associated testcases updates (crypto-vsha1*) * Updated expected regexp in vdupq_n_[su]16-2.c to account for different assembly comments (none for arm-none-eabi, '@ movhi' for arm-linux-gnueabihf) Tested on arm-linux-gnueabihf and arm-none-eabi with no regression. 2024-07-02 Jolen Li Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vdupq_impl): New class. (vdupq): Use new implementation. * config/arm/arm.cc (arm_rtx_costs_internal): Handle HFmode for COST_DOUBLE. Update consting for CONST_VECTOR. * config/arm/arm_mve_builtins.def: Merge vdupq_n_f, vdupq_n_s and vdupq_n_u into vdupq_n. * config/arm/mve.md (mve_vdup): Rename into ... (@mve_vdup_n): ... this. (@mve_q_n_f): Delete. (@mve_q_n_): Delete.. (@mve_q_m_n_): Update mve_unpredicated_insn attribute. (@mve_q_m_n_f): Likewise. gcc/testsuite/ * gcc.target/arm/mve/intrinsics/vdupq_n_u8.c (foo1): Update expected code. * gcc.target/arm/mve/intrinsics/vdupq_n_u16.c (foo1): Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_u32.c (foo1): Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_s8.c: Add test with immediate argument. * gcc.target/arm/mve/intrinsics/vdupq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f16.c (foo1): Update expected code. * gcc.target/arm/mve/intrinsics/vdupq_n_f32.c (foo1): Likewise. * gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c: Add test with immediate argument. * gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c: New test. * gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c: New test. * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c: Remove xfail. * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c: Likewise. --- gcc/config/arm/arm-mve-builtins-base.cc | 55 ++++++++++++++++++- gcc/config/arm/arm.cc | 6 +- gcc/config/arm/arm_mve_builtins.def | 4 +- gcc/config/arm/mve.md | 41 +++----------- .../arm/mve/intrinsics/vcmpeqq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpeqq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpgeq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpgeq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpgtq_n_f16.c | 2 +- .../arm/mve/intrinsics/vcmpgtq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpleq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpleq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpltq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpltq_n_f32.c | 4 +- .../arm/mve/intrinsics/vcmpneq_n_f16.c | 4 +- .../arm/mve/intrinsics/vcmpneq_n_f32.c | 4 +- .../arm/mve/intrinsics/vdupq_m_n_s16.c | 18 +++++- .../arm/mve/intrinsics/vdupq_m_n_s32.c | 18 +++++- .../arm/mve/intrinsics/vdupq_m_n_s8.c | 18 +++++- .../arm/mve/intrinsics/vdupq_n_f16.c | 3 +- .../arm/mve/intrinsics/vdupq_n_f32-2.c | 29 ++++++++++ .../arm/mve/intrinsics/vdupq_n_f32.c | 5 +- .../arm/mve/intrinsics/vdupq_n_s16-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_s16.c | 14 ++++- .../arm/mve/intrinsics/vdupq_n_s32-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_s32.c | 14 ++++- .../arm/mve/intrinsics/vdupq_n_s8.c | 14 ++++- .../arm/mve/intrinsics/vdupq_n_u16-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_u16.c | 4 +- .../arm/mve/intrinsics/vdupq_n_u32-2.c | 30 ++++++++++ .../arm/mve/intrinsics/vdupq_n_u32.c | 4 +- .../arm/mve/intrinsics/vdupq_n_u8.c | 4 +- .../arm/mve/intrinsics/vdupq_x_n_s16.c | 18 +++++- .../arm/mve/intrinsics/vdupq_x_n_s32.c | 18 +++++- .../arm/mve/intrinsics/vdupq_x_n_s8.c | 18 +++++- 35 files changed, 390 insertions(+), 81 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index e0ae593a6c0..be0f9c26c83 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -39,6 +39,59 @@ using namespace arm_mve; namespace { +/* Implements vdup_* intrinsics. */ +class vdupq_impl : public quiet +{ +public: + CONSTEXPR vdupq_impl (int unspec_for_m_n_sint, + int unspec_for_m_n_uint, + int unspec_for_m_n_fp) + : m_unspec_for_m_n_sint (unspec_for_m_n_sint), + m_unspec_for_m_n_uint (unspec_for_m_n_uint), + m_unspec_for_m_n_fp (unspec_for_m_n_fp) + {} + int m_unspec_for_m_n_sint; + int m_unspec_for_m_n_uint; + int m_unspec_for_m_n_fp; + + rtx expand (function_expander &e) const override + { + gcc_assert (e.mode_suffix_id == MODE_n); + + insn_code code; + machine_mode mode = e.vector_mode (0); + + switch (e.pred) + { + case PRED_none: + /* No predicate, _n suffix. */ + code = code_for_mve_vdupq_n (mode); + return e.use_exact_insn (code); + + case PRED_m: + case PRED_x: + /* "m" or "x" predicate, _n suffix. */ + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_q_m_n (m_unspec_for_m_n_uint, + m_unspec_for_m_n_uint, mode); + else + code = code_for_mve_q_m_n (m_unspec_for_m_n_sint, + m_unspec_for_m_n_sint, mode); + else + code = code_for_mve_q_m_n_f (m_unspec_for_m_n_fp, mode); + + if (e.pred == PRED_m) + return e.use_cond_insn (code, 0); + else + return e.use_pred_x_insn (code); + + default: + gcc_unreachable (); + } + } +}; + /* Implements vreinterpretq_* intrinsics. */ class vreinterpretq_impl : public quiet { @@ -339,7 +392,7 @@ FUNCTION (vcmpltq, unspec_based_mve_function_exact_insn_vcmp, (LT, UNKNOWN, LT, FUNCTION (vcmpcsq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GEU, UNKNOWN, UNKNOWN, VCMPCSQ_M_U, UNKNOWN, UNKNOWN, VCMPCSQ_M_N_U, UNKNOWN)) FUNCTION (vcmphiq, unspec_based_mve_function_exact_insn_vcmp, (UNKNOWN, GTU, UNKNOWN, UNKNOWN, VCMPHIQ_M_U, UNKNOWN, UNKNOWN, VCMPHIQ_M_N_U, UNKNOWN)) FUNCTION_WITHOUT_M_N (vcreateq, VCREATEQ) -FUNCTION_ONLY_N (vdupq, VDUPQ) +FUNCTION (vdupq, vdupq_impl, (VDUPQ_M_N_S, VDUPQ_M_N_U, VDUPQ_M_N_F)) FUNCTION_WITH_RTX_M (veorq, XOR, VEORQ) FUNCTION (vfmaq, unspec_mve_function_exact_insn, (-1, -1, VFMAQ_F, -1, -1, VFMAQ_N_F, -1, -1, VFMAQ_M_F, -1, -1, VFMAQ_M_N_F)) FUNCTION (vfmasq, unspec_mve_function_exact_insn, (-1, -1, -1, -1, -1, VFMASQ_N_F, -1, -1, -1, -1, -1, VFMASQ_M_N_F)) diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc index 93993d95eb9..4faeaf85cd1 100644 --- a/gcc/config/arm/arm.cc +++ b/gcc/config/arm/arm.cc @@ -11911,7 +11911,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code, case CONST_DOUBLE: if (TARGET_HARD_FLOAT && GET_MODE_CLASS (mode) == MODE_FLOAT - && (mode == SFmode || !TARGET_VFP_SINGLE)) + && (mode == SFmode || mode == HFmode || !TARGET_VFP_SINGLE)) { if (vfp3_const_double_rtx (x)) { @@ -11936,12 +11936,14 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code, return true; case CONST_VECTOR: - /* Fixme. */ if (((TARGET_NEON && TARGET_HARD_FLOAT && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode))) || TARGET_HAVE_MVE) && simd_immediate_valid_for_move (x, mode, NULL, NULL)) *cost = COSTS_N_INSNS (1); + else if (TARGET_HAVE_MVE) + /* 128-bit vector requires two vldr.64 on MVE. */ + *cost = extra_cost->ldst.loadd * 4; else *cost = COSTS_N_INSNS (4); return true; diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index f141aab816c..dd99a90b952 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -27,7 +27,7 @@ VAR2 (UNOP_NONE_NONE, vrndmq_f, v8hf, v4sf) VAR2 (UNOP_NONE_NONE, vrndaq_f, v8hf, v4sf) VAR2 (UNOP_NONE_NONE, vrev64q_f, v8hf, v4sf) VAR2 (UNOP_NONE_NONE, vnegq_f, v8hf, v4sf) -VAR2 (UNOP_NONE_NONE, vdupq_n_f, v8hf, v4sf) +VAR5 (UNOP_NONE_NONE, vdupq_n, v8hf, v4sf, v16qi, v8hi, v4si) VAR2 (UNOP_NONE_NONE, vabsq_f, v8hf, v4sf) VAR1 (UNOP_NONE_NONE, vrev32q_f, v8hf) VAR1 (UNOP_NONE_NONE, vcvttq_f32_f16, v4sf) @@ -39,7 +39,6 @@ VAR3 (UNOP_SNONE_SNONE, vqnegq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vqabsq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vnegq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vmvnq_s, v16qi, v8hi, v4si) -VAR3 (UNOP_SNONE_SNONE, vdupq_n_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vclzq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vclsq_s, v16qi, v8hi, v4si) VAR3 (UNOP_SNONE_SNONE, vaddvq_s, v16qi, v8hi, v4si) @@ -57,7 +56,6 @@ VAR1 (UNOP_SNONE_SNONE, vrev16q_s, v16qi) VAR1 (UNOP_SNONE_SNONE, vaddlvq_s, v4si) VAR3 (UNOP_UNONE_UNONE, vrev64q_u, v16qi, v8hi, v4si) VAR3 (UNOP_UNONE_UNONE, vmvnq_u, v16qi, v8hi, v4si) -VAR3 (UNOP_UNONE_UNONE, vdupq_n_u, v16qi, v8hi, v4si) VAR3 (UNOP_UNONE_UNONE, vclzq_u, v16qi, v8hi, v4si) VAR3 (UNOP_UNONE_UNONE, vaddvq_u, v16qi, v8hi, v4si) VAR2 (UNOP_UNONE_UNONE, vrev32q_u, v16qi, v8hi) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index afe5fba698c..9fcc1242206 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -94,13 +94,16 @@ (define_insn "mve_mov" (set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*") (set_attr "neg_pool_range" "*,*,*,*,996,*,*,*")]) -(define_insn "mve_vdup" +;; +;; [vdupq_n_u, vdupq_n_s, vdupq_n_f] +;; +(define_insn "@mve_vdupq_n" [(set (match_operand:MVE_VLD_ST 0 "s_register_operand" "=w") (vec_duplicate:MVE_VLD_ST (match_operand: 1 "s_register_operand" "r")))] "TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT" "vdup.\t%q0, %1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vdup")) + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_vdupq_n")) (set_attr "length" "4") (set_attr "type" "mve_move")]) @@ -188,21 +191,6 @@ (define_insn "mve_vq_f" (set_attr "type" "mve_move") ]) -;; -;; [vdupq_n_f]) -;; -(define_insn "@mve_q_n_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand: 1 "s_register_operand" "r")] - MVE_FP_N_VDUPQ_ONLY)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - ".%#\t%q0, %1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_f")) - (set_attr "type" "mve_move") -]) - ;; ;; [vrev32q_f]) ;; @@ -328,21 +316,6 @@ (define_expand "mve_vmvnq_s" "TARGET_HAVE_MVE" ) -;; -;; [vdupq_n_u, vdupq_n_s]) -;; -(define_insn "@mve_q_n_" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "=w") - (unspec:MVE_2 [(match_operand: 1 "s_register_operand" "r")] - VDUPQ_N)) - ] - "TARGET_HAVE_MVE" - ".%#\t%q0, %1" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_")) - (set_attr "type" "mve_move") -]) - ;; ;; [vclzq_u, vclzq_s]) ;; @@ -1903,7 +1876,7 @@ (define_insn "@mve_q_m_n_" ] "TARGET_HAVE_MVE" "vpst\;t.%#\t%q0, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_")) + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n")) (set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2317,7 +2290,7 @@ (define_insn "@mve_q_m_n_f" ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" "vpst\;t.%#\t%q0, %2" - [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n_f")) + [(set (attr "mve_unpredicated_insn") (symbol_ref "CODE_FOR_mve_q_n")) (set_attr "type" "mve_move") (set_attr "length""8")]) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c index 2f84d751c53..335e511b17b 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 eq, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c index 6cfe7338fce..e5c16be65e3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 eq, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c index 978bd7d4b52..47d54863a85 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 ge, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c index 66b6d8b0056..1b775eaf8a0 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 ge, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c index 9c5f1f2f5c8..89d8e2b9109 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 gt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c index 2723aa7f98f..a5510e852d5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 gt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c index 1d1f4bf0e58..c94b3119d59 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 le, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c index bf77a808064..80e2cfa1079 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 le, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c index f9f091cd9b3..c3a106455cb 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 lt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c index d22ea1aca30..b485f75e769 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 lt, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c index 83beca964d6..1156caafda1 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c @@ -39,7 +39,7 @@ foo1 (float16x8_t a, float16_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f16 ne, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float16x8_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c index abe1abfed2a..c3ffbd1335f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c @@ -39,7 +39,7 @@ foo1 (float32x4_t a, float32_t b) } /* -**foo2: { xfail *-*-* } +**foo2: ** ... ** vcmp.f32 ne, q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... @@ -56,4 +56,4 @@ foo2 (float32x4_t a) } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c index bf05c73fc1d..dbbf8540681 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c @@ -42,8 +42,24 @@ foo1 (int16x8_t inactive, int16_t a, mve_pred16_t p) return vdupq_m (inactive, a, p); } +/* +**foo2: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int16x8_t +foo2 (int16x8_t inactive, mve_pred16_t p) +{ + return vdupq_m (inactive, 1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c index 71789bb620e..613b5d30fb3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c @@ -42,8 +42,24 @@ foo1 (int32x4_t inactive, int32_t a, mve_pred16_t p) return vdupq_m (inactive, a, p); } +/* +**foo2: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int32x4_t +foo2 (int32x4_t inactive, mve_pred16_t p) +{ + return vdupq_m (inactive, 1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c index 48c4fbd1f82..a1ff48e94e5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c @@ -42,8 +42,24 @@ foo1 (int8x16_t inactive, int8_t a, mve_pred16_t p) return vdupq_m (inactive, a, p); } +/* +**foo2: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.8 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int8x16_t +foo2 (int8x16_t inactive, mve_pred16_t p) +{ + return vdupq_m (inactive, 1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c index 44112190fb8..f9aae2fd120 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f16.c @@ -24,6 +24,7 @@ foo (float16_t a) /* **foo1: ** ... +** movw r[0-9]+, #15462 ** vdup.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) ** ... */ @@ -37,4 +38,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c new file mode 100644 index 00000000000..a4b0022cdfc --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32-2.c @@ -0,0 +1,29 @@ +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */ +/* { dg-add-options arm_v8_1m_mve_fp } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that fits in vmov. */ +/* +**foo1: +** ... +** vmov.f32 q[0-9]+, #0.0 .* +** ... +*/ +float32x4_t +foo1 () +{ + return vdupq_n_f32 (0); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c index 059e3e42dd0..afa78129291 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_f32.c @@ -24,7 +24,8 @@ foo (float32_t a) /* **foo1: ** ... -** vdup.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ldr r[0-9]+, .L.* +** vdup.32 q0, r[0-9]+ ** ... */ float32x4_t @@ -37,4 +38,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c new file mode 100644 index 00000000000..3fedbb100b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000(?: @.*|) +** vdup.16 q[0-9]+, r[0-9]+ +** ... +*/ +int16x8_t +foo1 () +{ + return vdupq_n_s16 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c index d8ba299cb15..f2746075a3b 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s16.c @@ -21,8 +21,20 @@ foo (int16_t a) return vdupq_n_s16 (a); } +/* +**foo1: +** ... +** vmov.i16 q[0-9]+, (#0x1) (?:@.*|) +** ... +*/ +int16x8_t +foo1 () +{ + return vdupq_n_s16 (1); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c new file mode 100644 index 00000000000..da8adeeeb7b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000 +** vdup.32 q0, r[0-9]+ +** ... +*/ +int32x4_t +foo1 () +{ + return vdupq_n_s32 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c index a81c6d1e220..7f75eca2ad2 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s32.c @@ -21,8 +21,20 @@ foo (int32_t a) return vdupq_n_s32 (a); } +/* +**foo1: +** ... +** vmov.i32 q[0-9]+, (#0x1) (?:@.*|) +** ... +*/ +int32x4_t +foo1 () +{ + return vdupq_n_s32 (1); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c index b0bac4fce89..454ff5abac2 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_s8.c @@ -21,8 +21,20 @@ foo (int8_t a) return vdupq_n_s8 (a); } +/* +**foo1: +** ... +** vmov.i8 q[0-9]+, (#0x1) (?:@.*|) +** ... +*/ +int8x16_t +foo1 () +{ + return vdupq_n_s8 (1); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c new file mode 100644 index 00000000000..accd7fac130 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000(?: @.*|) +** vdup.16 q[0-9]+, r[0-9]+ +** ... +*/ +uint16x8_t +foo1 () +{ + return vdupq_n_u16 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c index 55e0a601110..4accb6480dd 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u16.c @@ -24,7 +24,7 @@ foo (uint16_t a) /* **foo1: ** ... -** vdup.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** vmov.i16 q[0-9]+, (#0x1) (?:@.*|) ** ... */ uint16x8_t @@ -37,4 +37,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c new file mode 100644 index 00000000000..c97cea186ef --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32-2.c @@ -0,0 +1,30 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + + /* Test with a constant that does not fit in vmov. */ +/* +**foo1: +** ... +** mov r[0-9]+, #1000 +** vdup.32 q0, r[0-9]+ +** ... +*/ +uint32x4_t +foo1 () +{ + return vdupq_n_u32 (1000); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c index bf73bc17fc7..d08a94c7a16 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u32.c @@ -24,7 +24,7 @@ foo (uint32_t a) /* **foo1: ** ... -** vdup.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** vmov.i32 q[0-9]+, (#0x1) (?:@.*|) ** ... */ uint32x4_t @@ -37,4 +37,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c index 48cbdb2a1da..f1fcd4acaa3 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_n_u8.c @@ -24,7 +24,7 @@ foo (uint8_t a) /* **foo1: ** ... -** vdup.8 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** vmov.i8 q[0-9]+, (#0x1) (?:@.*|) ** ... */ uint8x16_t @@ -37,4 +37,4 @@ foo1 () } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c index 6756502ab21..9dcfe4e0376 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c @@ -25,8 +25,24 @@ foo (int16_t a, mve_pred16_t p) return vdupq_x_n_s16 (a, p); } +/* +**foo1: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.16 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int16x8_t +foo1 (mve_pred16_t p) +{ + return vdupq_x_n_s16 (1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c index b04afb3834b..eacdb2e454f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c @@ -25,8 +25,24 @@ foo (int32_t a, mve_pred16_t p) return vdupq_x_n_s32 (a, p); } +/* +**foo1: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.32 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int32x4_t +foo1 (mve_pred16_t p) +{ + return vdupq_x_n_s32 (1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c index b23facd5e94..8951f7475f5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c @@ -25,8 +25,24 @@ foo (int8_t a, mve_pred16_t p) return vdupq_x_n_s8 (a, p); } +/* +**foo1: +** ... +** vmsr p0, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +** vpst(?: @.*|) +** ... +** vdupt.8 q[0-9]+, (?:ip|fp|r[0-9]+)(?: @.*|) +** ... +*/ +int8x16_t +foo1 (mve_pred16_t p) +{ + return vdupq_x_n_s8 (1, p); +} + #ifdef __cplusplus } #endif -/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file +/* { dg-final { scan-assembler-not "__ARM_undef" } } */