From patchwork Thu Oct 31 14:09:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jennifer Schmitz X-Patchwork-Id: 2004682 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=Nvidia.com header.i=@Nvidia.com header.a=rsa-sha256 header.s=selector2 header.b=nrKo97Ia; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XfQqK4DrNz1xxp for ; Fri, 1 Nov 2024 01:09:49 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7FA463857732 for ; Thu, 31 Oct 2024 14:09:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2060c.outbound.protection.outlook.com [IPv6:2a01:111:f403:2405::60c]) by sourceware.org (Postfix) with ESMTPS id 975653858CD9 for ; Thu, 31 Oct 2024 14:09:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 975653858CD9 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 975653858CD9 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2405::60c ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1730383764; cv=pass; b=eMwnJX+790dGPOPvOxOe71Xt26/OAEx5aX5l4GVOwU3wjFEtoz1VjOEgJ53TWO/vQJcv2oYySAL3yWD0/kVGqjAos9Fb1RiRfz03G4JMPXFxU69291/oY/XeoHDfTVpTqPj9cfNdrdBtNi2S/u4pXF2Lgmbem62R2dHDhLc/z6k= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1730383764; c=relaxed/simple; bh=x7Pd/mnmKGAUW/AgWpmn11mXrHMAwXXt8KYX9+TFMlE=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=xuR4BatmMXXmgRPLBQQ/RNhXw1eb/UWGZBqze5+LUbrzehQxTZ8V01WHhA+3nJdwQQNKmXTR3Pibh8O898RAMvV2JGmtx0wsDhorrCVvV6w1vwl1hrRcsgdCFdArgAnwsPkPuFG4lm/LAvSbEM63WajSE5AlV98YxIt8N1odD4c= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=L10CK056rAw7pRRMEltPUPzFLSSIOS6BnRjbPJMafZEmZnbXmI0A0NDlyHSdmNS95LlyTYODtIpgm7dTA7H45r6Oq58R/xvuQeRt7xJdxt2GHXKgp4Zd+Tb7CgooUaqST7GtXlqup9CNvZd6vj74S5ZyWBTSxRFoOulujAYRCiYh6P1096gP34dhsvEm9HGSjJrPk15db8VfuBAdlICIEE3fgaplDufzPtGoumtlJKZJyh0GndzPL9+14RoEvppDwkLx1yGZafLSNtt1qBYE+mwHL3Ssx8V9+BCN6cUIMJj0BklZvThtjpRbqQnkTAH2ZFLMJbyCl5aQhlw58TwhPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Fl8qB8rF0mk1/iyfm4ZjKs9sYlgIQIz4JsW03eJSEVE=; b=yAQdBs6k0Oul9Ppuie22P8si1GxN+yj1+Xli/szCRHBRh7pr75lE9Y/uJHT9ZakUIXjohkDrFJNDll3a5Vo6BSbxLvS/O0UeC6mxbU3zyhQkrgIkWMJWqQN0T5HT6/SthwcHFy/YLqEjT0aNW96HdL/vKhb0LcZhjU+NE4VNpQjk5vhM+D07fUBg0JJZg09CEqJ/fEHkUNLfbTmEOyL8PB8CvaS0E/QOsBwPE+/DlnNK9YKzfXkYmMmCLiM0uK8K7d5s76AWkoo2kkJ84CXE5JjWzGWEHQ1SaT47aZiqQ8wPWhE3ndhCnQDCFipzB5PJjAsDwaf3OcreNkYLVzX0Mg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Fl8qB8rF0mk1/iyfm4ZjKs9sYlgIQIz4JsW03eJSEVE=; b=nrKo97IaBDL7nlE2xI/AlVoe3p+BBkB71vGBBF/Y51Dd30NpVYoKXS3BOnK5N17l/RLHxWpjhXOOSSFSFjvRvQgRgRpEC8HPG/OEiTV+PIRTTVa5U+2XLSSbsmU9lICleIceyWKx2Zh/z/Z1/O7/xAb9HmNA3R/rJ5JnB0/pTixzYzHhbQ9qlERQz3cEdFIH14EKJA7JJsmJbDOOzr4tEbOv43d8mJQUnsYA+vL14m+iRGMghE23JFlsDKM8PW2wn5wRyQ8M2wiYZEM+4WC/QDZrU4AEjnc6ot2sL1oTrzw9CV/vFrTyMV1yWIzEmf5/5QeMp96+fwXbccn7IpS9qQ== Received: from CH0PR12MB5252.namprd12.prod.outlook.com (2603:10b6:610:d3::24) by MW4PR12MB7465.namprd12.prod.outlook.com (2603:10b6:303:212::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8093.32; Thu, 31 Oct 2024 14:09:08 +0000 Received: from CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d]) by CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d%4]) with mapi id 15.20.8114.015; Thu, 31 Oct 2024 14:09:08 +0000 From: Jennifer Schmitz To: "gcc-patches@gcc.gnu.org" CC: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH] [PR106329] SVE intrinsics: Fold calls with pfalse predicate. Thread-Topic: [PATCH] [PR106329] SVE intrinsics: Fold calls with pfalse predicate. Thread-Index: AQHbK55zXRr24NrSvE+HaP5dm7cqpA== Date: Thu, 31 Oct 2024 14:09:08 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: CH0PR12MB5252:EE_|MW4PR12MB7465:EE_ x-ms-office365-filtering-correlation-id: 26be637a-e86a-4ce6-3855-08dcf9b5964f x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230040|366016|376014|1800799024|38070700018; x-microsoft-antispam-message-info: YwHUT1R8HljE033TxDy4WeoCwFLyv1xmggY+/x6n0MDNQDwlZ0SEfrTTx5lPQ7D1SgV+PomZXXtHnOEbU0M7+06p345oT14UuOJVEZnax2V2I/MDEiVPGwJ/VebRW6rehH1C2qi0cy8M2A0kImZH4xOrdnsAwqyusDHp5s2JMxunU2UL2sVtBJKA5QupnY9JXu65hWd9JZMkZxb8FED3QuT24RI8LPqCkfRtFb+S/Dg01FqnfFtUei2snRgnlswV7QYZHqZatstfIGxwaHM6ExhfyIBz5vEDLtjTgp5V7dFBlObTkd8t0XKsF9D7lkE7wZsOaiWgDu/pWqL/Rh2bwd60kTIh/yjDPLSmsuwf6XIIS33d9ah00sZi9NUr1rVKjjpPBy0j6QRNhUrGtK1BS1g2l1lQCH7WJD0puliUZZobW7rKHv85TAnCRsTUEun88xBLGbJrg9CyxncdYlvc0HXPWQBxN8qmDZykEtMsgjGeY4W0/rrGVHzBVzLsEXIwl2Gs8MnSuInBLaAnXE6Dgman6rcYjm5rf3itc974qW7nOcvEOk2oll50MebOq3QUMd/AsjxlYJ6Fcfwrs56wnJK+Cv2Gck0FHr//Yw3SmWWQvM+KK6QFh2GWGNd4LRlpmVYB2u/Nz60kIm6JKJ8SUoMkemzcw0q3czfO/rgfS8o/CSYa/HM2ulNxQUhsWC/AAZDZrj38uxNbeN9RtXem00/82DvdrN8DymVemJDXudQ0CSC8MzVjoGIGImPVZUxVyGD/X7K7Oo8O2P+pmtKMCEN1qwzHhdqq2stE5S4ARH7heVsqeiZlvEWKyAqmuH4wWp0hv0ChSmOM8STilqlBvzq3wPDhI9zohpoc9jXJHOozGSHALmLxPnsDmGZU2bhzQPs/1+Q/wrZh8n0DxC2VplslSO8//NvfI+kV9xLulinW7AFJYSvUgTmZVaOqaydhh7BdZkVpK1iMCoArPePNnLw6Qk3kGPkUoOU51ByKshjRp4d7o8+/i9fHr6s8fYGOXrDXK5Bn6FeBME00U4Z3a5Ts34PBhj/OsFESXnOXf0R6S5Dft9GmcHcZEe059aF7TfAcba++uw6K3IQHrPqYWkQgPa7DhIxmejQywZyKKB3hPC2K8Mc5RfwzegkDxb6zTWA44vZsuINCPoTmpXApL2cnL/NTVuS/W1LlxUdYnXDTj8OGkn9oRDiFiwBlv3eWOW30PLZXI3TRY9AhsyEj5eGxeb7JAcka/AHi9fBfX32TdPy3nrYeU0pvvgNK3QQfKpDn57rSJYfDhf/pm/CLHhsJReUa8OULWprtdj710oDk/ylSSXmksvvlwpQ79B8s0JAr4Z+ukk688MpWJCrD0PqR5xBrHGH3TmuDrr0YR3aIwHKF7mwjFoc3lbYD3cZu x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH0PR12MB5252.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(376014)(1800799024)(38070700018); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: T1VEV4ADr+o8eqc5Lhq76FsDQgECmm1SyxnuVFpaLu4w3nGIm15nNU1TOU8mKLqgfUsSK+ZYYJRLisGMwp8qF8xR43f44ag5aMKqd/mJB4VvJN0KvANh9gamgdpESvxHoYvtbrwIHU2rqAA8v4NKabXQL65nYHK4VBPZ3DY712jRMqpsK5LN/cF6tExeWpLIsMO4cmzNXQBCtiEVInZGtSHv48X3j864sooHmkE7Q3Frq+5PNjVFNDF72OmzB7ZvYYmakehSHnnIpprOGDXE7jBE7RWeInKXMq9+AJHINnS8j9NRTOlivwl2gC+dss8oGKjUq31K8eHkFIrpJel/Gworlw28CSY53b4BYTiRjZ8NZ4yKThsMmr0WM/mg3qvI+UZR9/Ce2bC7hAFerGrUCeWDAwvTFGqHD72iREsDM8z6QePoC4bbt4zl2CQZ6PMp/o5VvZ9p3ht+wY7hm78j6sQp5OdQGmjwdv5AAqb0nDGA+uKQtoO6vCRzt4AMIFSs4U036+ZKujHkM4JWNx7eY7MVUf6Jv4DB3x1uMSLPPJjuMPmUdM9R4tehzX6vIKfyRNkYPf4l7UjGE9FbTjGkBWSgOMKhFwn4tnxF9YQi+l1zI/jurpqWc7/ZgHoIn7yZFIZgvYYBqRe/nell04NdMONPwwmnnUMYy0btBOgcX+zitgxCPPlGEDhOJopwa8TakWokasgs5YajG5zJWb6YBPTpb7pALgBxQVAoemFBuRppL02UNYz5/MoQHvo2NLWLfHm1gb2jzBk8XXL344ViCSRAaQNyyE5Pq5E12Ph2wWFcZhMLIpKSnUYEGzIx2cV4LqtzO8T0DhGkM9Inobf1HrAy62GIHsdZKcZqaf62VnueqpejAPfT/2mKo34BxibtNQG38rMDZKgNpq6s79goXeRecWiz4djvVE5ATkp2Yg0TrertRsu3Xtwk/Poj5LiqB08XfrWKgS42AU8aMFFEIEbBmgopjL/LvZaPILNv22Qhtl/cOt2ED7tLh9YUrjzrwhKJX3lYISBXlGEO4EFHd1IvdqLP0rRdtpGNRAwJsFCGpKrVyUw43c08LHT0eA3oW5m2bAhb7XA4Am0wIH+/rG7/fXloTIDjhiiPh4Dk62UiJJwLrMiXw9Fj7/6eZmYxKUT63MJ2YbrfbjDsIOeTwGZuXOfg5g4tE+yV/hCdspgjrQJH0LHxwRo9qqW37bMN/Vxjm1rioxMUoTp80qDoX2JchBeoSNilTg/TZguTr1NhA9bVT/Gyx+qSPsh/3m0vQWyKqNBzxD0cF08uPGpTIo22Q0/NCWlm2XfTfZGA10fvOhYPZc8Ee7M3myUFbUvdeQhu+GLDyd7+OTXSKzCO6Buqn5QA+3jg+1B5APY2wPkRXUKU+d6pu/rEtb4YMX1JBz4VwGOCYU+SvpWLUNX1B9k04LfWlSbyXd9GJcCgFqumNFe5ZPMJ03AoLrvIJT19xjSZHyyKJ7zoIOtgfAlhFhUmt9dXayHQRs1eOiczzj6esdPQRw8osoB2YFA7XI0WwRpNxARGu1sxx2W+7EgmT3700vbDJLT7CUonMrQGISR/qUeKqyGGrE5/P5vsu0HD MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CH0PR12MB5252.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 26be637a-e86a-4ce6-3855-08dcf9b5964f X-MS-Exchange-CrossTenant-originalarrivaltime: 31 Oct 2024 14:09:08.5388 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: j/gWwCwHtqw7nFOae9FjyHUEOkbZn8pj786HClQ+3jjjm6B4Tmzb58jic1zDE3JUfz9hnUaO1/YbbVp7SyHnzQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB7465 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org If an SVE intrinsic has predicate pfalse, we can fold the call to a simplified assignment statement: For _m, _x, and implicit predication, the LHS can be assigned the operand for inactive values and for _z, we can assign a zero vector. For example, svint32_t foo (svint32_t op1, svint32_t op2) { return svadd_s32_m (svpfalse_b (), op1, op2); } can be folded to lhs <- op1, such that foo is compiled to just a RET. We implemented this optimization during gimple folding by calling a new method function_shape::fold_pfalse from gimple_folder::fold. The implementations of fold_pfalse in the function_shape subclasses define the expected behavior for a pfalse predicate for the different predications and return an appropriate gimple statement. To avoid code duplication, function_shape::fold_pfalse calls a new method gimple_folder:fold_by_pred that takes arguments of type tree for each predication and returns the new assignment statement depending on the predication. We tested the new behavior for each intrinsic with all supported predications and data types and checked the produced assembly. There is a test file for each shape subclass with scan-assembler-times tests that look for the simplified instruction sequences, such as individual RET instructions or zeroing moves. There is an additional directive counting the total number of functions in the test, which must be the sum of counts of all other directives. This is to check that all tested intrinsics were optimized. In this patch, we only implemented function_shape::fold_pfalse for binary shapes. But we plan to cover more shapes in follow-up patches, after getting feedback on this patch. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz gcc/ PR target/106329 * config/aarch64/aarch64-sve-builtins-shapes.cc (struct binary_def): Implement fold_pfalse to simplify function calls with predicate pfalse. (struct binary_int_opt_n_def): Likewise. (struct binary_int_opt_single_n_def): Likewise. (struct binary_opt_n_def): Likewise. (struct binary_opt_single_n_def): Likewise. (struct binary_rotate_def): Likewise. (struct binary_to_uint_def): Likewise. (struct binary_uint_opt_n_def): Likewise. (struct binary_uint64_opt_n_def): Likewise. (struct binary_wide_def): Likewise. * config/aarch64/aarch64-sve-builtins.cc (is_pfalse): New function checking whether a given tree is a pfalse predicate. (gimple_folder::arg_is_pfalse_p): New function checking that the argument at a given index is a pfalse predicate. (gimple_folder::fold_by_pred): New function that folds the call based on the predication. (gimple_folder::fold): Call function_shape::fold_pfalse. * config/aarch64/aarch64-sve-builtins.h: Declare is_pfalse, gimple_folder::fold_by_pred, and gimple_folder::arg_is_pfalse_p. gcc/testsuite/ PR target/106329 * gcc.target/aarch64/pfalse-binary_0.c: New test. * gcc.target/aarch64/sve/pfalse-binary.c: New test. * gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: New test. * gcc.target/aarch64/sve/pfalse-binary_opt_n.c: New test. * gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: New test. * gcc.target/aarch64/sve/pfalse-binary_rotate.c: New test. * gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: New test. * gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: New test. * gcc.target/aarch64/sve2/pfalse-binary.c: New test. * gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: New test. * gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: New test. * gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: New test. * gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: New test. * gcc.target/aarch64/sve2/pfalse-binary_wide.c: New test. Signed-off-by: Jennifer Schmitz Signed-off-by: Jennifer Schmitz --- .../aarch64/aarch64-sve-builtins-shapes.cc | 112 +++++++++++ gcc/config/aarch64/aarch64-sve-builtins.cc | 53 ++++++ gcc/config/aarch64/aarch64-sve-builtins.h | 9 + .../gcc.target/aarch64/pfalse-binary_0.c | 176 ++++++++++++++++++ .../gcc.target/aarch64/sve/pfalse-binary.c | 13 ++ .../aarch64/sve/pfalse-binary_int_opt_n.c | 10 + .../aarch64/sve/pfalse-binary_opt_n.c | 30 +++ .../aarch64/sve/pfalse-binary_opt_single_n.c | 13 ++ .../aarch64/sve/pfalse-binary_rotate.c | 26 +++ .../aarch64/sve/pfalse-binary_uint64_opt_n.c | 12 ++ .../aarch64/sve/pfalse-binary_uint_opt_n.c | 12 ++ .../gcc.target/aarch64/sve2/pfalse-binary.c | 13 ++ .../sve2/pfalse-binary_int_opt_single_n.c | 10 + .../aarch64/sve2/pfalse-binary_opt_n.c | 16 ++ .../aarch64/sve2/pfalse-binary_to_uint.c | 9 + .../aarch64/sve2/pfalse-binary_uint_opt_n.c | 10 + .../aarch64/sve2/pfalse-binary_wide.c | 10 + 17 files changed, 534 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/pfalse-binary_0.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_rotate.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_to_uint.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_wide.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index f190770250f..3350ecfcca4 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -22,6 +22,10 @@ #include "coretypes.h" #include "tm.h" #include "tree.h" +#include "basic-block.h" +#include "function.h" +#include "gimple.h" +#include "gimple-iterator.h" #include "rtl.h" #include "tm_p.h" #include "memmodel.h" @@ -1244,6 +1248,17 @@ struct binary_def : public overloaded_base<0> { return r.resolve_uniform (2); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + gimple_call_arg (f.call, 2)); + } }; SHAPE (binary) @@ -1273,6 +1288,17 @@ struct binary_int_opt_n_def : public overloaded_base<0> return r.finish_opt_n_resolution (i + 1, i, type, TYPE_signed); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_int_opt_n) @@ -1308,6 +1334,17 @@ struct binary_int_opt_single_n_def : public overloaded_base<0> ? r.finish_opt_n_resolution (i + 1, i, type.type, TYPE_signed) : r.finish_opt_single_resolution (i + 1, i, type, TYPE_signed)); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_int_opt_single_n) @@ -1512,6 +1549,17 @@ struct binary_opt_n_def : public overloaded_base<0> { return r.resolve_uniform_opt_n (2); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_opt_n) @@ -1547,6 +1595,17 @@ struct binary_opt_single_n_def : public overloaded_base<0> ? r.finish_opt_n_resolution (i + 1, i, type.type) : r.finish_opt_single_resolution (i + 1, i, type)); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_opt_single_n) @@ -1584,6 +1643,17 @@ struct binary_rotate_def : public overloaded_base<0> { return c.require_immediate_either_or (2, 90, 270); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_rotate) @@ -1645,6 +1715,15 @@ struct binary_to_uint_def : public overloaded_base<0> { return r.resolve_uniform (2); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (NULL, NULL, + build_zero_cst (TREE_TYPE (f.lhs)), NULL); + } }; SHAPE (binary_to_uint) @@ -1730,6 +1809,17 @@ struct binary_uint_opt_n_def : public overloaded_base<0> return r.finish_opt_n_resolution (i + 1, i, type, TYPE_unsigned); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_uint_opt_n) @@ -1787,6 +1877,17 @@ struct binary_uint64_opt_n_def : public overloaded_base<0> return r.finish_opt_n_resolution (i + 1, i, type, TYPE_unsigned, 64); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_uint64_opt_n) @@ -1813,6 +1914,17 @@ struct binary_wide_def : public overloaded_base<0> return r.resolve_to (r.mode_suffix_id, type); } + + gimple * + fold_pfalse (gimple_folder &f) const override + { + if (!f.arg_is_pfalse_p (0)) + return NULL; + return f.fold_by_pred (gimple_call_arg (f.call, 1), + gimple_call_arg (f.call, 1), + build_zero_cst (TREE_TYPE (f.lhs)), + NULL); + } }; SHAPE (binary_wide) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index af6469fff71..5409fdff3e5 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -3459,6 +3459,15 @@ is_ptrue (tree v, unsigned int step) && vector_cst_all_same (v, step)); } +/* Return true if V is a constant predicate that acts as a pfalse. */ +bool +is_pfalse (tree v) +{ + return (TREE_CODE (v) == VECTOR_CST + && TYPE_MODE (TREE_TYPE (v)) == VNx16BImode + && integer_zerop (v)); +} + gimple_folder::gimple_folder (const function_instance &instance, tree fndecl, gimple_stmt_iterator *gsi_in, gcall *call_in) : function_call_info (gimple_location (call_in), instance, fndecl), @@ -3564,6 +3573,14 @@ gimple_folder::redirect_pred_x () return redirect_call (instance); } +/* Return true if the argument at position IDX is pfalse, + else return false. */ +bool +gimple_folder::arg_is_pfalse_p (unsigned int idx) +{ + return is_pfalse (gimple_call_arg (call, idx)); +} + /* Fold the call to constant VAL. */ gimple * gimple_folder::fold_to_cstu (poly_uint64 val) @@ -3666,6 +3683,39 @@ gimple_folder::fold_active_lanes_to (tree x) return gimple_build_assign (lhs, VEC_COND_EXPR, pred, x, vec_inactive); } +/* Fold call to assignment statement + lhs = new_lhs, + where new_lhs is determined by the predication. + Return the gimple statement on success, else return NULL. */ +gimple * +gimple_folder::fold_by_pred (tree m, tree x, tree z, tree implicit) +{ + tree new_lhs = NULL; + switch (pred) + { + case PRED_z: + new_lhs = z; + break; + case PRED_m: + new_lhs = m; + break; + case PRED_x: + new_lhs = x; + break; + case PRED_implicit: + new_lhs = implicit; + break; + default: + return NULL; + } + gcc_assert (new_lhs); + gimple_seq stmts = NULL; + gimple *g = gimple_build_assign (lhs, new_lhs); + gimple_seq_add_stmt_without_update (&stmts, g); + gsi_replace_with_seq_vops (gsi, stmts); + return g; +} + /* Try to fold the call. Return the new statement on success and null on failure. */ gimple * @@ -3685,6 +3735,9 @@ gimple_folder::fold () /* First try some simplifications that are common to many functions. */ if (auto *call = redirect_pred_x ()) return call; + if (pred != PRED_none) + if (auto *call = shape->fold_pfalse (*this)) + return call; return base->fold (*this); } diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 4cdc0541bdc..4e443a8192e 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -632,12 +632,15 @@ public: gcall *redirect_call (const function_instance &); gimple *redirect_pred_x (); + bool arg_is_pfalse_p (unsigned int idx); + gimple *fold_to_cstu (poly_uint64); gimple *fold_to_pfalse (); gimple *fold_to_ptrue (); gimple *fold_to_vl_pred (unsigned int); gimple *fold_const_binary (enum tree_code); gimple *fold_active_lanes_to (tree); + gimple *fold_by_pred (tree, tree, tree, tree); gimple *fold (); @@ -796,6 +799,11 @@ public: /* Check whether the given call is semantically valid. Return true if it is, otherwise report an error and return false. */ virtual bool check (function_checker &) const { return true; } + + /* For a pfalse predicate, try to fold the given gimple call. + Return the new gimple statement on success, otherwise return null. */ + virtual gimple *fold_pfalse (gimple_folder &) const { return NULL; } + }; /* RAII class for enabling enough SVE features to define the built-in @@ -829,6 +837,7 @@ extern tree acle_svprfop; bool vector_cst_all_same (tree, unsigned int); bool is_ptrue (tree, unsigned int); +bool is_pfalse (tree); const function_instance *lookup_fndecl (tree); /* Try to find a mode with the given mode_suffix_info fields. Return the diff --git a/gcc/testsuite/gcc.target/aarch64/pfalse-binary_0.c b/gcc/testsuite/gcc.target/aarch64/pfalse-binary_0.c new file mode 100644 index 00000000000..3910ab36b6b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pfalse-binary_0.c @@ -0,0 +1,176 @@ +#include + +#define MXZ(F, RTY, TY1, TY2) \ + RTY F##_f (TY1 op1, TY2 op2) \ + { \ + return sv##F (svpfalse_b (), op1, op2); \ + } + +#define PRED_MXv(F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_##TY##_m, sv##RTY, sv##TYPE1, sv##TYPE2) \ + MXZ (F##_##TY##_x, sv##RTY, sv##TYPE1, sv##TYPE2) + +#define PRED_Zv(F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_##TY##_z, sv##RTY, sv##TYPE1, sv##TYPE2) + +#define PRED_MXZv(F, RTY, TYPE1, TYPE2, TY) \ + PRED_MXv (F, RTY, TYPE1, TYPE2, TY) \ + PRED_Zv (F, RTY, TYPE1, TYPE2, TY) + +#define PRED_Z(F, RTY, TYPE1, TYPE2, TY) \ + PRED_Zv (F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_n_##TY##_z, sv##RTY, sv##TYPE1, TYPE2) + +#define PRED_MXZ(F, RTY, TYPE1, TYPE2, TY) \ + PRED_MXv (F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_n_##TY##_m, sv##RTY, sv##TYPE1, TYPE2) \ + MXZ (F##_n_##TY##_x, sv##RTY, sv##TYPE1, TYPE2) \ + PRED_Z (F, RTY, TYPE1, TYPE2, TY) + +#define PRED_IMPLICIT(F, RTY, TYPE1, TYPE2, TY) \ + MXZ (F##_##TY, sv##RTY, sv##TYPE1, sv##TYPE2) + +#define ALL_Q_INTEGER(F, P) \ + PRED_##P (F, uint8_t, uint8_t, uint8_t, u8) \ + PRED_##P (F, int8_t, int8_t, int8_t, s8) + +#define ALL_Q_INTEGER_UINT(F, P) \ + PRED_##P (F, uint8_t, uint8_t, uint8_t, u8) \ + PRED_##P (F, int8_t, int8_t, uint8_t, s8) + +#define ALL_Q_INTEGER_INT(F, P) \ + PRED_##P (F, uint8_t, uint8_t, int8_t, u8) \ + PRED_##P (F, int8_t, int8_t, int8_t, s8) + +#define ALL_H_INTEGER(F, P) \ + PRED_##P (F, uint16_t, uint16_t, uint16_t, u16) \ + PRED_##P (F, int16_t, int16_t, int16_t, s16) + +#define ALL_H_INTEGER_UINT(F, P) \ + PRED_##P (F, uint16_t, uint16_t, uint16_t, u16) \ + PRED_##P (F, int16_t, int16_t, uint16_t, s16) + +#define ALL_H_INTEGER_INT(F, P) \ + PRED_##P (F, uint16_t, uint16_t, int16_t, u16) \ + PRED_##P (F, int16_t, int16_t, int16_t, s16) + +#define ALL_H_INTEGER_WIDE(F, P) \ + PRED_##P (F, uint16_t, uint16_t, uint8_t, u16) \ + PRED_##P (F, int16_t, int16_t, int8_t, s16) + +#define ALL_S_INTEGER(F, P) \ + PRED_##P (F, uint32_t, uint32_t, uint32_t, u32) \ + PRED_##P (F, int32_t, int32_t, int32_t, s32) + +#define ALL_S_INTEGER_UINT(F, P) \ + PRED_##P (F, uint32_t, uint32_t, uint32_t, u32) \ + PRED_##P (F, int32_t, int32_t, uint32_t, s32) + +#define ALL_S_INTEGER_INT(F, P) \ + PRED_##P (F, uint32_t, uint32_t, int32_t, u32) \ + PRED_##P (F, int32_t, int32_t, int32_t, s32) + +#define ALL_S_INTEGER_WIDE(F, P) \ + PRED_##P (F, uint32_t, uint32_t, uint16_t, u32) \ + PRED_##P (F, int32_t, int32_t, int16_t, s32) + +#define ALL_D_INTEGER(F, P) \ + PRED_##P (F, uint64_t, uint64_t, uint64_t, u64) \ + PRED_##P (F, int64_t, int64_t, int64_t, s64) + +#define ALL_D_INTEGER_UINT(F, P) \ + PRED_##P (F, uint64_t, uint64_t, uint64_t, u64) \ + PRED_##P (F, int64_t, int64_t, uint64_t, s64) + +#define ALL_D_INTEGER_INT(F, P) \ + PRED_##P (F, uint64_t, uint64_t, int64_t, u64) \ + PRED_##P (F, int64_t, int64_t, int64_t, s64) + +#define ALL_D_INTEGER_WIDE(F, P) \ + PRED_##P (F, uint64_t, uint64_t, uint32_t, u64) \ + PRED_##P (F, int64_t, int64_t, int32_t, s64) + +#define SD_INTEGER_TO_UINT(F, P) \ + PRED_##P (F, uint32_t, uint32_t, uint32_t, u32) \ + PRED_##P (F, uint64_t, uint64_t, uint64_t, u64) \ + PRED_##P (F, uint32_t, int32_t, int32_t, s32) \ + PRED_##P (F, uint64_t, int64_t, int64_t, s64) + +#define BHS_UNSIGNED_UINT64(F, P) \ + PRED_##P (F, uint8_t, uint8_t, uint64_t, u8) \ + PRED_##P (F, uint16_t, uint16_t, uint64_t, u16) \ + PRED_##P (F, uint32_t, uint32_t, uint64_t, u32) + +#define BHS_SIGNED_UINT64(F, P) \ + PRED_##P (F, int8_t, int8_t, uint64_t, s8) \ + PRED_##P (F, int16_t, int16_t, uint64_t, s16) \ + PRED_##P (F, int32_t, int32_t, uint64_t, s32) + +#define ALL_UNSIGNED_UINT(F, P) \ + PRED_##P (F, uint8_t, uint8_t, uint8_t, u8) \ + PRED_##P (F, uint16_t, uint16_t, uint16_t, u16) \ + PRED_##P (F, uint32_t, uint32_t, uint32_t, u32) \ + PRED_##P (F, uint64_t, uint64_t, uint64_t, u64) + +#define ALL_SIGNED_UINT(F, P) \ + PRED_##P (F, int8_t, int8_t, uint8_t, s8) \ + PRED_##P (F, int16_t, int16_t, uint16_t, s16) \ + PRED_##P (F, int32_t, int32_t, uint32_t, s32) \ + PRED_##P (F, int64_t, int64_t, uint64_t, s64) + +#define ALL_FLOAT(F, P) \ + PRED_##P (F, float16_t, float16_t, float16_t, f16) \ + PRED_##P (F, float32_t, float32_t, float32_t, f32) \ + PRED_##P (F, float64_t, float64_t, float64_t, f64) + +#define ALL_FLOAT_INT(F, P) \ + PRED_##P (F, float16_t, float16_t, int16_t, f16) \ + PRED_##P (F, float32_t, float32_t, int32_t, f32) \ + PRED_##P (F, float64_t, float64_t, int64_t, f64) + +#define B(F, P) \ + PRED_##P (F, bool_t, bool_t, bool_t, b) + +#define ALL_SD_INTEGER(F, P) \ + ALL_S_INTEGER (F, P) \ + ALL_D_INTEGER (F, P) + +#define HSD_INTEGER_WIDE(F, P) \ + ALL_H_INTEGER_WIDE (F, P) \ + ALL_S_INTEGER_WIDE (F, P) \ + ALL_D_INTEGER_WIDE (F, P) + +#define BHS_INTEGER_UINT64(F, P) \ + BHS_UNSIGNED_UINT64 (F, P) \ + BHS_SIGNED_UINT64 (F, P) + +#define ALL_INTEGER(F, P) \ + ALL_Q_INTEGER (F, P) \ + ALL_H_INTEGER (F, P) \ + ALL_S_INTEGER (F, P) \ + ALL_D_INTEGER (F, P) + +#define ALL_INTEGER_UINT(F, P) \ + ALL_Q_INTEGER_UINT (F, P) \ + ALL_H_INTEGER_UINT (F, P) \ + ALL_S_INTEGER_UINT (F, P) \ + ALL_D_INTEGER_UINT (F, P) + +#define ALL_INTEGER_INT(F, P) \ + ALL_Q_INTEGER_INT (F, P) \ + ALL_H_INTEGER_INT (F, P) \ + ALL_S_INTEGER_INT (F, P) \ + ALL_D_INTEGER_INT (F, P) + +#define ALL_FLOAT_AND_SD_INTEGER(F, P) \ + ALL_SD_INTEGER (F, P) \ + ALL_FLOAT (F, P) + +#define ALL_ARITH(F, P) \ + ALL_INTEGER (F, P) \ + ALL_FLOAT (F, P) + +#define ALL_DATA(F, P) \ + ALL_ARITH (F, P) \ + PRED_##P (F, bfloat16_t, bfloat16_t, bfloat16_t, bf16) + diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary.c new file mode 100644 index 00000000000..ef629413185 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +B (brkn, Zv) +B (brkpa, Zv) +B (brkpb, Zv) +ALL_DATA (splice, IMPLICIT) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmov\tz0\.d, z1\.d\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 15 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c new file mode 100644 index 00000000000..91d574f9249 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_FLOAT_INT (scale, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 18 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_n.c new file mode 100644 index 00000000000..25c793ff40f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_n.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_ARITH (abd, MXZ) +ALL_ARITH (add, MXZ) +ALL_INTEGER (and, MXZ) +B (and, Zv) +ALL_INTEGER (bic, MXZ) +B (bic, Zv) +ALL_FLOAT_AND_SD_INTEGER (div, MXZ) +ALL_FLOAT_AND_SD_INTEGER (divr, MXZ) +ALL_INTEGER (eor, MXZ) +B (eor, Zv) +ALL_ARITH (mul, MXZ) +ALL_INTEGER (mulh, MXZ) +ALL_FLOAT (mulx, MXZ) +B (nand, Zv) +B (nor, Zv) +B (orn, Zv) +ALL_INTEGER (orr, MXZ) +B (orr, Zv) +ALL_ARITH (sub, MXZ) +ALL_ARITH (subr, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 448 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 224 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tpfalse\tp0\.b\n\tret\n} 7 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 679 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c new file mode 100644 index 00000000000..8d187c22eec --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_ARITH (max, MXZ) +ALL_ARITH (min, MXZ) +ALL_FLOAT (maxnm, MXZ) +ALL_FLOAT (minnm, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 112 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 56 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 168 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_rotate.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_rotate.c new file mode 100644 index 00000000000..9940866d5ef --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_rotate.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include + +#define MXZ4(F, TYPE) \ + TYPE F##_f (TYPE op1, TYPE op2) \ + { \ + return sv##F (svpfalse_b (), op1, op2, 90); \ + } + +#define PRED_MXZ(F, TYPE, TY) \ + MXZ4 (F##_##TY##_m, TYPE) \ + MXZ4 (F##_##TY##_x, TYPE) \ + MXZ4 (F##_##TY##_z, TYPE) + +#define ALL_FLOAT(F, P) \ + PRED_##P (F, svfloat16_t, f16) \ + PRED_##P (F, svfloat32_t, f32) \ + PRED_##P (F, svfloat64_t, f64) + +ALL_FLOAT (cadd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 3 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 9 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c new file mode 100644 index 00000000000..f8fd18043e6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +BHS_SIGNED_UINT64 (asr_wide, MXZ) +BHS_INTEGER_UINT64 (lsl_wide, MXZ) +BHS_UNSIGNED_UINT64 (lsr_wide, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 48 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 24 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 72 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c new file mode 100644 index 00000000000..2f1d7721bc8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_SIGNED_UINT (asr, MXZ) +ALL_INTEGER_UINT (lsl, MXZ) +ALL_UNSIGNED_UINT (lsr, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 64 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 32 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 96 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary.c new file mode 100644 index 00000000000..723fcd0a203 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_ARITH (addp, MXv) +ALL_ARITH (maxp, MXv) +ALL_FLOAT (maxnmp, MXv) +ALL_ARITH (minp, MXv) +ALL_FLOAT (minnmp, MXv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 78 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 78 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c new file mode 100644 index 00000000000..6e8be86f9b0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_INTEGER_INT (rshl, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 32 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 16 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 48 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_n.c new file mode 100644 index 00000000000..7335a4ff011 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_opt_n.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_INTEGER (hadd, MXZ) +ALL_INTEGER (hsub, MXZ) +ALL_INTEGER (hsubr, MXZ) +ALL_INTEGER (qadd, MXZ) +ALL_INTEGER (qsub, MXZ) +ALL_INTEGER (qsubr, MXZ) +ALL_INTEGER (rhadd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 224 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 112 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 336 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_to_uint.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_to_uint.c new file mode 100644 index 00000000000..e03e1e890f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_to_uint.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +SD_INTEGER_TO_UINT (histcnt, Zv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 4 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c new file mode 100644 index 00000000000..2649fc01954 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +ALL_SIGNED_UINT (uqadd, MXZ) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 16 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 8 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 24 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_wide.c b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_wide.c new file mode 100644 index 00000000000..72693d01ad0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pfalse-binary_wide.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include "../pfalse-binary_0.c" + +HSD_INTEGER_WIDE (adalp, MXZv) + +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tret\n} 12 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n\tmovi?\t[vdz]([0-9]+)\.?(?:[0-9]*[bhsd])?, #?0\n\tret\n} 6 } } */ +/* { dg-final { scan-assembler-times {\t.cfi_startproc\n} 18 } } */