From patchwork Wed Oct 16 08:32:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jennifer Schmitz X-Patchwork-Id: 1997865 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=Nvidia.com header.i=@Nvidia.com header.a=rsa-sha256 header.s=selector2 header.b=gFe+eL8e; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XT44H6B6kz1xv6 for ; Wed, 16 Oct 2024 19:33:35 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BEE043858D39 for ; Wed, 16 Oct 2024 08:33:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam02on2062a.outbound.protection.outlook.com [IPv6:2a01:111:f403:2407::62a]) by sourceware.org (Postfix) with ESMTPS id D732D3858D20 for ; Wed, 16 Oct 2024 08:33:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D732D3858D20 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D732D3858D20 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2407::62a ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1729067592; cv=pass; b=wSl+5nGWBpTHNbzeo/6rylTALGhRnDWNtvvcldKnsMVRuLRJm1SKkx6uxREFSkhbg0zdZVSBUECeg0bh2am1DW7jQZSDehusX2eH8+Qpu6KypTAWoJZP7msFFczTg40aLBRY8y6gboYG2FB3AIpMaPgZbMoM/BTL/XQxRWvWNVU= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1729067592; c=relaxed/simple; bh=09IDisyL9uaFPijbjhYDUGmmeRZRO6tJmUuZK2BNkOI=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=k9JKbP0PzybV9QXgR9v4ABug4FJ3zSed6lnUG9oP8dKDjl/+yvafIB90179AxKOTkM8B5Yg4kGljNR7TmzMydnt8BedZBuprhB1GibjcmgqoQ3mucIZanjvpokkkN8mtqSMiLB6P9z5GeHuDy2kTiqMHZUGIi3PbAOHyLOgy5DI= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ZgBzDWKHenm8g9CWblXp4lv/BRzs3t4R8W9+9vuuWHlsrl5R2jaUT8i+SiaNocMokrwehp8NIvqa/BwxZ+iaa+aYi70ici0rL4u9pWyqQZmAyyw5s4J+P/oDmrrz3uPSiCQS5HyMMg520uw8l5ThckrzMcrsZhYzn5C3l5ts889a84RJGTTdMvKZKnuGoIUaG/CiMWTwZHwlA1myTbiURSIJvdJUBiQlLG1QvQ31f1oi7yIse55JDgLudVGGtPGsFPA0xRSF4wS1DrpOxDQ3JRd1AEymkB2nUibfTVJOVcsrphFpMBmtxK3HrfCUAHJyawZtZzcfcObY0z7CE2isfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RK+0beT5szoV6iY6bJt8OkjdPd8QW+G1n7MGUJkvVf0=; b=hvFRREwOSaiUbnN79hf+e3lCxdEvyXszYrz7lGJ8i/m4yg5kJ4EIVK+EODfgRGrJxWP6bDdsZeZDA/tBTjv2nBSSbKNHP63eBuIW3NRqspx9cL5utnujQOWw2vSrbpsdIJz42ALzt6pCGiMXfXYaVmTIDx9hqVQdwzf95lAuDC3SMxegLR6hfeVyZe9ZpTvy7p1NxO8iYI42+Zr83Z2cgJxnbYfus00Kl3yIGdbWy1XpdJmtitkgwtCnqhAChqI10QpqEpcbqYRGkRacXBSaMG2ud9cGZCdVs9lboMJ1NSz+pX/zVgD8aUY5sec/XSMWKJWRzQMTGTSRyo2XtOt6vg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RK+0beT5szoV6iY6bJt8OkjdPd8QW+G1n7MGUJkvVf0=; b=gFe+eL8ejcLrI03DnBFGmqoVHCmrGhvm6WhEm4QfVlfBYpw9k3X+TC24qKPQLs9EHiYXlThyW5/c0fQ34IWbd8sLF2/AfQbVUdb6kTb2ZGPihhKJq1AlI9YHr+F5ShKvu3Hh/3JqOghSI2pfYmXbRJeSq4h/LmdYkFv4A7ceQ7OvQN0DZwYEychJGccHe6CXhMzjwMyhb7FWU0nAYY+lf9RRdyR9UyaqZH1gIQ8jDL2LTVC5+/j7FTqgxuzhhLdr/5NComwowsAC5GkrNLLzlnsdUuO7ijpr23Pv9hgkw0NZ6ksRXD1zOZAIS/gHrymeq+RTjVy5xen6H8AlwtHfjA== Received: from CH0PR12MB5252.namprd12.prod.outlook.com (2603:10b6:610:d3::24) by DS0PR12MB8416.namprd12.prod.outlook.com (2603:10b6:8:ff::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8048.27; Wed, 16 Oct 2024 08:32:58 +0000 Received: from CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d]) by CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d%5]) with mapi id 15.20.8048.029; Wed, 16 Oct 2024 08:32:57 +0000 From: Jennifer Schmitz To: "gcc-patches@gcc.gnu.org" CC: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH] SVE intrinsics: Add constant folding for svindex. Thread-Topic: [PATCH] SVE intrinsics: Add constant folding for svindex. Thread-Index: AQHbH6YAn4eVoZKm/0emJXYiH0DoXg== Date: Wed, 16 Oct 2024 08:32:57 +0000 Message-ID: <2C4AC631-3F0B-4131-95C3-E6AA81AAB71C@nvidia.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: CH0PR12MB5252:EE_|DS0PR12MB8416:EE_ x-ms-office365-filtering-correlation-id: bf3a8a56-4c5d-459e-2be3-08dcedbd2369 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230040|376014|366016|1800799024|38070700018; x-microsoft-antispam-message-info: K/gnyYq3t8e+2JoGKZa0tafZcbzDFA2vQDYNIVjPfxw85x4ire7OxTkNReajw9XTwLlsmSi6rEQ+i26OJ/U/rSOH9+GyDVv9N5CDc/CpVFYQdmREzAWBcFnQvXEm1NGDPbNhiFkXXBaKTrye1G9NwjIr99gT39pqmWGTZkoTG8IDYS48cVJfRP9d/pCCAF+4o/Rr2XlnozBTRQBKLKY5eZZqQ3enxlO9BE4AZIH18EACXP6J/Y/O2tUUr6FCpf4Xtzn8+UTQ84xrjOcXovnXFk8tb3kf4ip/s+5WiCP/joeOUw0lyoFOqNGKRRqxXairDkFMZPDK5manipLO9u+unmMfTFZjdts+qbV0jmtbkrx76Uqm5XDTVsHxYjJDJlbO/+0+QIsPCv7X7fn377O/snl7mSl4PVJxaLBa4r+txZ5og9jxMd55Qxn9hB/BINeG8iyTsYAU+VWzJMsPtbC8lSgcwl3DOasVluo9Pajpwe8JLlGe3V6M0+p/bc/uBPn+2Cmua4C0ip7FSaz+wl5MQEUAl0mV0zJ10amntgyZehKY30hRZfy1CoATdqqKJyhOGu+8+2TnRggdH6ye9c1Jz4IuZ/n5UGRWCx2NbIHaQSnlAAPpZBsLOPT5Cpdu2kaIle6ac8ZDN/aIS/qrIxiKQJCvUGUjS/BFHURImj6KPvx2KpFz6Q5ZAnHzUmWKWrqkEhWL4yNRnFcViO2PKBikO3j2Mn8WSzCSWYW8GFAnUxEe+k6QfwoPklBZ9zEwxh0oCqQXrkn7tb2a3jiSYdGvPHF4Ir4z+doy01TmC/f1VktKq56BDJ2atgitDBnLGrvheu6zvUmO3F8V+TJbikXktql4eZGBEo0a8npfitTX/iW2DiTIg1rJ37BO3zG79XdkYx5s2nC6Ur9IZCUTGblrEAKZo1XgIAUGhmr71nx9JFmR5h1Nj3Yc4MQo2tNfkz9DXBzdY9/p5zyArPLs4+ggZcgqRCsx8knn1EaCtRuL67777qt5GfuulnzbAd9vPV2HfOuCzfWValqqgmKgjejWBy4o8tgNKaHHwFAoT6K/t/VtiT3v2L3/eKpZ93d64het0kd1DDXbf+g1UFWakzAaHP3ijwzc2REeykSLvhhJy9K08w2X4MtOV/fSGhvpr+SmSQGUPYECmQBHVeV0/oikPGqg4WMiUAVWL4fixMCQ+Q/Ceu3RqXrJ91I2qrqRaGIBGXhfOqFoRuDCutueHHMezHJ9b2Qb6EeuadE0NqktaSAh0iFp9XCX15OO0pc972Fq5yncr0Mqewx6JyPo5wZ8qMuNazixDwUy+nHPMchwOUakNLrsV0aqCvFniadCoxk+3ZhzaIB2Lqdur2p4ZtZZAQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH0PR12MB5252.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(366016)(1800799024)(38070700018); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: R0zpB1ewM6foxRmaULqrrUAR1M92vPV7pcbOlVfcrvikrImftEJpUHitHpNw28/cPe/qwsSCiHTd3zmq/n99jOkswlsVQ7AKQKYObD4yUhcRb8VigzFaivd9fr/ddjkYPpDkkca4LNAM+v7FVZnExP+/WA2niyUUwQceVkZKxDqJS1MWBEj3zQy50yoOCXN3QU4MPuG6LJi3NXaTgFqD77g9p4C8ga/g6aTkSWAQ4ejabBSUw3kMBZ0MAFxUQtKscYPQ+Vi7cJYDSy/hz0s/8KDH3PKWXRpJLPLL7Eg53eXGKwp5p7IYUOPpzM9carvx8YB/NbxZ5O3Ku2MlPc29AgZEMzZSquO/gSQ8W+b//kDyCG94JzPuzww1xUUfaHB5M4DgCldaJ465bUgjBLb5RsMGvM5LyZdXvPhmWsZDhBCawcl2DVRmtjkWB5VZ4dUAZfOlN1CObTZMT7kBA6gx/y04FF8nNrHV3y5FrfOmkCm1y0ckcfAor4R3djuofKcMicLU+5O+TP5Q8da7EkZWnmAPsy+vh8C3NVDmKtcJLYE54XO2yvqH1DiZdl1X1taVYXxm53Fvh52jqkRIQtfFzJ/inqmT3NRVHQ6Xx3+VJVqEy1HCrjRNKi7+B3zIvHMQBDEUGJzZVAo52Qutj1fskqEJ258OGIa0qGyBbaOfg5c0xeAQO5wEC8YE2DGu0rOvdtWCVDeqAgMov8i/1oepePnkp0d2JPzRt7jKjFeKnLhX1/IkQDeGWSecMLjC+FH7W8F0wSVA8Na5Hz231hVTwp6YvVnTbkWGJk7BNg51M0Y4X1QUHf2wSxqv8JM7lDLikXhQo4Ww6uqDyFTemlAK0xUsIU4roF4bhELegnYsyl7mHvUYj4IeOu94huwYu5j6lLOTgZIGtLwweohFVtPr7q3Wq8IBDAaSh5CYmRMoJoZ+rDX8Hs/sYCUOcHgOgG1eQ7xwcjNMpYnYOp7D7VzzviJVE2uO3AhtCudmjNtLqijfQwl8mjpam3duSnpywd1Ym5XLKzTmrumHCLpiZLdbr3p9o06KlrUSZGnM4yK9IhbIqYqEgWcx2jgC4+oYiCCZCDaYzi7Z2bBdBbB8JL3jILN7hcZWMpOqEsz7YsEIJznFZFxZBeLNS5aa0avUAzLliEprsvJn/oUrQfeLEd3s55ValUQ0+Ikktb5nGPPEV+iqTVWF6W5GgergAnRICkR9UipKzQ8OtoCbuzyUnwipeIgG5uAVpfg70iGa3Sd9UQqgzeRVwP5yXoXTnUIMmthYapDZnLNk6rJwjChWecHS0kjyr7Qr19P/GB8EwFNLMohnaMtrK+mkF0MWII60jgt7hsr4I2oCIbbx26rJlI7aq5cMbZw2dm2pcwe/LQmH2iLOkz4cCkVFH5l1Ai6LwWLfKd1DErk9UAIWQ1+1HizD0woiGkRHymPk9c38zCelw+u7q64mZYX/7a+ichsmuaZbZn6TF9Mbfth1wUh/E0MPeD+S8JomqCV4skASke2PGPImubr3y/Tb7PvJcafM6Y8LkWqzyoD3U2t4xaX9AWg3FdukiiwBE8QJlCaxkRh+K8i+BRKCekstB6DEYHRyNdiW MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CH0PR12MB5252.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: bf3a8a56-4c5d-459e-2be3-08dcedbd2369 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Oct 2024 08:32:57.8428 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: BFDge/m2EqcmkKxrs/cZCF0td44Tsw3B4n+z0Ers39f4RU9ujk4LfbzVKCUGBSVsxrF47kx9QRocBYZn/Jfk5g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8416 X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch folds svindex with constant arguments into a vector series. We implemented this in svindex_impl::fold using the function build_vec_series. For example, svuint64_t f1 () { return svindex_u642 (10, 3); } compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...} in the gimple pass lower. This optimization benefits cases where svindex is used in combination with other gimple-level optimizations. For example, svuint64_t f2 () { return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5); } has previously been compiled to f2: index z0.d, #10, #3 mul z0.d, z0.d, #5 ret Now, it is compiled to f2: mov x0, 50 index z0.d, x0, #15 ret For non-constant arguments, build_vec_series produces a VEC_SERIES_EXPR, which is translated back at RTL level to an index instruction without codegen changes. We added test cases checking - the application of the transform during gimple for constant arguments, - the interaction with another gimple-level optimization. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svindex_impl::fold): Add constant folding. gcc/testsuite/ * gcc.target/aarch64/sve/index_const_fold.c: New test. --- .../aarch64/aarch64-sve-builtins-base.cc | 12 +++++++ .../gcc.target/aarch64/sve/index_const_fold.c | 35 +++++++++++++++++++ 2 files changed, 47 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 1c17149e1f0..f6b1657ecbb 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -1304,6 +1304,18 @@ public: class svindex_impl : public function_base { +public: + gimple * + fold (gimple_folder &f) const override + { + tree vec_type = TREE_TYPE (f.lhs); + tree base = gimple_call_arg (f.call, 0); + tree step = gimple_call_arg (f.call, 1); + + return gimple_build_assign (f.lhs, + build_vec_series (vec_type, base, step)); + } + public: rtx expand (function_expander &e) const override diff --git a/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c new file mode 100644 index 00000000000..f5e6c0f7a85 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include +#include + +#define INDEX_CONST(TYPE, TY) \ + sv##TYPE f_##TY##_index_const () \ + { \ + return svindex_##TY (10, 3); \ + } + +#define MULT_INDEX(TYPE, TY) \ + sv##TYPE f_##TY##_mult_index () \ + { \ + return svmul_x (svptrue_b8 (), \ + svindex_##TY (10, 3), \ + 5); \ + } + +#define ALL_TESTS(TYPE, TY) \ + INDEX_CONST (TYPE, TY) \ + MULT_INDEX (TYPE, TY) + +ALL_TESTS (uint8_t, u8) +ALL_TESTS (uint16_t, u16) +ALL_TESTS (uint32_t, u32) +ALL_TESTS (uint64_t, u64) +ALL_TESTS (int8_t, s8) +ALL_TESTS (int16_t, s16) +ALL_TESTS (int32_t, s32) +ALL_TESTS (int64_t, s64) + +/* { dg-final { scan-tree-dump "return \\{ 10, 13, 16, ... \\}" 8 "optimized" } } */ +/* { dg-final { scan-tree-dump "return \\{ 50, 65, 80, ... \\}" 8 "optimized" } } */