From patchwork Wed Oct 16 15:38:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jennifer Schmitz X-Patchwork-Id: 1998131 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=Nvidia.com header.i=@Nvidia.com header.a=rsa-sha256 header.s=selector2 header.b=AJHqYtn1; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XTFVt4FnRz1xth for ; Thu, 17 Oct 2024 02:38:45 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4DF423858CD1 for ; Wed, 16 Oct 2024 15:38:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on20619.outbound.protection.outlook.com [IPv6:2a01:111:f403:2416::619]) by sourceware.org (Postfix) with ESMTPS id B4E9B3858D20 for ; Wed, 16 Oct 2024 15:38:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B4E9B3858D20 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B4E9B3858D20 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2416::619 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1729093104; cv=pass; b=Unc07D7T+FlfmNNSydcyHX6oiqLXMENexcxyNeZFABmEyvOM1b0CUbstFVi9z1GBUDOGo6mhO3UVmq+6EQ1hjwEbZ4GzS5iEd2iVaXYwC+i2IDWAIHArKrWB4CUwg0M1DeSY/z4gjqQk1DrjmMgbcrvLnkd0pP6Ychl51qc+Pcc= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1729093104; c=relaxed/simple; bh=lQnaADRiP/Er18i0O1dq2VWiikCBD5tVpx/OPEc4DXU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=TYNif5bU8hk8a9m2BOx9qizIOTbK0ITDjXCarIIOLDCJ0EOEaAZmINy17xXMcRnGln7KVE7U5m9vTSGSKH6jc+OmkZ3oflAjr+m9nf2ROSepPvksrWkHmL53c1j36x20xciHeST1zBE4rgSd/kEdmHvA7G48JXO2YqLb4235uy0= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=A0rffhXWLiMYJ8GNTB0DaeSHjIzNJmVbOWkyEdKO6URY6foZJG/PlNOfVMvQ3aWRMgAXz7hjaDmBP2V1MmBoP3h8KotIFKnCnS4sNeWzfe3UYROEtfMSf/mQwJ+oNkDrrxuiX7WVJQCK88o5BnMe+Qs2YvcoqaLFzftVrjXlVIoC7fwTsdrYO6efAQeKC/yTsVa04JlHkSOVug3RVSWPKAtOZWzIBlNIoPiQgXdR0LKuQnlDKwsQKddYXzsekTTe9BM6JI8QX2D6cP0u5ppMYnhl3v0tQl9DnFhZddbgkEyISweAFr1IUOqg+gpi9YjT3P9CQpXEQGmiK0uUaDsznw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MeOdCc/KYrU0ZfBIj365JBhK57Kt3IJBFXu507FY4HA=; b=FUKnm3PXifnQiSD4Dsm3Ef0+4LpVg+1Wd1u56U0PqvgrPdOz0DQtBk4vsHXxwdTgDGay3nSB8i8SofEtbJPdufoPDQfUAUycr+cbn9oXV4M6EdAbu+ujPOsPQaPevAg2+NFYnoGHs93LGBm0ViZnjSwKasJuxinbZF8Zvomgj4IH+UEekpjQKdQtguIf9m2PE2DERSh1K2qTG/p9UqCW4jKWUWM8q/l25C1Ul15m1bHhtXIKLMW3xOFFnG2OTSFPvoUgG6RO9IJfLvWGuG2umRM5Rr9fatxtTMbZxQY8A2feS4q1hG/HiCg19NW/mg2oBxa4po7x/oLb5vcHgjkhwg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MeOdCc/KYrU0ZfBIj365JBhK57Kt3IJBFXu507FY4HA=; b=AJHqYtn1aNZSUUU5dc/KjPDEXHdLcQUbDJsp8zIx3QcInug+HmlfV+cOxUr/tb1jQlFtTAF8cX/kaKovx36CGpj3xHeMq8P9e58MbC13oiyvUhcHktuzjr1lXKpm3fpWIJ8QQWkZeloN3Hk7zBwKEYCYH1EEjulC3lkkTgXSUVUeYa9EmYWjFA5huiQx4qDD2lT4+mLUJB729Cni0cJAF/C6V8oW6Y6HQ1w9/XQs4ZxPG0Fv6BnHpiu+Thh9bpaUBpNbhYxAyiZiD9ckx5yYRc5xc60bn34rqhjuxkuzNyVwG+dqlrPVg8GFO39urrxgigcbXXOREhCi1GRef3N+eQ== Received: from CH0PR12MB5252.namprd12.prod.outlook.com (2603:10b6:610:d3::24) by DM4PR12MB8557.namprd12.prod.outlook.com (2603:10b6:8:18b::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8069.19; Wed, 16 Oct 2024 15:38:17 +0000 Received: from CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d]) by CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d%4]) with mapi id 15.20.8069.018; Wed, 16 Oct 2024 15:38:16 +0000 From: Jennifer Schmitz To: "gcc-patches@gcc.gnu.org" CC: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH] SVE intrinsics: Add constant folding for svindex. Thread-Topic: [PATCH] SVE intrinsics: Add constant folding for svindex. Thread-Index: AQHbH+FrIuBHOyjEkEuacNwdvPCKFw== Date: Wed, 16 Oct 2024 15:38:16 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: CH0PR12MB5252:EE_|DM4PR12MB8557:EE_ x-ms-office365-filtering-correlation-id: 70550a6e-8eb1-439d-c042-08dcedf88dd0 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230040|1800799024|366016|376014|38070700018; x-microsoft-antispam-message-info: Pu+O5qBB9ygWW5kl6KFoh6qSnc1kYVA3FWozs8ZwA/n/KfifjbtOy1vZqX/0mRAq0yxUV3BvbIBihlX2s6b/5u3GclRVX7gv092XbJVbHxPUkpBggISjLfzHJoFBYRwEuKsj/mgL4Q50m726HGrO4cIjhoELa4CGpXleLX7n8Qn9AedMxGLalJVcpn74STLhT5dkdwbLknh8ke7BiMWf3kHgt4vn/6UmGPefLE06y9i7DB+O5dFP98Cgp5KC+48MziTqAKaRNuLV5VNQ4uHtGjYGdyhbOG62XeiWmJxL4FXEj+Vp1WjDnWu9GDEaBQilDElCF+f/Sep3Mg+MKyjz1ZG0DgngSAlFjp1cB4BIIVSH8DHUvXBtOuqPo1QGMGGf5Od4Xwj6aZ0hnPcvJ7mKaNkowYbII3RgeXp3BJ/0Rgcve9fvxmrAyISFGNtPjz9oMNnfDOXbfq1PLyo+1KmMyQr5jLlWK1Kl0NogzbhgqzKTt4tcVO05glwbkffJQewHRRBbpOvPt6InR9Laq9SFwtR/AUgytZvEMwrzsczdIr2S7/ugyGt1k0fM/XSIkzVxjvujuQVo3ZVmvxtlNHkkAi9K5Wb7bllisIEIdSxB283mZkRw7/bJYifIWIh6nYlhY9gFpDwDXVu/T2CHn8JXfZu44gHOmZ6tT91y3t6EERQ7vr6ILkxBSDMvKJIgtAg7n4529m3e3xY5O4BAwTuNilrLnb69MxGKTgfI1IGcAlBy5JpkeR6yW6AjYbI+rKVr2det7yQoNHb21OAeqPeF6TrqcKqmkrNAWn0LK9J20nxPCGf3sgQCHQSKK7WZlcX63KnVVhGfesWxhIqgz5Rjw2Auf1MzAOkPLXHATASwI09ZUYgVTeIpGwvZEC9cXVBS2EbypbKVLahSVKIVTWJvFDdUYhtw2ssheiTVaMEQrkUKEf1efBa33FtBkE1O4WHX2H5r+Vbd1IHaKxEacDOVbZ7J7e/jQFbccNwcBc0qmAskiC0qcse/y/WcKVjLtRCVksDUJOl/X5kPJkO2zY12XFH36i4bCykxGGk5I2cfELC9l3pLTLl0hc9k1qbj5Vkel9GCzyIfkI6x0D4vL/gPQbOtvHARdTTGWJUkESeMzZLnwYw63C7PetpCTXz+uNaqq7FkUZltD4tInGUYfSlMqCTwrLMv9P0zct5wXjU48Lu5g3KXKzCHpVYYtR3BXdH7ECZRTUQ/B/DItfUhzsCdYOuITkcN8CaORRfPyKn1njsj0Yz/PJ/acEbzmcZFCOeqaGphb77UlrvfeR7zRCk+9vErR6gT+NZcKXThWvFsEO0JqmOhfRgquw9dszLUVXnSnGJlLEupwLkSDP+iRju7uQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH0PR12MB5252.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(1800799024)(366016)(376014)(38070700018); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: UxtOoPqfIb0KPN9g9wGlIOmXonnG5K1gvk07kEX8d53eq7y+FMgwsXu8AxQjL+s0pp3LItocxxqAgeyG0q2rz6vyY3OlVaSpgMsHLKBlJZ09C+W7uFgtOKtQPM7bYu6GievxNOs3z5f7Sc/gx1glaFxTvFoDC/X4S+K0z2dLgWgAFSPtOx3lidZqLH/rsqp5KVYhmlXBofbOGhev50qjyIooO62DzhO6c9sUa5XNqDat1cIdPD8NubhJCa1cfjCHSeJvOk7G1njyS3HLFgNWEECi7bF3hLyv5fal6rrXEEjNv+He7piXIXu4dNoOfQtZfp6Oc2lMbKUKgDtS/4avzxl0NSdEip1bDOaApRposKWzwwVbPzNhBulLVov2pjBnPpgkh2aVSmU3tNSeRS7ilGk0xEBc1M4LphNLj+pi3sRdIPD9hd6jpd56nw5yOLcm9aSkpEEnq6PJwsaEMcRj4BXbWMNBAKGmgi9mFzQXEvafBGjWOcnG5SNX5wKHWq4G+Syx2p/JShxn/3zIXCeqdAqqrQTXHXEKWbsF1gHFrsMna0HqgvSYF+bGF825NZZgfDH+RhUyzlNLSDyhWanuvD+tmesrmuOAXC+x9NKpmm1U/BW6gzHcV8BhOgUPSp6DVdYMfRaPxSQqokI2G0zRocZ7DBAYgqnzx6agT60KjKOSViz3+mwgP5OMxmI8HTwQ1JL89AnWNs7UVlyJxajPmtzAlIrH2cY198+3hiyEGEqC2ajRa+dz2jyJ7onh+6kMK9Lzy6XxGs7X5R35XdCQZSLutXGfufRiITEQuP0YxkZsGDuQK3mkyC9nzkJ3BNDPrdjxjBbE08RuwB3k/iajYSP9be+rUFUJ+W1HKhfNhnxaa11eTT6TsJx68zoEmT8ejFgpUzGOQU2OflP29niPAjPgYgtsU335JYO/uzAUrQjgCFYT8doF/SGtK9uYTjqB1LN2bzN/Qf0Jb0qvUqafDSUm0Vtkg8p8SP1QselqKFStdkbBsK5juy89QS928lpGcNzymwXOzc97ge5tJ17un8hjXxSUpmhv9a0MG0hjRmG/qR85xdHIvOl6nKDLV0KhOBcITNz029aFqZpPJ9vQ27vMn6Ush0Rc8nbXfZ++Nds/yfagmZeSlxbaxO03bPVT/PKFf5X+DzKleFwRUug/kWsJPuWNF4vR48ZF2OvvnSTqysO2jHipf/6UrKNBVljFF3sqG9ELEK4mO82+JJPnKC9aGDLm1tzEI0koKriibrfR/CwRs+5TUI7VeaW3N5pmh8/iXkXE4N3gY5hfSoWgWHvqaVwuXiHR8Xxpm/njY4MfkWXclGDnKBesBVhWrFv5O8esf610x11euOMPrgPC8dwhTwUsS66p5kEBhdO3SY5gpRxHTAuhtxNDzg7aZ1qsW2p7ZMlUOEMw/bXlsAkeFpDn8eU3AjbRfB+cDUQhUrX8u4d8rEC/c926l0x2nc+vI4ytVnXSb0DnbUoQXTV5Ie71i7RAiwAJgKFCEXjrEurlmSbiuE/I6/WaN2UF+Ia0jodUkQg03B38V75JPy1a9N8tLELvgWHAT+3LnUuq8CLUp3BJYtJ7nhWcHSDHVmgK MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CH0PR12MB5252.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 70550a6e-8eb1-439d-c042-08dcedf88dd0 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Oct 2024 15:38:16.6556 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: OVK6t5/nmhJDSmHvmGNReSWdjCbhgJmzgnYjcJJt4twX8HpQ5q3ZhyW8gP/ibpx62Ub6Mp7AppHcCOcLxFK+VQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB8557 X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch folds svindex with constant arguments into a vector series. We implemented this in svindex_impl::fold using the function build_vec_series. For example, svuint64_t f1 () { return svindex_u642 (10, 3); } compiled with -O2 -march=armv8.2-a+sve, is folded to {10, 13, 16, ...} in the gimple pass lower. This optimization benefits cases where svindex is used in combination with other gimple-level optimizations. For example, svuint64_t f2 () { return svmul_x (svptrue_b64 (), svindex_u64 (10, 3), 5); } has previously been compiled to f2: index z0.d, #10, #3 mul z0.d, z0.d, #5 ret Now, it is compiled to f2: mov x0, 50 index z0.d, x0, #15 ret For non-constant arguments, build_vec_series produces a VEC_SERIES_EXPR, which is translated back at RTL level to an index instruction without codegen changes. We added test cases checking - the application of the transform during gimple for constant arguments, - the interaction with another gimple-level optimization. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svindex_impl::fold): Add constant folding. gcc/testsuite/ * gcc.target/aarch64/sve/index_const_fold.c: New test. --- .../aarch64/aarch64-sve-builtins-base.cc | 12 +++++++ .../gcc.target/aarch64/sve/index_const_fold.c | 35 +++++++++++++++++++ 2 files changed, 47 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 1c17149e1f0..f6b1657ecbb 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -1304,6 +1304,18 @@ public: class svindex_impl : public function_base { +public: + gimple * + fold (gimple_folder &f) const override + { + tree vec_type = TREE_TYPE (f.lhs); + tree base = gimple_call_arg (f.call, 0); + tree step = gimple_call_arg (f.call, 1); + + return gimple_build_assign (f.lhs, + build_vec_series (vec_type, base, step)); + } + public: rtx expand (function_expander &e) const override diff --git a/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c new file mode 100644 index 00000000000..7abb803f58b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/index_const_fold.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include +#include + +#define INDEX_CONST(TYPE, TY) \ + sv##TYPE f_##TY##_index_const () \ + { \ + return svindex_##TY (10, 3); \ + } + +#define MULT_INDEX(TYPE, TY) \ + sv##TYPE f_##TY##_mult_index () \ + { \ + return svmul_x (svptrue_b8 (), \ + svindex_##TY (10, 3), \ + 5); \ + } + +#define ALL_TESTS(TYPE, TY) \ + INDEX_CONST (TYPE, TY) \ + MULT_INDEX (TYPE, TY) + +ALL_TESTS (uint8_t, u8) +ALL_TESTS (uint16_t, u16) +ALL_TESTS (uint32_t, u32) +ALL_TESTS (uint64_t, u64) +ALL_TESTS (int8_t, s8) +ALL_TESTS (int16_t, s16) +ALL_TESTS (int32_t, s32) +ALL_TESTS (int64_t, s64) + +/* { dg-final { scan-tree-dump-times "return \\{ 10, 13, 16, ... \\}" 8 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "return \\{ 50, 65, 80, ... \\}" 8 "optimized" } } */