From patchwork Tue Nov 5 17:10:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jennifer Schmitz X-Patchwork-Id: 2007020 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=Nvidia.com header.i=@Nvidia.com header.a=rsa-sha256 header.s=selector2 header.b=OWeaoLpm; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XjZd55Zq0z1xxb for ; Wed, 6 Nov 2024 04:11:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ED5A8385AC09 for ; Tue, 5 Nov 2024 17:11:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on20607.outbound.protection.outlook.com [IPv6:2a01:111:f403:2409::607]) by sourceware.org (Postfix) with ESMTPS id 9D1903857001 for ; Tue, 5 Nov 2024 17:10:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9D1903857001 Authentication-Results: sourceware.org; dmarc=fail (p=reject dis=none) header.from=nvidia.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=nvidia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9D1903857001 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2409::607 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1730826623; cv=pass; b=mwwMQoBfqj70fCOHZ1JlJx0McNue47m4zvef7xyBYvtc8/m1ulZoSwl0e85cJNrB5g48SKs+FWLFSydX0Ghlc8v37bXx7bbcJlzdFFjoW8gnMk+8H9yhNfIaG+MFDmLZvex2rctJ/fjLz55Aj5KIbeaZUriil884ugtO4HCykWQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1730826623; c=relaxed/simple; bh=3W8d8hKKM8ZorX3y9ekZxWEP668UBWWb+aelSwLBH6I=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ZfA7Z4QAnxXeyigQEE5dSeGXlWg43FaVtlhD/950N/bYJ/oN2uA3TivMCIwGSulKej9sLJ1o0AWzpfhqIMiovsM3dJJ8+/70AP2n7i5yfnbEvxH2/PTnAxCO8mADvEF7/LafSQ6H1Ejp34O382VWiYjS+rvu7rght5IeEEwgrfA= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=aLreFAnYEYRcUNTwoxB3f/WI664jKhlMfDz98gcH8pVmgnnOls96ln/rE4zUHIuFYyYDUBCRLZ/dAiStFpilHamQ9DkTgMqNPanJ8EGUWQ6thyKA8OE2/sY8uSUHyDFSWxRl7nSXRFEWFS9M36zwqLHY7SLCvEOAjLoC18loY2Q/Ph4jWBmfxoQqRs48duYpDpI0MIoxiZ27VCD9QXVe4WovEbbiFWG9QPQyyXSSlQs4GsC7OcSbI0WsPCURcvEXuT/37bfKGpsfZL4KPmUPrdiJwCNiBJXhtugAAzHczD5zwmEmIFj3SiD7e9JbtHaInYzxRimLyhB6gG7srZz33w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/6Ge1rsdJMKFN9E+T3OBmGdsxcHi2y13S/TAX4U/gxU=; b=eaJFnvfJZ+UV/ZWTdbXWs2bPNEOfWGY/VgmnY/ubxPBXbdSz7UuTpZALEMhYjtycX4P5zejA7K0E8fLdmJQY3r0Y3x7/VcCEC7WVLDptGgooKdX4534NJNEbg6HvMv4Kh74XM9NFvvkWotLlHQFYxFfxgUUahk7mtETgMYGv3LOC7JBzyPVPxAeQBkg2LiG2H8AaIg7idwMpNi3V6HN0loNUKTNILTl+XvQfdoYfAmzhL8ba+QsWBA8l61ic2214oni5jLM1C6H9fHL2cvXSzqLvnQ0ujvkaHjeKSezLQckVtA14yw7XT3O+cxUb/DIjVZ+gNNJuySExCC553m2HpA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/6Ge1rsdJMKFN9E+T3OBmGdsxcHi2y13S/TAX4U/gxU=; b=OWeaoLpmkT7LCuNEt3BCGxPQ/Qzg7WIjAb9z+ADzqCx45EkHVWli3R4TKwucgvXQ7StAT+PM3KHLFT6pWBhhyNkDnJJwHiSCw+6FpieusaWY+21QBnxfq8u+UX4uwYoTGejtIsY7aMMr8/iaXXWToheWMtSCVXR1I5hXKmK+5K5d6qmLHRdeR+Z8Fzsv5EpX9kzaMZP0VHXcpOED4AYbyAE9fhtXhdEU3h1MWyEWoN9T4Pl+H26fyIZNx6S/Q2x5DKAjRJXBWhsmSdlS/SQ5Mwn7xmneBivKnu7RHzKIEKfzX7I2LLSdIJsg0AirDG1sqbOwRlDPONJPzIjMqoUBzQ== Received: from CH0PR12MB5252.namprd12.prod.outlook.com (2603:10b6:610:d3::24) by SJ0PR12MB6757.namprd12.prod.outlook.com (2603:10b6:a03:449::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8114.30; Tue, 5 Nov 2024 17:10:15 +0000 Received: from CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d]) by CH0PR12MB5252.namprd12.prod.outlook.com ([fe80::290b:293f:5cbd:9c9d%4]) with mapi id 15.20.8114.031; Tue, 5 Nov 2024 17:10:15 +0000 From: Jennifer Schmitz To: "gcc-patches@gcc.gnu.org" CC: Richard Biener , Kyrylo Tkachov Subject: [PATCH][RFC][PR117093] match.pd: Fold vec_perm with view_convert Thread-Topic: [PATCH][RFC][PR117093] match.pd: Fold vec_perm with view_convert Thread-Index: AQHbL6WUStUVpUncD0qvOzFrHIsMnw== Date: Tue, 5 Nov 2024 17:10:15 +0000 Message-ID: <794D9463-93D4-4EDC-AA5B-499A5557B0EF@nvidia.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: CH0PR12MB5252:EE_|SJ0PR12MB6757:EE_ x-ms-office365-filtering-correlation-id: c0fa15cc-bbf3-48ca-aec5-08dcfdbcb753 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230040|366016|376014|1800799024|38070700018; x-microsoft-antispam-message-info: Kji9R2ageZ1PNZj0oXrSpbf81ZXKr4rZgYsfLD38sYVor3oci2hd6YGQaN/KcVj1lAD5LxmeuAfCa3o5yLHJGYQrvSfaeTf69WG46EnFtXcoXl0BpI2tOLGfbHhaxrzg5HuI6yQUw412Q3AdIROL0aLhb0UQMfc0a4kLZJYAWcMRrEgPjLJgQJ9oHKxPjV4zesxnKpCq2RGKAOJ4KloZeEZ+nu6oF/YfVgJz2OtJAI9m8lP2L5xw0jqroueOPGc/kwddsrbyiKa/uZIIcso4goHP+8y02+TOCvimC2EwPqIz2IekbqLqxKBn8OB8Yz9kMn7DDYGwjIXy0d+yJkGzDVJSWFDM8FX1kZf3L1jiNxplVbYyHGV58foWh2XHf7TRK4QGm4OjA3Z9cjA6xLfyicSIaGAFUKI/MWKMVNFsnwoDzY9rBxz4AynucLzLAonlzdataMF9qRznKtWvbenoMj4NObPFaiTGN9hoqoBD02Nib5u3hwwcPpWbG5gVO7wt+czcZint0KoXFBt0kTuegBy24+O3iUf+HvbQmsClYYP0OCZwA88dSJ1aoQYpNbJm06GAKIsr81Lo0hl2nMOQ4Ho5AYG9ps6+3dWRjXy5Wllj3WW53im6JhzaJw0llqEpf+TOwxAuV1xDU2M42OgcFk+qyB7bU/2PWzi2E/zCOR9uPdy8ktN0Pnwq1k4x4qILhu3emzXPqTKpZTeyQTosI2V5PQPwoFVRXpwh2s8zXmDU+hf97THi31OC91zP18YU15g46s/fTAD56CLnLOcxl2hFLPCL/tR4z/qlgYDlzUrIRRPXbjzWrU9L8EcP8X+sO62uEaeP5mCyvNOc4NXz5rUlAuDfPPAFaHjCLxZfIhyARwoUBj2R5ejfJpAvCNZjd08HM2dbZu/PDg4u1j/AoWWqsa0h7DE3QyswfuV6VGZKZUY4SiVyiwfVGm4HmK1R3inUPX2D9mwEd9iH/pnyupols2h74ANKOG6g9U2CnGC71a8apFA2jLghjFkdq6GpPZvS6JDy1xVtWeQlAzGRYF4yEbuDKFm6kDQZmt//09ftTtbHyUVRmgBqr0lDstbV4aQRB5MhpL8/RmwhD9PcCSJC18V1VtrQME+sl4DV20n0GrBd+5XLjUa++FCsxchaTjp/tvGLjTsrtsJ3o6B9nWKYsqTNdte10fhT+ky8o1UqaW13Fb1uB/evPoxxDIfxwRBayMoet31VvZtIVaV3DE2xNO+HDHeqOOJhmYSl2kPORPRC1FTeuOoimMgW98yrokscIkuA9Zxs1MUcl3EG95lJrQ73Z6q9CpW6m2ZPpgf+4tiALVDoiv48RlR01Z79pKawK9b+eJh/V1/EfhGstD9G1g/6Db3+TbKYjzI2qabdn599zs1D7rWNHolPZnAw x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH0PR12MB5252.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(376014)(1800799024)(38070700018); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: p9ywix6Wcu7zbln6KmZtN7JAaJSqc42JlAZvyIJUwjSKLEulDE+embIznNJQ7T5UcBQQoZwql77s9yCdXEyPHS3FYgBE7ett17kGAh4DpAn9hlC/bDBE2Nx9y0SHzzW9I/RBrZo31Zgo85gkadcCPoQ+oiNMTWnLp+Y8FdjxE4ahQNI4YjWae25j+ILRoSKsc3C4d/m7L8v+pYxBCZcekOdeT1yYzIuGYvj3Xnnx8zEf34Wh05qfF9Tp972D01tclciYc9iYCU6Qf9qawKukQzcrHqWlAUIq/ON0Ci4rlgL/byje1mXri/a5ivInKiMUTmXAbJuHYbqgbx05CjrECLCAbBWufq+PKuBwYBEpIBPeZRUdTlUmvD0q0AtPXMIgdZ5b9mlBDfo5ZQZMdQQYTQetMflKfGY7cQgIkLq5IgRb1xt2FVd/oaglnaSzaXhw9XV7qGbd5Qu5yENZG3J0BAFlYbkkk0eKp9jKsBq8qWpjkirC8NdYJOf6lR1nd6XDaSUboeuWr/v6VArBqPye84K5WQgNOHxrI0YLud+IVOUryqw2vQvhO/JiFiyH/sPkdSfM1WSJwIOL2DMLTS2wp9H9uBAw4eBT8mvWDaEjEG5ZXE2AHNHfSX9++DE/QYphP+GYbBXTuoizsFfKLmdyIiSrw3CfNpxX3AqhMJ2622cnGj6oib73VZJReMyq6oVgu/AYv3zWDvl2kKlHV2jvt9i/IE9mYZh/zEmOoOq61Jb0Fe28VFTe0Dqb9POZpf4UTe5ICYK3t9NYvS4LM+WeiPr2gzKAMstV0Bjg8jSDUCr7vE+iAPP3RK2FBQ5vocmIbMq1v68CYiyM1njZZ2oBImOT5rMDBviJm58VJfWcMXOCZ/yE/5I9Uz6YT836Wx2M+BdBjLZoCCdCHa7P5hvIxABcEcWePjQtGL5oTaX0fInvU977x6YDT5sDrCNPa+5hFLxWy9JPHgHmCgKqCB0nCa2EUw2Av294v/sh8/W5DzMXwI+PCmo9nUjDxhqOpCfA/gA8PBjSI71DVzqjAgLtPZR3JrnB9OL+Bq8h3HzrlnRlif//9v7bMDAUlyG1W27ABdq3W02a43ZjSmNr+bylehxzfMT6VUDg679ixwR1VyyiMXnmpg978Zc9oB1nhWARanzmz6CxKLaT7sSASascrDm31S7kJXtnRf8DUhbyUsa3SZoo+BS1PGQ5+cKzUO+CTrQcShgKwshG4tZwbv2waWy+Z4p+0pYDOSXlI3mr+H7XxbsiUqfTfpI3pn9oMpW+A3CGCPQKtZPZXdB4gf3cyyeeSCimHN2dlP/Mz8FOX63VZlX4+gnLefW+oARhuRKGFmV59KucdmuGLvqkeHtTATpPzy52ZNrH0+Qqeul5fc5UqMQr4RMvwvKby0Er/Sjadl9rBPP+LKuwEqtsaQQo45xDvixVTi3Ul3pEZZWaWx9JuQMib9JI5zym899Fh1GsEWurhea4lb1KW5pdUA428hrT+X1tx6ct6bfcB4He98KzXQNo3Qs/88wb6r2/coHlAhDxPyWgTTNVj1HwgviRO/Uy1bFakL3rJNPT0vtY3cAYW7IpyzhmxUSfK8Vyq1+u MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CH0PR12MB5252.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: c0fa15cc-bbf3-48ca-aec5-08dcfdbcb753 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Nov 2024 17:10:15.0873 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: mtLNnrhEhWTePhq71hgioK4c5WKTkiqRU1/a9PSfV9uqbUxi3ZXYf6sZPWsy5w9E1OgCK81Sz6gGR2jXoTK/Tw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR12MB6757 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org We are working on a patch to improve the codegen for the following test case: uint64x2_t foo (uint64x2_t r) { uint32x4_t a = vreinterpretq_u32_u64 (r); uint32_t t; t = a[0]; a[0] = a[1]; a[1] = t; t = a[2]; a[2] = a[3]; a[3] = t; return vreinterpretq_u64_u32 (a); } that GCC currently compiles to (-O1): foo: mov v31.16b, v0.16b ins v0.s[0], v0.s[1] ins v0.s[1], v31.s[0] ins v0.s[2], v31.s[3] ins v0.s[3], v31.s[2] ret whereas LLVM produces the preferable sequence foo: rev64 v0.4s, v0.4s ret On gimple level, we currently have: _1 = VIEW_CONVERT_EXPR(r_3(D)); t_4 = BIT_FIELD_REF ; a_5 = VEC_PERM_EXPR <_1, _1, { 1, 1, 2, 3 }>; a_6 = BIT_INSERT_EXPR ; t_7 = BIT_FIELD_REF ; _2 = BIT_FIELD_REF ; a_8 = BIT_INSERT_EXPR ; a_9 = BIT_INSERT_EXPR ; _10 = VIEW_CONVERT_EXPR(a_9); return _10; whereas the desired sequence is: _1 = VIEW_CONVERT_EXPR(r_2(D)); a_3 = VEC_PERM_EXPR <_1, _1, { 1, 0, 3, 2 }>; _4 = VIEW_CONVERT_EXPR(a_3); return _4; If we remove the casts from the test case, the forwprop1 dump shows that a series of match.pd is applied (repeatedly, only showing the first iteration here): Applying pattern match.pd:10881, gimple-match-1.cc:25213 Applying pattern match.pd:11099, gimple-match-1.cc:25714 Applying pattern match.pd:9549, gimple-match-1.cc:24274 gimple_simplified to a_7 = VEC_PERM_EXPR ; The reason why these patterns cannot be applied with casts seems to be the failing types_match (@0, @1) in the following pattern: /* Simplify vector inserts of other vector extracts to a permute. */ (simplify (bit_insert @0 (BIT_FIELD_REF@2 @1 @rsize @rpos) @ipos) (if (VECTOR_TYPE_P (type) && (VECTOR_MODE_P (TYPE_MODE (type)) || optimize_vectors_before_lowering_p ()) && types_match (@0, @1) && types_match (TREE_TYPE (TREE_TYPE (@0)), TREE_TYPE (@2)) && TYPE_VECTOR_SUBPARTS (type).is_constant () && multiple_p (wi::to_poly_offset (@rpos), wi::to_poly_offset (TYPE_SIZE (TREE_TYPE (type))))) (with { [...] } (if (!VECTOR_MODE_P (TYPE_MODE (type)) || can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), sel, false)) (vec_perm @0 @1 { vec_perm_indices_to_tree (build_vector_type (ssizetype, nunits), sel); }))))) The types_match fails, because the following pattern has already removed the view_convert expression, thereby changing the type of @0: (simplify (BIT_FIELD_REF (view_convert @0) @1 @2) [...] (BIT_FIELD_REF @0 @1 @2))) One attempt to make the types_match true was to add a single_use flag to the view_convert expression in the pattern above, preventing it from being applied. While this actually fixed the test case and produced the intended instruction sequence, it caused another test to fail that relies on application of the pattern with multiple use of the view_convert expression (gcc.target/i386/vect-strided-3.c). Hence, the RFC: How can we make the types_match work with view_convert expressions in the arguments? Signed-off-by: Jennifer Schmitz --- gcc/match.pd | 7 ++++--- gcc/testsuite/gcc.dg/tree-ssa/pr117093.c | 17 +++++++++++++++++ 2 files changed, 21 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr117093.c diff --git a/gcc/match.pd b/gcc/match.pd index 9107e6a95ca..d7957177027 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -9357,9 +9357,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) (simplify - (BIT_FIELD_REF (view_convert @0) @1 @2) - (if (! INTEGRAL_TYPE_P (TREE_TYPE (@0)) - || type_has_mode_precision_p (TREE_TYPE (@0))) + (BIT_FIELD_REF (view_convert@3 @0) @1 @2) + (if ((! INTEGRAL_TYPE_P (TREE_TYPE (@0)) + || type_has_mode_precision_p (TREE_TYPE (@0))) + && single_use (@3)) (BIT_FIELD_REF @0 @1 @2))) (simplify diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117093.c b/gcc/testsuite/gcc.dg/tree-ssa/pr117093.c new file mode 100644 index 00000000000..0fea32919dd --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117093.c @@ -0,0 +1,17 @@ +/* { dg-final { check-function-bodies "**" "" } } */ +/* { dg-options "-O1" } */ + +#include + +/* +** foo: +** rev64 v0\.4s, v0\.4s +** ret +*/ +uint64x2_t foo (uint64x2_t r) { + uint32x4_t a = vreinterpretq_u32_u64 (r); + uint32_t t; + t = a[0]; a[0] = a[1]; a[1] = t; + t = a[2]; a[2] = a[3]; a[3] = t; + return vreinterpretq_u64_u32 (a); +}