From patchwork Wed Oct 11 12:27:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6ZKf5bGF5ZOy?= X-Patchwork-Id: 1846698 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4S5Bql2c6vz1yqN for ; Wed, 11 Oct 2023 23:27:45 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3C39F385696A for ; Wed, 11 Oct 2023 12:27:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr2.qq.com (smtpbgbr2.qq.com [54.207.22.56]) by sourceware.org (Postfix) with ESMTPS id D172B385842C for ; Wed, 11 Oct 2023 12:27:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D172B385842C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp75t1697027238ti44kpeo Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 11 Oct 2023 20:27:16 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: +ynUkgUhZJnFwpbqh2dCsNVpoK/kLR4DqqewEZoCXmMFua2EVjo0v7z2vZWGL TmD5q5RAU/QVzNyWTO1w+VgR54t1NlBT9vv/bQoa48TDYB4rrTTwYhHHMcDqRVno4bk/qAI k3ITxfuDLZeWZeF/Tp35VutcuxCGG2cfR2nOvdI7BoCq30ZiyZMdc/ivtVFqGP6YOC+lApf cJXIBR1K0ro7lUUj0t6sijUa4H+nGq2ocLKQOA4B+XmiRQb85NkYFm3V1Ss3eiXMaSZ8taz tycv9S5pX7QUmcXS6iD8iBBXbT1dQCyCXx79VVSnp6q2NGgBT2bJ7kDyYdF2/ZfcV61FLQo GV/af3p9CPKeRh0Jh8VrSjOmM63Pe9RAX06IUWt0gczj4eNgTnWxWQ+STvybNQb7pCuneRv 6nqcYcxJhl4= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 1963470913281899359 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, Juzhe-Zhong Subject: [PATCH] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721] Date: Wed, 11 Oct 2023 20:27:15 +0800 Message-Id: <20231011122715.3017753-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch fixes this following FAILs in RISC-V regression: FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts" FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts" FAIL: gcc.dg/vect/vect-gather-3.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts" FAIL: gcc.dg/vect/vect-gather-3.c scan-tree-dump vect "Loop contains only SLP stmts" The root cause of these FAIL is that GCC SLP failed on MASK_LEN_GATHER_LOAD. Since for RVV, we build MASK_LEN_GATHER_LOAD with dummy mask (-1) in tree-vect-patterns.cc if it is same situation as GATHER_LOAD (no conditional mask). So we make MASK_LEN_GATHER_LOAD leverage the flow of GATHER_LOAD if mask argument is a dummy mask. gcc/ChangeLog: * tree-vect-slp.cc (vect_get_operand_map): (vect_build_slp_tree_1): (vect_build_slp_tree_2): * tree-vect-stmts.cc (vectorizable_load): --- gcc/tree-vect-slp.cc | 18 ++++++++++++++++-- gcc/tree-vect-stmts.cc | 4 ++-- 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index fa098f9ff4e..712c04ec278 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -544,6 +544,17 @@ vect_get_operand_map (const gimple *stmt, unsigned char swap = 0) case IFN_MASK_GATHER_LOAD: return arg1_arg4_map; + case IFN_MASK_LEN_GATHER_LOAD: + /* In tree-vect-patterns.cc, we will have these 2 situations: + + - Unconditional gather load transforms + into MASK_LEN_GATHER_LOAD with dummy mask which is -1. + + - Conditional gather load transforms + into MASK_LEN_GATHER_LOAD with real conditional mask.*/ + return integer_minus_onep (gimple_call_arg (call, 4)) ? arg1_map + : nullptr; + case IFN_MASK_STORE: return arg3_arg2_map; @@ -1077,7 +1088,8 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap, if (cfn == CFN_MASK_LOAD || cfn == CFN_GATHER_LOAD - || cfn == CFN_MASK_GATHER_LOAD) + || cfn == CFN_MASK_GATHER_LOAD + || cfn == CFN_MASK_LEN_GATHER_LOAD) ldst_p = true; else if (cfn == CFN_MASK_STORE) { @@ -1337,6 +1349,7 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap, if (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)) && rhs_code != CFN_GATHER_LOAD && rhs_code != CFN_MASK_GATHER_LOAD + && rhs_code != CFN_MASK_LEN_GATHER_LOAD /* Not grouped loads are handled as externals for BB vectorization. For loop vectorization we can handle splats the same we handle single element interleaving. */ @@ -1837,7 +1850,8 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node, if (gcall *stmt = dyn_cast (stmt_info->stmt)) gcc_assert (gimple_call_internal_p (stmt, IFN_MASK_LOAD) || gimple_call_internal_p (stmt, IFN_GATHER_LOAD) - || gimple_call_internal_p (stmt, IFN_MASK_GATHER_LOAD)); + || gimple_call_internal_p (stmt, IFN_MASK_GATHER_LOAD) + || gimple_call_internal_p (stmt, IFN_MASK_LEN_GATHER_LOAD)); else { *max_nunits = this_max_nunits; diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index cd7c1090d88..263acf5d3cd 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -9575,9 +9575,9 @@ vectorizable_load (vec_info *vinfo, return false; mask_index = internal_fn_mask_index (ifn); - if (mask_index >= 0 && slp_node) + if (mask_index >= 0 && slp_node && internal_fn_len_index (ifn) < 0) mask_index = vect_slp_child_index_for_operand (call, mask_index); - if (mask_index >= 0 + if (mask_index >= 0 && internal_fn_len_index (ifn) < 0 && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index, &mask, NULL, &mask_dt, &mask_vectype)) return false;