From patchwork Thu Jan 11 13:49:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6ZKf5bGF5ZOy?= X-Patchwork-Id: 1885616 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4T9mJd6hXvz1yPm for ; Fri, 12 Jan 2024 00:50:25 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E1FC3385770D for ; Thu, 11 Jan 2024 13:50:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id F2DCD3857C5B for ; Thu, 11 Jan 2024 13:49:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F2DCD3857C5B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F2DCD3857C5B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704981000; cv=none; b=CBWCvCojc4FAl9yyonzJhcUnyEQ2zzQHhv20snLkX/JyfnhuGllhBt4ss4WurnRyPqHIyqqayye3ja9UnKaP1ra3P/kbSyt130/J5W5urCCCDtidFogQXUsX4mU2cPh8SQOsy/ys9Vu1A60GWdszEgDhPE4ZA3RbhbkZC39asVU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704981000; c=relaxed/simple; bh=VD9Hm7WMbudZ1RHBpTg+WFOmO7oqaVLr8crDeSJHyX4=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=oH9CXEMKt00Z9Yz2j1H2hszxF7Xwg24rKfAY95X6mj+Hm5Ir1YPy+obt/o1WZs0UCJCZOuHeV7Yuah2ZCVSrPuFBjERMzpQpBsXWyJPzhyHz91iL5bYoLfeuhYRiplWmUaWLfMf1/xPaPDHbXaEXnIWQn/90sQXChNUtTcLubA8= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp80t1704980990tpnejsyq Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 11 Jan 2024 21:49:49 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: +ynUkgUhZJkHkpG2NzlEnTYtoIhy7jH4Wb55FB5mDNKELWCIOCQxE3K9kLAUY 0XUevW7CygsZ2gh+2gjcVlCLhYvG22OWaamp0pEV1CR1/s4lJaQEKBj2yMy11U15+JuSwrk 74CK/YK5aquQV/Zr5d0v5r6W95HEXr5le5OOrh+t+S30uiuU3pxBR5EFi+BfYPTgkEVryw/ Snr3ILXPRcaMC9MoGT9Eqk8mzfHjs9xZCSSBPGGCooI7XW2J7vmOaq5ZdUcGnzFJqvTc2fG XAQTZpqMd01MKW/sd2noD9hQllStpMdMJcZJxWpTlpa8IxNDU3ryac67K+ECdO3UsuQ4crW 2vHyAYPSa9EicPYWU7PBtSMdXiNXpbsxsj9Jg796cnUskn+oaVqCKkEucmxHZ3iZ63UabfZ FOdiZoZpjI+uF0SmvvLeIw== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 15814180417614200845 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH V2] RISC-V: Adjust scalar_to_vec cost accurately Date: Thu, 11 Jan 2024 21:49:48 +0800 Message-Id: <20240111134948.3112510-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org 1. This patch set scalar_to_vec cost as 2 instead 1 since scalar move instruction is slightly more costly than normal rvv instructions (e.g. vadd.vv). 2. Adjust scalar_to_vec cost accurately according to the splat value, for example, a value like 32872, needs 2 more scalar instructions: so the cost = 2 (scalar instructions) + 2 (scalar move). We adjust the cost like this since it doesn need such many instructions in vectorized codes, wheras they are not needed in scalar codes. After this patch, no matter -march=rv64gcv_zvl256b or -march=rv64gcv_zvl4096b. We have optimal codgen: lui a5,%hi(a) li a4,19 sb a4,%lo(a)(a5) li a0,0 ret PR target/113281 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Adjust scalar_to_vec cost accurately. (costs::add_stmt_cost): Ditto. * config/riscv/riscv.cc: Ditto. * config/riscv/t-riscv: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113209.c: Adapt test. * gcc.target/riscv/rvv/autovec/zve32f-1.c: Ditto. * gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c: New test. --- gcc/config/riscv/riscv-vector-costs.cc | 50 ++++++++++++++++++- gcc/config/riscv/riscv.cc | 4 +- gcc/config/riscv/t-riscv | 2 +- .../vect/costmodel/riscv/rvv/pr113281-1.c | 18 +++++++ .../vect/costmodel/riscv/rvv/pr113281-2.c | 18 +++++++ .../gcc.target/riscv/rvv/autovec/pr113209.c | 2 +- .../gcc.target/riscv/rvv/autovec/zve32f-1.c | 2 +- 7 files changed, 90 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 58ec0b9b503..fc377435e53 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. If not see #include "backend.h" #include "tree-data-ref.h" #include "tree-ssa-loop-niter.h" +#include "emit-rtl.h" /* This file should be included last. */ #include "riscv-vector-costs.h" @@ -1055,6 +1056,50 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Adjust vectorization cost after calling + targetm.vectorize.builtin_vectorization_cost. For some statement, we would + like to further fine-grain tweak the cost on top of + targetm.vectorize.builtin_vectorization_cost handling which doesn't have any + information on statement operation codes etc. */ + +static unsigned +adjust_stmt_cost (enum vect_cost_for_stmt kind, + struct _stmt_vec_info *stmt_info, int count, int stmt_cost) +{ + gimple *stmt = stmt_info->stmt; + switch (kind) + { + case scalar_to_vec: { + stmt_cost *= count; + gcall *call = dyn_cast (stmt); + /* Adjust cost by counting the scalar value initialization. */ + unsigned int num + = call ? gimple_call_num_args (call) : gimple_num_ops (stmt); + unsigned int start = call ? 0 : 1; + + for (unsigned int i = start; i < num; i++) + { + tree op = call ? gimple_call_arg (call, i) : gimple_op (stmt, i); + if (TREE_CODE (op) == INTEGER_CST) + { + HOST_WIDE_INT value = tree_fits_shwi_p (op) ? tree_to_shwi (op) + : tree_to_uhwi (op); + /* We don't need to count scalar costs if it + is in range of [-16, 15] since we can use + vmv.v.i. */ + if (!IN_RANGE (value, -16, 15)) + stmt_cost += riscv_const_insns (gen_int_mode (value, Pmode)); + } + /* TODO: We don't count CONST_POLY_INT value for now. */ + } + return stmt_cost; + } + default: + break; + } + return count * stmt_cost; +} + unsigned costs::add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_vec_info stmt_info, slp_tree, tree vectype, @@ -1082,9 +1127,12 @@ costs::add_stmt_cost (int count, vect_cost_for_stmt kind, as one iteration of the VLA loop. */ if (where == vect_body && m_unrolled_vls_niters) m_unrolled_vls_stmts += count * m_unrolled_vls_niters; + + if (vectype) + stmt_cost = adjust_stmt_cost (kind, stmt_info, count, stmt_cost); } - return record_stmt_cost (stmt_info, where, count * stmt_cost); + return record_stmt_cost (stmt_info, where, stmt_cost); } void diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index df9799d9c5e..a14fb36817a 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -366,7 +366,7 @@ static const common_vector_cost rvv_vls_vector_cost = { 1, /* gather_load_cost */ 1, /* scatter_store_cost */ 1, /* vec_to_scalar_cost */ - 1, /* scalar_to_vec_cost */ + 2, /* scalar_to_vec_cost */ 1, /* permute_cost */ 1, /* align_load_cost */ 1, /* align_store_cost */ @@ -382,7 +382,7 @@ static const scalable_vector_cost rvv_vla_vector_cost = { 1, /* gather_load_cost */ 1, /* scatter_store_cost */ 1, /* vec_to_scalar_cost */ - 1, /* scalar_to_vec_cost */ + 2, /* scalar_to_vec_cost */ 1, /* permute_cost */ 1, /* align_load_cost */ 1, /* align_store_cost */ diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index 32de6b851c1..fb2bf1c155f 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -73,7 +73,7 @@ riscv-vector-costs.o: $(srcdir)/config/riscv/riscv-vector-costs.cc \ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TARGET_H) $(FUNCTION_H) \ $(TREE_H) basic-block.h $(RTL_H) gimple.h targhooks.h cfgloop.h \ fold-const.h $(TM_P_H) tree-vectorizer.h gimple-iterator.h bitmap.h \ - ssa.h backend.h tree-data-ref.h tree-ssa-loop-niter.h \ + ssa.h backend.h tree-data-ref.h tree-ssa-loop-niter.h emit-rtl.h \ $(srcdir)/config/riscv/riscv-vector-costs.h $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/riscv/riscv-vector-costs.cc diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c new file mode 100644 index 00000000000..fdf6ed0334b --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 -ftree-vectorize" } */ + +unsigned char a; + +int main() { + short b = a = 0; + for (; a != 19; a++) + if (a) + b = 32872 >> a; + + if (b == 0) + return 0; + else + return 1; +} + +/* { dg-final { scan-assembler-not {vset} } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c new file mode 100644 index 00000000000..706e19116c9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl4096b -mabi=lp64d -O3 -ftree-vectorize --param=riscv-autovec-lmul=m8" } */ + +unsigned char a; + +int main() { + short b = a = 0; + for (; a != 19; a++) + if (a) + b = 32872 >> a; + + if (b == 0) + return 0; + else + return 1; +} + +/* { dg-final { scan-assembler-not {vset} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c index 081ee369394..70aae151000 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3" } */ +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 -fno-vect-cost-model" } */ int b, c, d, f, i, a; int e[1] = {0}; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c index ab57e89b1cd..3a00327dfed 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f-1.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 2 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */