From patchwork Thu Jan 4 08:27:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6ZKf5bGF5ZOy?= X-Patchwork-Id: 1882355 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4T5KVB0JRhz1ydd for ; Thu, 4 Jan 2024 19:28:18 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DD593385E44E for ; Thu, 4 Jan 2024 08:28:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id F3E5B3858284 for ; Thu, 4 Jan 2024 08:27:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F3E5B3858284 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F3E5B3858284 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704356860; cv=none; b=xTbY0yzZtZckEhfxfFKkShz0sNSzdqaTi2pcQEppKr8C5+gohIqChLpS0TvPhUOwuFxd7N3A7f8uPGNX2mUAMGK8TMLEX5ypMcVUcu/m74gFq/7amiZTIoF8k6q51aiKF6KaRSqJ7H5A+bAWkNXgsXsk/G0Wm95gvfqL22rBqck= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704356860; c=relaxed/simple; bh=FnZLuNHftJVqHWPaq9rDIamJr2gETG7AOQECVEZiU3A=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=b4BBQf10MyYDxZhyP3x7xikNX767gKVNmDbG5g4bvKRgwCzOBXt4JwU9koYcAn5neiLfbgPhbB7DneMK0dkYki1mfYH4y5H9DXajQUyip+pG7NaQo9SfFQOTbbqJpquCVGUV1gxlG6amNcVNphla15au4tzmxggESqjLJjKjz9g= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp75t1704356848txjobrkl Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 04 Jan 2024 16:27:27 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: cvpDInk2tjWNlA3rMTRS5qSLCGb8rlZ+HYeDlE0ykg8BA82X46adjVRXCO5vL IhPaca6WV74piuBCPLsQsUxnoOZyCLqfJrnplSIC/64kcBnl/3wiAefUZyPCobAE6JBPMNv qQ7g/CINsQ5Wjzo/3xA0qtXQ5n0ekhOYSLe7uuNeUmmzau2zkq0W5VrpvdHE7f5LlsAc0Gy 7IKQjpjJ3/rMAuj6vlZNpSDyfgpJPAbLzxz8Z8QMcBEpltUGsQuVAHpe3xzKccaP5zsw4p1 Sapq936NHf+ItTbSXONIp/gYVbwdhBguChNhd+FU96NXWnC6QS/VXhTCNWHZlasyMUoqkn3 vXyl/LTcki0SK6xdclb8kRUpjYT2eg1iGldAgWx5muxW2BFAdWICyIUrwGZ3DwPXvcdWIMx tFI13lSaY8Y= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 9458401375768173145 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Teach liveness estimation be aware of .vi variant Date: Thu, 4 Jan 2024 16:27:26 +0800 Message-Id: <20240104082726.16368-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Consider this following case: void f (int *restrict a, int *restrict b, int *restrict c, int *restrict d, int n) { for (int i = 0; i < n; i++) { int tmp = b[i] + 15; int tmp2 = tmp + b[i]; c[i] = tmp2 + b[i]; d[i] = tmp + tmp2 + b[i]; } } Current dynamic LMUL cost model choose LMUL = 4 because we count the "15" as consuming 1 vector register group which is not accurate. We teach the dynamic LMUL cost model be aware of the potential vi variant instructions transformation, so that we can choose LMUL = 8 according to more accurate cost model. After this patch: f: ble a4,zero,.L5 .L3: vsetvli a5,a4,e32,m8,ta,ma slli a0,a5,2 vle32.v v16,0(a1) vadd.vi v24,v16,15 vadd.vv v8,v24,v16 vadd.vv v0,v8,v16 vse32.v v0,0(a2) vadd.vv v8,v8,v24 vadd.vv v8,v8,v16 vse32.v v8,0(a3) add a1,a1,a0 add a2,a2,a0 add a3,a3,a0 sub a4,a4,a5 bne a4,zero,.L3 .L5: ret Tested on both RV32 and RV64 no regression. Ok for trunk ? gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (variable_vectorized_p): Teach vi variant. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-13.c: New test. --- gcc/config/riscv/riscv-vector-costs.cc | 30 ++++++-- .../costmodel/riscv/rvv/dynamic-lmul8-13.c | 74 +++++++++++++++++++ 2 files changed, 97 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-13.c diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 21f8a81c89c..7f083b04edd 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -255,6 +255,29 @@ variable_vectorized_p (stmt_vec_info stmt_info, tree var, bool lhs_p) return false; } } + else if (is_gimple_assign (stmt)) + { + tree_code tcode = gimple_assign_rhs_code (stmt); + /* vi variant doesn't need to allocate such statement. + E.g. tmp_15 = _4 + 1; will be transformed into vadd.vi + so the INTEGER_CST '1' doesn't need vector a register. */ + switch (tcode) + { + case PLUS_EXPR: + case BIT_IOR_EXPR: + case BIT_XOR_EXPR: + case BIT_AND_EXPR: + return TREE_CODE (var) != INTEGER_CST + || !IN_RANGE (tree_to_shwi (var), -16, 15); + case MINUS_EXPR: + return TREE_CODE (var) != INTEGER_CST + || !IN_RANGE (tree_to_shwi (var), -16, 15) + || gimple_assign_rhs1 (stmt) != var; + default: + break; + } + } + if (lhs_p) return is_gimple_reg (var) && (!POINTER_TYPE_P (TREE_TYPE (var)) @@ -331,13 +354,6 @@ compute_local_live_ranges ( for (i = 0; i < gimple_num_args (stmt); i++) { tree var = gimple_arg (stmt, i); - /* Both IMM and REG are included since a VECTOR_CST may be - potentially held in a vector register. However, it's not - accurate, since a PLUS_EXPR can be vectorized into vadd.vi - if IMM is -16 ~ 15. - - TODO: We may elide the cases that the unnecessary IMM in - the future. */ if (variable_vectorized_p (program_point.stmt_info, var, false)) { diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-13.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-13.c new file mode 100644 index 00000000000..baef4e39014 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-13.c @@ -0,0 +1,74 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize --param riscv-autovec-lmul=dynamic -fdump-tree-vect-details" } */ + +void +f (int *restrict a, int *restrict b, int *restrict c, int *restrict d, int n) +{ + for (int i = 0; i < n; i++) + { + int tmp = b[i] + 15; + int tmp2 = tmp + b[i]; + c[i] = tmp2 + b[i]; + d[i] = tmp + tmp2 + b[i]; + } +} + +void +f2 (int *restrict a, int *restrict b, int *restrict c, int *restrict d, int n) +{ + for (int i = 0; i < n; i++) + { + int tmp = 15 - b[i]; + int tmp2 = tmp * b[i]; + c[i] = tmp2 * b[i]; + d[i] = tmp * tmp2 * b[i]; + } +} + +void +f3 (int *restrict a, int *restrict b, int *restrict c, int *restrict d, int n) +{ + for (int i = 0; i < n; i++) + { + int tmp = b[i] & 15; + int tmp2 = tmp * b[i]; + c[i] = tmp2 * b[i]; + d[i] = tmp * tmp2 * b[i]; + } +} + +void +f4 (int *restrict a, int *restrict b, int *restrict c, int *restrict d, int n) +{ + for (int i = 0; i < n; i++) + { + int tmp = b[i] | 15; + int tmp2 = tmp * b[i]; + c[i] = tmp2 * b[i]; + d[i] = tmp * tmp2 * b[i]; + } +} + +void +f5 (int *restrict a, int *restrict b, int *restrict c, int *restrict d, int n) +{ + for (int i = 0; i < n; i++) + { + int tmp = b[i] ^ 15; + int tmp2 = tmp * b[i]; + c[i] = tmp2 * b[i]; + d[i] = tmp * tmp2 * b[i]; + } +} + +/* { dg-final { scan-assembler-times {e32,m8} 5 } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ +/* { dg-final { scan-assembler-not {jr} } } */ +/* { dg-final { scan-assembler-not {e32,m4} } } */ +/* { dg-final { scan-assembler-not {e32,m2} } } */ +/* { dg-final { scan-assembler-not {e32,m1} } } */ +/* { dg-final { scan-assembler-times {ret} 5 } } */ +/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 5 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 5 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 5 "vect" } } */