diff mbox series

[2/2] RISC-V: Optimize the cost of the LO_SUM expression.

Message ID 20240916081121.29707-1-cooper.qu@linux.alibaba.com
State New
Headers show
Series [1/2] RISC-V: Fix the outer_code when calculating the cost of SET expression. | expand

Commit Message

Xianmiao Qu Sept. 16, 2024, 8:11 a.m. UTC
Currently, the cost of the LO_SUM expression is based on
the cost of calculating the first subexpression. When the
first subexpression is a register, the cost result will
be zero. It seems a bit unreasonable for a SET expression
to have a zero cost when its source is LO_SUM. Moreover,
having a cost of zero for the expression will lead the
loop invariant pass to calculate its benefits of being
moved outside the loop as zero, thus preventing the
out-of-loop placement of the loop invariant.

As an example, consider the following test case:
  long a;
  long b[];
  long *c;
  foo () {
    for (;;)
      *c = b[a];
  }

When compiling with -march=rv64gc -mabi=lp64d -Os, the following code is generated:
        .cfi_startproc
        lui     a5,%hi(c)
        ld      a4,%lo(c)(a5)
        lui     a2,%hi(b)
        lui     a1,%hi(a)
.L2:
        ld      a5,%lo(a)(a1)
        addi    a3,a2,%lo(b)
        slli    a5,a5,3
        add     a5,a5,a3
        ld      a5,0(a5)
        sd      a5,0(a4)
        j       .L2

After adjust the cost of the LO_SUM expression, the instruction addi will be
moved outside the loop:
        .cfi_startproc
        lui     a5,%hi(c)
        ld      a3,%lo(c)(a5)
        lui     a4,%hi(b)
        lui     a2,%hi(a)
        addi    a4,a4,%lo(b)
.L2:
        ld      a5,%lo(a)(a2)
        slli    a5,a5,3
        add     a5,a5,a4
        ld      a5,0(a5)
        sd      a5,0(a3)
        j       .L2

gcc/
	* config/riscv/riscv.cc (riscv_rtx_costs): Optimize
	the cost of the LO_SUM expression.
---
 gcc/config/riscv/riscv.cc | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Jeff Law Sept. 18, 2024, 1:42 p.m. UTC | #1
On 9/16/24 2:11 AM, Xianmiao Qu wrote:
> Currently, the cost of the LO_SUM expression is based on
> the cost of calculating the first subexpression.
And that's probably the real bug here.  THe first subexpression is 
typically going to be a REG and is not interesting from a costing 
standpoint.  What is far more interesting is the second subexpression 
which is typically going to be a symbolic constant which can have 
varying costs.


> 
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 1b4a5c39c667..cb9692dad715 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -3994,7 +3994,10 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN
>         return false;
>   
>       case LO_SUM:
> -      *total = set_src_cost (XEXP (x, 0), mode, speed);
> +      if (outer_code == SET && REG_P (XEXP (x, 0)))
> +	*total = COSTS_N_INSNS (1);
> +      else
> +	*total = set_src_cost (XEXP (x, 0), mode, speed);
>         return true;
So rather than what you've done, I'd revert to the original form and 
change XEXP (x, 0) to XEXP (x, 1).  Or better yet add the two costs 
together.  Something like this:

*total = (set_src_cost (XEXP (x, 0), mode, speed)
           + set_src_cost (XEXP (x, 1), mode, speed));

One of the "quirks" of the costing interfaces is they can be passed RTL 
which won't match any backend pattern and sometimes computing costs of 
such RTL is useful.  By adding the cost of the two subexpresions we 
should handle that case more gracefully.



jeff
diff mbox series

Patch

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 1b4a5c39c667..cb9692dad715 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3994,7 +3994,10 @@  riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN
       return false;
 
     case LO_SUM:
-      *total = set_src_cost (XEXP (x, 0), mode, speed);
+      if (outer_code == SET && REG_P (XEXP (x, 0)))
+	*total = COSTS_N_INSNS (1);
+      else
+	*total = set_src_cost (XEXP (x, 0), mode, speed);
       return true;
 
     case LT: