Message ID | 00b701da57c9$b7444a80$25ccdf80$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | [x86_64] PR target/113690: Fix-up MULT REG_EQUAL notes in STV. | expand |
On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > This patch fixes PR target/113690, an ICE-on-valid regression on x86_64 > that exhibits with a specific combination of command line options. The > cause is that x86's scalar-to-vector pass converts a chain of instructions > from TImode to V1TImode, but fails to appropriately update the attached > REG_EQUAL note. Given that multiplication isn't supported in V1TImode, > the REG_NOTE handling code wasn't expecting to see a MULT. Easily solved > with additional handling for other binary operators that may potentially > (in future) have an immediate constant as the second operand that needs > handling. For convenience, this code (re)factors the logic to convert > a TImode constant into a V1TImode constant vector into a subroutine and > reuses it. > > For the record, STV is actually doing something useful in this strange > testcase, GCC with -O2 -fno-dce -fno-forward-propagate > -fno-split-wide-types > -funroll-loops generates: > > foo: movl $v, %eax > pxor %xmm0, %xmm0 > movaps %xmm0, 48(%rax) > movaps %xmm0, (%rax) > movaps %xmm0, 16(%rax) > movaps %xmm0, 32(%rax) > ret > > With the addition of -mno-stv (to disable the patched code) it gives: > > foo: movl $v, %eax > movq $0, 48(%rax) > movq $0, 56(%rax) > movq $0, (%rax) > movq $0, 8(%rax) > movq $0, 16(%rax) > movq $0, 24(%rax) > movq $0, 32(%rax) > movq $0, 40(%rax) > ret > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2024-02-05 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > PR target/113690 > * config/i386/i386-features.cc (timode_convert_cst): New helper > function to convert a TImode CONST_SCALAR_INT_P to a V1TImode > CONST_VECTOR. > (timode_scalar_chain::convert_op): Use timode_convert_cst. > (timode_scalar_chain::convert_insn): If a REG_EQUAL note contains > a binary operator where the second operand is an immediate integer > constant, convert it to V1TImode using timode_convert_cst. > Use timode_convert_cst. > > gcc/testsuite/ChangeLog > PR target/113690 > * gcc.target/i386/pr113690.c: New test case. OK. Thanks, Uros. > > > Thanks in advance, > Roger > -- >
On Mon, Feb 5, 2024 at 9:06 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > > > > This patch fixes PR target/113690, an ICE-on-valid regression on x86_64 > > that exhibits with a specific combination of command line options. The > > cause is that x86's scalar-to-vector pass converts a chain of instructions > > from TImode to V1TImode, but fails to appropriately update the attached > > REG_EQUAL note. Given that multiplication isn't supported in V1TImode, > > the REG_NOTE handling code wasn't expecting to see a MULT. Easily solved > > with additional handling for other binary operators that may potentially > > (in future) have an immediate constant as the second operand that needs > > handling. For convenience, this code (re)factors the logic to convert > > a TImode constant into a V1TImode constant vector into a subroutine and > > reuses it. > > > > For the record, STV is actually doing something useful in this strange > > testcase, GCC with -O2 -fno-dce -fno-forward-propagate > > -fno-split-wide-types > > -funroll-loops generates: > > > > foo: movl $v, %eax > > pxor %xmm0, %xmm0 > > movaps %xmm0, 48(%rax) > > movaps %xmm0, (%rax) > > movaps %xmm0, 16(%rax) > > movaps %xmm0, 32(%rax) > > ret > > > > With the addition of -mno-stv (to disable the patched code) it gives: > > > > foo: movl $v, %eax > > movq $0, 48(%rax) > > movq $0, 56(%rax) > > movq $0, (%rax) > > movq $0, 8(%rax) > > movq $0, 16(%rax) > > movq $0, 24(%rax) > > movq $0, 32(%rax) > > movq $0, 40(%rax) > > ret > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32} > > with no new failures. Ok for mainline? > > > > > > 2024-02-05 Roger Sayle <roger@nextmovesoftware.com> > > > > gcc/ChangeLog > > PR target/113690 > > * config/i386/i386-features.cc (timode_convert_cst): New helper > > function to convert a TImode CONST_SCALAR_INT_P to a V1TImode > > CONST_VECTOR. > > (timode_scalar_chain::convert_op): Use timode_convert_cst. > > (timode_scalar_chain::convert_insn): If a REG_EQUAL note contains > > a binary operator where the second operand is an immediate integer > > constant, convert it to V1TImode using timode_convert_cst. > > Use timode_convert_cst. > > > > gcc/testsuite/ChangeLog > > PR target/113690 > > * gcc.target/i386/pr113690.c: New test case. > > OK. OTOH, how about we follow the approach from general_scalar_chain::convert_insn and just kill the note? Uros.
diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index 4020b27..90ada7d 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -1749,6 +1749,19 @@ timode_scalar_chain::fix_debug_reg_uses (rtx reg) } } +/* Helper function to convert immediate constant X to V1TImode. */ +static rtx +timode_convert_cst (rtx x) +{ + /* Prefer all ones vector in case of -1. */ + if (constm1_operand (x, TImode)) + return CONSTM1_RTX (V1TImode); + + rtx *v = XALLOCAVEC (rtx, 1); + v[0] = x; + return gen_rtx_CONST_VECTOR (V1TImode, gen_rtvec_v (1, v)); +} + /* Convert operand OP in INSN from TImode to V1TImode. */ void @@ -1775,18 +1788,8 @@ timode_scalar_chain::convert_op (rtx *op, rtx_insn *insn) } else if (CONST_SCALAR_INT_P (*op)) { - rtx vec_cst; rtx tmp = gen_reg_rtx (V1TImode); - - /* Prefer all ones vector in case of -1. */ - if (constm1_operand (*op, TImode)) - vec_cst = CONSTM1_RTX (V1TImode); - else - { - rtx *v = XALLOCAVEC (rtx, 1); - v[0] = *op; - vec_cst = gen_rtx_CONST_VECTOR (V1TImode, gen_rtvec_v (1, v)); - } + rtx vec_cst = timode_convert_cst (*op); if (!standard_sse_constant_p (vec_cst, V1TImode)) { @@ -1830,12 +1833,28 @@ timode_scalar_chain::convert_insn (rtx_insn *insn) tmp = find_reg_equal_equiv_note (insn); if (tmp) { - if (GET_MODE (XEXP (tmp, 0)) == TImode) - PUT_MODE (XEXP (tmp, 0), V1TImode); - else if (CONST_SCALAR_INT_P (XEXP (tmp, 0))) - XEXP (tmp, 0) - = gen_rtx_CONST_VECTOR (V1TImode, - gen_rtvec (1, XEXP (tmp, 0))); + rtx expr = XEXP (tmp, 0); + if (GET_MODE (expr) == TImode) + { + PUT_MODE (expr, V1TImode); + switch (GET_CODE (expr)) + { + case PLUS: + case MINUS: + case MULT: + case AND: + case IOR: + case XOR: + if (CONST_SCALAR_INT_P (XEXP (expr, 1))) + XEXP (expr, 1) = timode_convert_cst (XEXP (expr, 1)); + break; + + default: + break; + } + } + else if (CONST_SCALAR_INT_P (expr)) + XEXP (tmp, 0) = timode_convert_cst (expr); } } break; @@ -1876,7 +1895,7 @@ timode_scalar_chain::convert_insn (rtx_insn *insn) } else { - src = gen_rtx_CONST_VECTOR (V1TImode, gen_rtvec (1, src)); + src = timode_convert_cst (src); src = validize_mem (force_const_mem (V1TImode, src)); use_move = MEM_P (dst); } diff --git a/gcc/testsuite/gcc.target/i386/pr113690.c b/gcc/testsuite/gcc.target/i386/pr113690.c new file mode 100644 index 0000000..23a1108 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr113690.c @@ -0,0 +1,12 @@ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O2 -fno-dce -fno-forward-propagate -fno-split-wide-types -funroll-loops" } */ +int i; +__attribute__((__vector_size__(64))) __int128 v; + +void +foo(void) +{ + v <<= 127; + __builtin_mul_overflow(0, i, &v[3]); + v *= 6; +}