diff mbox

, PR tqrget/78953, Fix power9 insn does not meet its constraints

Message ID 20170103214349.GA32146@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner Jan. 3, 2017, 9:43 p.m. UTC
In builting Spec 2006 with -mcpu=power9 with -O3, two of the benchmarks (gamess
and calculix) did not build due to an "insn does not match its constraints"
error.

(insn 2674 2673 2675 37 (parallel [
            (set (reg:SI 0 0 [985])
                (vec_select:SI (reg:V4SI 32 0 [orig:378 vect__50.42 ] [378])
                    (parallel [
                            (const_int 1 [0x1])
                        ])))
            (clobber (reg:SI 31 31 [986]))
        ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1184 {vsx_extract_v4si_p9}
     (expr_list:REG_UNUSED (reg:SI 31 31 [986])
        (nil)))

This insn was formed by vsx_extract_v4si_store_p9 splitting the following insn
after register allocation:

(insn 376 374 378 32 (parallel [
            (set (mem:SI (plus:DI (reg:DI 7 7 [orig:394 ivtmp.316 ] [394])
                        (const_int 112 [0x70])) [3 MEM[base: _399, offset: 112B]+0 S4 A32])
                (vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
                    (parallel [
                            (const_int 2 [0x2])
                        ])))
            (clobber (reg:SI 9 9 [675]))
            (clobber (reg:SI 10 10 [676]))
        ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1191 {*vsx_extract_v4si_store_p9}
     (nil))

It split it to:

(insn 968 381 969 32 (parallel [
            (set (reg:SI 44 12 [671])
                (vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
                    (parallel [
                            (const_int 0 [0])
                        ])))
            (clobber (scratch:SI))
        ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1185 {vsx_extract_v4si_p9}
     (nil))

Unfortunately, when it is splitting a word extract to be deposited into a GPR
register, it needs to use a traditional Altivec register.

The following patch fixes this:

[gcc]
2017-01-03  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/78953
	* config/rs6000/vsx.md (vsx_extract_<mode>_store_p9): If we are
	extracting SImode to a GPR register so that we can generate a
	store, limit the vector to be in a traditional Altivec register
	for the vextuwrx instruction.

[gcc/testsuite]
2017-01-03  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/78953
	* gcc.target/powerpc/pr78953.c: New test.

I did the usual bootstrap and make check with no regression on a little endinan
power8 system.  I also compiled the two Spec 2006 benchmarks that failed and
they now build.  Is this ok for the trunk?  It does not need to be applied to
GCC 6.x since the word extract optimization is new to GCC 7.

Comments

David Edelsohn Jan. 3, 2017, 9:56 p.m. UTC | #1
On Tue, Jan 3, 2017 at 4:43 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> In builting Spec 2006 with -mcpu=power9 with -O3, two of the benchmarks (gamess
> and calculix) did not build due to an "insn does not match its constraints"
> error.
>
> (insn 2674 2673 2675 37 (parallel [
>             (set (reg:SI 0 0 [985])
>                 (vec_select:SI (reg:V4SI 32 0 [orig:378 vect__50.42 ] [378])
>                     (parallel [
>                             (const_int 1 [0x1])
>                         ])))
>             (clobber (reg:SI 31 31 [986]))
>         ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1184 {vsx_extract_v4si_p9}
>      (expr_list:REG_UNUSED (reg:SI 31 31 [986])
>         (nil)))
>
> This insn was formed by vsx_extract_v4si_store_p9 splitting the following insn
> after register allocation:
>
> (insn 376 374 378 32 (parallel [
>             (set (mem:SI (plus:DI (reg:DI 7 7 [orig:394 ivtmp.316 ] [394])
>                         (const_int 112 [0x70])) [3 MEM[base: _399, offset: 112B]+0 S4 A32])
>                 (vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
>                     (parallel [
>                             (const_int 2 [0x2])
>                         ])))
>             (clobber (reg:SI 9 9 [675]))
>             (clobber (reg:SI 10 10 [676]))
>         ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1191 {*vsx_extract_v4si_store_p9}
>      (nil))
>
> It split it to:
>
> (insn 968 381 969 32 (parallel [
>             (set (reg:SI 44 12 [671])
>                 (vec_select:SI (reg:V4SI 32 0 [orig:355 vect__50.286 ] [355])
>                     (parallel [
>                             (const_int 0 [0])
>                         ])))
>             (clobber (scratch:SI))
>         ]) "SPOOLES/MSMD/src/MSMD_init.c":113 1185 {vsx_extract_v4si_p9}
>      (nil))
>
> Unfortunately, when it is splitting a word extract to be deposited into a GPR
> register, it needs to use a traditional Altivec register.
>
> The following patch fixes this:
>
> [gcc]
> 2017-01-03  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         PR target/78953
>         * config/rs6000/vsx.md (vsx_extract_<mode>_store_p9): If we are
>         extracting SImode to a GPR register so that we can generate a
>         store, limit the vector to be in a traditional Altivec register
>         for the vextuwrx instruction.
>
> [gcc/testsuite]
> 2017-01-03  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         PR target/78953
>         * gcc.target/powerpc/pr78953.c: New test.
>
> I did the usual bootstrap and make check with no regression on a little endinan
> power8 system.  I also compiled the two Spec 2006 benchmarks that failed and
> they now build.  Is this ok for the trunk?  It does not need to be applied to
> GCC 6.x since the word extract optimization is new to GCC 7.

Okay.

Thanks, David
diff mbox

Patch

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 243966)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -2628,7 +2628,7 @@  (define_insn_and_split "*vsx_extract_<mo
 (define_insn_and_split "*vsx_extract_<mode>_store_p9"
   [(set (match_operand:<VS_scalar> 0 "memory_operand" "=Z,m")
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "<VSX_EX>,<VSX_EX>")
+	 (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "<VSX_EX>,v")
 	 (parallel [(match_operand:QI 2 "const_int_operand" "n,n")])))
    (clobber (match_scratch:<VS_scalar> 3 "=<VSX_EX>,&r"))
    (clobber (match_scratch:SI 4 "=X,&r"))]
Index: gcc/testsuite/gcc.target/powerpc/pr78953.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr78953.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr78953.c	(working copy)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+#include <altivec.h>
+
+/* PR 78953: mem = vec_extract (V4SI, <n>) failed if the vector was in a
+   traditional FPR register.  */
+
+void
+foo (vector int *vp, int *ip)
+{
+  vector int v = *vp;
+  __asm__ (" # fpr %x0" : "+d" (v));
+  ip[4] = vec_extract (v, 0);
+}
+
+/* { dg-final { scan-assembler "xxextractuw\|vextuw\[lr\]x" } } */