diff mbox series

[RS6000] altivec style lvx/stvx addresses vs power10

Message ID 20201023064508.GS4898@bubble.grove.modra.org
State New
Headers show
Series [RS6000] altivec style lvx/stvx addresses vs power10 | expand

Commit Message

Alan Modra Oct. 23, 2020, 6:45 a.m. UTC
gcc.target/powerpc/fold-vec-st-pixel.c and other testcases fail on
power10, generating
	addi 9,5,12
	rldicr 9,9,0,59
	stxv 34,0(9)
rather than
	addi 5,5,12
	stvx 2,0,5
for an altivec lvx/stvx style address.

The problem starts with fwprop creating
(insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ])
                    (const_int 12 [0xc]))
                (const_int -16 [0xfffffffffffffff0])) [0 MEM <vector(8) short int> [(void *)_4 & -16B]+0 S16 A128])
        (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit}
which is finally thrown out as invalid by lra.  lra of course does that
by reloading the entire address.

fwprop creates the invalid address due to rs6000_legitimate_address_p
trimming off the outer AND of altivec style addresses before applying
other predicates.  address_is_prefixed then allows the inner address.

Now at the time the AND stripping was added (git commit 850e8d3d56d),
rs6000_legitimate_address looked a lot simpler.  This patch allows
through just those addresses that were legitimate in those simpler
days.  (legitimate_small_data_p, legitimate_constant_pool_address_p,
legitimate_lo_sum_address_p, and legitimate_offset_address_p did get
to look at the inside of an AND address, but I'm fairly certain
small_data, constant_pool and lo_sum addresses would never be wrapped
in an AND, and offset_address_p of that time excluded altivec modes.)

Regstrapped powerpc64le-linux power8 and power10, with tests on
powerpc64-linux -m64/-m32 still running.  OK assuming they pass?

	* config/rs6000/rs6000.c (rs6000_legitimate_address_p): Limit
	AND addressing to just lvx/stvx style addresses.

Comments

Segher Boessenkool Oct. 23, 2020, 4:31 p.m. UTC | #1
Hi!

On Fri, Oct 23, 2020 at 05:15:08PM +1030, Alan Modra wrote:
> The problem starts with fwprop creating
> (insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ])
>                     (const_int 12 [0xc]))
>                 (const_int -16 [0xfffffffffffffff0])) [0 MEM <vector(8) short int> [(void *)_4 & -16B]+0 S16 A128])
>         (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit}
> which is finally thrown out as invalid by lra.  lra of course does that
> by reloading the entire address.

While it could/should just reload that 12 into a reg :-(  Could you
investigate doing that?  Is there any reason we should not?

> Now at the time the AND stripping was added (git commit 850e8d3d56d),
> rs6000_legitimate_address looked a lot simpler.  This patch allows
> through just those addresses that were legitimate in those simpler
> days.  (legitimate_small_data_p, legitimate_constant_pool_address_p,
> legitimate_lo_sum_address_p, and legitimate_offset_address_p did get
> to look at the inside of an AND address, but I'm fairly certain
> small_data, constant_pool and lo_sum addresses would never be wrapped
> in an AND, and offset_address_p of that time excluded altivec modes.)
> 
> Regstrapped powerpc64le-linux power8 and power10, with tests on
> powerpc64-linux -m64/-m32 still running.  OK assuming they pass?

I would be much happier if this was performance tested.  But, it is okay
for trunk.  Thanks!


Segher
Alan Modra Oct. 24, 2020, 4:19 a.m. UTC | #2
Hi Segher,

On Fri, Oct 23, 2020 at 11:31:02AM -0500, Segher Boessenkool wrote:
> On Fri, Oct 23, 2020 at 05:15:08PM +1030, Alan Modra wrote:
> > The problem starts with fwprop creating
> > (insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ])
> >                     (const_int 12 [0xc]))
> >                 (const_int -16 [0xfffffffffffffff0])) [0 MEM <vector(8) short int> [(void *)_4 & -16B]+0 S16 A128])
> >         (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit}
> > which is finally thrown out as invalid by lra.  lra of course does that
> > by reloading the entire address.
> 
> While it could/should just reload that 12 into a reg :-(  Could you
> investigate doing that?  Is there any reason we should not?

My first thought was to write the necessary reloads too, but 
I think it's better to not generate the invalid addressing in the
first place.  Hiding the +12 in an invalid memory address means it
isn't exposed to optimization passes.
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2f0ca7af030..7fd80fb75ff 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -9128,15 +9128,21 @@  rs6000_legitimate_address_p (machine_mode mode, rtx x, bool reg_ok_strict)
   bool reg_offset_p = reg_offset_addressing_ok_p (mode);
   bool quad_offset_p = mode_supports_dq_form (mode);
 
-  /* If this is an unaligned stvx/ldvx type address, discard the outer AND.  */
+  if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x))
+    return 0;
+
+  /* Handle unaligned altivec lvx/stvx type addresses.  */
   if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode)
       && GET_CODE (x) == AND
       && CONST_INT_P (XEXP (x, 1))
       && INTVAL (XEXP (x, 1)) == -16)
-    x = XEXP (x, 0);
+    {
+      x = XEXP (x, 0);
+      return (legitimate_indirect_address_p (x, reg_ok_strict)
+	      || legitimate_indexed_address_p (x, reg_ok_strict)
+	      || virtual_stack_registers_memory_p (x));
+    }
 
-  if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x))
-    return 0;
   if (legitimate_indirect_address_p (x, reg_ok_strict))
     return 1;
   if (TARGET_UPDATE