Message ID | 20201023064508.GS4898@bubble.grove.modra.org |
---|---|
State | New |
Headers | show |
Series | [RS6000] altivec style lvx/stvx addresses vs power10 | expand |
Hi! On Fri, Oct 23, 2020 at 05:15:08PM +1030, Alan Modra wrote: > The problem starts with fwprop creating > (insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ]) > (const_int 12 [0xc])) > (const_int -16 [0xfffffffffffffff0])) [0 MEM <vector(8) short int> [(void *)_4 & -16B]+0 S16 A128]) > (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit} > which is finally thrown out as invalid by lra. lra of course does that > by reloading the entire address. While it could/should just reload that 12 into a reg :-( Could you investigate doing that? Is there any reason we should not? > Now at the time the AND stripping was added (git commit 850e8d3d56d), > rs6000_legitimate_address looked a lot simpler. This patch allows > through just those addresses that were legitimate in those simpler > days. (legitimate_small_data_p, legitimate_constant_pool_address_p, > legitimate_lo_sum_address_p, and legitimate_offset_address_p did get > to look at the inside of an AND address, but I'm fairly certain > small_data, constant_pool and lo_sum addresses would never be wrapped > in an AND, and offset_address_p of that time excluded altivec modes.) > > Regstrapped powerpc64le-linux power8 and power10, with tests on > powerpc64-linux -m64/-m32 still running. OK assuming they pass? I would be much happier if this was performance tested. But, it is okay for trunk. Thanks! Segher
Hi Segher, On Fri, Oct 23, 2020 at 11:31:02AM -0500, Segher Boessenkool wrote: > On Fri, Oct 23, 2020 at 05:15:08PM +1030, Alan Modra wrote: > > The problem starts with fwprop creating > > (insn 9 4 0 2 (set (mem:V8HI (and:DI (plus:DI (reg/v/f:DI 121 [ vpp ]) > > (const_int 12 [0xc])) > > (const_int -16 [0xfffffffffffffff0])) [0 MEM <vector(8) short int> [(void *)_4 & -16B]+0 S16 A128]) > > (reg/v:V8HI 120 [ vp1 ])) "pixel.c":6:10 1237 {vsx_movv8hi_64bit} > > which is finally thrown out as invalid by lra. lra of course does that > > by reloading the entire address. > > While it could/should just reload that 12 into a reg :-( Could you > investigate doing that? Is there any reason we should not? My first thought was to write the necessary reloads too, but I think it's better to not generate the invalid addressing in the first place. Hiding the +12 in an invalid memory address means it isn't exposed to optimization passes.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 2f0ca7af030..7fd80fb75ff 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -9128,15 +9128,21 @@ rs6000_legitimate_address_p (machine_mode mode, rtx x, bool reg_ok_strict) bool reg_offset_p = reg_offset_addressing_ok_p (mode); bool quad_offset_p = mode_supports_dq_form (mode); - /* If this is an unaligned stvx/ldvx type address, discard the outer AND. */ + if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x)) + return 0; + + /* Handle unaligned altivec lvx/stvx type addresses. */ if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) && GET_CODE (x) == AND && CONST_INT_P (XEXP (x, 1)) && INTVAL (XEXP (x, 1)) == -16) - x = XEXP (x, 0); + { + x = XEXP (x, 0); + return (legitimate_indirect_address_p (x, reg_ok_strict) + || legitimate_indexed_address_p (x, reg_ok_strict) + || virtual_stack_registers_memory_p (x)); + } - if (TARGET_ELF && RS6000_SYMBOL_REF_TLS_P (x)) - return 0; if (legitimate_indirect_address_p (x, reg_ok_strict)) return 1; if (TARGET_UPDATE