Message ID | alpine.LFD.2.21.2011232039080.656242@eddie.linux-mips.org |
---|---|
State | Superseded |
Headers | show |
Series | [v2,01/31] PR target/58901: reload: Handle SUBREG of MEM with a mode-dependent address | expand |
> First posted at: <https://gcc.gnu.org/ml/gcc/2014-06/msg00060.html>. > > 2020-11-24 Matt Thomas <matt@3am-software.com> > Maciej W. Rozycki <macro@linux-mips.org> > > gcc/ > PR target/58901 > * reload.c (push_reload): Also reload the inner expression of a > SUBREG for pseudos associated with a mode-dependent memory > reference. > (find_reloads): Force a reload likewise. Thanks for pursuing this. Although the duplication is unfortunate, I think that this version is more in line with the existing processing so I would go for it. You probably need a formal ack from Ulrich(?) or Jeff though.
On Tue, 24 Nov 2020, Eric Botcazou wrote: > > First posted at: <https://gcc.gnu.org/ml/gcc/2014-06/msg00060.html>. > > > > 2020-11-24 Matt Thomas <matt@3am-software.com> > > Maciej W. Rozycki <macro@linux-mips.org> > > > > gcc/ > > PR target/58901 > > * reload.c (push_reload): Also reload the inner expression of a > > SUBREG for pseudos associated with a mode-dependent memory > > reference. > > (find_reloads): Force a reload likewise. > > Thanks for pursuing this. Although the duplication is unfortunate, I think > that this version is more in line with the existing processing so I would go > for it. You probably need a formal ack from Ulrich(?) or Jeff though. This has completed verification now with good results. OK to apply then? That would let the whole series through once it's gone through final verification (now ongoing; ETC: Mon, Nov 30th). Maciej
On Thu, 26 Nov 2020, Maciej W. Rozycki wrote: > > > PR target/58901 > > > * reload.c (push_reload): Also reload the inner expression of a > > > SUBREG for pseudos associated with a mode-dependent memory > > > reference. > > > (find_reloads): Force a reload likewise. > > > > Thanks for pursuing this. Although the duplication is unfortunate, I think > > that this version is more in line with the existing processing so I would go > > for it. You probably need a formal ack from Ulrich(?) or Jeff though. > > This has completed verification now with good results. > > OK to apply then? That would let the whole series through once it's gone > through final verification (now ongoing; ETC: Mon, Nov 30th). FAOD verification included bootstraps with native `powerpc64le-linux-gnu' and `x86_64-linux-gnu' systems, a cross-build of a `vax-netbsdelf' target compiler with a `powerpc64le-linux-gnu' build/host system, and then full regression-testing of all the components built across the three setups. Maciej
On Tue, Nov 24, 2020 at 06:19:41AM +0000, Maciej W. Rozycki wrote: > NB I find the reindentation resulting in `push_reload' awful, just as I > do either version of the massive logical expression involved. Perhaps we > could factor these out into `static inline' functions sometime, and then > have them split into individual returns within? I'm generally hesitant to do a lot of changes to the reload code base at this stage. The strategy rather is to move all remaining targets over to LRA and then simply delete reload :-) Given that you're modernizing the vax target, I'm wondering if you shouldn't rather go all the way and move it over to LRA as well, then you wouldn't be affected by any remaining reload deficiencies. (The major hurdle so far was that LRA doesn't support CC0, but it looks like you're removing that anyway ...) > + && (strict_low > + || (subreg_lowpart_p (in) > + && (CONSTANT_P (SUBREG_REG (in)) > + || GET_CODE (SUBREG_REG (in)) == PLUS > + || strict_low If we do this, then that "|| strict_low" clause is now redundant. > + && REGNO (SUBREG_REG (in)) >= FIRST_PSEUDO_REGISTER > + && reg_equiv_mem (REGNO (SUBREG_REG (in))) > + && (mode_dependent_address_p > + (XEXP (reg_equiv_mem (REGNO (SUBREG_REG (in))), 0), > + MEM_ADDR_SPACE (reg_equiv_mem (REGNO (SUBREG_REG (in))))))))) I guess this is not incorrect, but I'm wondering if it would be *complete* -- there are other cases where reload replaces a pseudo with a MEM even where reg_equiv_mem is null. That said, if it fixes the test suite errors you're seeing, it would probably be OK to go with just this minimal change -- unless we can just move to LRA as mentioned above. Bye, Ulrich
On Fri, 27 Nov 2020, Ulrich Weigand wrote: > > NB I find the reindentation resulting in `push_reload' awful, just as I > > do either version of the massive logical expression involved. Perhaps we > > could factor these out into `static inline' functions sometime, and then > > have them split into individual returns within? > > I'm generally hesitant to do a lot of changes to the reload code base > at this stage. The strategy rather is to move all remaining targets > over to LRA and then simply delete reload :-) > > Given that you're modernizing the vax target, I'm wondering if you > shouldn't rather go all the way and move it over to LRA as well, > then you wouldn't be affected by any remaining reload deficiencies. > (The major hurdle so far was that LRA doesn't support CC0, but it > looks like you're removing that anyway ...) I considered moving to LRA, but decided to make one step at a time, especially given the number of issues the VAX port has been suffering from. For example there are cases evident from regression test failures where new pseudos are created past-reload. That would require tracking down, and I think switching to LRA would best be made with cleaner test results so as not to introduce another variable into the picture. So I would consider it GCC 12 material, so that we have an actual release with the VAX port converted to MODE_CC, but still using reload. I think it could make some backports easier too if NetBSD people wanted to do it. > > + && (strict_low > > + || (subreg_lowpart_p (in) > > + && (CONSTANT_P (SUBREG_REG (in)) > > + || GET_CODE (SUBREG_REG (in)) == PLUS > > + || strict_low > > If we do this, then that "|| strict_low" clause is now redundant. Oh! I noticed the redundancy (which was in the way of the extra condition I was about to add) and rewrote the expression so as to remove it, and looks like I then left the line in place. Maybe I didn't save the final change in the editor or something. Sigh! Thanks for spotting it. > > + && REGNO (SUBREG_REG (in)) >= FIRST_PSEUDO_REGISTER > > + && reg_equiv_mem (REGNO (SUBREG_REG (in))) > > + && (mode_dependent_address_p > > + (XEXP (reg_equiv_mem (REGNO (SUBREG_REG (in))), 0), > > + MEM_ADDR_SPACE (reg_equiv_mem (REGNO (SUBREG_REG (in))))))))) > > I guess this is not incorrect, but I'm wondering if it would be > *complete* -- there are other cases where reload replaces a pseudo > with a MEM even where reg_equiv_mem is null. I wasn't aware of that. What would be the usual scenario? > That said, if it fixes the test suite errors you're seeing, it would > probably be OK to go with just this minimal change -- unless we can > just move to LRA as mentioned above. I've looked through the test results and indeed these suspicious ICEs remain: .../gcc/testsuite/gcc.dg/pr83623.c:13:1: internal compiler error: in change_address_1, at emit-rtl.c:2275 .../gcc/testsuite/gcc.dg/torture/vshuf-main.inc:27:1: internal compiler error: in change_address_1, at emit-rtl.c:2275 corresponding to: FAIL: gcc.dg/pr83623.c (internal compiler error) FAIL: gcc.dg/pr83623.c (test for excess errors) FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 (internal compiler error) FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 (test for excess errors) I'll investigate. Maciej
On Fri, 27 Nov 2020, Maciej W. Rozycki wrote: > > That said, if it fixes the test suite errors you're seeing, it would > > probably be OK to go with just this minimal change -- unless we can > > just move to LRA as mentioned above. > > I've looked through the test results and indeed these suspicious ICEs > remain: > > .../gcc/testsuite/gcc.dg/pr83623.c:13:1: internal compiler error: in change_address_1, at emit-rtl.c:2275 > .../gcc/testsuite/gcc.dg/torture/vshuf-main.inc:27:1: internal compiler error: in change_address_1, at emit-rtl.c:2275 I've double-checked these and this: > corresponding to: > > FAIL: gcc.dg/pr83623.c (internal compiler error) > FAIL: gcc.dg/pr83623.c (test for excess errors) comes from this insn: (insn 17 14 145 (set (reg:SI 1 %r1) (zero_extract:SI (mem/c:SI (symbol_ref:SI ("x") <var_decl 0x7ffff7f80120 x>) [1 x+0 S4 A128]) (const_int 16 [0x10]) (const_int 16 [0x10]))) ".../gcc/testsuite/gcc.dg/pr83623.c":12:9 101 {*vax.md:805} (nil)) and this: > FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 (internal compiler error) > FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 (test for excess errors) likewise: (insn 83 82 84 (set (reg:SI 5 %r5 [84]) (zero_extract:SI (mem/c:SI (symbol_ref:SI ("b") <var_decl 0x7ffff7f801b0 b>) [0 b+0 S4 A128]) (const_int 8 [0x8]) (const_int 16 [0x10]))) ".../gcc/testsuite/gcc.dg/torture/vshuf-main.inc":28:1 101 {*vax.md:805} (nil)) So these are not related (and addressed with 22/31 BTW). I'll make the "|| strict_low" update then and push this change along with the rest once all the verification has completed, presumably this coming Monday. Thanks for your review! Maciej
On 11/27/20 12:22 PM, Maciej W. Rozycki wrote: > On Fri, 27 Nov 2020, Ulrich Weigand wrote: > >>> NB I find the reindentation resulting in `push_reload' awful, just as I >>> do either version of the massive logical expression involved. Perhaps we >>> could factor these out into `static inline' functions sometime, and then >>> have them split into individual returns within? >> I'm generally hesitant to do a lot of changes to the reload code base >> at this stage. The strategy rather is to move all remaining targets >> over to LRA and then simply delete reload :-) >> >> Given that you're modernizing the vax target, I'm wondering if you >> shouldn't rather go all the way and move it over to LRA as well, >> then you wouldn't be affected by any remaining reload deficiencies. >> (The major hurdle so far was that LRA doesn't support CC0, but it >> looks like you're removing that anyway ...) > I considered moving to LRA, but decided to make one step at a time, > especially given the number of issues the VAX port has been suffering > from. For example there are cases evident from regression test failures > where new pseudos are created past-reload. That would require tracking > down, and I think switching to LRA would best be made with cleaner test > results so as not to introduce another variable into the picture. > > So I would consider it GCC 12 material, so that we have an actual release > with the VAX port converted to MODE_CC, but still using reload. I think > it could make some backports easier too if NetBSD people wanted to do it. That was my plan on H8 as well (switch to LRA next cycle). I am aware of one reload issue that the move away from cc0 triggers, but I'm ignoring it for now. Jeff
Index: gcc/gcc/reload.c =================================================================== --- gcc.orig/gcc/reload.c +++ gcc/gcc/reload.c @@ -1043,53 +1043,73 @@ push_reload (rtx in, rtx out, rtx *inloc Also reload the inner expression if it does not require a secondary reload but the SUBREG does. - Finally, reload the inner expression if it is a register that is in + Also reload the inner expression if it is a register that is in the class whose registers cannot be referenced in a different size and M1 is not the same size as M2. If subreg_lowpart_p is false, we cannot reload just the inside since we might end up with the wrong register class. But if it is inside a STRICT_LOW_PART, we have - no choice, so we hope we do get the right register class there. */ + no choice, so we hope we do get the right register class there. + + Finally, reload the inner expression if it is a pseudo that will + become a MEM and the MEM has a mode-dependent address, as in that + case we obviously cannot change the mode of the MEM to that of the + containing SUBREG as that would change the interpretation of the + address. */ scalar_int_mode inner_mode; if (in != 0 && GET_CODE (in) == SUBREG - && (subreg_lowpart_p (in) || strict_low) && targetm.can_change_mode_class (GET_MODE (SUBREG_REG (in)), inmode, rclass) && contains_allocatable_reg_of_mode[rclass][GET_MODE (SUBREG_REG (in))] - && (CONSTANT_P (SUBREG_REG (in)) - || GET_CODE (SUBREG_REG (in)) == PLUS - || strict_low - || (((REG_P (SUBREG_REG (in)) - && REGNO (SUBREG_REG (in)) >= FIRST_PSEUDO_REGISTER) - || MEM_P (SUBREG_REG (in))) - && (paradoxical_subreg_p (inmode, GET_MODE (SUBREG_REG (in))) - || (known_le (GET_MODE_SIZE (inmode), UNITS_PER_WORD) - && is_a <scalar_int_mode> (GET_MODE (SUBREG_REG (in)), - &inner_mode) - && GET_MODE_SIZE (inner_mode) <= UNITS_PER_WORD - && paradoxical_subreg_p (inmode, inner_mode) - && LOAD_EXTEND_OP (inner_mode) != UNKNOWN) - || (WORD_REGISTER_OPERATIONS - && partial_subreg_p (inmode, GET_MODE (SUBREG_REG (in))) - && (known_equal_after_align_down - (GET_MODE_SIZE (inmode) - 1, - GET_MODE_SIZE (GET_MODE (SUBREG_REG (in))) - 1, - UNITS_PER_WORD))))) - || (REG_P (SUBREG_REG (in)) - && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER - /* The case where out is nonzero - is handled differently in the following statement. */ - && (out == 0 || subreg_lowpart_p (in)) - && (complex_word_subreg_p (inmode, SUBREG_REG (in)) - || !targetm.hard_regno_mode_ok (subreg_regno (in), inmode))) - || (secondary_reload_class (1, rclass, inmode, in) != NO_REGS - && (secondary_reload_class (1, rclass, GET_MODE (SUBREG_REG (in)), - SUBREG_REG (in)) - == NO_REGS)) + && (strict_low + || (subreg_lowpart_p (in) + && (CONSTANT_P (SUBREG_REG (in)) + || GET_CODE (SUBREG_REG (in)) == PLUS + || strict_low + || (((REG_P (SUBREG_REG (in)) + && REGNO (SUBREG_REG (in)) >= FIRST_PSEUDO_REGISTER) + || MEM_P (SUBREG_REG (in))) + && (paradoxical_subreg_p (inmode, + GET_MODE (SUBREG_REG (in))) + || (known_le (GET_MODE_SIZE (inmode), UNITS_PER_WORD) + && is_a <scalar_int_mode> (GET_MODE (SUBREG_REG + (in)), + &inner_mode) + && GET_MODE_SIZE (inner_mode) <= UNITS_PER_WORD + && paradoxical_subreg_p (inmode, inner_mode) + && LOAD_EXTEND_OP (inner_mode) != UNKNOWN) + || (WORD_REGISTER_OPERATIONS + && partial_subreg_p (inmode, + GET_MODE (SUBREG_REG (in))) + && (known_equal_after_align_down + (GET_MODE_SIZE (inmode) - 1, + GET_MODE_SIZE (GET_MODE (SUBREG_REG + (in))) - 1, + UNITS_PER_WORD))))) + || (REG_P (SUBREG_REG (in)) + && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER + /* The case where out is nonzero + is handled differently in the following statement. */ + && (out == 0 || subreg_lowpart_p (in)) + && (complex_word_subreg_p (inmode, SUBREG_REG (in)) + || !targetm.hard_regno_mode_ok (subreg_regno (in), + inmode))) + || (secondary_reload_class (1, rclass, inmode, in) != NO_REGS + && (secondary_reload_class (1, rclass, + GET_MODE (SUBREG_REG (in)), + SUBREG_REG (in)) + == NO_REGS)) + || (REG_P (SUBREG_REG (in)) + && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER + && !REG_CAN_CHANGE_MODE_P (REGNO (SUBREG_REG (in)), + GET_MODE (SUBREG_REG (in)), + inmode)))) || (REG_P (SUBREG_REG (in)) - && REGNO (SUBREG_REG (in)) < FIRST_PSEUDO_REGISTER - && !REG_CAN_CHANGE_MODE_P (REGNO (SUBREG_REG (in)), - GET_MODE (SUBREG_REG (in)), inmode)))) + && REGNO (SUBREG_REG (in)) >= FIRST_PSEUDO_REGISTER + && reg_equiv_mem (REGNO (SUBREG_REG (in))) + && (mode_dependent_address_p + (XEXP (reg_equiv_mem (REGNO (SUBREG_REG (in))), 0), + MEM_ADDR_SPACE (reg_equiv_mem (REGNO (SUBREG_REG (in))))))))) { #ifdef LIMIT_RELOAD_CLASS in_subreg_loc = inloc; @@ -3157,6 +3177,19 @@ find_reloads (rtx_insn *insn, int replac && paradoxical_subreg_p (operand_mode[i], inner_mode) && LOAD_EXTEND_OP (inner_mode) != UNKNOWN))) + /* We must force a reload of a SUBREG's inner expression + if it is a pseudo that will become a MEM and the MEM + has a mode-dependent address, as in that case we + obviously cannot change the mode of the MEM to that + of the containing SUBREG as that would change the + interpretation of the address. */ + || (REG_P (operand) + && REGNO (operand) >= FIRST_PSEUDO_REGISTER + && reg_equiv_mem (REGNO (operand)) + && (mode_dependent_address_p + (XEXP (reg_equiv_mem (REGNO (operand)), 0), + (MEM_ADDR_SPACE + (reg_equiv_mem (REGNO (operand))))))) ) force_reload = 1; } Index: gcc/gcc/testsuite/gcc.c-torture/compile/pr58901-0.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.c-torture/compile/pr58901-0.c @@ -0,0 +1,17 @@ +typedef signed int __attribute__ ((mode (SI))) int_t; + +struct s +{ + int_t n; + int_t c[]; +}; + +int_t +ashlsi (int_t x, const struct s *s) +{ + int_t i; + + for (i = 0; i < s->n; i++) + x ^= 1 << s->c[i]; + return x; +} Index: gcc/gcc/testsuite/gcc.c-torture/compile/pr58901-1.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.c-torture/compile/pr58901-1.c @@ -0,0 +1,21 @@ +typedef int __attribute__ ((mode (SI))) int_t; + +struct s +{ + int_t n; + int_t m : 1; + int_t l : 31; +}; + +int_t +movdi (int_t x, const struct s *s) +{ + int_t i; + + for (i = 0; i < x; i++) + { + const struct s t = s[i]; + x += t.m ? 1 : 0; + } + return x; +}