Message ID | 20180305165139.GA27394@arm.com |
---|---|
State | New |
Headers | show |
Series | [ARM] Fix can_change_mode_class for big-endian | expand |
Ping > -----Original Message----- > From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- > owner@gcc.gnu.org] On Behalf Of Tamar Christina > Sent: Monday, March 5, 2018 16:52 > To: gcc-patches@gcc.gnu.org > Cc: nd <nd@arm.com>; Ramana Radhakrishnan > <Ramana.Radhakrishnan@arm.com>; Richard Earnshaw > <Richard.Earnshaw@arm.com>; nickc@redhat.com; Kyrylo Tkachov > <Kyrylo.Tkachov@arm.com> > Subject: [PATCH][GCC][ARM] Fix can_change_mode_class for big-endian > > Hi All, > > Taking the subreg of a vector mode on big-endian may result in an infinite > recursion and eventually a segfault once we run out of stack space. > > As an example, taking a subreg of V4HF to SImode we end up in the following > loop on big-endian: > > #861 0x00000000008462e9 in operand_subword_force src/gcc/gcc/emit- > rtl.c:1787 > #862 0x0000000000882a90 in emit_move_multi_word > src/gcc/gcc/expr.c:3621 > #863 0x000000000087eea1 in emit_move_insn_1 src/gcc/gcc/expr.c:3698 > #864 0x000000000087f350 in emit_move_insn src/gcc/gcc/expr.c:3757 > #865 0x000000000085e326 in copy_to_reg src/gcc/gcc/explow.c:603 > #866 0x00000000008462e9 in operand_subword_force src/gcc/gcc/emit- > rtl.c:1787 > > The reason is that operand_subword_force will always fail. When the value is > in a register that can't be accessed as a multi word the code tries to create a > new psuedo register and emit the value to it. Eventually you end up in > simplify_gen_subreg which calls validate_subreg. > > validate_subreg will however always fail because of the > REG_CAN_CHANGE_MODE_P check. > > On little endian this check always returns true. On big-endian this check is > supposed to prevent values that have a size larger than word size, due to > those being stored in VFP registers. > > However we are only interested in a subreg of the vector mode, so we > should be checking the unit size, not the size of the entire mode. Doing this > fixes the problem. > > Regtested on armeb-none-eabi and no regressions. > Bootstrapped on arm-none-linux-gnueabihf and no issues. > > Ok for trunk? and for backport to GCC 7? > > Thanks, > Tamar > > gcc/ > 2018-03-05 Tamar Christina <tamar.christina@arm.com> > > PR target/84711 > * config/arm/arm.c (arm_can_change_mode_class): Use > GET_MODE_UNIT_SIZE > instead of GET_MODE_SIZE when comparing Units. > > gcc/testsuite/ > 2018-03-05 Tamar Christina <tamar.christina@arm.com> > > PR target/84711 > * gcc.target/arm/big-endian-subreg.c: New. > > --
Hi Tamar, On 05/03/18 16:51, Tamar Christina wrote: > Hi All, > > Taking the subreg of a vector mode on big-endian may result in an infinite > recursion and eventually a segfault once we run out of stack space. > > As an example, taking a subreg of V4HF to SImode we end up in the following > loop on big-endian: > > #861 0x00000000008462e9 in operand_subword_force src/gcc/gcc/emit-rtl.c:1787 > #862 0x0000000000882a90 in emit_move_multi_word src/gcc/gcc/expr.c:3621 > #863 0x000000000087eea1 in emit_move_insn_1 src/gcc/gcc/expr.c:3698 > #864 0x000000000087f350 in emit_move_insn src/gcc/gcc/expr.c:3757 > #865 0x000000000085e326 in copy_to_reg src/gcc/gcc/explow.c:603 > #866 0x00000000008462e9 in operand_subword_force src/gcc/gcc/emit-rtl.c:1787 > > The reason is that operand_subword_force will always fail. When the value is in > a register that can't be accessed as a multi word the code tries to create a new > psuedo register and emit the value to it. Eventually you end up in simplify_gen_subreg > which calls validate_subreg. > > validate_subreg will however always fail because of the REG_CAN_CHANGE_MODE_P check. > > On little endian this check always returns true. On big-endian this check is supposed > to prevent values that have a size larger than word size, due to those being stored in > VFP registers. > > However we are only interested in a subreg of the vector mode, so we should be checking > the unit size, not the size of the entire mode. Doing this fixes the problem. > > Regtested on armeb-none-eabi and no regressions. > Bootstrapped on arm-none-linux-gnueabihf and no issues. > > Ok for trunk? and for backport to GCC 7? > Ok for trunk. Please wait for a few days before backporting. Thanks, Kyrill > Thanks, > Tamar > > gcc/ > 2018-03-05 Tamar Christina <tamar.christina@arm.com> > > PR target/84711 > * config/arm/arm.c (arm_can_change_mode_class): Use GET_MODE_UNIT_SIZE > instead of GET_MODE_SIZE when comparing Units. > > gcc/testsuite/ > 2018-03-05 Tamar Christina <tamar.christina@arm.com> > > PR target/84711 > * gcc.target/arm/big-endian-subreg.c: New. > > --
On 15 March 2018 at 11:19, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > Hi Tamar, > > > On 05/03/18 16:51, Tamar Christina wrote: >> >> Hi All, >> >> Taking the subreg of a vector mode on big-endian may result in an infinite >> recursion and eventually a segfault once we run out of stack space. >> >> As an example, taking a subreg of V4HF to SImode we end up in the >> following >> loop on big-endian: >> >> #861 0x00000000008462e9 in operand_subword_force >> src/gcc/gcc/emit-rtl.c:1787 >> #862 0x0000000000882a90 in emit_move_multi_word src/gcc/gcc/expr.c:3621 >> #863 0x000000000087eea1 in emit_move_insn_1 src/gcc/gcc/expr.c:3698 >> #864 0x000000000087f350 in emit_move_insn src/gcc/gcc/expr.c:3757 >> #865 0x000000000085e326 in copy_to_reg src/gcc/gcc/explow.c:603 >> #866 0x00000000008462e9 in operand_subword_force >> src/gcc/gcc/emit-rtl.c:1787 >> >> The reason is that operand_subword_force will always fail. When the value >> is in >> a register that can't be accessed as a multi word the code tries to create >> a new >> psuedo register and emit the value to it. Eventually you end up in >> simplify_gen_subreg >> which calls validate_subreg. >> >> validate_subreg will however always fail because of the >> REG_CAN_CHANGE_MODE_P check. >> >> On little endian this check always returns true. On big-endian this check >> is supposed >> to prevent values that have a size larger than word size, due to those >> being stored in >> VFP registers. >> >> However we are only interested in a subreg of the vector mode, so we >> should be checking >> the unit size, not the size of the entire mode. Doing this fixes the >> problem. >> >> Regtested on armeb-none-eabi and no regressions. >> Bootstrapped on arm-none-linux-gnueabihf and no issues. >> >> Ok for trunk? and for backport to GCC 7? >> > > Ok for trunk. > Please wait for a few days before backporting. > Hi Tamar, Strangely I have noticed regressions on armeb, I have updated bugzilla accordingly. Thanks, Christophe > Thanks, > Kyrill > > >> Thanks, >> Tamar >> >> gcc/ >> 2018-03-05 Tamar Christina <tamar.christina@arm.com> >> >> PR target/84711 >> * config/arm/arm.c (arm_can_change_mode_class): Use >> GET_MODE_UNIT_SIZE >> instead of GET_MODE_SIZE when comparing Units. >> >> gcc/testsuite/ >> 2018-03-05 Tamar Christina <tamar.christina@arm.com> >> >> PR target/84711 >> * gcc.target/arm/big-endian-subreg.c: New. >> >> -- > >
Sending to list Thu 03/19/2018 08:48, Tamar Christina wrote: > Hi Christophe, > > The 03/16/2018 13:50, Christophe Lyon wrote: > > On 15 March 2018 at 11:19, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > > Hi Tamar, > > > > > > > > >> Regtested on armeb-none-eabi and no regressions. > > >> Bootstrapped on arm-none-linux-gnueabihf and no issues. > > >> > > >> Ok for trunk? and for backport to GCC 7? > > >> > > > > > > Ok for trunk. > > > Please wait for a few days before backporting. > > > > > > > Hi Tamar, > > > > Strangely I have noticed regressions on armeb, I have updated bugzilla > > accordingly. > > Thanks, seems testsuite didn't catch it with our default options. > > This seems to be a combine issue as it's removing subregs it thinks is a > no-op, causing the vec_select not to be done. It was working before because > we were saying any subreg is invalid in arm-be, so if the code didn't get > stuck in an endless loop, it would skip the no-op check. > > I don't see anyway to fix this in the back-end alone as combine gives no way > to tell it that the instruction is not a no-op. Even if I split the instruction > early on to explicitly use a new register, combine will combine the mov and vec_select > again and come to the same conclusion. = > > So I will revert this for GCC 8 and fix it for GCC 9. Better the compiler ICE than > generate bad code. > > Thanks, > Tamar > > > > > Thanks, > > > > Christophe > > > > > Thanks, > > > Kyrill > > > > > > > > >> Thanks, > > >> Tamar > > >> > > >> gcc/ > > >> 2018-03-05 Tamar Christina <tamar.christina@arm.com> > > >> > > >> PR target/84711 > > >> * config/arm/arm.c (arm_can_change_mode_class): Use > > >> GET_MODE_UNIT_SIZE > > >> instead of GET_MODE_SIZE when comparing Units. > > >> > > >> gcc/testsuite/ > > >> 2018-03-05 Tamar Christina <tamar.christina@arm.com> > > >> > > >> PR target/84711 > > >> * gcc.target/arm/big-endian-subreg.c: New. > > >> > > >> -- > > > > > > > > -- --
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 90d62e699bce9594879be2e3016c9b36c7e064c8..703632240822e762a906570964b949c783df56f3 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -31508,8 +31508,8 @@ arm_can_change_mode_class (machine_mode from, machine_mode to, { if (TARGET_BIG_END && !(GET_MODE_SIZE (from) == 16 && GET_MODE_SIZE (to) == 8) - && (GET_MODE_SIZE (from) > UNITS_PER_WORD - || GET_MODE_SIZE (to) > UNITS_PER_WORD) + && (GET_MODE_UNIT_SIZE (from) > UNITS_PER_WORD + || GET_MODE_UNIT_SIZE (to) > UNITS_PER_WORD) && reg_classes_intersect_p (VFP_REGS, rclass)) return false; return true; diff --git a/gcc/testsuite/gcc.target/arm/big-endian-subreg.c b/gcc/testsuite/gcc.target/arm/big-endian-subreg.c new file mode 100644 index 0000000000000000000000000000000000000000..4b1ab122f349e61e296fe3f76030a7a258e55f62 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/big-endian-subreg.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm_hf_eabi } */ +/* { dg-add-options arm_neon } */ +/* { dg-additional-options "-mfp16-format=ieee -mfloat-abi=hard" } */ + +typedef __fp16 v4f16 + __attribute__ ((vector_size (8))); + +v4f16 fn1 (v4f16 p) +{ + return p; +}