Message ID | 20240617095336.871176-4-richard.sandiford@arm.com |
---|---|
State | New |
Headers | show |
Series | [1/8] Make force_subreg emit nothing on failure | expand |
On 6/17/24 3:53 AM, Richard Sandiford wrote: > This patch makes target-independent code use force_subreg instead > of simplify_gen_subreg in some places. The criteria were: > > (1) The code is obviously specific to expand (where new pseudos > can be created), or at least would be invalid to call when > !can_create_pseudo_p () and temporaries are needed. > > (2) The value is obviously an rvalue rather than an lvalue. > > (3) The offset wasn't a simple lowpart or highpart calculation; > a later patch will deal with those. > > Doing this should reduce the likelihood of bugs like PR115464 > occuring in other situations. > > gcc/ > * expmed.cc (store_bit_field_using_insv): Use force_subreg > instead of simplify_gen_subreg. > (store_bit_field_1): Likewise. > (extract_bit_field_as_subreg): Likewise. > (extract_integral_bit_field): Likewise. > (emit_store_flag_1): Likewise. > * expr.cc (convert_move): Likewise. > (convert_modes): Likewise. > (emit_group_load_1): Likewise. > (emit_group_store): Likewise. > (expand_assignment): Likewise. [ ... ] So this has triggered a failure on ft32-elf with this testcase (simplified from the testsuite): typedef _Bool bool; const bool false = 0; const bool true = 1; struct RenderBox { bool m_positioned : 1; }; typedef struct RenderBox RenderBox; void RenderBox_setStyle(RenderBox *thisin) { RenderBox *this = thisin; bool ltrue = true; this->m_positioned = ltrue; } Before this change we generated this: > (insn 13 12 14 (set (reg:QI 47) > (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars) > (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8])) "j.c":17:22 -1 > (nil)) > > (insn 14 13 15 (parallel [ > (set (zero_extract:SI (subreg:SI (reg:QI 46) 0) > (const_int 1 [0x1]) > (const_int 0 [0])) > (subreg:SI (reg:QI 47) 0)) > (clobber (scratch:SI)) > ]) "j.c":17:22 -1 > (nil)) Afterwards we generate: > (insn 13 12 14 2 (parallel [ > (set (zero_extract:SI (subreg:SI (reg:QI 46) 0) > (const_int 1 [0x1]) > (const_int 0 [0])) > (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars) > (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8]) 0)) > (clobber (scratch:SI)) > ]) "j.c":17:22 -1 > (nil)) Note the (subreg (mem (...)). Probably not desirable in general, but also note the virtual-stack-vars in the memory address. The code to instantiate virtual registers doesn't handle (subreg (mem)), so we never convert that to an FP based address and we eventually fault. Should be visible with ft32-elf cross compiler. No options needed. Jeff
On Fri, Jun 21, 2024 at 1:11 PM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 6/17/24 3:53 AM, Richard Sandiford wrote: > > This patch makes target-independent code use force_subreg instead > > of simplify_gen_subreg in some places. The criteria were: > > > > (1) The code is obviously specific to expand (where new pseudos > > can be created), or at least would be invalid to call when > > !can_create_pseudo_p () and temporaries are needed. > > > > (2) The value is obviously an rvalue rather than an lvalue. > > > > (3) The offset wasn't a simple lowpart or highpart calculation; > > a later patch will deal with those. > > > > Doing this should reduce the likelihood of bugs like PR115464 > > occuring in other situations. > > > > gcc/ > > * expmed.cc (store_bit_field_using_insv): Use force_subreg > > instead of simplify_gen_subreg. > > (store_bit_field_1): Likewise. > > (extract_bit_field_as_subreg): Likewise. > > (extract_integral_bit_field): Likewise. > > (emit_store_flag_1): Likewise. > > * expr.cc (convert_move): Likewise. > > (convert_modes): Likewise. > > (emit_group_load_1): Likewise. > > (emit_group_store): Likewise. > > (expand_assignment): Likewise. > [ ... ] > > So this has triggered a failure on ft32-elf with this testcase > (simplified from the testsuite): > > typedef _Bool bool; > const bool false = 0; > const bool true = 1; > > struct RenderBox > { > bool m_positioned : 1; > }; > > typedef struct RenderBox RenderBox; > > > void RenderBox_setStyle(RenderBox *thisin) > { > RenderBox *this = thisin; > bool ltrue = true; > this->m_positioned = ltrue; > > } > > > > Before this change we generated this: > > > (insn 13 12 14 (set (reg:QI 47) > > (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars) > > (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8])) "j.c":17:22 -1 > > (nil)) > > > > (insn 14 13 15 (parallel [ > > (set (zero_extract:SI (subreg:SI (reg:QI 46) 0) > > (const_int 1 [0x1]) > > (const_int 0 [0])) > > (subreg:SI (reg:QI 47) 0)) > > (clobber (scratch:SI)) > > ]) "j.c":17:22 -1 > > (nil)) > > > Afterwards we generate: > > > (insn 13 12 14 2 (parallel [ > > (set (zero_extract:SI (subreg:SI (reg:QI 46) 0) > > (const_int 1 [0x1]) > > (const_int 0 [0])) > > (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars) > > (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8]) 0)) > > (clobber (scratch:SI)) > > ]) "j.c":17:22 -1 > > (nil)) > > Note the (subreg (mem (...)). Probably not desirable in general, but > also note the virtual-stack-vars in the memory address. The code to > instantiate virtual registers doesn't handle (subreg (mem)), so we never > convert that to an FP based address and we eventually fault. We should really get rid of the support of `(subreg (mem))` as a valid for register_operand (recorg.cc). Two ideas on how to fix this before removing `(subreg (mem))` support from register_operand: 1) Maybe for now reject subreg of mem inside validate_subreg that have virtual-stack-vars addresses. 2) Or we add the code to instantiate virtual registers to handle (subreg (mem)). Maybe we should just bite the bullet and remove support of `(subreg (mem))` from register_operand instead of hacking around this preexisting mess. Also see https://inbox.sourceware.org/gcc-patches/485B2857.2070003@naturalbridge.com/ Which is from 2008 about this subreg of mem. Thanks, Andrew > > Should be visible with ft32-elf cross compiler. No options needed. > > Jeff > >
Jeff Law <jeffreyalaw@gmail.com> writes: > On 6/17/24 3:53 AM, Richard Sandiford wrote: >> This patch makes target-independent code use force_subreg instead >> of simplify_gen_subreg in some places. The criteria were: >> >> (1) The code is obviously specific to expand (where new pseudos >> can be created), or at least would be invalid to call when >> !can_create_pseudo_p () and temporaries are needed. >> >> (2) The value is obviously an rvalue rather than an lvalue. >> >> (3) The offset wasn't a simple lowpart or highpart calculation; >> a later patch will deal with those. >> >> Doing this should reduce the likelihood of bugs like PR115464 >> occuring in other situations. >> >> gcc/ >> * expmed.cc (store_bit_field_using_insv): Use force_subreg >> instead of simplify_gen_subreg. >> (store_bit_field_1): Likewise. >> (extract_bit_field_as_subreg): Likewise. >> (extract_integral_bit_field): Likewise. >> (emit_store_flag_1): Likewise. >> * expr.cc (convert_move): Likewise. >> (convert_modes): Likewise. >> (emit_group_load_1): Likewise. >> (emit_group_store): Likewise. >> (expand_assignment): Likewise. > [ ... ] > > So this has triggered a failure on ft32-elf with this testcase > (simplified from the testsuite): > > typedef _Bool bool; > const bool false = 0; > const bool true = 1; > > struct RenderBox > { > bool m_positioned : 1; > }; > > typedef struct RenderBox RenderBox; > > > void RenderBox_setStyle(RenderBox *thisin) > { > RenderBox *this = thisin; > bool ltrue = true; > this->m_positioned = ltrue; > > } > > > > Before this change we generated this: > >> (insn 13 12 14 (set (reg:QI 47) >> (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars) >> (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8])) "j.c":17:22 -1 >> (nil)) >> >> (insn 14 13 15 (parallel [ >> (set (zero_extract:SI (subreg:SI (reg:QI 46) 0) >> (const_int 1 [0x1]) >> (const_int 0 [0])) >> (subreg:SI (reg:QI 47) 0)) >> (clobber (scratch:SI)) >> ]) "j.c":17:22 -1 >> (nil)) > > > Afterwards we generate: > >> (insn 13 12 14 2 (parallel [ >> (set (zero_extract:SI (subreg:SI (reg:QI 46) 0) >> (const_int 1 [0x1]) >> (const_int 0 [0])) >> (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars) >> (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8]) 0)) >> (clobber (scratch:SI)) >> ]) "j.c":17:22 -1 >> (nil)) > > Note the (subreg (mem (...)). Probably not desirable in general, but > also note the virtual-stack-vars in the memory address. The code to > instantiate virtual registers doesn't handle (subreg (mem)), so we never > convert that to an FP based address and we eventually fault. > > Should be visible with ft32-elf cross compiler. No options needed. Bah. Thanks for the report. I agree of course with the follow-on discussion that we should get rid of (subreg (mem)). But this was supposed to be a conservative patch. I've therefore reverted the offending part of the commit, as below. (Tested on aarch64-linux-gnu.) Richard One of the changes in g:d4047da6a070175aae7121c739d1cad6b08ff4b2 caused a regression in ft32-elf; see: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655418.html for details. This change was different from the others in that the original call was to simplify_subreg rather than simplify_lowpart_subreg. The old code would therefore go on to do the force_reg for more cases than the new code would. gcc/ * expmed.cc (store_bit_field_using_insv): Revert earlier change to use force_subreg instead of simplify_gen_subreg. --- gcc/expmed.cc | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 3b9475f5aa0..8bbbc94a98c 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -695,7 +695,13 @@ store_bit_field_using_insv (const extraction_insn *insv, rtx op0, if we must narrow it, be sure we do it correctly. */ if (GET_MODE_SIZE (value_mode) < GET_MODE_SIZE (op_mode)) - tmp = force_subreg (op_mode, value1, value_mode, 0); + { + tmp = simplify_subreg (op_mode, value1, value_mode, 0); + if (! tmp) + tmp = simplify_gen_subreg (op_mode, + force_reg (value_mode, value1), + value_mode, 0); + } else { if (targetm.mode_rep_extended (op_mode, value_mode) != UNKNOWN)
diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 9ba01695f53..1f68e7be721 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -695,13 +695,7 @@ store_bit_field_using_insv (const extraction_insn *insv, rtx op0, if we must narrow it, be sure we do it correctly. */ if (GET_MODE_SIZE (value_mode) < GET_MODE_SIZE (op_mode)) - { - tmp = simplify_subreg (op_mode, value1, value_mode, 0); - if (! tmp) - tmp = simplify_gen_subreg (op_mode, - force_reg (value_mode, value1), - value_mode, 0); - } + tmp = force_subreg (op_mode, value1, value_mode, 0); else { if (targetm.mode_rep_extended (op_mode, value_mode) != UNKNOWN) @@ -806,7 +800,7 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, if (known_eq (bitnum, 0U) && known_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (op0)))) { - sub = simplify_gen_subreg (GET_MODE (op0), value, fieldmode, 0); + sub = force_subreg (GET_MODE (op0), value, fieldmode, 0); if (sub) { if (reverse) @@ -1633,7 +1627,7 @@ extract_bit_field_as_subreg (machine_mode mode, rtx op0, && known_eq (bitsize, GET_MODE_BITSIZE (mode)) && lowpart_bit_field_p (bitnum, bitsize, op0_mode) && TRULY_NOOP_TRUNCATION_MODES_P (mode, op0_mode)) - return simplify_gen_subreg (mode, op0, op0_mode, bytenum); + return force_subreg (mode, op0, op0_mode, bytenum); return NULL_RTX; } @@ -2000,11 +1994,11 @@ extract_integral_bit_field (rtx op0, opt_scalar_int_mode op0_mode, return convert_extracted_bit_field (target, mode, tmode, unsignedp); } /* If OP0 is a hard register, copy it to a pseudo before calling - simplify_gen_subreg. */ + force_subreg. */ if (REG_P (op0) && HARD_REGISTER_P (op0)) op0 = copy_to_reg (op0); - op0 = simplify_gen_subreg (word_mode, op0, op0_mode.require (), - bitnum / BITS_PER_WORD * UNITS_PER_WORD); + op0 = force_subreg (word_mode, op0, op0_mode.require (), + bitnum / BITS_PER_WORD * UNITS_PER_WORD); op0_mode = word_mode; bitnum %= BITS_PER_WORD; } @@ -5774,8 +5768,8 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1, /* Do a logical OR or AND of the two words and compare the result. */ - op00 = simplify_gen_subreg (word_mode, op0, int_mode, 0); - op01 = simplify_gen_subreg (word_mode, op0, int_mode, UNITS_PER_WORD); + op00 = force_subreg (word_mode, op0, int_mode, 0); + op01 = force_subreg (word_mode, op0, int_mode, UNITS_PER_WORD); tem = expand_binop (word_mode, op1 == const0_rtx ? ior_optab : and_optab, op00, op01, NULL_RTX, unsignedp, diff --git a/gcc/expr.cc b/gcc/expr.cc index 9cecc1758f5..31a7346e33f 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -301,7 +301,7 @@ convert_move (rtx to, rtx from, int unsignedp) GET_MODE_BITSIZE (to_mode))); if (VECTOR_MODE_P (to_mode)) - from = simplify_gen_subreg (to_mode, from, GET_MODE (from), 0); + from = force_subreg (to_mode, from, GET_MODE (from), 0); else to = simplify_gen_subreg (from_mode, to, GET_MODE (to), 0); @@ -935,7 +935,7 @@ convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp) { gcc_assert (known_eq (GET_MODE_BITSIZE (mode), GET_MODE_BITSIZE (oldmode))); - return simplify_gen_subreg (mode, x, oldmode, 0); + return force_subreg (mode, x, oldmode, 0); } temp = gen_reg_rtx (mode); @@ -3072,8 +3072,8 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type, } } else if (CONSTANT_P (src) && GET_MODE (dst) != BLKmode - && XVECLEN (dst, 0) > 1) - tmps[i] = simplify_gen_subreg (mode, src, GET_MODE (dst), bytepos); + && XVECLEN (dst, 0) > 1) + tmps[i] = force_subreg (mode, src, GET_MODE (dst), bytepos); else if (CONSTANT_P (src)) { if (known_eq (bytelen, ssize)) @@ -3297,7 +3297,7 @@ emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED, if (known_eq (rtx_to_poly_int64 (XEXP (XVECEXP (src, 0, start), 1)), bytepos)) { - temp = simplify_gen_subreg (outer, tmps[start], inner, 0); + temp = force_subreg (outer, tmps[start], inner, 0); if (temp) { emit_move_insn (dst, temp); @@ -3317,7 +3317,7 @@ emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED, finish - 1), 1)), bytepos)) { - temp = simplify_gen_subreg (outer, tmps[finish - 1], inner, 0); + temp = force_subreg (outer, tmps[finish - 1], inner, 0); if (temp) { emit_move_insn (dst, temp); @@ -6191,11 +6191,9 @@ expand_assignment (tree to, tree from, bool nontemporal) to_mode = GET_MODE_INNER (to_mode); machine_mode from_mode = GET_MODE_INNER (GET_MODE (result)); rtx from_real - = simplify_gen_subreg (to_mode, XEXP (result, 0), - from_mode, 0); + = force_subreg (to_mode, XEXP (result, 0), from_mode, 0); rtx from_imag - = simplify_gen_subreg (to_mode, XEXP (result, 1), - from_mode, 0); + = force_subreg (to_mode, XEXP (result, 1), from_mode, 0); if (!from_real || !from_imag) goto concat_store_slow; emit_move_insn (XEXP (to_rtx, 0), from_real); @@ -6211,8 +6209,7 @@ expand_assignment (tree to, tree from, bool nontemporal) if (MEM_P (result)) from_rtx = change_address (result, to_mode, NULL_RTX); else - from_rtx - = simplify_gen_subreg (to_mode, result, from_mode, 0); + from_rtx = force_subreg (to_mode, result, from_mode, 0); if (from_rtx) { emit_move_insn (XEXP (to_rtx, 0), @@ -6224,10 +6221,10 @@ expand_assignment (tree to, tree from, bool nontemporal) { to_mode = GET_MODE_INNER (to_mode); rtx from_real - = simplify_gen_subreg (to_mode, result, from_mode, 0); + = force_subreg (to_mode, result, from_mode, 0); rtx from_imag - = simplify_gen_subreg (to_mode, result, from_mode, - GET_MODE_SIZE (to_mode)); + = force_subreg (to_mode, result, from_mode, + GET_MODE_SIZE (to_mode)); if (!from_real || !from_imag) goto concat_store_slow; emit_move_insn (XEXP (to_rtx, 0), from_real);