diff mbox series

[3/8] Make more use of force_subreg

Message ID 20240617095336.871176-4-richard.sandiford@arm.com
State New
Headers show
Series [1/8] Make force_subreg emit nothing on failure | expand

Commit Message

Richard Sandiford June 17, 2024, 9:53 a.m. UTC
This patch makes target-independent code use force_subreg instead
of simplify_gen_subreg in some places.  The criteria were:

(1) The code is obviously specific to expand (where new pseudos
    can be created), or at least would be invalid to call when
    !can_create_pseudo_p () and temporaries are needed.

(2) The value is obviously an rvalue rather than an lvalue.

(3) The offset wasn't a simple lowpart or highpart calculation;
    a later patch will deal with those.

Doing this should reduce the likelihood of bugs like PR115464
occuring in other situations.

gcc/
	* expmed.cc (store_bit_field_using_insv): Use force_subreg
	instead of simplify_gen_subreg.
	(store_bit_field_1): Likewise.
	(extract_bit_field_as_subreg): Likewise.
	(extract_integral_bit_field): Likewise.
	(emit_store_flag_1): Likewise.
	* expr.cc (convert_move): Likewise.
	(convert_modes): Likewise.
	(emit_group_load_1): Likewise.
	(emit_group_store): Likewise.
	(expand_assignment): Likewise.
---
 gcc/expmed.cc | 22 ++++++++--------------
 gcc/expr.cc   | 27 ++++++++++++---------------
 2 files changed, 20 insertions(+), 29 deletions(-)

Comments

Jeff Law June 21, 2024, 8:10 p.m. UTC | #1
On 6/17/24 3:53 AM, Richard Sandiford wrote:
> This patch makes target-independent code use force_subreg instead
> of simplify_gen_subreg in some places.  The criteria were:
> 
> (1) The code is obviously specific to expand (where new pseudos
>      can be created), or at least would be invalid to call when
>      !can_create_pseudo_p () and temporaries are needed.
> 
> (2) The value is obviously an rvalue rather than an lvalue.
> 
> (3) The offset wasn't a simple lowpart or highpart calculation;
>      a later patch will deal with those.
> 
> Doing this should reduce the likelihood of bugs like PR115464
> occuring in other situations.
> 
> gcc/
> 	* expmed.cc (store_bit_field_using_insv): Use force_subreg
> 	instead of simplify_gen_subreg.
> 	(store_bit_field_1): Likewise.
> 	(extract_bit_field_as_subreg): Likewise.
> 	(extract_integral_bit_field): Likewise.
> 	(emit_store_flag_1): Likewise.
> 	* expr.cc (convert_move): Likewise.
> 	(convert_modes): Likewise.
> 	(emit_group_load_1): Likewise.
> 	(emit_group_store): Likewise.
> 	(expand_assignment): Likewise.
[ ... ]

So this has triggered a failure on ft32-elf with this testcase 
(simplified from the testsuite):

typedef _Bool bool;
const bool false = 0;
const bool true = 1;

struct RenderBox
{
   bool m_positioned : 1;
};

typedef struct RenderBox RenderBox;


void RenderBox_setStyle(RenderBox *thisin)
{
   RenderBox *this = thisin;
   bool ltrue = true;
   this->m_positioned = ltrue;

}



Before this change we generated this:

> (insn 13 12 14 (set (reg:QI 47)
>         (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars)
>                 (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8])) "j.c":17:22 -1
>      (nil))
> 
> (insn 14 13 15 (parallel [
>             (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
>                     (const_int 1 [0x1])
>                     (const_int 0 [0]))
>                 (subreg:SI (reg:QI 47) 0))
>             (clobber (scratch:SI))
>         ]) "j.c":17:22 -1
>      (nil))


Afterwards we generate:

> (insn 13 12 14 2 (parallel [
>             (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
>                     (const_int 1 [0x1])
>                     (const_int 0 [0]))
>                 (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars)
>                             (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8]) 0))
>             (clobber (scratch:SI))
>         ]) "j.c":17:22 -1
>      (nil))

Note the (subreg (mem (...)).  Probably not desirable in general, but 
also note the virtual-stack-vars in the memory address.  The code to 
instantiate virtual registers doesn't handle (subreg (mem)), so we never 
convert that to an FP based address and we eventually fault.

Should be visible with ft32-elf cross compiler.  No options needed.

Jeff
Andrew Pinski June 22, 2024, 1:30 a.m. UTC | #2
On Fri, Jun 21, 2024 at 1:11 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 6/17/24 3:53 AM, Richard Sandiford wrote:
> > This patch makes target-independent code use force_subreg instead
> > of simplify_gen_subreg in some places.  The criteria were:
> >
> > (1) The code is obviously specific to expand (where new pseudos
> >      can be created), or at least would be invalid to call when
> >      !can_create_pseudo_p () and temporaries are needed.
> >
> > (2) The value is obviously an rvalue rather than an lvalue.
> >
> > (3) The offset wasn't a simple lowpart or highpart calculation;
> >      a later patch will deal with those.
> >
> > Doing this should reduce the likelihood of bugs like PR115464
> > occuring in other situations.
> >
> > gcc/
> >       * expmed.cc (store_bit_field_using_insv): Use force_subreg
> >       instead of simplify_gen_subreg.
> >       (store_bit_field_1): Likewise.
> >       (extract_bit_field_as_subreg): Likewise.
> >       (extract_integral_bit_field): Likewise.
> >       (emit_store_flag_1): Likewise.
> >       * expr.cc (convert_move): Likewise.
> >       (convert_modes): Likewise.
> >       (emit_group_load_1): Likewise.
> >       (emit_group_store): Likewise.
> >       (expand_assignment): Likewise.
> [ ... ]
>
> So this has triggered a failure on ft32-elf with this testcase
> (simplified from the testsuite):
>
> typedef _Bool bool;
> const bool false = 0;
> const bool true = 1;
>
> struct RenderBox
> {
>    bool m_positioned : 1;
> };
>
> typedef struct RenderBox RenderBox;
>
>
> void RenderBox_setStyle(RenderBox *thisin)
> {
>    RenderBox *this = thisin;
>    bool ltrue = true;
>    this->m_positioned = ltrue;
>
> }
>
>
>
> Before this change we generated this:
>
> > (insn 13 12 14 (set (reg:QI 47)
> >         (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars)
> >                 (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8])) "j.c":17:22 -1
> >      (nil))
> >
> > (insn 14 13 15 (parallel [
> >             (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
> >                     (const_int 1 [0x1])
> >                     (const_int 0 [0]))
> >                 (subreg:SI (reg:QI 47) 0))
> >             (clobber (scratch:SI))
> >         ]) "j.c":17:22 -1
> >      (nil))
>
>
> Afterwards we generate:
>
> > (insn 13 12 14 2 (parallel [
> >             (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
> >                     (const_int 1 [0x1])
> >                     (const_int 0 [0]))
> >                 (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars)
> >                             (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8]) 0))
> >             (clobber (scratch:SI))
> >         ]) "j.c":17:22 -1
> >      (nil))
>
> Note the (subreg (mem (...)).  Probably not desirable in general, but
> also note the virtual-stack-vars in the memory address.  The code to
> instantiate virtual registers doesn't handle (subreg (mem)), so we never
> convert that to an FP based address and we eventually fault.

We should really get rid of the support of `(subreg (mem))` as a valid
for register_operand (recorg.cc).
Two ideas on how to fix this before removing `(subreg (mem))` support
from register_operand:
1) Maybe for now reject subreg of mem inside validate_subreg that have
virtual-stack-vars addresses.
2) Or we add the code to instantiate virtual registers to handle (subreg (mem)).

Maybe we should just bite the bullet and remove support of `(subreg
(mem))` from register_operand instead of hacking around this
preexisting mess.

Also see
https://inbox.sourceware.org/gcc-patches/485B2857.2070003@naturalbridge.com/

Which is from 2008 about this subreg of mem.

Thanks,
Andrew

>
> Should be visible with ft32-elf cross compiler.  No options needed.
>
> Jeff
>
>
Richard Sandiford June 25, 2024, 8:42 a.m. UTC | #3
Jeff Law <jeffreyalaw@gmail.com> writes:
> On 6/17/24 3:53 AM, Richard Sandiford wrote:
>> This patch makes target-independent code use force_subreg instead
>> of simplify_gen_subreg in some places.  The criteria were:
>> 
>> (1) The code is obviously specific to expand (where new pseudos
>>      can be created), or at least would be invalid to call when
>>      !can_create_pseudo_p () and temporaries are needed.
>> 
>> (2) The value is obviously an rvalue rather than an lvalue.
>> 
>> (3) The offset wasn't a simple lowpart or highpart calculation;
>>      a later patch will deal with those.
>> 
>> Doing this should reduce the likelihood of bugs like PR115464
>> occuring in other situations.
>> 
>> gcc/
>> 	* expmed.cc (store_bit_field_using_insv): Use force_subreg
>> 	instead of simplify_gen_subreg.
>> 	(store_bit_field_1): Likewise.
>> 	(extract_bit_field_as_subreg): Likewise.
>> 	(extract_integral_bit_field): Likewise.
>> 	(emit_store_flag_1): Likewise.
>> 	* expr.cc (convert_move): Likewise.
>> 	(convert_modes): Likewise.
>> 	(emit_group_load_1): Likewise.
>> 	(emit_group_store): Likewise.
>> 	(expand_assignment): Likewise.
> [ ... ]
>
> So this has triggered a failure on ft32-elf with this testcase 
> (simplified from the testsuite):
>
> typedef _Bool bool;
> const bool false = 0;
> const bool true = 1;
>
> struct RenderBox
> {
>    bool m_positioned : 1;
> };
>
> typedef struct RenderBox RenderBox;
>
>
> void RenderBox_setStyle(RenderBox *thisin)
> {
>    RenderBox *this = thisin;
>    bool ltrue = true;
>    this->m_positioned = ltrue;
>
> }
>
>
>
> Before this change we generated this:
>
>> (insn 13 12 14 (set (reg:QI 47)
>>         (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars)
>>                 (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8])) "j.c":17:22 -1
>>      (nil))
>> 
>> (insn 14 13 15 (parallel [
>>             (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
>>                     (const_int 1 [0x1])
>>                     (const_int 0 [0]))
>>                 (subreg:SI (reg:QI 47) 0))
>>             (clobber (scratch:SI))
>>         ]) "j.c":17:22 -1
>>      (nil))
>
>
> Afterwards we generate:
>
>> (insn 13 12 14 2 (parallel [
>>             (set (zero_extract:SI (subreg:SI (reg:QI 46) 0)
>>                     (const_int 1 [0x1])
>>                     (const_int 0 [0]))
>>                 (subreg:SI (mem/c:QI (plus:SI (reg/f:SI 37 virtual-stack-vars)
>>                             (const_int -5 [0xfffffffffffffffb])) [1 ltrue+0 S1 A8]) 0))
>>             (clobber (scratch:SI))
>>         ]) "j.c":17:22 -1
>>      (nil))
>
> Note the (subreg (mem (...)).  Probably not desirable in general, but 
> also note the virtual-stack-vars in the memory address.  The code to 
> instantiate virtual registers doesn't handle (subreg (mem)), so we never 
> convert that to an FP based address and we eventually fault.
>
> Should be visible with ft32-elf cross compiler.  No options needed.

Bah.  Thanks for the report.

I agree of course with the follow-on discussion that we should get
rid of (subreg (mem)).  But this was supposed to be a conservative
patch.  I've therefore reverted the offending part of the commit,
as below.  (Tested on aarch64-linux-gnu.)

Richard


One of the changes in g:d4047da6a070175aae7121c739d1cad6b08ff4b2
caused a regression in ft32-elf; see:

    https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655418.html

for details.  This change was different from the others in that the
original call was to simplify_subreg rather than simplify_lowpart_subreg.
The old code would therefore go on to do the force_reg for more cases
than the new code would.

gcc/
	* expmed.cc (store_bit_field_using_insv): Revert earlier change
	to use force_subreg instead of simplify_gen_subreg.
---
 gcc/expmed.cc | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 3b9475f5aa0..8bbbc94a98c 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -695,7 +695,13 @@ store_bit_field_using_insv (const extraction_insn *insv, rtx op0,
 	     if we must narrow it, be sure we do it correctly.  */
 
 	  if (GET_MODE_SIZE (value_mode) < GET_MODE_SIZE (op_mode))
-	    tmp = force_subreg (op_mode, value1, value_mode, 0);
+	    {
+	      tmp = simplify_subreg (op_mode, value1, value_mode, 0);
+	      if (! tmp)
+		tmp = simplify_gen_subreg (op_mode,
+					   force_reg (value_mode, value1),
+					   value_mode, 0);
+	    }
 	  else
 	    {
 	      if (targetm.mode_rep_extended (op_mode, value_mode) != UNKNOWN)
diff mbox series

Patch

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 9ba01695f53..1f68e7be721 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -695,13 +695,7 @@  store_bit_field_using_insv (const extraction_insn *insv, rtx op0,
 	     if we must narrow it, be sure we do it correctly.  */
 
 	  if (GET_MODE_SIZE (value_mode) < GET_MODE_SIZE (op_mode))
-	    {
-	      tmp = simplify_subreg (op_mode, value1, value_mode, 0);
-	      if (! tmp)
-		tmp = simplify_gen_subreg (op_mode,
-					   force_reg (value_mode, value1),
-					   value_mode, 0);
-	    }
+	    tmp = force_subreg (op_mode, value1, value_mode, 0);
 	  else
 	    {
 	      if (targetm.mode_rep_extended (op_mode, value_mode) != UNKNOWN)
@@ -806,7 +800,7 @@  store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum,
       if (known_eq (bitnum, 0U)
 	  && known_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (op0))))
 	{
-	  sub = simplify_gen_subreg (GET_MODE (op0), value, fieldmode, 0);
+	  sub = force_subreg (GET_MODE (op0), value, fieldmode, 0);
 	  if (sub)
 	    {
 	      if (reverse)
@@ -1633,7 +1627,7 @@  extract_bit_field_as_subreg (machine_mode mode, rtx op0,
       && known_eq (bitsize, GET_MODE_BITSIZE (mode))
       && lowpart_bit_field_p (bitnum, bitsize, op0_mode)
       && TRULY_NOOP_TRUNCATION_MODES_P (mode, op0_mode))
-    return simplify_gen_subreg (mode, op0, op0_mode, bytenum);
+    return force_subreg (mode, op0, op0_mode, bytenum);
   return NULL_RTX;
 }
 
@@ -2000,11 +1994,11 @@  extract_integral_bit_field (rtx op0, opt_scalar_int_mode op0_mode,
 	  return convert_extracted_bit_field (target, mode, tmode, unsignedp);
 	}
       /* If OP0 is a hard register, copy it to a pseudo before calling
-	 simplify_gen_subreg.  */
+	 force_subreg.  */
       if (REG_P (op0) && HARD_REGISTER_P (op0))
 	op0 = copy_to_reg (op0);
-      op0 = simplify_gen_subreg (word_mode, op0, op0_mode.require (),
-				 bitnum / BITS_PER_WORD * UNITS_PER_WORD);
+      op0 = force_subreg (word_mode, op0, op0_mode.require (),
+			  bitnum / BITS_PER_WORD * UNITS_PER_WORD);
       op0_mode = word_mode;
       bitnum %= BITS_PER_WORD;
     }
@@ -5774,8 +5768,8 @@  emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1,
 
 	  /* Do a logical OR or AND of the two words and compare the
 	     result.  */
-	  op00 = simplify_gen_subreg (word_mode, op0, int_mode, 0);
-	  op01 = simplify_gen_subreg (word_mode, op0, int_mode, UNITS_PER_WORD);
+	  op00 = force_subreg (word_mode, op0, int_mode, 0);
+	  op01 = force_subreg (word_mode, op0, int_mode, UNITS_PER_WORD);
 	  tem = expand_binop (word_mode,
 			      op1 == const0_rtx ? ior_optab : and_optab,
 			      op00, op01, NULL_RTX, unsignedp,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 9cecc1758f5..31a7346e33f 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -301,7 +301,7 @@  convert_move (rtx to, rtx from, int unsignedp)
 			    GET_MODE_BITSIZE (to_mode)));
 
       if (VECTOR_MODE_P (to_mode))
-	from = simplify_gen_subreg (to_mode, from, GET_MODE (from), 0);
+	from = force_subreg (to_mode, from, GET_MODE (from), 0);
       else
 	to = simplify_gen_subreg (from_mode, to, GET_MODE (to), 0);
 
@@ -935,7 +935,7 @@  convert_modes (machine_mode mode, machine_mode oldmode, rtx x, int unsignedp)
     {
       gcc_assert (known_eq (GET_MODE_BITSIZE (mode),
 			    GET_MODE_BITSIZE (oldmode)));
-      return simplify_gen_subreg (mode, x, oldmode, 0);
+      return force_subreg (mode, x, oldmode, 0);
     }
 
   temp = gen_reg_rtx (mode);
@@ -3072,8 +3072,8 @@  emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type,
 	    }
 	}
       else if (CONSTANT_P (src) && GET_MODE (dst) != BLKmode
-               && XVECLEN (dst, 0) > 1)
-        tmps[i] = simplify_gen_subreg (mode, src, GET_MODE (dst), bytepos);
+	       && XVECLEN (dst, 0) > 1)
+	tmps[i] = force_subreg (mode, src, GET_MODE (dst), bytepos);
       else if (CONSTANT_P (src))
 	{
 	  if (known_eq (bytelen, ssize))
@@ -3297,7 +3297,7 @@  emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED,
 	  if (known_eq (rtx_to_poly_int64 (XEXP (XVECEXP (src, 0, start), 1)),
 			bytepos))
 	    {
-	      temp = simplify_gen_subreg (outer, tmps[start], inner, 0);
+	      temp = force_subreg (outer, tmps[start], inner, 0);
 	      if (temp)
 		{
 		  emit_move_insn (dst, temp);
@@ -3317,7 +3317,7 @@  emit_group_store (rtx orig_dst, rtx src, tree type ATTRIBUTE_UNUSED,
 							  finish - 1), 1)),
 			bytepos))
 	    {
-	      temp = simplify_gen_subreg (outer, tmps[finish - 1], inner, 0);
+	      temp = force_subreg (outer, tmps[finish - 1], inner, 0);
 	      if (temp)
 		{
 		  emit_move_insn (dst, temp);
@@ -6191,11 +6191,9 @@  expand_assignment (tree to, tree from, bool nontemporal)
 		  to_mode = GET_MODE_INNER (to_mode);
 		  machine_mode from_mode = GET_MODE_INNER (GET_MODE (result));
 		  rtx from_real
-		    = simplify_gen_subreg (to_mode, XEXP (result, 0),
-					   from_mode, 0);
+		    = force_subreg (to_mode, XEXP (result, 0), from_mode, 0);
 		  rtx from_imag
-		    = simplify_gen_subreg (to_mode, XEXP (result, 1),
-					   from_mode, 0);
+		    = force_subreg (to_mode, XEXP (result, 1), from_mode, 0);
 		  if (!from_real || !from_imag)
 		    goto concat_store_slow;
 		  emit_move_insn (XEXP (to_rtx, 0), from_real);
@@ -6211,8 +6209,7 @@  expand_assignment (tree to, tree from, bool nontemporal)
 		  if (MEM_P (result))
 		    from_rtx = change_address (result, to_mode, NULL_RTX);
 		  else
-		    from_rtx
-		      = simplify_gen_subreg (to_mode, result, from_mode, 0);
+		    from_rtx = force_subreg (to_mode, result, from_mode, 0);
 		  if (from_rtx)
 		    {
 		      emit_move_insn (XEXP (to_rtx, 0),
@@ -6224,10 +6221,10 @@  expand_assignment (tree to, tree from, bool nontemporal)
 		    {
 		      to_mode = GET_MODE_INNER (to_mode);
 		      rtx from_real
-			= simplify_gen_subreg (to_mode, result, from_mode, 0);
+			= force_subreg (to_mode, result, from_mode, 0);
 		      rtx from_imag
-			= simplify_gen_subreg (to_mode, result, from_mode,
-					       GET_MODE_SIZE (to_mode));
+			= force_subreg (to_mode, result, from_mode,
+					GET_MODE_SIZE (to_mode));
 		      if (!from_real || !from_imag)
 			goto concat_store_slow;
 		      emit_move_insn (XEXP (to_rtx, 0), from_real);