Message ID | mptr0jgloae.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | cse: Fix handling of fake vec_select sets [PR111702] | expand |
> Am 20.12.2023 um 19:30 schrieb Richard Sandiford <richard.sandiford@arm.com>: > > If cse sees: > > (set (reg R) (const_vector [A B ...])) > > it creates fake sets of the form: > > (set R[0] A) > (set R[1] B) > ... > > (with R[n] replaced by appropriate rtl) and then adds them to the tables > in the same way as for normal sets. This allows a sequence like: > > (set (reg R2) A) > ...(reg R2)... > > to try to use R[0] instead of (reg R2). > > But the pass was taking the analogy too far, and was trying to simplify > these fake sets based on costs. That is, if there was an earlier: > > (set (reg T) A) > > the pass would go to considerable effort trying to work out whether: > > (set R[0] A) > > or: > > (set R[0] (reg T)) > > was more profitable. This included running validate*_change on the sets, > which has no meaning given that the sets are not part of the insn. > > In this example, the equivalence A == T is already known, and the > purpose of the fake sets is to add A == T == R[0]. We can do that > just as easily (or, as the PR shows, more easily) if we keep the > original form of the fake set, with A instead of T. > > The problem in the PR occurred if we had: > > (1) something that establishes an equivalence between a vector V1 of > M-bit scalar integers and a hard register H > > (2) something that establishes an equivalence between a vector V2 of > N-bit scalar integers, where N<M and where V2 contains at least 2 > instances of V1[0] > > (1) established an equivalence between V1[0] and H in M bits. > (2) then triggered a search for an equivalence of V1[0] in N bits. > This included: > > /* See if we have a CONST_INT that is already in a register in a > wider mode. */ > > which (correctly) found that the low N bits of H contain the right value. > But because it came from a wider mode, this equivalence between N-bit H > and N-bit V1[0] was not yet in the hash table. It therefore survived > the purge in: > > /* At this point, ELT, if nonzero, points to a class of expressions > equivalent to the source of this SET and SRC, SRC_EQV, SRC_FOLDED, > and SRC_RELATED, if nonzero, each contain additional equivalent > expressions. Prune these latter expressions by deleting expressions > already in the equivalence class. > > And since more than 1 set found the same N-bit equivalence between > H and V1[0], the pass tried to add it more than once. > > Things were already wrong at this stage, but an ICE was only triggered > later when trying to merge this N-bit equivalence with another one. > > We could avoid the double registration by adding: > > for (elt = classp; elt; elt = elt->next_same_value) > if (rtx_equal_p (elt->exp, x)) > return elt; > > to insert_with_costs, or by making cse_insn check whether previous > sets have recorded the same equivalence. The latter seems more > appealing from a compile-time perspective. But in this case, > doing that would be adding yet more spurious work to the handling > of fake sets. > > The handling of fake sets therefore seems like the more fundamental bug. > > While there, the patch also makes sure that we don't apply REG_EQUAL > notes to these fake sets. They only describe the "real" (first) set. Agreed and OK > gcc/ > PR rtl-optimization/111702 > * cse.cc (set::mode): Move earlier. > (set::src_in_memory, set::src_volatile): Convert to bitfields. > (set::is_fake_set): New member variable. > (add_to_set): Add an is_fake_set parameter. > (find_sets_in_insn): Update calls accordingly. > (cse_insn): Do not apply REG_EQUAL notes to fake sets. Do not > try to optimize them either, or validate changes to them. > > gcc/ > PR rtl-optimization/111702 > * gcc.dg/rtl/aarch64/pr111702.c: New test. > --- > gcc/cse.cc | 38 +++++++++++------- > gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c | 43 +++++++++++++++++++++ > 2 files changed, 67 insertions(+), 14 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c > > diff --git a/gcc/cse.cc b/gcc/cse.cc > index f9603fdfd43..9fd51ca2832 100644 > --- a/gcc/cse.cc > +++ b/gcc/cse.cc > @@ -4128,13 +4128,17 @@ struct set > unsigned dest_hash; > /* The SET_DEST, with SUBREG, etc., stripped. */ > rtx inner_dest; > + /* Original machine mode, in case it becomes a CONST_INT. */ > + ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; > /* Nonzero if the SET_SRC is in memory. */ > - char src_in_memory; > + unsigned int src_in_memory : 1; > /* Nonzero if the SET_SRC contains something > whose value cannot be predicted and understood. */ > - char src_volatile; > - /* Original machine mode, in case it becomes a CONST_INT. */ > - ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; > + unsigned int src_volatile : 1; > + /* Nonzero if RTL is an artifical set that has been created to describe > + part of an insn's effect. Zero means that RTL appears directly in > + the insn pattern. */ > + unsigned int is_fake_set : 1; > /* Hash value of constant equivalent for SET_SRC. */ > unsigned src_const_hash; > /* A constant equivalent for SET_SRC, if any. */ > @@ -4229,12 +4233,15 @@ try_back_substitute_reg (rtx set, rtx_insn *insn) > } > } > > -/* Add an entry containing RTL X into SETS. */ > +/* Add an entry containing RTL X into SETS. IS_FAKE_SET is true if X is > + an artifical set that has been created to describe part of an insn's > + effect. */ > static inline void > -add_to_set (vec<struct set> *sets, rtx x) > +add_to_set (vec<struct set> *sets, rtx x, bool is_fake_set) > { > struct set entry = {}; > entry.rtl = x; > + entry.is_fake_set = is_fake_set; > sets->safe_push (entry); > } > > @@ -4271,7 +4278,7 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) > && known_eq (GET_MODE_NUNITS (GET_MODE (SET_SRC (x))), 1))) > { > /* First register the vector itself. */ > - add_to_set (psets, x); > + add_to_set (psets, x, false); > rtx src = SET_SRC (x); > /* Go over the constants of the CONST_VECTOR in forward order, to > put them in the same order in the SETS array. */ > @@ -4281,11 +4288,12 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) > used to tell CSE how to get to a particular constant. */ > rtx y = simplify_gen_vec_select (SET_DEST (x), i); > gcc_assert (y); > - add_to_set (psets, gen_rtx_SET (y, CONST_VECTOR_ELT (src, i))); > + rtx set = gen_rtx_SET (y, CONST_VECTOR_ELT (src, i)); > + add_to_set (psets, set, true); > } > } > else > - add_to_set (psets, x); > + add_to_set (psets, x, false); > } > else if (GET_CODE (x) == PARALLEL) > { > @@ -4306,7 +4314,7 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) > else if (GET_CODE (SET_SRC (y)) == CALL) > ; > else > - add_to_set (psets, y); > + add_to_set (psets, y, false); > } > } > } > @@ -4616,6 +4624,7 @@ cse_insn (rtx_insn *insn) > int src_related_regcost = MAX_COST; > int src_elt_regcost = MAX_COST; > scalar_int_mode int_mode; > + bool is_fake_set = sets[i].is_fake_set; > > dest = SET_DEST (sets[i].rtl); > src = SET_SRC (sets[i].rtl); > @@ -4627,7 +4636,7 @@ cse_insn (rtx_insn *insn) > mode = GET_MODE (src) == VOIDmode ? GET_MODE (dest) : GET_MODE (src); > sets[i].mode = mode; > > - if (src_eqv) > + if (!is_fake_set && src_eqv) > { > machine_mode eqvmode = mode; > if (GET_CODE (dest) == STRICT_LOW_PART) > @@ -4648,7 +4657,7 @@ cse_insn (rtx_insn *insn) > /* If this is a STRICT_LOW_PART assignment, src_eqv corresponds to the > value of the INNER register, not the destination. So it is not > a valid substitution for the source. But save it for later. */ > - if (GET_CODE (dest) == STRICT_LOW_PART) > + if (is_fake_set || GET_CODE (dest) == STRICT_LOW_PART) > src_eqv_here = 0; > else > src_eqv_here = src_eqv; > @@ -5158,7 +5167,7 @@ cse_insn (rtx_insn *insn) > > /* Terminate loop when replacement made. This must terminate since > the current contents will be tested and will always be valid. */ > - while (1) > + while (!is_fake_set) > { > rtx trial; > > @@ -5425,7 +5434,8 @@ cse_insn (rtx_insn *insn) > with the head of the class. If we do not do this, we will have > both registers live over a portion of the basic block. This way, > their lifetimes will likely abut instead of overlapping. */ > - if (REG_P (dest) > + if (!is_fake_set > + && REG_P (dest) > && REGNO_QTY_VALID_P (REGNO (dest))) > { > int dest_q = REG_QTY (REGNO (dest)); > diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c > new file mode 100644 > index 00000000000..8af2c54de3c > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c > @@ -0,0 +1,43 @@ > +/* { dg-do compile { target aarch64*-*-* } } */ > +/* { dg-options "-O2" } */ > + > +extern int data[]; > + > +void __RTL (startwith ("vregs")) foo () > +{ > + (function "foo" > + (insn-chain > + (block 2 > + (edge-from entry (flags "FALLTHRU")) > + (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) > + (insn 4 (set (reg:V16QI <0>) > + (const_vector:V16QI [(const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0)]))) > + (insn 5 (set (reg:V2SI v0) > + (const_vector:V2SI [(const_int 1) (const_int 0)]))) > + (insn 6 (set (reg:V16QI v1) > + (const_vector:V16QI [(const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1)]))) > + (insn 7 (set (reg:QI x0) (subreg:QI (reg:V16QI <0>) 0)) > + (expr_list:REG_EQUAL (const_int 1) (nil))) > + (insn 8 (use (reg:V16QI <0>))) > + (insn 9 (use (reg:V2SI v0))) > + (insn 10 (use (reg:V16QI v1))) > + (insn 11 (use (reg:QI x0))) > + (edge-to exit (flags "FALLTHRU")) > + ) ;; block 2 > + ) ;; insn-chain > + ) ;; function > +} > -- > 2.25.1 >
On Thu, 21 Dec 2023 at 00:00, Richard Sandiford <richard.sandiford@arm.com> wrote: > > If cse sees: > > (set (reg R) (const_vector [A B ...])) > > it creates fake sets of the form: > > (set R[0] A) > (set R[1] B) > ... > > (with R[n] replaced by appropriate rtl) and then adds them to the tables > in the same way as for normal sets. This allows a sequence like: > > (set (reg R2) A) > ...(reg R2)... > > to try to use R[0] instead of (reg R2). > > But the pass was taking the analogy too far, and was trying to simplify > these fake sets based on costs. That is, if there was an earlier: > > (set (reg T) A) > > the pass would go to considerable effort trying to work out whether: > > (set R[0] A) > > or: > > (set R[0] (reg T)) > > was more profitable. This included running validate*_change on the sets, > which has no meaning given that the sets are not part of the insn. > > In this example, the equivalence A == T is already known, and the > purpose of the fake sets is to add A == T == R[0]. We can do that > just as easily (or, as the PR shows, more easily) if we keep the > original form of the fake set, with A instead of T. > > The problem in the PR occurred if we had: > > (1) something that establishes an equivalence between a vector V1 of > M-bit scalar integers and a hard register H > > (2) something that establishes an equivalence between a vector V2 of > N-bit scalar integers, where N<M and where V2 contains at least 2 > instances of V1[0] > > (1) established an equivalence between V1[0] and H in M bits. > (2) then triggered a search for an equivalence of V1[0] in N bits. > This included: > > /* See if we have a CONST_INT that is already in a register in a > wider mode. */ > > which (correctly) found that the low N bits of H contain the right value. > But because it came from a wider mode, this equivalence between N-bit H > and N-bit V1[0] was not yet in the hash table. It therefore survived > the purge in: > > /* At this point, ELT, if nonzero, points to a class of expressions > equivalent to the source of this SET and SRC, SRC_EQV, SRC_FOLDED, > and SRC_RELATED, if nonzero, each contain additional equivalent > expressions. Prune these latter expressions by deleting expressions > already in the equivalence class. > > And since more than 1 set found the same N-bit equivalence between > H and V1[0], the pass tried to add it more than once. > > Things were already wrong at this stage, but an ICE was only triggered > later when trying to merge this N-bit equivalence with another one. > > We could avoid the double registration by adding: > > for (elt = classp; elt; elt = elt->next_same_value) > if (rtx_equal_p (elt->exp, x)) > return elt; > > to insert_with_costs, or by making cse_insn check whether previous > sets have recorded the same equivalence. The latter seems more > appealing from a compile-time perspective. But in this case, > doing that would be adding yet more spurious work to the handling > of fake sets. > > The handling of fake sets therefore seems like the more fundamental bug. > > While there, the patch also makes sure that we don't apply REG_EQUAL > notes to these fake sets. They only describe the "real" (first) set. Hi Richard, Thanks for the detailed explanation and fix! Thanks, Prathamesh > > gcc/ > PR rtl-optimization/111702 > * cse.cc (set::mode): Move earlier. > (set::src_in_memory, set::src_volatile): Convert to bitfields. > (set::is_fake_set): New member variable. > (add_to_set): Add an is_fake_set parameter. > (find_sets_in_insn): Update calls accordingly. > (cse_insn): Do not apply REG_EQUAL notes to fake sets. Do not > try to optimize them either, or validate changes to them. > > gcc/ > PR rtl-optimization/111702 > * gcc.dg/rtl/aarch64/pr111702.c: New test. > --- > gcc/cse.cc | 38 +++++++++++------- > gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c | 43 +++++++++++++++++++++ > 2 files changed, 67 insertions(+), 14 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c > > diff --git a/gcc/cse.cc b/gcc/cse.cc > index f9603fdfd43..9fd51ca2832 100644 > --- a/gcc/cse.cc > +++ b/gcc/cse.cc > @@ -4128,13 +4128,17 @@ struct set > unsigned dest_hash; > /* The SET_DEST, with SUBREG, etc., stripped. */ > rtx inner_dest; > + /* Original machine mode, in case it becomes a CONST_INT. */ > + ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; > /* Nonzero if the SET_SRC is in memory. */ > - char src_in_memory; > + unsigned int src_in_memory : 1; > /* Nonzero if the SET_SRC contains something > whose value cannot be predicted and understood. */ > - char src_volatile; > - /* Original machine mode, in case it becomes a CONST_INT. */ > - ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; > + unsigned int src_volatile : 1; > + /* Nonzero if RTL is an artifical set that has been created to describe > + part of an insn's effect. Zero means that RTL appears directly in > + the insn pattern. */ > + unsigned int is_fake_set : 1; > /* Hash value of constant equivalent for SET_SRC. */ > unsigned src_const_hash; > /* A constant equivalent for SET_SRC, if any. */ > @@ -4229,12 +4233,15 @@ try_back_substitute_reg (rtx set, rtx_insn *insn) > } > } > > -/* Add an entry containing RTL X into SETS. */ > +/* Add an entry containing RTL X into SETS. IS_FAKE_SET is true if X is > + an artifical set that has been created to describe part of an insn's > + effect. */ > static inline void > -add_to_set (vec<struct set> *sets, rtx x) > +add_to_set (vec<struct set> *sets, rtx x, bool is_fake_set) > { > struct set entry = {}; > entry.rtl = x; > + entry.is_fake_set = is_fake_set; > sets->safe_push (entry); > } > > @@ -4271,7 +4278,7 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) > && known_eq (GET_MODE_NUNITS (GET_MODE (SET_SRC (x))), 1))) > { > /* First register the vector itself. */ > - add_to_set (psets, x); > + add_to_set (psets, x, false); > rtx src = SET_SRC (x); > /* Go over the constants of the CONST_VECTOR in forward order, to > put them in the same order in the SETS array. */ > @@ -4281,11 +4288,12 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) > used to tell CSE how to get to a particular constant. */ > rtx y = simplify_gen_vec_select (SET_DEST (x), i); > gcc_assert (y); > - add_to_set (psets, gen_rtx_SET (y, CONST_VECTOR_ELT (src, i))); > + rtx set = gen_rtx_SET (y, CONST_VECTOR_ELT (src, i)); > + add_to_set (psets, set, true); > } > } > else > - add_to_set (psets, x); > + add_to_set (psets, x, false); > } > else if (GET_CODE (x) == PARALLEL) > { > @@ -4306,7 +4314,7 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) > else if (GET_CODE (SET_SRC (y)) == CALL) > ; > else > - add_to_set (psets, y); > + add_to_set (psets, y, false); > } > } > } > @@ -4616,6 +4624,7 @@ cse_insn (rtx_insn *insn) > int src_related_regcost = MAX_COST; > int src_elt_regcost = MAX_COST; > scalar_int_mode int_mode; > + bool is_fake_set = sets[i].is_fake_set; > > dest = SET_DEST (sets[i].rtl); > src = SET_SRC (sets[i].rtl); > @@ -4627,7 +4636,7 @@ cse_insn (rtx_insn *insn) > mode = GET_MODE (src) == VOIDmode ? GET_MODE (dest) : GET_MODE (src); > sets[i].mode = mode; > > - if (src_eqv) > + if (!is_fake_set && src_eqv) > { > machine_mode eqvmode = mode; > if (GET_CODE (dest) == STRICT_LOW_PART) > @@ -4648,7 +4657,7 @@ cse_insn (rtx_insn *insn) > /* If this is a STRICT_LOW_PART assignment, src_eqv corresponds to the > value of the INNER register, not the destination. So it is not > a valid substitution for the source. But save it for later. */ > - if (GET_CODE (dest) == STRICT_LOW_PART) > + if (is_fake_set || GET_CODE (dest) == STRICT_LOW_PART) > src_eqv_here = 0; > else > src_eqv_here = src_eqv; > @@ -5158,7 +5167,7 @@ cse_insn (rtx_insn *insn) > > /* Terminate loop when replacement made. This must terminate since > the current contents will be tested and will always be valid. */ > - while (1) > + while (!is_fake_set) > { > rtx trial; > > @@ -5425,7 +5434,8 @@ cse_insn (rtx_insn *insn) > with the head of the class. If we do not do this, we will have > both registers live over a portion of the basic block. This way, > their lifetimes will likely abut instead of overlapping. */ > - if (REG_P (dest) > + if (!is_fake_set > + && REG_P (dest) > && REGNO_QTY_VALID_P (REGNO (dest))) > { > int dest_q = REG_QTY (REGNO (dest)); > diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c > new file mode 100644 > index 00000000000..8af2c54de3c > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c > @@ -0,0 +1,43 @@ > +/* { dg-do compile { target aarch64*-*-* } } */ > +/* { dg-options "-O2" } */ > + > +extern int data[]; > + > +void __RTL (startwith ("vregs")) foo () > +{ > + (function "foo" > + (insn-chain > + (block 2 > + (edge-from entry (flags "FALLTHRU")) > + (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) > + (insn 4 (set (reg:V16QI <0>) > + (const_vector:V16QI [(const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0)]))) > + (insn 5 (set (reg:V2SI v0) > + (const_vector:V2SI [(const_int 1) (const_int 0)]))) > + (insn 6 (set (reg:V16QI v1) > + (const_vector:V16QI [(const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1) > + (const_int 0) (const_int 0) > + (const_int 1) (const_int 1)]))) > + (insn 7 (set (reg:QI x0) (subreg:QI (reg:V16QI <0>) 0)) > + (expr_list:REG_EQUAL (const_int 1) (nil))) > + (insn 8 (use (reg:V16QI <0>))) > + (insn 9 (use (reg:V2SI v0))) > + (insn 10 (use (reg:V16QI v1))) > + (insn 11 (use (reg:QI x0))) > + (edge-to exit (flags "FALLTHRU")) > + ) ;; block 2 > + ) ;; insn-chain > + ) ;; function > +} > -- > 2.25.1 >
diff --git a/gcc/cse.cc b/gcc/cse.cc index f9603fdfd43..9fd51ca2832 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -4128,13 +4128,17 @@ struct set unsigned dest_hash; /* The SET_DEST, with SUBREG, etc., stripped. */ rtx inner_dest; + /* Original machine mode, in case it becomes a CONST_INT. */ + ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; /* Nonzero if the SET_SRC is in memory. */ - char src_in_memory; + unsigned int src_in_memory : 1; /* Nonzero if the SET_SRC contains something whose value cannot be predicted and understood. */ - char src_volatile; - /* Original machine mode, in case it becomes a CONST_INT. */ - ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; + unsigned int src_volatile : 1; + /* Nonzero if RTL is an artifical set that has been created to describe + part of an insn's effect. Zero means that RTL appears directly in + the insn pattern. */ + unsigned int is_fake_set : 1; /* Hash value of constant equivalent for SET_SRC. */ unsigned src_const_hash; /* A constant equivalent for SET_SRC, if any. */ @@ -4229,12 +4233,15 @@ try_back_substitute_reg (rtx set, rtx_insn *insn) } } -/* Add an entry containing RTL X into SETS. */ +/* Add an entry containing RTL X into SETS. IS_FAKE_SET is true if X is + an artifical set that has been created to describe part of an insn's + effect. */ static inline void -add_to_set (vec<struct set> *sets, rtx x) +add_to_set (vec<struct set> *sets, rtx x, bool is_fake_set) { struct set entry = {}; entry.rtl = x; + entry.is_fake_set = is_fake_set; sets->safe_push (entry); } @@ -4271,7 +4278,7 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) && known_eq (GET_MODE_NUNITS (GET_MODE (SET_SRC (x))), 1))) { /* First register the vector itself. */ - add_to_set (psets, x); + add_to_set (psets, x, false); rtx src = SET_SRC (x); /* Go over the constants of the CONST_VECTOR in forward order, to put them in the same order in the SETS array. */ @@ -4281,11 +4288,12 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) used to tell CSE how to get to a particular constant. */ rtx y = simplify_gen_vec_select (SET_DEST (x), i); gcc_assert (y); - add_to_set (psets, gen_rtx_SET (y, CONST_VECTOR_ELT (src, i))); + rtx set = gen_rtx_SET (y, CONST_VECTOR_ELT (src, i)); + add_to_set (psets, set, true); } } else - add_to_set (psets, x); + add_to_set (psets, x, false); } else if (GET_CODE (x) == PARALLEL) { @@ -4306,7 +4314,7 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> *psets) else if (GET_CODE (SET_SRC (y)) == CALL) ; else - add_to_set (psets, y); + add_to_set (psets, y, false); } } } @@ -4616,6 +4624,7 @@ cse_insn (rtx_insn *insn) int src_related_regcost = MAX_COST; int src_elt_regcost = MAX_COST; scalar_int_mode int_mode; + bool is_fake_set = sets[i].is_fake_set; dest = SET_DEST (sets[i].rtl); src = SET_SRC (sets[i].rtl); @@ -4627,7 +4636,7 @@ cse_insn (rtx_insn *insn) mode = GET_MODE (src) == VOIDmode ? GET_MODE (dest) : GET_MODE (src); sets[i].mode = mode; - if (src_eqv) + if (!is_fake_set && src_eqv) { machine_mode eqvmode = mode; if (GET_CODE (dest) == STRICT_LOW_PART) @@ -4648,7 +4657,7 @@ cse_insn (rtx_insn *insn) /* If this is a STRICT_LOW_PART assignment, src_eqv corresponds to the value of the INNER register, not the destination. So it is not a valid substitution for the source. But save it for later. */ - if (GET_CODE (dest) == STRICT_LOW_PART) + if (is_fake_set || GET_CODE (dest) == STRICT_LOW_PART) src_eqv_here = 0; else src_eqv_here = src_eqv; @@ -5158,7 +5167,7 @@ cse_insn (rtx_insn *insn) /* Terminate loop when replacement made. This must terminate since the current contents will be tested and will always be valid. */ - while (1) + while (!is_fake_set) { rtx trial; @@ -5425,7 +5434,8 @@ cse_insn (rtx_insn *insn) with the head of the class. If we do not do this, we will have both registers live over a portion of the basic block. This way, their lifetimes will likely abut instead of overlapping. */ - if (REG_P (dest) + if (!is_fake_set + && REG_P (dest) && REGNO_QTY_VALID_P (REGNO (dest))) { int dest_q = REG_QTY (REGNO (dest)); diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c new file mode 100644 index 00000000000..8af2c54de3c --- /dev/null +++ b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c @@ -0,0 +1,43 @@ +/* { dg-do compile { target aarch64*-*-* } } */ +/* { dg-options "-O2" } */ + +extern int data[]; + +void __RTL (startwith ("vregs")) foo () +{ + (function "foo" + (insn-chain + (block 2 + (edge-from entry (flags "FALLTHRU")) + (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) + (insn 4 (set (reg:V16QI <0>) + (const_vector:V16QI [(const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0)]))) + (insn 5 (set (reg:V2SI v0) + (const_vector:V2SI [(const_int 1) (const_int 0)]))) + (insn 6 (set (reg:V16QI v1) + (const_vector:V16QI [(const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1)]))) + (insn 7 (set (reg:QI x0) (subreg:QI (reg:V16QI <0>) 0)) + (expr_list:REG_EQUAL (const_int 1) (nil))) + (insn 8 (use (reg:V16QI <0>))) + (insn 9 (use (reg:V2SI v0))) + (insn 10 (use (reg:V16QI v1))) + (insn 11 (use (reg:QI x0))) + (edge-to exit (flags "FALLTHRU")) + ) ;; block 2 + ) ;; insn-chain + ) ;; function +}