Message ID | 89b612f6-f30a-6abc-1ee6-3b3810117609@redhat.com |
---|---|
State | New |
Headers | show |
Series | patch to fix PR87305 | expand |
Hi Vlad, Vladimir Makarov <vmakarov@redhat.com> writes: > The following patch fixes > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87305 > > The patch was bootstrapped and tested on x86-64 and ppc64 (be). > > Committed as rev. 267823. > > Index: ChangeLog > =================================================================== > --- ChangeLog (revision 267822) > +++ ChangeLog (working copy) > @@ -1,3 +1,12 @@ > +2019-01-10 Vladimir Makarov <vmakarov@redhat.com> > + > + PR rtl-optimization/87305 > + * lra-assigns.c > + (setup_live_pseudos_and_spill_after_risky_transforms): Check > + allocation for big endian pseudos used as paradoxical subregs and > + spill them if it is wrong. > + * lra-constraints.c (lra_constraints): Add a comment. > + > 2019-01-10 Richard Biener <rguenther@suse.de> > > PR tree-optimization/88792 > Index: lra-assigns.c > =================================================================== > --- lra-assigns.c (revision 267822) > +++ lra-assigns.c (working copy) > @@ -1146,12 +1146,12 @@ static void > setup_live_pseudos_and_spill_after_risky_transforms (bitmap > spilled_pseudo_bitmap) > { > - int p, i, j, n, regno, hard_regno; > + int p, i, j, n, regno, hard_regno, biggest_nregs, nregs_diff; > unsigned int k, conflict_regno; > poly_int64 offset; > int val; > HARD_REG_SET conflict_set; > - machine_mode mode; > + machine_mode mode, biggest_mode; > lra_live_range_t r; > bitmap_iterator bi; > int max_regno = max_reg_num (); > @@ -1166,8 +1166,26 @@ setup_live_pseudos_and_spill_after_risky > for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) > if ((pic_offset_table_rtx == NULL_RTX > || i != (int) REGNO (pic_offset_table_rtx)) > - && reg_renumber[i] >= 0 && lra_reg_info[i].nrefs > 0) > - sorted_pseudos[n++] = i; > + && (hard_regno = reg_renumber[i]) >= 0 && lra_reg_info[i].nrefs > 0) > + { > + biggest_mode = lra_reg_info[i].biggest_mode; > + biggest_nregs = hard_regno_nregs (hard_regno, biggest_mode); > + nregs_diff = (biggest_nregs > + - hard_regno_nregs (hard_regno, PSEUDO_REGNO_MODE (i))); > + enum reg_class rclass = lra_get_allocno_class (i); > + > + if (WORDS_BIG_ENDIAN > + && (hard_regno - nregs_diff < 0 > + || !TEST_HARD_REG_BIT (reg_class_contents[rclass], > + hard_regno - nregs_diff))) > + { > + /* Hard registers of paradoxical sub-registers are out of > + range of pseudo register class. Spill the pseudo. */ > + reg_renumber[i] = -1; > + continue; > + } I think for !WORDS_BIG_ENDIAN the equivalent problem to: || !TEST_HARD_REG_BIT (reg_class_contents[rclass], hard_regno - nregs_diff))) would be: || !TEST_HARD_REG_BIT (reg_class_contents[rclass], hard_regno + biggest_nregs - 1))) Where does that little-endian check happen? I wasn't sure why "biggest_mode goes outside the class" problems only occured here for big-endian. Thanks, Richard
On 01/11/2019 04:58 AM, Richard Sandiford wrote: > Hi Vlad, > > I think for !WORDS_BIG_ENDIAN the equivalent problem to: > > || !TEST_HARD_REG_BIT (reg_class_contents[rclass], > hard_regno - nregs_diff))) > > would be: > > || !TEST_HARD_REG_BIT (reg_class_contents[rclass], > hard_regno + biggest_nregs - 1))) > > Where does that little-endian check happen? I wasn't sure why > "biggest_mode goes outside the class" problems only occured here > for big-endian. > > Yes, you are right. Although probability of the same problem for little endian is much smaller, it is still present. I am working on it. It would be nice to avoid wrong assignments in IRA too but it is too much work. LRA will fix wrong assignments (by spilling and probably reassignment afterwards) but sometimes with worse result.
Index: ChangeLog =================================================================== --- ChangeLog (revision 267822) +++ ChangeLog (working copy) @@ -1,3 +1,12 @@ +2019-01-10 Vladimir Makarov <vmakarov@redhat.com> + + PR rtl-optimization/87305 + * lra-assigns.c + (setup_live_pseudos_and_spill_after_risky_transforms): Check + allocation for big endian pseudos used as paradoxical subregs and + spill them if it is wrong. + * lra-constraints.c (lra_constraints): Add a comment. + 2019-01-10 Richard Biener <rguenther@suse.de> PR tree-optimization/88792 Index: lra-assigns.c =================================================================== --- lra-assigns.c (revision 267822) +++ lra-assigns.c (working copy) @@ -1146,12 +1146,12 @@ static void setup_live_pseudos_and_spill_after_risky_transforms (bitmap spilled_pseudo_bitmap) { - int p, i, j, n, regno, hard_regno; + int p, i, j, n, regno, hard_regno, biggest_nregs, nregs_diff; unsigned int k, conflict_regno; poly_int64 offset; int val; HARD_REG_SET conflict_set; - machine_mode mode; + machine_mode mode, biggest_mode; lra_live_range_t r; bitmap_iterator bi; int max_regno = max_reg_num (); @@ -1166,8 +1166,26 @@ setup_live_pseudos_and_spill_after_risky for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) if ((pic_offset_table_rtx == NULL_RTX || i != (int) REGNO (pic_offset_table_rtx)) - && reg_renumber[i] >= 0 && lra_reg_info[i].nrefs > 0) - sorted_pseudos[n++] = i; + && (hard_regno = reg_renumber[i]) >= 0 && lra_reg_info[i].nrefs > 0) + { + biggest_mode = lra_reg_info[i].biggest_mode; + biggest_nregs = hard_regno_nregs (hard_regno, biggest_mode); + nregs_diff = (biggest_nregs + - hard_regno_nregs (hard_regno, PSEUDO_REGNO_MODE (i))); + enum reg_class rclass = lra_get_allocno_class (i); + + if (WORDS_BIG_ENDIAN + && (hard_regno - nregs_diff < 0 + || !TEST_HARD_REG_BIT (reg_class_contents[rclass], + hard_regno - nregs_diff))) + { + /* Hard registers of paradoxical sub-registers are out of + range of pseudo register class. Spill the pseudo. */ + reg_renumber[i] = -1; + continue; + } + sorted_pseudos[n++] = i; + } qsort (sorted_pseudos, n, sizeof (int), pseudo_compare_func); if (pic_offset_table_rtx != NULL_RTX && (regno = REGNO (pic_offset_table_rtx)) >= FIRST_PSEUDO_REGISTER @@ -1206,10 +1224,11 @@ setup_live_pseudos_and_spill_after_risky || hard_regno != reg_renumber[conflict_regno]) { int conflict_hard_regno = reg_renumber[conflict_regno]; - machine_mode biggest_mode = lra_reg_info[conflict_regno].biggest_mode; - int biggest_nregs = hard_regno_nregs (conflict_hard_regno, - biggest_mode); - int nregs_diff + + biggest_mode = lra_reg_info[conflict_regno].biggest_mode; + biggest_nregs = hard_regno_nregs (conflict_hard_regno, + biggest_mode); + nregs_diff = (biggest_nregs - hard_regno_nregs (conflict_hard_regno, PSEUDO_REGNO_MODE (conflict_regno))); Index: lra-constraints.c =================================================================== --- lra-constraints.c (revision 267822) +++ lra-constraints.c (working copy) @@ -4739,7 +4739,9 @@ lra_constraints (bool first_p) else /* On the first iteration we should check IRA assignment correctness. In rare cases, the assignments can be wrong as - early clobbers operands are ignored in IRA. */ + early clobbers operands are ignored in IRA or usages of + paradoxical sub-registers are not taken into account by + IRA. */ lra_risky_transformations_p = first_p; new_insn_uid_start = get_max_uid (); new_regno_start = first_p ? lra_constraint_new_regno_start : max_reg_num (); Index: testsuite/ChangeLog =================================================================== --- testsuite/ChangeLog (revision 267822) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2019-01-10 Vladimir Makarov <vmakarov@redhat.com> + + PR rtl-optimization/87305 + * gcc.target/aarch64/pr87305.c: New. + 2019-01-10 Richard Biener <rguenther@suse.de> PR tree-optimization/88792 Index: testsuite/gcc.target/aarch64/pr87305.c =================================================================== --- testsuite/gcc.target/aarch64/pr87305.c (nonexistent) +++ testsuite/gcc.target/aarch64/pr87305.c (working copy) @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -mbig-endian -w" } */ + +int cc; + +void +rc (__int128 *oi) +{ + __int128 qz = (__int128)2 << cc; + + if (qz != 0) + { + if (cc != 0) + { + __int128 zp = 1; + + for (;;) + { + unsigned __int128 *ar = &cc; + int y5; + + if (oi != 0) + { + y3: + zp = *oi + *ar; + } + + y5 = (cc + 1) == ((*ar /= *oi) << ((zp >>= 128) / cc)); + qz += !!y5 ? 1 : qz == (*ar ^ zp + 1); + ++*oi; + } + } + else + ++qz; + } + + goto y3; +}