diff mbox series

[committed] Fix previously latent bug in reorg affecting cris port

Message ID 60fc2780-24f1-4567-a91f-7b227f299621@gmail.com
State New
Headers show
Series [committed] Fix previously latent bug in reorg affecting cris port | expand

Commit Message

Jeff Law July 3, 2024, 6:46 p.m. UTC
The late-combine patch has triggered a previously latent bug in reorg.

Basically we have a sequence like this in the middle of reorg before we 
start relaxing delay slots (cris-elf, gcc.dg/torture/pr98289.c)

> (insn 67 49 18 (sequence [
>             (jump_insn 50 49 52 (set (pc)
>                     (if_then_else (ne (reg:CC 19 ccr)
>                             (const_int 0 [0]))
>                         (label_ref:SI 30)
>                         (pc))) "j.c":10:6 discrim 1 282 {*bnecc}
>                  (expr_list:REG_DEAD (reg:CC 19 ccr)
>                     (int_list:REG_BR_PROB 7 (nil)))
>              -> 30)
>             (insn/f 52 50 18 (set (mem:SI (reg/f:SI 14 sp) [1  S4 A8])
>                     (reg:SI 16 srp)) 37 {*mov_tomemsi}
>                  (nil))
>         ]) "j.c":10:6 discrim 1 -1
>      (nil))
> 
> (note 18 67 54 [bb 3] NOTE_INSN_BASIC_BLOCK)
> 
> (note 54 18 55 NOTE_INSN_EPILOGUE_BEG)
> 
> (jump_insn 55 54 56 (return) "j.c":14:1 228 {*return_expanded}
>      (nil)
>  -> return)
> 
> (barrier 56 55 43)
> 
> (note 43 56 65 [bb 4] NOTE_INSN_BASIC_BLOCK)
> 
> (note 65 43 30 NOTE_INSN_SWITCH_TEXT_SECTIONS)
> 
> (code_label 30 65 8 5 6 (nil) [1 uses])
> 
> (note 8 30 61 [bb 5] NOTE_INSN_BASIC_BLOCK)

So at a high level the things to note are that insn 50 conditionally 
jumps around insn 55.  Second there's a SWITCH_TEXT_SECTIONS note 
between insn 50 and the target label for insn 50 (code_label 30).

reorg sees the conditional jump around the unconditional jump/return and 
will invert the jump and retarget the original jump to an appropriate 
location.  In this case generating:

> (insn 67 49 18 (sequence [
>             (jump_insn 50 49 52 (set (pc)
>                     (if_then_else (eq (reg:CC 19 ccr)
>                             (const_int 0 [0]))
>                         (label_ref:SI 68)
>                         (pc))) "j.c":10:6 discrim 1 281 {*beqcc}
>                  (expr_list:REG_DEAD (reg:CC 19 ccr)
>                     (int_list:REG_BR_PROB 1073741831 (nil)))
>              -> 68)
>             (insn/s/f 52 50 18 (set (mem:SI (reg/f:SI 14 sp) [1  S4 A8])
>                     (reg:SI 16 srp)) 37 {*mov_tomemsi}
>                  (nil))
>         ]) "j.c":10:6 discrim 1 -1
>      (nil))
> 
> (note 18 67 54 [bb 3] NOTE_INSN_BASIC_BLOCK)
> 
> (note 54 18 43 NOTE_INSN_EPILOGUE_BEG)
> 
> (note 43 54 65 [bb 4] NOTE_INSN_BASIC_BLOCK)
> 
> (note 65 43 8 NOTE_INSN_SWITCH_TEXT_SECTIONS)
> 
> (note 8 65 61 [bb 5] NOTE_INSN_BASIC_BLOCK)
[ ... ]
Where the new target of the jump is a return statement later in the IL.


Note that we now have a SWITCH_TEXT_SECTIONS note that is not 
immediately preceded by a BARRIER.  That triggers an assertion in the 
dwarf2 code.  Removal of the BARRIER is inherent in this optimization.

The fix is simple, we avoid this optimization when there's a 
SWITCH_TEXT_SECTIONS note between the conditional jump insn and its 
target.  Thankfully we already have a routine to test for this in reorg, 
so we just need to call it appropriately.  The other approach would be 
to drop the note which I considered and discarded.

We don't have great coverage for delay slot targets.  I've tested arc, 
cris, fr30, frv, h8, iq2000, microblaze, or1k, sh3  visium in my tester 
as crosses without new regressions, fixing one regression along the way. 
   Bootstrap & regression testing on sh4 and hppa will take considerably 
longer.

Pushing to the trunk momentarily.

Jeff
gcc/

	* reorg.cc (relax_delay_slots): Do not optimize a conditional
	jump around an unconditional jump/return in the presence of
	a text section switch.

Comments

Hans-Peter Nilsson July 12, 2024, 12:11 a.m. UTC | #1
> Date: Wed, 3 Jul 2024 12:46:46 -0600
> From: Jeff Law <jeffreyalaw@gmail.com>

> The late-combine patch has triggered a previously latent bug in reorg.
> 
> Basically we have a sequence like this in the middle of reorg before we 
> start relaxing delay slots (cris-elf, gcc.dg/torture/pr98289.c)

[...]

> Pushing to the trunk momentarily.

Thanks for fixing that!  One less for me to deal with!
I see it in my autotester log, but just as one of many
temporary regressions, which happen all the time.

FWIW, I have a patch-set for the (other) late-combine
regressions in gcc.target/cris, most prominently PR115883.
I just have to check one more thing...

brgds, H-P
PS. That "[committed]" part ends up as part of the committed subject.
Hans-Peter Nilsson July 15, 2024, 2:27 a.m. UTC | #2
> From: Hans-Peter Nilsson <hp@axis.com>
> Date: Fri, 12 Jul 2024 02:11:45 +0200
> 
> > Date: Wed, 3 Jul 2024 12:46:46 -0600
> > From: Jeff Law <jeffreyalaw@gmail.com>
> 
> > The late-combine patch has triggered a previously latent bug in reorg.
> > 
> > Basically we have a sequence like this in the middle of reorg before we 
> > start relaxing delay slots (cris-elf, gcc.dg/torture/pr98289.c)
> 
> [...]
> 
> > Pushing to the trunk momentarily.

JFTR, for cris-elf, this can't be blamed on (to have been
exposed by) late-combine, because this appeared with
r15-1619-g3b9b8d6cfdf593 "ira: Scale save/restore costs of
callee save registers with block frequency", even with
-fno-late-combine-instructions.

I noticed because I chased another regression, an XPASS,
happening for gcc.dg/tree-ssa/loop-1.c which was also caused
by that revision.

Regarding that commit, checking the generated code for
loop-1.c, that XPASS was reflecting a *regression*, not an
improvement.  To wit, it looks like _foo is no longer stored
in a register for cris-elf, but there's no change in the
number of saved registers.  As coremark results are the same
before/after that commit for cris-elf, I'm not going to make
a fuss; IOW, not open a PR for the regression.  (Phew, one
less rabbit-hole.  I see that patch exposed as many problems
as late-combine!)

Still, a heads-up to the author of that patch.  Maybe the
frequencies are miscalculated for that test-case.  I tried
to look at regs.h:REG_FREQ_FROM_BB, but it's a mystery to
me: its value seems to vary between 1 and 1000, that doesn't
seem right, but that macro's used all over the place.  Not
documented very much though. :(  FAOD, not blaming the author
of r15-1619-g3b9b8d6cfdf593 here.

Also FTR, I had to search a bit to find the patch submission
and review.  It's in the archives of last October:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631849.html
as mentioned in another message.

brgds, H-P
diff mbox series

Patch

diff --git a/gcc/reorg.cc b/gcc/reorg.cc
index 99228a22c69..633099ca765 100644
--- a/gcc/reorg.cc
+++ b/gcc/reorg.cc
@@ -3409,7 +3409,8 @@  relax_delay_slots (rtx_insn *first)
 	  && next && simplejump_or_return_p (next)
 	  && (next_active_insn (as_a<rtx_insn *> (target_label))
 	      == next_active_insn (next))
-	  && no_labels_between_p (insn, next))
+	  && no_labels_between_p (insn, next)
+	  && !switch_text_sections_between_p (insn, next_active_insn (next)))
 	{
 	  rtx label = JUMP_LABEL (next);
 	  rtx old_label = JUMP_LABEL (delay_jump_insn);