diff mbox series

[RFA,PR,rtl-optimization/116136] Fix previously latent SUBREG simplification bug

Message ID 5685a3c8-e8f3-4106-b747-cd6c5831b44f@ventanamicro.com
State New
Headers show
Series [RFA,PR,rtl-optimization/116136] Fix previously latent SUBREG simplification bug | expand

Commit Message

Jeff Law July 30, 2024, 8:22 p.m. UTC
This fixes a testsuite regression seen on m68k after some of the recent 
ext-dce changes.  Ultimately Richard S and I have concluded the bug was 
a latent issue in subreg simplification.

Essentially when simplifying something like

(set (target:M1) (subreg:M1 (subreg:M2 (reg:M1) 0) 0))

Where M1 > M2.  We'd simplify to:

(set (target:M1) (reg:M1))

The problem is on a big endian target that's wrong.   Consider if M1 is 
DI and M2 is SI.    The original should extract bits 32..63 from the 
source register and store them into bits 0..31 of the target register. 
In the simplified form it's just a copy, so bits 0..63 of the source end 
up bits 0..63 of the target.

This shows up as the following regressions on the m68k:

> Tests that now fail, but worked before (3 tests):
> 
> gcc: gcc.c-torture/execute/960416-1.c   -O2  execution test
> gcc: gcc.c-torture/execute/960416-1.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> gcc: gcc.c-torture/execute/960416-1.c   -Os  execution test



The fix is pretty trivial, instead of hardcoding "0" as the byte offset 
in the test for the simplification, instead we need to use the 
subreg_lowpart_offset.


Anyway, bootstrapped and regression tested on m68k and x86_64 and tested 
on the other embedded targets as well without regressions.  Naturally it 
fixes the regression noted above.  I haven't see other testsuite 
improvements when I spot checked some of the big endian crosses.

OK for the trunk?

Jeff
PR rtl-optimization/116136
gcc/
	* simplify-rtx.cc (simplify_context::simplify_subreg): Check
	that we're working with the lowpart offset rather than byte 0.

Comments

Richard Biener July 31, 2024, 8:34 a.m. UTC | #1
On Tue, Jul 30, 2024 at 10:24 PM Jeff Law <jlaw@ventanamicro.com> wrote:
>
>
> This fixes a testsuite regression seen on m68k after some of the recent
> ext-dce changes.  Ultimately Richard S and I have concluded the bug was
> a latent issue in subreg simplification.
>
> Essentially when simplifying something like
>
> (set (target:M1) (subreg:M1 (subreg:M2 (reg:M1) 0) 0))
>
> Where M1 > M2.  We'd simplify to:
>
> (set (target:M1) (reg:M1))
>
> The problem is on a big endian target that's wrong.   Consider if M1 is
> DI and M2 is SI.    The original should extract bits 32..63 from the
> source register and store them into bits 0..31 of the target register.
> In the simplified form it's just a copy, so bits 0..63 of the source end
> up bits 0..63 of the target.
>
> This shows up as the following regressions on the m68k:
>
> > Tests that now fail, but worked before (3 tests):
> >
> > gcc: gcc.c-torture/execute/960416-1.c   -O2  execution test
> > gcc: gcc.c-torture/execute/960416-1.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  execution test
> > gcc: gcc.c-torture/execute/960416-1.c   -Os  execution test
>
>
>
> The fix is pretty trivial, instead of hardcoding "0" as the byte offset
> in the test for the simplification, instead we need to use the
> subreg_lowpart_offset.
>
>
> Anyway, bootstrapped and regression tested on m68k and x86_64 and tested
> on the other embedded targets as well without regressions.  Naturally it
> fixes the regression noted above.  I haven't see other testsuite
> improvements when I spot checked some of the big endian crosses.
>
> OK for the trunk?

OK.

Thanks,
Richard.

> Jeff
>
diff mbox series

Patch

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index a49eefb34d4..a20a61c5ddd 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -7744,8 +7744,9 @@  simplify_context::simplify_subreg (machine_mode outermode, rtx op,
 	return NULL_RTX;
 
       if (outermode == innermostmode
-	  && known_eq (byte, 0U)
-	  && known_eq (SUBREG_BYTE (op), 0))
+	  && known_eq (byte, subreg_lowpart_offset (outermode, innermode))
+	  && known_eq (SUBREG_BYTE (op),
+		       subreg_lowpart_offset (innermode, innermostmode)))
 	return SUBREG_REG (op);
 
       /* Work out the memory offset of the final OUTERMODE value relative