Message ID | 5685a3c8-e8f3-4106-b747-cd6c5831b44f@ventanamicro.com |
---|---|
State | New |
Headers | show |
Series | [RFA,PR,rtl-optimization/116136] Fix previously latent SUBREG simplification bug | expand |
On Tue, Jul 30, 2024 at 10:24 PM Jeff Law <jlaw@ventanamicro.com> wrote: > > > This fixes a testsuite regression seen on m68k after some of the recent > ext-dce changes. Ultimately Richard S and I have concluded the bug was > a latent issue in subreg simplification. > > Essentially when simplifying something like > > (set (target:M1) (subreg:M1 (subreg:M2 (reg:M1) 0) 0)) > > Where M1 > M2. We'd simplify to: > > (set (target:M1) (reg:M1)) > > The problem is on a big endian target that's wrong. Consider if M1 is > DI and M2 is SI. The original should extract bits 32..63 from the > source register and store them into bits 0..31 of the target register. > In the simplified form it's just a copy, so bits 0..63 of the source end > up bits 0..63 of the target. > > This shows up as the following regressions on the m68k: > > > Tests that now fail, but worked before (3 tests): > > > > gcc: gcc.c-torture/execute/960416-1.c -O2 execution test > > gcc: gcc.c-torture/execute/960416-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test > > gcc: gcc.c-torture/execute/960416-1.c -Os execution test > > > > The fix is pretty trivial, instead of hardcoding "0" as the byte offset > in the test for the simplification, instead we need to use the > subreg_lowpart_offset. > > > Anyway, bootstrapped and regression tested on m68k and x86_64 and tested > on the other embedded targets as well without regressions. Naturally it > fixes the regression noted above. I haven't see other testsuite > improvements when I spot checked some of the big endian crosses. > > OK for the trunk? OK. Thanks, Richard. > Jeff >
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index a49eefb34d4..a20a61c5ddd 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -7744,8 +7744,9 @@ simplify_context::simplify_subreg (machine_mode outermode, rtx op, return NULL_RTX; if (outermode == innermostmode - && known_eq (byte, 0U) - && known_eq (SUBREG_BYTE (op), 0)) + && known_eq (byte, subreg_lowpart_offset (outermode, innermode)) + && known_eq (SUBREG_BYTE (op), + subreg_lowpart_offset (innermode, innermostmode))) return SUBREG_REG (op); /* Work out the memory offset of the final OUTERMODE value relative