Message ID | ZhF/QS9e9+rbNYMT@tucnak |
---|---|
State | New |
Headers | show |
Series | s390: Fix s390_const_int_pool_entry_p and movdi peephole2 [PR114605] | expand |
On Sat, 2024-04-06 at 18:58 +0200, Jakub Jelinek wrote: > Hi! > > The following testcase is miscompiled, because we have initially > a movti which loads the 0x3f8000003f800000ULL TImode constant > from constant pool. Later on we split it into a pair of DImode > loads. Now, for the first load (why just that?, though not stage4 > material) we trigger the peephole2 which uses > s390_const_int_pool_entry_p. > That function doesn't check at all the constant pool mode though, > sees > the constant pool at that address has a CONST_INT value and just > assumes > that is the value to return, which is especially wrong for big- > endian, > if it is a DImode load from offset 0, it should be loading 0 rather > than > 0x3f8000003f800000ULL. > The following patch adds checks if we are extracing a MODE_INT mode, > if the constant pool has MODE_INT mode as well, punts if constant > pool > has smaller mode size than the extraction one (then it would be UB), > if it has the same mode as before keeps using what it did before, > if constant pool has a larger mode than the one being extracted, uses > simplify_subreg. I'd have used avoid_constant_pool_reference > instead which can handle also offsets into the constant pool > constants, > but it can't handle UNSPEC_LTREF. > > Another thing is that once that is fixed, we ICE when we extract > constant > like 0, ior insn predicate require non-0 constant. So, the patch > also > fixes the peephole2 so that if either 32-bit half is zero, it uses a > mere > load of the constant into register rather than a pair of such load > and ior. > > Bootstrapped/regtested on s390x-linux, ok for trunk? Hi Jakub, thanks for the patch, it looks good to me. Since I'm not a maintainer, we need to wait for Andreas' opinion. > > 2024-04-06 Jakub Jelinek <jakub@redhat.com> > > PR target/114605 > * config/s390/s390.cc (s390_const_int_pool_entry_p): Punt > if mem doesn't have MODE_INT mode, or pool constant doesn't > have MODE_INT mode, or if pool constant mode is smaller than > mem mode. If mem mode is different from pool constant mode, > try to simplify subreg. If that doesn't work, punt, if it > does, use the simplified constant instead of the constant > pool > constant. > * config/s390/s390.md (movdi from const pool peephole): If > either low or high 32-bit part is zero, just emit move insn > instead of move + ior. > > * gcc.dg/pr114605.c: New test.
On 4/8/24 13:43, Ilya Leoshkevich wrote: > On Sat, 2024-04-06 at 18:58 +0200, Jakub Jelinek wrote: >> Hi! >> >> The following testcase is miscompiled, because we have initially >> a movti which loads the 0x3f8000003f800000ULL TImode constant >> from constant pool. Later on we split it into a pair of DImode >> loads. Now, for the first load (why just that?, though not stage4 >> material) we trigger the peephole2 which uses >> s390_const_int_pool_entry_p. >> That function doesn't check at all the constant pool mode though, >> sees >> the constant pool at that address has a CONST_INT value and just >> assumes >> that is the value to return, which is especially wrong for big- >> endian, >> if it is a DImode load from offset 0, it should be loading 0 rather >> than >> 0x3f8000003f800000ULL. >> The following patch adds checks if we are extracing a MODE_INT mode, >> if the constant pool has MODE_INT mode as well, punts if constant >> pool >> has smaller mode size than the extraction one (then it would be UB), >> if it has the same mode as before keeps using what it did before, >> if constant pool has a larger mode than the one being extracted, uses >> simplify_subreg. I'd have used avoid_constant_pool_reference >> instead which can handle also offsets into the constant pool >> constants, >> but it can't handle UNSPEC_LTREF. >> >> Another thing is that once that is fixed, we ICE when we extract >> constant >> like 0, ior insn predicate require non-0 constant. So, the patch >> also >> fixes the peephole2 so that if either 32-bit half is zero, it uses a >> mere >> load of the constant into register rather than a pair of such load >> and ior. >> >> Bootstrapped/regtested on s390x-linux, ok for trunk? > > Hi Jakub, thanks for the patch, it looks good to me. > Since I'm not a maintainer, we need to wait for Andreas' opinion. Ok. Thank you very much Jakub for fixing this! Andreas > >> >> 2024-04-06 Jakub Jelinek <jakub@redhat.com> >> >> PR target/114605 >> * config/s390/s390.cc (s390_const_int_pool_entry_p): Punt >> if mem doesn't have MODE_INT mode, or pool constant doesn't >> have MODE_INT mode, or if pool constant mode is smaller than >> mem mode. If mem mode is different from pool constant mode, >> try to simplify subreg. If that doesn't work, punt, if it >> does, use the simplified constant instead of the constant >> pool >> constant. >> * config/s390/s390.md (movdi from const pool peephole): If >> either low or high 32-bit part is zero, just emit move insn >> instead of move + ior. >> >> * gcc.dg/pr114605.c: New test.
--- gcc/config/s390/s390.cc.jj 2024-03-14 14:07:34.088426911 +0100 +++ gcc/config/s390/s390.cc 2024-04-05 15:58:57.757057420 +0200 @@ -9984,7 +9984,7 @@ s390_const_int_pool_entry_p (rtx mem, HO - (mem (unspec [(symbol_ref) (reg)] UNSPEC_LTREF)). - (mem (symbol_ref)). */ - if (!MEM_P (mem)) + if (!MEM_P (mem) || GET_MODE_CLASS (GET_MODE (mem)) != MODE_INT) return false; rtx addr = XEXP (mem, 0); @@ -9998,9 +9998,19 @@ s390_const_int_pool_entry_p (rtx mem, HO return false; rtx val_rtx = get_pool_constant (sym); - if (!CONST_INT_P (val_rtx)) + machine_mode mode = get_pool_mode (sym); + if (!CONST_INT_P (val_rtx) + || GET_MODE_CLASS (mode) != MODE_INT + || GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (mem))) return false; + if (mode != GET_MODE (mem)) + { + val_rtx = simplify_subreg (GET_MODE (mem), val_rtx, mode, 0); + if (val_rtx == NULL_RTX || !CONST_INT_P (val_rtx)) + return false; + } + if (val != nullptr) *val = INTVAL (val_rtx); return true; --- gcc/config/s390/s390.md.jj 2024-01-03 11:51:54.638410489 +0100 +++ gcc/config/s390/s390.md 2024-04-05 16:17:17.322234553 +0200 @@ -2152,6 +2152,16 @@ (define_peephole2 gcc_assert (ok); operands[2] = GEN_INT (val & 0xFFFFFFFF00000000ULL); operands[3] = GEN_INT (val & 0x00000000FFFFFFFFULL); + if (operands[2] == const0_rtx) + { + emit_move_insn (operands[0], operands[3]); + DONE; + } + else if (operands[3] == const0_rtx) + { + emit_move_insn (operands[0], operands[2]); + DONE; + } }) ; --- gcc/testsuite/gcc.dg/pr114605.c.jj 2024-04-05 16:25:34.678505438 +0200 +++ gcc/testsuite/gcc.dg/pr114605.c 2024-04-05 16:25:10.388834268 +0200 @@ -0,0 +1,37 @@ +/* PR target/114605 */ +/* { dg-do run } */ +/* { dg-options "-O0" } */ + +typedef struct { const float *a; int b, c; float *d; } S; + +__attribute__((noipa)) void +bar (void) +{ +} + +__attribute__((noinline, optimize (2))) static void +foo (S *e) +{ + const float *f; + float *g; + float h[4] = { 0.0, 0.0, 1.0, 1.0 }; + if (!e->b) + f = h; + else + f = e->a; + g = &e->d[0]; + __builtin_memcpy (g, f, sizeof (float) * 4); + bar (); + if (!e->b) + if (g[0] != 0.0 || g[1] != 0.0 || g[2] != 1.0 || g[3] != 1.0) + __builtin_abort (); +} + +int +main () +{ + float d[4]; + S e = { .d = d }; + foo (&e); + return 0; +}