Message ID | 049c01d9fb75$47c0bfa0$d7423ee0$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1). | expand |
On 10/10/23 06:28, Roger Sayle wrote: > > This patch is the middle-end piece of an improvement to PRs 101955 and > 106245, that adds a missing simplification to the RTL optimizers. > This transformation is to simplify (char)(x << 7) != 0 as x & 1. > Technically, the cast can be any truncation, where shift is by one > less than the narrower type's precision, setting the most significant > (only) bit from the least significant bit. > > This transformation applies to any target, but it's easy to see > (and add a new test case) on x86, where the following function: > > int f(int a) { return (a << 31) >> 31; } > > currently gets compiled with -O2 to: > > foo: movl %edi, %eax > sall $7, %eax > sarb $7, %al > movsbl %al, %eax > ret > > but with this patch, we now generate the slightly simpler. > > foo: movl %edi, %eax > sall $31, %eax > sarl $31, %eax > ret > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check with no new failures. Ok for mainline? > > > 2023-10-10 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > PR middle-end/101955 > PR tree-optimization/106245 > * simplify-rtx.c (simplify_relational_operation_1): Simplify > the RTL (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) to (and:SI x 1). > > gcc/testsuite/ChangeLog > * gcc.target/i386/pr106245-1.c: New test case. OK. Thanks! I must admit, I'm a bit surprised this wasn't already handled. jeff
On Tue, 10 Oct 2023, Roger Sayle wrote: > > This patch is the middle-end piece of an improvement to PRs 101955 and > 106245, that adds a missing simplification to the RTL optimizers. > This transformation is to simplify (char)(x << 7) != 0 as x & 1. Random observation: So, why restrict to shifts of LEN-1 and mask 1? It's always the case that (type-of-LEN)(x << S)) != 0 === (x & ((1 << (LEN - S)) - 1)) != 0. E.g. (char)(x << 5) != 0 === (x & 7) != 0. (Eventually the mask will be a constant that's too costly to compute if S is target-dependendly too small, but all else being equal avoiding shifts seems sensible) Ciao, Michael.
On 10/10/23 08:41, Michael Matz wrote: > > On Tue, 10 Oct 2023, Roger Sayle wrote: > >> >> This patch is the middle-end piece of an improvement to PRs 101955 and >> 106245, that adds a missing simplification to the RTL optimizers. >> This transformation is to simplify (char)(x << 7) != 0 as x & 1. > > Random observation: > > So, why restrict to shifts of LEN-1 and mask 1? It's always the case that > (type-of-LEN)(x << S)) != 0 === (x & ((1 << (LEN - S)) - 1)) != 0. > > E.g. (char)(x << 5) != 0 === (x & 7) != 0. Yea, it probably could be extended as a followup. > > (Eventually the mask will be a constant that's too costly to compute if S > is target-dependendly too small, but all else being equal avoiding shifts > seems sensible) Agreed, though it's nowhere near as important as it was 20+ years ago ;-) jeff
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index bd9443d..69d8757 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -6109,6 +6109,23 @@ simplify_context::simplify_relational_operation_1 (rtx_code code, break; } + /* (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) -> (and:SI x 1). */ + if (code == NE + && op1 == const0_rtx + && (op0code == TRUNCATE + || (partial_subreg_p (op0) + && subreg_lowpart_p (op0))) + && SCALAR_INT_MODE_P (mode) + && STORE_FLAG_VALUE == 1) + { + rtx tmp = XEXP (op0, 0); + if (GET_CODE (tmp) == ASHIFT + && GET_MODE (tmp) == mode + && CONST_INT_P (XEXP (tmp, 1)) + && is_int_mode (GET_MODE (op0), &int_mode) + && INTVAL (XEXP (tmp, 1)) == GET_MODE_PRECISION (int_mode) - 1) + return simplify_gen_binary (AND, mode, XEXP (tmp, 0), const1_rtx); + } return NULL_RTX; } diff --git a/gcc/testsuite/gcc.target/i386/pr106245-1.c b/gcc/testsuite/gcc.target/i386/pr106245-1.c new file mode 100644 index 0000000..a0403e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr106245-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +int f(int a) +{ + return (a << 31) >> 31; +} + +/* { dg-final { scan-assembler-not "sarb" } } */ +/* { dg-final { scan-assembler-not "movsbl" } } */