Message ID | alpine.LSU.2.20.1610201615340.5714@wotan.suse.de |
---|---|
State | New |
Headers | show |
On 10/20/2016 08:20 AM, Michael Matz wrote: > Hello, > > like analyzed in the PR, combine is able to remove outer subregs that > don't do anything interesting in the context they are used > (simplify_comparison). But that currently happens outside of the loop > that retries simplifications if changes occurred. > > When we do that inside the loop as well we get secondary simplifications > that currently only happen when calling the simplifiers multiple time, > like when we start from three rather than from two instructions. So > right now we're in the curious position that more complicated code is > optimized better than simpler code and the patch fixes this. > > (FWIW: this replicates parts of rather than moves the responsible code, > because between the loop and the original place of simplification other > things happen that might itself generate subregs). > > Regstrapping on x86-64, all languages in process. Okay if that passes? > > > Ciao, > Michael. > PR missed-optimization/77881 > * combine.c (simplify_comparison): Remove useless subregs > also inside the loop, not just after it. > > testsuite/ > * gcc.target/i386/pr77881.c: New test. LGTM. I was a bit curious why you duplicated rather than factoring, but it looks like you simplified the copy a bit by not handling the paradoxical subreg case. There's an outside chance this might help a couple BZs that I've poked at in the past. I'll make a point to test them again this release cycle to see if your change improves them (I don't have the #s handy, I think they're P4/P5 regressions). jeff
Hi Michael, On Thu, Oct 20, 2016 at 04:20:09PM +0200, Michael Matz wrote: > PR missed-optimization/77881 > * combine.c (simplify_comparison): Remove useless subregs > also inside the loop, not just after it. > > testsuite/ > * gcc.target/i386/pr77881.c: New test. This isn't checked in yet, do you still want it? Thanks, Segher
diff --git a/gcc/combine.c b/gcc/combine.c index 2727683..58351ff 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -11925,6 +11925,28 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1) if (subreg_lowpart_p (op0) && GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))) < mode_width) ; + else if (subreg_lowpart_p (op0) + && GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT + && GET_MODE_CLASS (GET_MODE (SUBREG_REG (op0))) == MODE_INT + && (code == NE || code == EQ) + && (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))) + <= HOST_BITS_PER_WIDE_INT) + && !paradoxical_subreg_p (op0) + && (nonzero_bits (SUBREG_REG (op0), + GET_MODE (SUBREG_REG (op0))) + & ~GET_MODE_MASK (GET_MODE (op0))) == 0) + { + /* Remove outer subregs that don't do anything. */ + tem = gen_lowpart (GET_MODE (SUBREG_REG (op0)), op1); + + if ((nonzero_bits (tem, GET_MODE (SUBREG_REG (op0))) + & ~GET_MODE_MASK (GET_MODE (op0))) == 0) + { + op0 = SUBREG_REG (op0), op1 = tem; + continue; + } + break; + } else break; diff --git a/gcc/testsuite/gcc.target/i386/pr77881.c b/gcc/testsuite/gcc.target/i386/pr77881.c new file mode 100644 index 0000000..80d143f --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr77881.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target pie } */ +/* { dg-options "-O2" } */ +extern void baz(void); +int +foo (long long int a, long long int a2, int b) +{ + if (a < 0 || b) + baz (); +} +/* { dg-final { scan-assembler "js\[ \t\]\.L" } } */ +/* { dg-final { scan-assembler "jne\[ \t\]\.L" } } */