Message ID | alpine.LNX.2.20.13.1707152000200.20069@monopod.intra.ispras.ru |
---|---|
State | New |
Headers | show |
On Sat, 15 Jul 2017, Alexander Monakov wrote: > On Thu, 13 Jul 2017, Marc Glisse wrote: >> X*big*big where abs(big*big)>abs(INT_MIN) can be optimized to 0 > > I'm not sure that would be a win, eliminating X prevents the compiler from > deducing that X must be zero (if overflow invokes undefined behavior). This issue is common to many simplifications. As far as I know, we almost never use an operation to deduce a range for its argument (the only case I can think of is dereferencing a pointer). Though this case is a bit special since we would determine a constant value, not just a range, so we could imagine walking through the uses of X that are dominated by the multiplication and replacing X by 0 there... but that's quite a bit of work for something hopefully rare. I wonder if the new VRP that's in the works changes anything there. >> the only hard case is when the product of the constants is -INT_MIN, which we >> could turn into X<<31 for instance (sadly loses range info), or (-X)*INT_MIN >> or whatever. That would make a nice follow-up, if you are interested. > > Here's a patch that combines constants, but doesn't handle this case, > (I'm not sure we want to handle it, would it be useful in practice?) Probably not worth it indeed. > and neither does substitute zero on overflow, per the above concern. > > (slsr-4.c needs to be adjusted due to new simplifications) > > * match.pd ((X * CST1) * CST2): Simplify to X * (CST1 * CST2) > if the product does not overflow. > testsuite: > * gcc.dg/tree-ssa/assoc-2.c: Enhance. > * gcc.dg/tree-ssa/slsr-4.c: Adjust. > > diff --git a/gcc/match.pd b/gcc/match.pd > index 36045f1..7f384db 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -283,6 +283,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) > { build_zero_cst (type); }))))) > > +/* Combine successive multiplications. Similar to above, but handling > + overflow is different. */ > +(simplify > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) > + (with { > + bool overflow_p; > + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); > + } > + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) I wonder if there are cases where this would cause trouble for saturating integers. The only case I can think of is when @2 is -1, but that's likely simplified to NEGATE_EXPR first. > + (mult @0 {wide_int_to_tree (type, mul); })))) > + There are a number of possible extensions (handle vectors, handle a conversion, etc) but I guess reviewers probably won't make those a requirement for this patch. Missing space before wide_int_to_tree. > /* Optimize A / A to 1.0 if we don't care about > NaNs or Infinities. */ > (simplify > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c > index a92c882..cc0e9d4 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c > @@ -5,4 +5,15 @@ int f0(int a, int b){ > return a * 33 * b * 55; > } > > -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */ > +int f1(int a){ > + a *= 33; > + return a * 55; > +} > + > +int f2(int a, int b){ > + a *= 33; > + return a * b * 55; > +} > + > +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */ > +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */ Ah, so that's why you had -fdump-tree-optimized in the previous patch... > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c > index 17d7b4c..1e943b7 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c > @@ -23,13 +23,9 @@ f (int i) > foo (y); > } > > -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */ > +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */ > /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */ > /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */ > -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */ > -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */ > /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */ > /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */
On Mon, 17 Jul 2017, Marc Glisse wrote: > > +/* Combine successive multiplications. Similar to above, but handling > > + overflow is different. */ > > +(simplify > > + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) > > + (with { > > + bool overflow_p; > > + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); > > + } > > + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) > > I wonder if there are cases where this would cause trouble for saturating > integers. The only case I can think of is when @2 is -1, but that's likely > simplified to NEGATE_EXPR first. Ah, yes, I think if @2 is -1 or 0 then we should not attempt this transform for either saturating or sanitized types, just like in the first patch. I think wrapping the 'with' with 'if (!integer_minus_onep (@2) && !integer_zerop (@2))' works, since as you say it should become a negate/zero anyway? Alexander
diff --git a/gcc/match.pd b/gcc/match.pd index 36045f1..7f384db 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -283,6 +283,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)) { build_zero_cst (type); }))))) +/* Combine successive multiplications. Similar to above, but handling + overflow is different. */ +(simplify + (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2) + (with { + bool overflow_p; + wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), &overflow_p); + } + (if (!overflow_p || TYPE_OVERFLOW_WRAPS (type)) + (mult @0 {wide_int_to_tree (type, mul); })))) + /* Optimize A / A to 1.0 if we don't care about NaNs or Infinities. */ (simplify diff --git a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c index a92c882..cc0e9d4 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/assoc-2.c @@ -5,4 +5,15 @@ int f0(int a, int b){ return a * 33 * b * 55; } -/* { dg-final { scan-tree-dump-times "mult_expr" 2 "gimple" } } */ +int f1(int a){ + a *= 33; + return a * 55; +} + +int f2(int a, int b){ + a *= 33; + return a * b * 55; +} + +/* { dg-final { scan-tree-dump-times "mult_expr" 7 "gimple" } } */ +/* { dg-final { scan-tree-dump-times "mult_expr" 5 "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c index 17d7b4c..1e943b7 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-4.c @@ -23,13 +23,9 @@ f (int i) foo (y); } -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\+ 20;" 1 "slsr" } } */ +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "slsr" } } */ /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\- 16;" 1 "slsr" } } */ /* { dg-final { scan-tree-dump-times "\\- 160" 1 "slsr" } } */ -/* { dg-final { scan-tree-dump-times "\\* 4" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "\\* 10" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "\\* 40" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "\\+ 200" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "\\+ 40" 1 "optimized" } } */