Message ID | 00a701d82660$a1b8ac20$e52a0460$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | Improved constant folding for scalar evolution. | expand |
On Sun, Feb 20, 2022 at 2:50 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > This patch adds a small (follow-up) optimization to chrec_apply for > linear chrecs to clean-up the final value expressions sometimes generated > by GCC's scalar evolution pass. The transformation of A+(X-1)*A into > A*X is usually unsafe with respect to overflow (see PR92712), and so > can't be performed by match.pd (or fold-const). However, during scalar > evolution's evaluation of recurrences it's known that X-1 can't be negative > (in fact X-1 is unsigned even when A is signed), hence this optimization > can be applied. Interestingly, this expression does get simplified in > later passes once the range of X-1 is bounded by VRP, but that occurs > long after the decision of whether to perform final value replacement, > which is based on the complexity of this expression. In principle A + (X-1)*A can be always simplified to (unsigned)A * (unsigned)X, but at least fold-consts fold_plusminus_mult has /* Do not resort to unsigned multiplication because we lose the no-overflow property of the expression. */ return NULL_TREE; we might want to heuristically do that anyway if the result is not a multiplication by a constant (I remember doing the above because of testsuite regressions). It might be also interesting to see whether we change back (int)((unsigned)A * (unsigned)X) to A * X when we can constrain ranges. In the specific case of the testcase below we of course only know overflow doesn't happen because it would be undefined behavior. > The motivating test case is the optimization of the loop (from comment > #7 of PR65855): > > int square(int x) { > int result = 0; > for (int i = 0; i < x; ++i) > result += x; > return result; > } > > which is currently optimized, with final value replacement to: > > final value replacement: > with expr: (int) ((unsigned int) x_3(D) + 4294967295) * x_3(D) + x_3(D) > > but with this patch it first gets simplified further: > > final value replacement: > with expr: x_3(D) * x_3(D) > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check with no new failures. Ok for mainline? OK once stage1 opens. Thanks, Richard. > > 2022-02-20 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * tree-chrec.cc (chrec_apply): Attempt to fold the linear chrec > "{a, +, a} (x-1)" as "a*x", as the number of loop iterations, x-1, > can't be negative. > > gcc/testsuite/ChangeLog > * gcc.dg/tree-ssa/pr65855-2.c: New test case. > > > Roger > -- >
diff --git a/gcc/tree-chrec.cc b/gcc/tree-chrec.cc index c44cea7..7321fb9 100644 --- a/gcc/tree-chrec.cc +++ b/gcc/tree-chrec.cc @@ -612,16 +612,31 @@ chrec_apply (unsigned var, case POLYNOMIAL_CHREC: if (evolution_function_is_affine_p (chrec)) { + tree chrecr = CHREC_RIGHT (chrec); if (CHREC_VARIABLE (chrec) != var) - return build_polynomial_chrec + res = build_polynomial_chrec (CHREC_VARIABLE (chrec), chrec_apply (var, CHREC_LEFT (chrec), x), - chrec_apply (var, CHREC_RIGHT (chrec), x)); + chrec_apply (var, chrecr, x)); /* "{a, +, b} (x)" -> "a + b*x". */ - x = chrec_convert_rhs (type, x, NULL); - res = chrec_fold_multiply (TREE_TYPE (x), CHREC_RIGHT (chrec), x); - res = chrec_fold_plus (type, CHREC_LEFT (chrec), res); + else if (operand_equal_p (CHREC_LEFT (chrec), chrecr) + && TREE_CODE (x) == PLUS_EXPR + && integer_all_onesp (TREE_OPERAND (x, 1))) + { + /* We know the number of iterations can't be negative. + So {a, +, a} (x-1) -> "a*x". */ + res = build_int_cst (TREE_TYPE (x), 1); + res = chrec_fold_plus (TREE_TYPE (x), x, res); + res = chrec_convert_rhs (type, res, NULL); + res = chrec_fold_multiply (type, chrecr, res); + } + else + { + res = chrec_convert_rhs (TREE_TYPE (chrecr), x, NULL); + res = chrec_fold_multiply (TREE_TYPE (chrecr), chrecr, res); + res = chrec_fold_plus (type, CHREC_LEFT (chrec), res); + } } else if (TREE_CODE (x) == INTEGER_CST && tree_int_cst_sgn (x) == 1) @@ -644,7 +659,7 @@ chrec_apply (unsigned var, if (dump_file && (dump_flags & TDF_SCEV)) { - fprintf (dump_file, " (varying_loop = %d\n", var); + fprintf (dump_file, " (varying_loop = %d", var); fprintf (dump_file, ")\n (chrec = "); print_generic_expr (dump_file, chrec); fprintf (dump_file, ")\n (x = "); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr65855-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr65855-2.c new file mode 100644 index 0000000..d44ef51 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr65855-2.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-sccp" } */ + +int square(int x) { + int result = 0; + for (int i = 0; i < x; ++i) + result += x; + return result; +} + +/* { dg-final { scan-tree-dump " with expr: x_\[0-9\]\\(D\\) \\* x_\[0-9\]\\(D\\)" "sccp" } } */