Message ID | 5B990CF8.2020807@foss.arm.com |
---|---|
State | New |
Headers | show |
Series | [tree-ssa-mathopts] PR tree-optimization/87259: Try execute_cse_reciprocals_1 when optimize_recip_sqrt fails | expand |
On Wed, Sep 12, 2018 at 2:56 PM Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > Hi all, > > This PR is an SSA validation error where the definition ends up not dominating a use. > This appeared after my recip_sqrt optimisation. Though the optimisation doesn't get triggered > as there is no sqrt involved it prevents execute_cse_reciprocals_1 from running as it's currently > an either/or situation. I think execute_cse_reciprocals_1 doesn't like it when some divisions get > execute_cse_reciprocals_1 called on them and some don't. > > This patch changes optimize_recip_sqrt to return a boolean indicating success and call execute_cse_reciprocals_1 > if no transform was made so that execute_cse_reciprocals_1 has a chance to optimise it as well. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > Ok for trunk? OK. But your analysis suggests to do it the other way around - call execute_cse_reciprocals_1 and if that leaves a division at *gsi process it with optimize_recip_sqrt? Richard. > Thanks, > Kyrill > > 2018-09-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR tree-optimization/87259 > PR lto/87283 > * tree-ssa-math-opts.c (optimize_recip_sqrt): Make function return > boolean. > (pass_cse_reciprocals::execute): Try execute_cse_reciprocals_1 if > optimize_recip_sqrt was ineffective. > > 2018-09-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR tree-optimization/87259 > * gcc.dg/pr87259.c: New test.
On 12/09/18 15:51, Richard Biener wrote: > On Wed, Sep 12, 2018 at 2:56 PM Kyrill Tkachov > <kyrylo.tkachov@foss.arm.com> wrote: >> Hi all, >> >> This PR is an SSA validation error where the definition ends up not dominating a use. >> This appeared after my recip_sqrt optimisation. Though the optimisation doesn't get triggered >> as there is no sqrt involved it prevents execute_cse_reciprocals_1 from running as it's currently >> an either/or situation. I think execute_cse_reciprocals_1 doesn't like it when some divisions get >> execute_cse_reciprocals_1 called on them and some don't. >> >> This patch changes optimize_recip_sqrt to return a boolean indicating success and call execute_cse_reciprocals_1 >> if no transform was made so that execute_cse_reciprocals_1 has a chance to optimise it as well. >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu. >> Ok for trunk? > OK. But your analysis suggests to do it the other way around - call > execute_cse_reciprocals_1 and if that > leaves a division at *gsi process it with optimize_recip_sqrt? > Ah, that's a shorter approach, this version implements that. Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this preferable? Thanks, Kyrill 2018-09-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> PR tree-optimization/87259 PR lto/87283 (pass_cse_reciprocals::execute): Run optimize_recip_sqrt after execute_cse_reciprocals_1 has tried transforming. 2018-09-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> PR tree-optimization/87259 * gcc.dg/pr87259.c: New test. diff --git a/gcc/testsuite/gcc.dg/pr87259.c b/gcc/testsuite/gcc.dg/pr87259.c new file mode 100644 index 0000000000000000000000000000000000000000..527a60a37adb520591ff40a52a12706aa1582068 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr87259.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + +int a, b, c; +int *e; +float f; +void h() { + for (int g;;) { + float d = b, i = 0 / f, j = a / (f * f), k, l = 0 / d; + c = i + j; + g = l; + e[g] = c / d * k / d; + } +} diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 055b4e08e97469bac3ba7e26d61193512c234575..2b3ee887ad047b6e1430efe901965e2aeccf163a 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -547,7 +547,7 @@ free_bb (struct occurrence *occ) depending on the uses of x, r1, r2. This removes one multiplication and allows the sqrt and division operations to execute in parallel. DEF_GSI is the gsi of the initial division by sqrt that defines - DEF (x in the example abovs). */ + DEF (x in the example above). */ static void optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def) @@ -947,13 +947,13 @@ pass_cse_reciprocals::execute (function *fun) && FLOAT_TYPE_P (TREE_TYPE (def)) && TREE_CODE (def) == SSA_NAME) { + execute_cse_reciprocals_1 (&gsi, def); + stmt = gsi_stmt (gsi); if (flag_unsafe_math_optimizations && is_gimple_assign (stmt) && !stmt_can_throw_internal (stmt) && gimple_assign_rhs_code (stmt) == RDIV_EXPR) optimize_recip_sqrt (&gsi, def); - else - execute_cse_reciprocals_1 (&gsi, def); } }
On Wed, Sep 12, 2018 at 6:25 PM Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > > On 12/09/18 15:51, Richard Biener wrote: > > On Wed, Sep 12, 2018 at 2:56 PM Kyrill Tkachov > > <kyrylo.tkachov@foss.arm.com> wrote: > >> Hi all, > >> > >> This PR is an SSA validation error where the definition ends up not dominating a use. > >> This appeared after my recip_sqrt optimisation. Though the optimisation doesn't get triggered > >> as there is no sqrt involved it prevents execute_cse_reciprocals_1 from running as it's currently > >> an either/or situation. I think execute_cse_reciprocals_1 doesn't like it when some divisions get > >> execute_cse_reciprocals_1 called on them and some don't. > >> > >> This patch changes optimize_recip_sqrt to return a boolean indicating success and call execute_cse_reciprocals_1 > >> if no transform was made so that execute_cse_reciprocals_1 has a chance to optimise it as well. > >> > >> Bootstrapped and tested on x86_64-unknown-linux-gnu. > >> Ok for trunk? > > OK. But your analysis suggests to do it the other way around - call > > execute_cse_reciprocals_1 and if that > > leaves a division at *gsi process it with optimize_recip_sqrt? > > > > Ah, that's a shorter approach, this version implements that. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > Is this preferable? Yes. OK for trunk. Richard. > Thanks, > Kyrill > > 2018-09-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR tree-optimization/87259 > PR lto/87283 > (pass_cse_reciprocals::execute): Run optimize_recip_sqrt after > execute_cse_reciprocals_1 has tried transforming. > > 2018-09-12 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR tree-optimization/87259 > * gcc.dg/pr87259.c: New test. >
diff --git a/gcc/testsuite/gcc.dg/pr87259.c b/gcc/testsuite/gcc.dg/pr87259.c new file mode 100644 index 0000000000000000000000000000000000000000..527a60a37adb520591ff40a52a12706aa1582068 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr87259.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + +int a, b, c; +int *e; +float f; +void h() { + for (int g;;) { + float d = b, i = 0 / f, j = a / (f * f), k, l = 0 / d; + c = i + j; + g = l; + e[g] = c / d * k / d; + } +} diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 055b4e08e97469bac3ba7e26d61193512c234575..700ce6cecf0bc00a6d562db4c1f268df5b25f03a 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -547,9 +547,10 @@ free_bb (struct occurrence *occ) depending on the uses of x, r1, r2. This removes one multiplication and allows the sqrt and division operations to execute in parallel. DEF_GSI is the gsi of the initial division by sqrt that defines - DEF (x in the example abovs). */ + DEF (x in the example abovs). + Return TRUE if a transformation was possible. */ -static void +static bool optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def) { gimple *use_stmt; @@ -562,13 +563,13 @@ optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def) if (TREE_CODE (orig_sqrt_ssa_name) != SSA_NAME || TREE_CODE (div_rhs1) != REAL_CST || !real_equal (&TREE_REAL_CST (div_rhs1), &dconst1)) - return; + return false; gcall *sqrt_stmt = dyn_cast <gcall *> (SSA_NAME_DEF_STMT (orig_sqrt_ssa_name)); if (!sqrt_stmt || !gimple_call_lhs (sqrt_stmt)) - return; + return false; switch (gimple_call_combined_fn (sqrt_stmt)) { @@ -577,7 +578,7 @@ optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def) break; default: - return; + return false; } tree a = gimple_call_arg (sqrt_stmt, 0); @@ -615,15 +616,15 @@ optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def) a multiplication so make sure we don't introduce a multiplication on a path where there was none. */ if (has_other_use && !mult_on_main_path) - return; + return false; if (sqr_stmts.is_empty () && mult_stmts.is_empty ()) - return; + return false; /* If x = 1.0 / sqrt (a) has uses other than those optimized here we want to be able to compose it from the sqr and mult cases. */ if (has_other_use && (sqr_stmts.is_empty () || mult_stmts.is_empty ())) - return; + return false; if (dump_file) { @@ -715,6 +716,7 @@ optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def) gsi_remove (&gsi2, true); release_defs (stmt); } + return true; } /* Look for floating-point divisions among DEF's uses, and try to @@ -950,8 +952,9 @@ pass_cse_reciprocals::execute (function *fun) if (flag_unsafe_math_optimizations && is_gimple_assign (stmt) && !stmt_can_throw_internal (stmt) - && gimple_assign_rhs_code (stmt) == RDIV_EXPR) - optimize_recip_sqrt (&gsi, def); + && gimple_assign_rhs_code (stmt) == RDIV_EXPR + && optimize_recip_sqrt (&gsi, def)) + ; else execute_cse_reciprocals_1 (&gsi, def); }