diff mbox series

[tree-ssa-mathopts] PR tree-optimization/87259: Try execute_cse_reciprocals_1 when optimize_recip_sqrt fails

Message ID 5B990CF8.2020807@foss.arm.com
State New
Headers show
Series [tree-ssa-mathopts] PR tree-optimization/87259: Try execute_cse_reciprocals_1 when optimize_recip_sqrt fails | expand

Commit Message

Kyrill Tkachov Sept. 12, 2018, 12:56 p.m. UTC
Hi all,

This PR is an SSA validation error where the definition ends up not dominating a use.
This appeared after my recip_sqrt optimisation. Though the optimisation doesn't get triggered
as there is no sqrt involved it prevents execute_cse_reciprocals_1 from running as it's currently
an either/or situation. I think execute_cse_reciprocals_1 doesn't like it when some divisions get
execute_cse_reciprocals_1 called on them and some don't.

This patch changes optimize_recip_sqrt to return a boolean indicating success and call execute_cse_reciprocals_1
if no transform was made so that execute_cse_reciprocals_1 has a chance to optimise it as well.

Bootstrapped and tested on x86_64-unknown-linux-gnu.
Ok for trunk?

Thanks,
Kyrill

2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR tree-optimization/87259
     PR lto/87283
     * tree-ssa-math-opts.c (optimize_recip_sqrt): Make function return
     boolean.
     (pass_cse_reciprocals::execute): Try execute_cse_reciprocals_1 if
     optimize_recip_sqrt was ineffective.

2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR tree-optimization/87259
     * gcc.dg/pr87259.c: New test.

Comments

Richard Biener Sept. 12, 2018, 2:51 p.m. UTC | #1
On Wed, Sep 12, 2018 at 2:56 PM Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
> Hi all,
>
> This PR is an SSA validation error where the definition ends up not dominating a use.
> This appeared after my recip_sqrt optimisation. Though the optimisation doesn't get triggered
> as there is no sqrt involved it prevents execute_cse_reciprocals_1 from running as it's currently
> an either/or situation. I think execute_cse_reciprocals_1 doesn't like it when some divisions get
> execute_cse_reciprocals_1 called on them and some don't.
>
> This patch changes optimize_recip_sqrt to return a boolean indicating success and call execute_cse_reciprocals_1
> if no transform was made so that execute_cse_reciprocals_1 has a chance to optimise it as well.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> Ok for trunk?

OK.  But your analysis suggests to do it the other way around - call
execute_cse_reciprocals_1 and if that
leaves a division at *gsi process it with optimize_recip_sqrt?

Richard.

> Thanks,
> Kyrill
>
> 2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>      PR tree-optimization/87259
>      PR lto/87283
>      * tree-ssa-math-opts.c (optimize_recip_sqrt): Make function return
>      boolean.
>      (pass_cse_reciprocals::execute): Try execute_cse_reciprocals_1 if
>      optimize_recip_sqrt was ineffective.
>
> 2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>      PR tree-optimization/87259
>      * gcc.dg/pr87259.c: New test.
Kyrill Tkachov Sept. 12, 2018, 4:25 p.m. UTC | #2
On 12/09/18 15:51, Richard Biener wrote:
> On Wed, Sep 12, 2018 at 2:56 PM Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> Hi all,
>>
>> This PR is an SSA validation error where the definition ends up not dominating a use.
>> This appeared after my recip_sqrt optimisation. Though the optimisation doesn't get triggered
>> as there is no sqrt involved it prevents execute_cse_reciprocals_1 from running as it's currently
>> an either/or situation. I think execute_cse_reciprocals_1 doesn't like it when some divisions get
>> execute_cse_reciprocals_1 called on them and some don't.
>>
>> This patch changes optimize_recip_sqrt to return a boolean indicating success and call execute_cse_reciprocals_1
>> if no transform was made so that execute_cse_reciprocals_1 has a chance to optimise it as well.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> Ok for trunk?
> OK.  But your analysis suggests to do it the other way around - call
> execute_cse_reciprocals_1 and if that
> leaves a division at *gsi process it with optimize_recip_sqrt?
>

Ah, that's a shorter approach, this version implements that.

Bootstrapped and tested on x86_64-unknown-linux-gnu.
Is this preferable?

Thanks,
Kyrill

2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR tree-optimization/87259
     PR lto/87283
     (pass_cse_reciprocals::execute): Run optimize_recip_sqrt after
     execute_cse_reciprocals_1 has tried transforming.

2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR tree-optimization/87259
     * gcc.dg/pr87259.c: New test.
diff --git a/gcc/testsuite/gcc.dg/pr87259.c b/gcc/testsuite/gcc.dg/pr87259.c
new file mode 100644
index 0000000000000000000000000000000000000000..527a60a37adb520591ff40a52a12706aa1582068
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87259.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+
+int a, b, c;
+int *e;
+float f;
+void h() {
+  for (int g;;) {
+    float d = b, i = 0 / f, j = a / (f * f), k, l = 0 / d;
+    c = i + j;
+    g = l;
+    e[g] = c / d * k / d;
+  }
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 055b4e08e97469bac3ba7e26d61193512c234575..2b3ee887ad047b6e1430efe901965e2aeccf163a 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -547,7 +547,7 @@ free_bb (struct occurrence *occ)
    depending on the uses of x, r1, r2.  This removes one multiplication and
    allows the sqrt and division operations to execute in parallel.
    DEF_GSI is the gsi of the initial division by sqrt that defines
-   DEF (x in the example abovs).  */
+   DEF (x in the example above).  */
 
 static void
 optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def)
@@ -947,13 +947,13 @@ pass_cse_reciprocals::execute (function *fun)
 	      && FLOAT_TYPE_P (TREE_TYPE (def))
 	      && TREE_CODE (def) == SSA_NAME)
 	    {
+	      execute_cse_reciprocals_1 (&gsi, def);
+	      stmt = gsi_stmt (gsi);
 	      if (flag_unsafe_math_optimizations
 		  && is_gimple_assign (stmt)
 		  && !stmt_can_throw_internal (stmt)
 		  && gimple_assign_rhs_code (stmt) == RDIV_EXPR)
 		optimize_recip_sqrt (&gsi, def);
-	      else
-		execute_cse_reciprocals_1 (&gsi, def);
 	    }
 	}
Richard Biener Sept. 14, 2018, 1:07 p.m. UTC | #3
On Wed, Sep 12, 2018 at 6:25 PM Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
>
> On 12/09/18 15:51, Richard Biener wrote:
> > On Wed, Sep 12, 2018 at 2:56 PM Kyrill Tkachov
> > <kyrylo.tkachov@foss.arm.com> wrote:
> >> Hi all,
> >>
> >> This PR is an SSA validation error where the definition ends up not dominating a use.
> >> This appeared after my recip_sqrt optimisation. Though the optimisation doesn't get triggered
> >> as there is no sqrt involved it prevents execute_cse_reciprocals_1 from running as it's currently
> >> an either/or situation. I think execute_cse_reciprocals_1 doesn't like it when some divisions get
> >> execute_cse_reciprocals_1 called on them and some don't.
> >>
> >> This patch changes optimize_recip_sqrt to return a boolean indicating success and call execute_cse_reciprocals_1
> >> if no transform was made so that execute_cse_reciprocals_1 has a chance to optimise it as well.
> >>
> >> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >> Ok for trunk?
> > OK.  But your analysis suggests to do it the other way around - call
> > execute_cse_reciprocals_1 and if that
> > leaves a division at *gsi process it with optimize_recip_sqrt?
> >
>
> Ah, that's a shorter approach, this version implements that.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> Is this preferable?

Yes.

OK for trunk.
Richard.

> Thanks,
> Kyrill
>
> 2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>      PR tree-optimization/87259
>      PR lto/87283
>      (pass_cse_reciprocals::execute): Run optimize_recip_sqrt after
>      execute_cse_reciprocals_1 has tried transforming.
>
> 2018-09-12  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>      PR tree-optimization/87259
>      * gcc.dg/pr87259.c: New test.
>
diff mbox series

Patch

diff --git a/gcc/testsuite/gcc.dg/pr87259.c b/gcc/testsuite/gcc.dg/pr87259.c
new file mode 100644
index 0000000000000000000000000000000000000000..527a60a37adb520591ff40a52a12706aa1582068
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87259.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Ofast" } */
+
+int a, b, c;
+int *e;
+float f;
+void h() {
+  for (int g;;) {
+    float d = b, i = 0 / f, j = a / (f * f), k, l = 0 / d;
+    c = i + j;
+    g = l;
+    e[g] = c / d * k / d;
+  }
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 055b4e08e97469bac3ba7e26d61193512c234575..700ce6cecf0bc00a6d562db4c1f268df5b25f03a 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -547,9 +547,10 @@  free_bb (struct occurrence *occ)
    depending on the uses of x, r1, r2.  This removes one multiplication and
    allows the sqrt and division operations to execute in parallel.
    DEF_GSI is the gsi of the initial division by sqrt that defines
-   DEF (x in the example abovs).  */
+   DEF (x in the example abovs).
+   Return TRUE if a transformation was possible.  */
 
-static void
+static bool
 optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def)
 {
   gimple *use_stmt;
@@ -562,13 +563,13 @@  optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def)
   if (TREE_CODE (orig_sqrt_ssa_name) != SSA_NAME
       || TREE_CODE (div_rhs1) != REAL_CST
       || !real_equal (&TREE_REAL_CST (div_rhs1), &dconst1))
-    return;
+    return false;
 
   gcall *sqrt_stmt
     = dyn_cast <gcall *> (SSA_NAME_DEF_STMT (orig_sqrt_ssa_name));
 
   if (!sqrt_stmt || !gimple_call_lhs (sqrt_stmt))
-    return;
+    return false;
 
   switch (gimple_call_combined_fn (sqrt_stmt))
     {
@@ -577,7 +578,7 @@  optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def)
       break;
 
     default:
-      return;
+      return false;
     }
   tree a = gimple_call_arg (sqrt_stmt, 0);
 
@@ -615,15 +616,15 @@  optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def)
      a multiplication so make sure we don't introduce a multiplication
      on a path where there was none.  */
   if (has_other_use && !mult_on_main_path)
-    return;
+    return false;
 
   if (sqr_stmts.is_empty () && mult_stmts.is_empty ())
-    return;
+    return false;
 
   /* If x = 1.0 / sqrt (a) has uses other than those optimized here we want
      to be able to compose it from the sqr and mult cases.  */
   if (has_other_use && (sqr_stmts.is_empty () || mult_stmts.is_empty ()))
-    return;
+    return false;
 
   if (dump_file)
     {
@@ -715,6 +716,7 @@  optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def)
       gsi_remove (&gsi2, true);
       release_defs (stmt);
     }
+  return true;
 }
 
 /* Look for floating-point divisions among DEF's uses, and try to
@@ -950,8 +952,9 @@  pass_cse_reciprocals::execute (function *fun)
 	      if (flag_unsafe_math_optimizations
 		  && is_gimple_assign (stmt)
 		  && !stmt_can_throw_internal (stmt)
-		  && gimple_assign_rhs_code (stmt) == RDIV_EXPR)
-		optimize_recip_sqrt (&gsi, def);
+		  && gimple_assign_rhs_code (stmt) == RDIV_EXPR
+		  && optimize_recip_sqrt (&gsi, def))
+		;
 	      else
 		execute_cse_reciprocals_1 (&gsi, def);
 	    }