Message ID | f30ac263-bc14-e3af-d178-69777bedcd2c@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Fri, Jun 16, 2017 at 6:10 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote: > Hi, > > PR71815 identifies a situation where SLSR misses opportunities for > PHI candidates when code hoisting is enabled (which is now on by > default). The basic problem is that SLSR currently uses an overly > simple test for profitability of the transformation. The algorithm > currently requires that the PHI basis (through which the non-local > SLSR candidate is propagated) has only one use, which is the > candidate statement. The true requirement for profitability is > that, if the candidate statement will be dead after transformation, > then so will the PHI candidate. > > This patch fixes the problem by looking at the transitive reachability > of the PHI definitions. If all paths terminate in the candidate > statement, then we know the PHI basis will go dead and we will not > make the code worse with the planned replacement. To avoid compile > time issues, path search is arbitrarily terminated at depth 10. The > new test is used throughout the cost calculation, so appears multiple > times in the code. > > Also, I've added a check to avoid replacing multiply candidates with > a stride of 1. Such a candidate is really a copy or cast statement, > and if we replace it, we will just generate a different copy or cast > statement. I noticed this with one of the test cases from the PR > while debugging the problem. > > I've updated the two test cases that were previously enabled only > with -fno-code-hoisting, removing that restriction. > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no > regressions. I've also tested this with SPEC cpu2006 and the > patch is performance neutral on a POWER8 box (as expected). Is > this ok for trunk? > > Thanks, > Bill > > > [gcc] > > 2016-06-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com> > > * gimple-ssa-strength-reduction.c (uses_consumed_by_stmt): New > function. > (find_basis_for_candidate): Call uses_consumed_by_stmt rather than > has_single_use. > (slsr_process_phi): Likewise. > (replace_uncond_cands_and_profitable_phis): Don't replace a > multiply candidate with a stride of 1 (copy or cast). > (phi_incr_cost): Call uses_consumed_by_stmt rather than > has_single_use. > (lowest_cost_path): Likewise. > (total_savings): Likewise. > > [gcc/testsuite] > > 2016-06-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com> > > * gcc.dg/tree-ssa/slsr-35.c: Remove -fno-code-hoisting workaround. > * gcc.dg/tree-ssa/slsr-36.c: Likewise. > > > Index: gcc/gimple-ssa-strength-reduction.c > =================================================================== > --- gcc/gimple-ssa-strength-reduction.c (revision 239241) > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > @@ -475,6 +475,48 @@ find_phi_def (tree base) > return c->cand_num; > } > > +/* Determine whether all uses of NAME are directly or indirectly > + used by STMT. That is, we want to know whether if STMT goes > + dead, the definition of NAME also goes dead. */ > +static bool > +uses_consumed_by_stmt (tree name, gimple *stmt, unsigned recurse) use a default arg 'unsigned recurse = 0' to hide this implementation detail at users. > +{ > + gimple *use_stmt; > + imm_use_iterator iter; > + bool retval = true; > + > + FOR_EACH_IMM_USE_STMT (use_stmt, iter, name) > + { > + if (use_stmt == stmt || is_gimple_debug (use_stmt)) > + continue; > + > + if (!is_gimple_assign (use_stmt)) > + { > + retval = false; > + BREAK_FROM_IMM_USE_STMT (iter); > + } > + > + /* Limit recursion. */ > + if (recurse >= 10) > + { > + retval = false; > + BREAK_FROM_IMM_USE_STMT (iter); > + } Put this limit right before the recursion. > + tree next_name = gimple_get_lhs (use_stmt); > + if (!next_name || !is_gimple_reg (next_name)) > + { > + retval = false; > + BREAK_FROM_IMM_USE_STMT (iter); > + } > + > + if (uses_consumed_by_stmt (next_name, stmt, recurse + 1)) > + continue; So this doesn't change dependent on the result which means you likely meant if (! uses....) { retval = false; BREAK... } which possibly also invalidates your testing? The whole thing is probably easier to optimize if you merge the ifs that break into one. Richard. > + } > + > + return retval; > +} > + > /* Helper routine for find_basis_for_candidate. May be called twice: > once for the candidate's base expr, and optionally again either for > the candidate's phi definition or for a CAND_REF's alternative base > @@ -550,7 +592,8 @@ find_basis_for_candidate (slsr_cand_t c) > > /* If we found a hidden basis, estimate additional dead-code > savings if the phi and its feeding statements can be removed. */ > - if (basis && has_single_use (gimple_phi_result (phi_cand->cand_stmt))) > + tree feeding_var = gimple_phi_result (phi_cand->cand_stmt); > + if (basis && uses_consumed_by_stmt (feeding_var, c->cand_stmt, 0)) > c->dead_savings += phi_cand->dead_savings; > } > } > @@ -777,7 +820,7 @@ slsr_process_phi (gphi *phi, bool speed) > > /* Gather potential dead code savings if the phi statement > can be removed later on. */ > - if (has_single_use (arg)) > + if (uses_consumed_by_stmt (arg, phi, 0)) > { > if (gimple_code (arg_stmt) == GIMPLE_PHI) > savings += arg_cand->dead_savings; > @@ -2384,7 +2427,9 @@ replace_uncond_cands_and_profitable_phis (slsr_can > { > if (phi_dependent_cand_p (c)) > { > - if (c->kind == CAND_MULT) > + /* A multiply candidate with a stride of 1 is just an artifice > + of a copy or cast; there is no value in replacing it. */ > + if (c->kind == CAND_MULT && wi::to_widest (c->stride) != 1) > { > /* A candidate dependent upon a phi will replace a multiply by > a constant with an add, and will insert at most one add for > @@ -2630,8 +2675,9 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in > if (gimple_code (arg_def) == GIMPLE_PHI) > { > int feeding_savings = 0; > + tree feeding_var = gimple_phi_result (arg_def); > cost += phi_incr_cost (c, incr, arg_def, &feeding_savings); > - if (has_single_use (gimple_phi_result (arg_def))) > + if (uses_consumed_by_stmt (feeding_var, phi, 0)) > *savings += feeding_savings; > } > else > @@ -2644,7 +2690,7 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in > tree basis_lhs = gimple_assign_lhs (basis->cand_stmt); > tree lhs = gimple_assign_lhs (arg_cand->cand_stmt); > cost += add_cost (true, TYPE_MODE (TREE_TYPE (basis_lhs))); > - if (has_single_use (lhs)) > + if (uses_consumed_by_stmt (lhs, phi, 0)) > *savings += stmt_cost (arg_cand->cand_stmt, true); > } > } > @@ -2721,7 +2767,7 @@ lowest_cost_path (int cost_in, int repl_savings, s > gimple *phi = lookup_cand (c->def_phi)->cand_stmt; > local_cost += phi_incr_cost (c, incr, phi, &savings); > > - if (has_single_use (gimple_phi_result (phi))) > + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0)) > local_cost -= savings; > } > > @@ -2765,7 +2811,7 @@ total_savings (int repl_savings, slsr_cand_t c, co > gimple *phi = lookup_cand (c->def_phi)->cand_stmt; > savings -= phi_incr_cost (c, incr, phi, &phi_savings); > > - if (has_single_use (gimple_phi_result (phi))) > + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0)) > savings += phi_savings; > } > > Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (revision 239241) > +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (working copy) > @@ -3,7 +3,7 @@ > phi has an argument which is a parameter. */ > > /* { dg-do compile } */ > -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */ > +/* { dg-options "-O3 -fdump-tree-optimized" } */ > > int > f (int c, int i) > Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (revision 239241) > +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (working copy) > @@ -3,7 +3,7 @@ > phi has an argument which is a parameter. */ > > /* { dg-do compile } */ > -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */ > +/* { dg-options "-O3 -fdump-tree-optimized" } */ > > int > f (int s, int c, int i) >
On Jun 20, 2017, at 6:23 AM, Richard Biener <richard.guenther@gmail.com> wrote: > > On Fri, Jun 16, 2017 at 6:10 PM, Bill Schmidt > <wschmidt@linux.vnet.ibm.com> wrote: >> Hi, >> >> PR71815 identifies a situation where SLSR misses opportunities for >> PHI candidates when code hoisting is enabled (which is now on by >> default). The basic problem is that SLSR currently uses an overly >> simple test for profitability of the transformation. The algorithm >> currently requires that the PHI basis (through which the non-local >> SLSR candidate is propagated) has only one use, which is the >> candidate statement. The true requirement for profitability is >> that, if the candidate statement will be dead after transformation, >> then so will the PHI candidate. >> >> This patch fixes the problem by looking at the transitive reachability >> of the PHI definitions. If all paths terminate in the candidate >> statement, then we know the PHI basis will go dead and we will not >> make the code worse with the planned replacement. To avoid compile >> time issues, path search is arbitrarily terminated at depth 10. The >> new test is used throughout the cost calculation, so appears multiple >> times in the code. >> >> Also, I've added a check to avoid replacing multiply candidates with >> a stride of 1. Such a candidate is really a copy or cast statement, >> and if we replace it, we will just generate a different copy or cast >> statement. I noticed this with one of the test cases from the PR >> while debugging the problem. >> >> I've updated the two test cases that were previously enabled only >> with -fno-code-hoisting, removing that restriction. >> >> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no >> regressions. I've also tested this with SPEC cpu2006 and the >> patch is performance neutral on a POWER8 box (as expected). Is >> this ok for trunk? >> >> Thanks, >> Bill >> >> >> [gcc] >> >> 2016-06-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com> >> >> * gimple-ssa-strength-reduction.c (uses_consumed_by_stmt): New >> function. >> (find_basis_for_candidate): Call uses_consumed_by_stmt rather than >> has_single_use. >> (slsr_process_phi): Likewise. >> (replace_uncond_cands_and_profitable_phis): Don't replace a >> multiply candidate with a stride of 1 (copy or cast). >> (phi_incr_cost): Call uses_consumed_by_stmt rather than >> has_single_use. >> (lowest_cost_path): Likewise. >> (total_savings): Likewise. >> >> [gcc/testsuite] >> >> 2016-06-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com> >> >> * gcc.dg/tree-ssa/slsr-35.c: Remove -fno-code-hoisting workaround. >> * gcc.dg/tree-ssa/slsr-36.c: Likewise. >> >> >> Index: gcc/gimple-ssa-strength-reduction.c >> =================================================================== >> --- gcc/gimple-ssa-strength-reduction.c (revision 239241) >> +++ gcc/gimple-ssa-strength-reduction.c (working copy) >> @@ -475,6 +475,48 @@ find_phi_def (tree base) >> return c->cand_num; >> } >> >> +/* Determine whether all uses of NAME are directly or indirectly >> + used by STMT. That is, we want to know whether if STMT goes >> + dead, the definition of NAME also goes dead. */ >> +static bool >> +uses_consumed_by_stmt (tree name, gimple *stmt, unsigned recurse) > > use a default arg 'unsigned recurse = 0' to hide this implementation > detail at users. Good idea, thanks. > >> +{ >> + gimple *use_stmt; >> + imm_use_iterator iter; >> + bool retval = true; >> + >> + FOR_EACH_IMM_USE_STMT (use_stmt, iter, name) >> + { >> + if (use_stmt == stmt || is_gimple_debug (use_stmt)) >> + continue; >> + >> + if (!is_gimple_assign (use_stmt)) >> + { >> + retval = false; >> + BREAK_FROM_IMM_USE_STMT (iter); >> + } >> + >> + /* Limit recursion. */ >> + if (recurse >= 10) >> + { >> + retval = false; >> + BREAK_FROM_IMM_USE_STMT (iter); >> + } > > Put this limit right before the recursion. > >> + tree next_name = gimple_get_lhs (use_stmt); >> + if (!next_name || !is_gimple_reg (next_name)) >> + { >> + retval = false; >> + BREAK_FROM_IMM_USE_STMT (iter); >> + } >> + >> + if (uses_consumed_by_stmt (next_name, stmt, recurse + 1)) >> + continue; > > So this doesn't change dependent on the result which means you likely meant > > if (! uses....) > { > retval = false; > BREAK... > } > > which possibly also invalidates your testing? Grumble. Can't believe I did that. Yep, will respin. > > The whole thing is probably easier to optimize if you merge the ifs > that break into one. Will do! Thanks, Richard! Bill > > Richard. > >> + } >> + >> + return retval; >> +} >> + >> /* Helper routine for find_basis_for_candidate. May be called twice: >> once for the candidate's base expr, and optionally again either for >> the candidate's phi definition or for a CAND_REF's alternative base >> @@ -550,7 +592,8 @@ find_basis_for_candidate (slsr_cand_t c) >> >> /* If we found a hidden basis, estimate additional dead-code >> savings if the phi and its feeding statements can be removed. */ >> - if (basis && has_single_use (gimple_phi_result (phi_cand->cand_stmt))) >> + tree feeding_var = gimple_phi_result (phi_cand->cand_stmt); >> + if (basis && uses_consumed_by_stmt (feeding_var, c->cand_stmt, 0)) >> c->dead_savings += phi_cand->dead_savings; >> } >> } >> @@ -777,7 +820,7 @@ slsr_process_phi (gphi *phi, bool speed) >> >> /* Gather potential dead code savings if the phi statement >> can be removed later on. */ >> - if (has_single_use (arg)) >> + if (uses_consumed_by_stmt (arg, phi, 0)) >> { >> if (gimple_code (arg_stmt) == GIMPLE_PHI) >> savings += arg_cand->dead_savings; >> @@ -2384,7 +2427,9 @@ replace_uncond_cands_and_profitable_phis (slsr_can >> { >> if (phi_dependent_cand_p (c)) >> { >> - if (c->kind == CAND_MULT) >> + /* A multiply candidate with a stride of 1 is just an artifice >> + of a copy or cast; there is no value in replacing it. */ >> + if (c->kind == CAND_MULT && wi::to_widest (c->stride) != 1) >> { >> /* A candidate dependent upon a phi will replace a multiply by >> a constant with an add, and will insert at most one add for >> @@ -2630,8 +2675,9 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in >> if (gimple_code (arg_def) == GIMPLE_PHI) >> { >> int feeding_savings = 0; >> + tree feeding_var = gimple_phi_result (arg_def); >> cost += phi_incr_cost (c, incr, arg_def, &feeding_savings); >> - if (has_single_use (gimple_phi_result (arg_def))) >> + if (uses_consumed_by_stmt (feeding_var, phi, 0)) >> *savings += feeding_savings; >> } >> else >> @@ -2644,7 +2690,7 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in >> tree basis_lhs = gimple_assign_lhs (basis->cand_stmt); >> tree lhs = gimple_assign_lhs (arg_cand->cand_stmt); >> cost += add_cost (true, TYPE_MODE (TREE_TYPE (basis_lhs))); >> - if (has_single_use (lhs)) >> + if (uses_consumed_by_stmt (lhs, phi, 0)) >> *savings += stmt_cost (arg_cand->cand_stmt, true); >> } >> } >> @@ -2721,7 +2767,7 @@ lowest_cost_path (int cost_in, int repl_savings, s >> gimple *phi = lookup_cand (c->def_phi)->cand_stmt; >> local_cost += phi_incr_cost (c, incr, phi, &savings); >> >> - if (has_single_use (gimple_phi_result (phi))) >> + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0)) >> local_cost -= savings; >> } >> >> @@ -2765,7 +2811,7 @@ total_savings (int repl_savings, slsr_cand_t c, co >> gimple *phi = lookup_cand (c->def_phi)->cand_stmt; >> savings -= phi_incr_cost (c, incr, phi, &phi_savings); >> >> - if (has_single_use (gimple_phi_result (phi))) >> + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0)) >> savings += phi_savings; >> } >> >> Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c >> =================================================================== >> --- gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (revision 239241) >> +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (working copy) >> @@ -3,7 +3,7 @@ >> phi has an argument which is a parameter. */ >> >> /* { dg-do compile } */ >> -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */ >> +/* { dg-options "-O3 -fdump-tree-optimized" } */ >> >> int >> f (int c, int i) >> Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c >> =================================================================== >> --- gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (revision 239241) >> +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (working copy) >> @@ -3,7 +3,7 @@ >> phi has an argument which is a parameter. */ >> >> /* { dg-do compile } */ >> -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */ >> +/* { dg-options "-O3 -fdump-tree-optimized" } */ >> >> int >> f (int s, int c, int i)
Index: gcc/gimple-ssa-strength-reduction.c =================================================================== --- gcc/gimple-ssa-strength-reduction.c (revision 239241) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -475,6 +475,48 @@ find_phi_def (tree base) return c->cand_num; } +/* Determine whether all uses of NAME are directly or indirectly + used by STMT. That is, we want to know whether if STMT goes + dead, the definition of NAME also goes dead. */ +static bool +uses_consumed_by_stmt (tree name, gimple *stmt, unsigned recurse) +{ + gimple *use_stmt; + imm_use_iterator iter; + bool retval = true; + + FOR_EACH_IMM_USE_STMT (use_stmt, iter, name) + { + if (use_stmt == stmt || is_gimple_debug (use_stmt)) + continue; + + if (!is_gimple_assign (use_stmt)) + { + retval = false; + BREAK_FROM_IMM_USE_STMT (iter); + } + + /* Limit recursion. */ + if (recurse >= 10) + { + retval = false; + BREAK_FROM_IMM_USE_STMT (iter); + } + + tree next_name = gimple_get_lhs (use_stmt); + if (!next_name || !is_gimple_reg (next_name)) + { + retval = false; + BREAK_FROM_IMM_USE_STMT (iter); + } + + if (uses_consumed_by_stmt (next_name, stmt, recurse + 1)) + continue; + } + + return retval; +} + /* Helper routine for find_basis_for_candidate. May be called twice: once for the candidate's base expr, and optionally again either for the candidate's phi definition or for a CAND_REF's alternative base @@ -550,7 +592,8 @@ find_basis_for_candidate (slsr_cand_t c) /* If we found a hidden basis, estimate additional dead-code savings if the phi and its feeding statements can be removed. */ - if (basis && has_single_use (gimple_phi_result (phi_cand->cand_stmt))) + tree feeding_var = gimple_phi_result (phi_cand->cand_stmt); + if (basis && uses_consumed_by_stmt (feeding_var, c->cand_stmt, 0)) c->dead_savings += phi_cand->dead_savings; } } @@ -777,7 +820,7 @@ slsr_process_phi (gphi *phi, bool speed) /* Gather potential dead code savings if the phi statement can be removed later on. */ - if (has_single_use (arg)) + if (uses_consumed_by_stmt (arg, phi, 0)) { if (gimple_code (arg_stmt) == GIMPLE_PHI) savings += arg_cand->dead_savings; @@ -2384,7 +2427,9 @@ replace_uncond_cands_and_profitable_phis (slsr_can { if (phi_dependent_cand_p (c)) { - if (c->kind == CAND_MULT) + /* A multiply candidate with a stride of 1 is just an artifice + of a copy or cast; there is no value in replacing it. */ + if (c->kind == CAND_MULT && wi::to_widest (c->stride) != 1) { /* A candidate dependent upon a phi will replace a multiply by a constant with an add, and will insert at most one add for @@ -2630,8 +2675,9 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in if (gimple_code (arg_def) == GIMPLE_PHI) { int feeding_savings = 0; + tree feeding_var = gimple_phi_result (arg_def); cost += phi_incr_cost (c, incr, arg_def, &feeding_savings); - if (has_single_use (gimple_phi_result (arg_def))) + if (uses_consumed_by_stmt (feeding_var, phi, 0)) *savings += feeding_savings; } else @@ -2644,7 +2690,7 @@ phi_incr_cost (slsr_cand_t c, const widest_int &in tree basis_lhs = gimple_assign_lhs (basis->cand_stmt); tree lhs = gimple_assign_lhs (arg_cand->cand_stmt); cost += add_cost (true, TYPE_MODE (TREE_TYPE (basis_lhs))); - if (has_single_use (lhs)) + if (uses_consumed_by_stmt (lhs, phi, 0)) *savings += stmt_cost (arg_cand->cand_stmt, true); } } @@ -2721,7 +2767,7 @@ lowest_cost_path (int cost_in, int repl_savings, s gimple *phi = lookup_cand (c->def_phi)->cand_stmt; local_cost += phi_incr_cost (c, incr, phi, &savings); - if (has_single_use (gimple_phi_result (phi))) + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0)) local_cost -= savings; } @@ -2765,7 +2811,7 @@ total_savings (int repl_savings, slsr_cand_t c, co gimple *phi = lookup_cand (c->def_phi)->cand_stmt; savings -= phi_incr_cost (c, incr, phi, &phi_savings); - if (has_single_use (gimple_phi_result (phi))) + if (uses_consumed_by_stmt (gimple_phi_result (phi), c->cand_stmt, 0)) savings += phi_savings; } Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (revision 239241) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-35.c (working copy) @@ -3,7 +3,7 @@ phi has an argument which is a parameter. */ /* { dg-do compile } */ -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */ +/* { dg-options "-O3 -fdump-tree-optimized" } */ int f (int c, int i) Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (revision 239241) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-36.c (working copy) @@ -3,7 +3,7 @@ phi has an argument which is a parameter. */ /* { dg-do compile } */ -/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */ +/* { dg-options "-O3 -fdump-tree-optimized" } */ int f (int s, int c, int i)