Message ID | orpmjyfbnd.fsf@lxoliva.fsfla.org |
---|---|
State | New |
Headers | show |
Series | [PR105665] ivopts: check defs of names in base for undefs | expand |
On Sat, May 28, 2022 at 7:52 AM Alexandre Oliva via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > The patch for PR 100810 tested for undefined SSA_NAMEs appearing > directly in the base expression of the potential IV candidate, but > that's not enough. The testcase for PR105665 shows an undefined > SSA_NAME has the same ill effect if it's referenced as an PHI_NODE arg > in the referenced SSA_NAME. The variant of that test shows it can be > further removed from the referenced SSA_NAME. > > To avoid deep recursion, precompute SSA_NAMEs deemed unsuitable > candidates, so that we can skip them with a flag test. > > Regstrapped on x86_64-linux-gnu. Ok to install? I don't think you can rely on TREE_VISITED not set at the start of the pass (and you don't clear it either). So it's probably better to use a sbitmap. I also wonder how you decide that tracking PHIs with (one) uninit arg is "enough"? The previous patch indeed is also only somewhat a hack because the GIMPLE IL doesn't stabilize the "value" of an SSA default def. Is it important which edge of the PHI the undef appears in? I presume the testcase might have it on the loop entry edge? I presume only PHIs in loop headers are to be considered? Richard. > > for gcc/ChangeLog > > PR tree-optimization/105665 > PR tree-optimization/100810 > * tree-ssa-loop-ivopts.cc (mark_ssa_undefs): Precompute > unsuitability of SSA_NAMEs in TREE_VISITED. > (find_ssa_undef): Check the precomputed flag. > (tree_ssa_iv_optimize): Call mark_ssa_undefs. > > for gcc/testsuite/ChangeLog > > PR tree-optimization/105665 > PR tree-optimization/100810 > * gcc.dg/torture/pr105665.c: New. > --- > gcc/testsuite/gcc.dg/torture/pr105665.c | 20 ++++++++++ > gcc/tree-ssa-loop-ivopts.cc | 62 ++++++++++++++++++++++++++++++- > 2 files changed, 80 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/pr105665.c > > diff --git a/gcc/testsuite/gcc.dg/torture/pr105665.c b/gcc/testsuite/gcc.dg/torture/pr105665.c > new file mode 100644 > index 0000000000000..34cfc65843495 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/pr105665.c > @@ -0,0 +1,20 @@ > +/* { dg-do run } */ > + > +int a, b, c[1], d[2], *e = c; > +int main() { > + int f = 0; > + for (; b < 2; b++) { > + int g; > + if (f) > + g++, b = 40; > + a = d[b * b]; > + for (f = 0; f < 3; f++) { > + if (e) > + break; > + g--; > + if (a) > + a = g; > + } > + } > + return 0; > +} > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc > index 81b536f930415..d8200f2a53b21 100644 > --- a/gcc/tree-ssa-loop-ivopts.cc > +++ b/gcc/tree-ssa-loop-ivopts.cc > @@ -3071,13 +3071,70 @@ get_loop_invariant_expr (struct ivopts_data *data, tree inv_expr) > return *slot; > } > > -/* Find the first undefined SSA name in *TP. */ > +/* Mark as TREe_VISITED any SSA_NAMEs that are unsuitable as ivopts > + candidates for potentially involving undefined behavior. */ > + > +static void > +mark_ssa_undefs (void) > +{ > + auto_vec<tree> queue; > + > + unsigned int i; > + tree var; > + FOR_EACH_SSA_NAME (i, var, cfun) > + { > + if (SSA_NAME_IS_VIRTUAL_OPERAND (var) > + || ssa_defined_default_def_p (var) > + || !ssa_undefined_value_p (var, false)) > + TREE_VISITED (var) = false; > + else > + { > + TREE_VISITED (var) = true; > + queue.safe_push (var); > + if (dump_file) > + fprintf (dump_file, "marking _%i as undef\n", > + SSA_NAME_VERSION (var)); > + } > + } > + > + while (!queue.is_empty ()) > + { > + var = queue.pop (); > + gimple *stmt; > + imm_use_iterator iter; > + FOR_EACH_IMM_USE_STMT (stmt, iter, var) > + { > + if (is_gimple_call (stmt) || is_a <gasm *> (stmt)) > + continue; > + > + def_operand_p defvar; > + ssa_op_iter diter; > + FOR_EACH_PHI_OR_STMT_DEF (defvar, stmt, diter, SSA_OP_DEF) > + { > + gcc_checking_assert (is_gimple_assign (stmt) > + || is_a <gphi *> (stmt)); > + tree def = DEF_FROM_PTR (defvar); > + if (TREE_VISITED (def)) > + continue; > + TREE_VISITED (def) = true; > + queue.safe_push (def); > + if (dump_file) > + fprintf (dump_file, "Marking _%i as undef because of _%i\n", > + SSA_NAME_VERSION (def), SSA_NAME_VERSION (var)); > + } > + } > + } > +} > + > +/* Return *TP if it is an SSA_NAME marked with TREE_VISITED, i.e., as > + unsuitable as ivopts candidates for potentially involving undefined > + behavior. */ > > static tree > find_ssa_undef (tree *tp, int *walk_subtrees, void *) > { > if (TREE_CODE (*tp) == SSA_NAME > - && ssa_undefined_value_p (*tp, false)) > + && TREE_VISITED (*tp)) > return *tp; > if (!EXPR_P (*tp)) > *walk_subtrees = 0; > @@ -8192,6 +8249,7 @@ tree_ssa_iv_optimize (void) > auto_bitmap toremove; > > tree_ssa_iv_optimize_init (&data); > + mark_ssa_undefs (); > > /* Optimize the loops starting with the innermost ones. */ > for (auto loop : loops_list (cfun, LI_FROM_INNERMOST)) > > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > Disinformation flourishes because many people care deeply about injustice > but very few check the facts. Ask me about <https://stallmansupport.org>
On May 30, 2022, Richard Biener <richard.guenther@gmail.com> wrote: > I don't think you can rely on TREE_VISITED not set at the start of the > pass (and you don't clear it either). I don't clear it, but I loop over all SSA names and set TREE_VISITED to either true or false, so that's covered. I even had a test patch that checked that TREE_VISITED remains unchanged and still matched the expected value, with a recursive verification. I could switch to an sbitmap if that's preferred, though. > I also wonder how you decide that tracking PHIs with (one) uninit arg > is "enough"? It's a conservative assumption, granted. One could imagine cases in which an uninit def is never actually used, say because of conditionals forced by external circumstances the compiler cannot possibly know about. But then, just as this sort of bug shows, sometimes even when an uninit is not actually used, the fact that it is uninit and thus undefined may end up percolating onto stuff that is actually used, so I figured we'd be better off leaving alone whatever is potentially derived from an uninit value. > Is it important which edge of the PHI the undef appears in? At some point, I added recursion to find_ssa_undef, at PHI nodes and assignments, and pondered whether to recurse at PHI nodes only for defs that were "earlier" ones, rather than coming from back edges. I ended up reasoning that back edges would affect step and rule out an IV candidate even sooner. But the forward propagation of maybe-undef obviated that reasoning. Now, almost tautologically, if circumstances are such that the compiler could only tell that an ssa name is defined with external knowledge, then, since such external knowledge is not available to the compiler, it has to assume that the ssa name may be undefined. > I presume the testcase might have it on the loop entry edge? The original testcase did. The modified one (the added increment) shows it can be an earlier edge that has the maybe-undef name. > I presume only PHIs in loop headers are to be considered? As in the modified testcase, earlier PHIs that are entirely outside the loop can still trigger the bug. Adding more increments of g guarded by conditionals involving other global variables pushes the undef ssa name further and further away from the inner loop, while still rendering g an unsuitable IV. >> +int a, b, c[1], d[2], *e = c; >> +int main() { >> + int f = 0; >> + for (; b < 2; b++) { >> + int g; >> + if (f) >> + g++, b = 40; >> + a = d[b * b]; >> + for (f = 0; f < 3; f++) { >> + if (e) >> + break; >> + g--; >> + if (a) >> + a = g; >> + } >> + } >> + return 0; >> +}
On Tue, May 31, 2022 at 3:27 PM Alexandre Oliva <oliva@adacore.com> wrote: > > On May 30, 2022, Richard Biener <richard.guenther@gmail.com> wrote: > > > I don't think you can rely on TREE_VISITED not set at the start of the > > pass (and you don't clear it either). > > I don't clear it, but I loop over all SSA names and set TREE_VISITED to > either true or false, so that's covered. > > I even had a test patch that checked that TREE_VISITED remains unchanged > and still matched the expected value, with a recursive verification. > > I could switch to an sbitmap if that's preferred, though. > > > I also wonder how you decide that tracking PHIs with (one) uninit arg > > is "enough"? > > It's a conservative assumption, granted. One could imagine cases in > which an uninit def is never actually used, say because of conditionals > forced by external circumstances the compiler cannot possibly know > about. But then, just as this sort of bug shows, sometimes even when an > uninit is not actually used, the fact that it is uninit and thus > undefined may end up percolating onto stuff that is actually used, so I > figured we'd be better off leaving alone whatever is potentially derived > from an uninit value. > > > Is it important which edge of the PHI the undef appears in? > > At some point, I added recursion to find_ssa_undef, at PHI nodes and > assignments, and pondered whether to recurse at PHI nodes only for defs > that were "earlier" ones, rather than coming from back edges. I ended > up reasoning that back edges would affect step and rule out an IV > candidate even sooner. But the forward propagation of maybe-undef > obviated that reasoning. Now, almost tautologically, if circumstances > are such that the compiler could only tell that an ssa name is defined > with external knowledge, then, since such external knowledge is not > available to the compiler, it has to assume that the ssa name may be > undefined. > > > I presume the testcase might have it on the loop entry edge? > > The original testcase did. The modified one (the added increment) shows > it can be an earlier edge that has the maybe-undef name. > > > I presume only PHIs in loop headers are to be considered? > > As in the modified testcase, earlier PHIs that are entirely outside the > loop can still trigger the bug. Adding more increments of g guarded by > conditionals involving other global variables pushes the undef ssa name > further and further away from the inner loop, while still rendering g an > unsuitable IV. So for example void foo (int flag, int init, int n, int *p) { int i; if (flag) i = init; for (; i < n; ++i) *(p++) = 1; } will now not consider 'i - i-on-entry' as an IV to express *(p++) = 1. But void foo (int flag, int init, int n, int *p) { int i; if (flag) i = init; i++; for (; i < n; ++i) *(p++) = 1; } still would (the propagation stops at i++). That said - I wonder if we want to compute a conservative set of 'must-initialized' here or to say, do we understand the failure mode good enough to say that a must-def (even if from an SSA name without a definition) is good enough to avoid the issues we are seeing? One would argue that my example above invokes undefined behavior if (!flag), but IIRC the cases in the PRs we talk about are not but IVOPTs with its IV choice exposes undefined behavior - orignially by relying on undef - undef being zero. That said, the contains-undef thing tries to avoid rewriting expressions with terms that possibly contain undefs which means if we want to strenthen it then we look for must-defined (currently it's must-undefined)? Richard. > >> +int a, b, c[1], d[2], *e = c; > >> +int main() { > >> + int f = 0; > >> + for (; b < 2; b++) { > >> + int g; > >> + if (f) > >> + g++, b = 40; > >> + a = d[b * b]; > >> + for (f = 0; f < 3; f++) { > >> + if (e) > >> + break; > >> + g--; > >> + if (a) > >> + a = g; > >> + } > >> + } > >> + return 0; > >> +} > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > Disinformation flourishes because many people care deeply about injustice > but very few check the facts. Ask me about <https://stallmansupport.org>
On Jun 1, 2022, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, May 31, 2022 at 3:27 PM Alexandre Oliva <oliva@adacore.com> wrote: > int i; > if (flag) > i = init; > i++; > still would (the propagation stops at i++). Uh, are you sure? That doesn't sound right. I meant for the propagation to affect the incremented i as well, and any users thereof: the incremented i is maybe-undefined, since its computation involves another maybe-undefined value. > do we understand the failure mode good enough to say that > a must-def (even if from an SSA name without a definition) is good > enough to avoid the issues we are seeing? A must-def computed from a maybe-undef doesn't protect us from the failure. I assumed the failure mode was understood well enough to make directly-visible undefined SSA_NAMEs problematic, and, given the finding that indirectly-visible ones were also problematic, even with multiple-steps removed, I figured other optimizations such as jump threading could turn indirectly-visible undef names into directly-visible ones, or that other passes could take advantage of indirectly-visible undefinedness leading to potential undefined behavior, to the point that we ought to avoid that. > One would argue that my example above invokes undefined behavior > if (!flag), but IIRC the cases in the PRs we talk about are not but > IVOPTs with its IV choice exposes undefined behavior - orignially > by relying on undef - undef being zero. *nod*. Perhaps we could refrain from propagating through assignments, like the i++ increment above, rather than PHIs after the conditional increment in my modified testcase, on the grounds that, if such non-PHI assignments exercised, then we can assume any of its operands are defined, because otherwise undefined behavior would have been invoked. I.e., at least for ivopts purposes, we could propagate maybe-undefined down PHI nodes only. > That said, the contains-undef thing tries to avoid rewriting expressions > with terms that possibly contain undefs which means if we want to > strenthen it then we look for must-defined (currently it's must-undefined)? I went for must-defined in the patch, but by computing its negated form, maybe-undefined. Now I'm thinking we can go for an even stricter predicate to disable the optimization: if a non-PHI use of a maybe-undefined dominates the loop, then we can still perform the optimization: >> >> + int g; >> >> + if (f) >> >> + g++, b = 40; y = g+2; [loop] The mere use of g before the loop requires it to be defined (even though that's impossible above: either path carries the uninitialized value, incremented or not), so we can proceed with the optimization under the assumption that, after undefined behavior, anything goes. This would be more useful for the case you showed, in which there's a conditional initialization followed by an unconditional use. The requirement of no undefined behavior implies the conditional guarding the initializer must hold, so the optimization can be performed. WDYT?
On Jun 1, 2022, Alexandre Oliva <oliva@adacore.com> wrote: > Now I'm thinking we can go for an even stricter predicate to disable > the optimization: if a non-PHI use of a maybe-undefined dominates the > loop, then we can still perform the optimization: Here it is. [PR105665] ivopts: check defs of names in base for undefs From: Alexandre Oliva <oliva@adacore.com> The patch for PR 100810 tested for undefined SSA_NAMEs appearing directly in the base expression of the potential IV candidate, but that's not enough. The testcase for PR105665 shows an undefined SSA_NAME has the same ill effect if it's referenced as an PHI_NODE arg in the referenced SSA_NAME. The variant of that test shows it can be further removed from the referenced SSA_NAME. To avoid deep recursion, precompute maybe-undefined SSA_NAMEs: start from known-undefined nonvirtual default defs, and propagate them to any PHI nodes reached by a maybe-undefined arg, as long as there aren't intervening non-PHI uses, that would imply the maybe-undefined name must be defined at that point, otherwise it would invoke undefined behavior. Also test for intervening non-PHI uses of DEFs in the base expr. The test for intervening uses implemented herein relies on dominance; this could be further extended, regarding conditional uses in every path leading to a point as an unconditional use dominating that point, but I haven't implemented that. for gcc/ChangeLog PR tree-optimization/105665 PR tree-optimization/100810 * tree-ssa-loop-ivopts.cc (ssa_name_maybe_undef_p, ssa_name_set_maybe_undef): New. (ssa_name_any_use_dominates_bb_p, mark_ssa_maybe_undefs): New. (find_ssa_undef): Check precomputed flag and intervening uses. (tree_ssa_iv_optimize): Call mark_ssa_maybe_undefs. for gcc/testsuite/ChangeLog PR tree-optimization/105665 PR tree-optimization/100810 * gcc.dg/torture/pr105665.c: New. --- gcc/testsuite/gcc.dg/torture/pr105665.c | 20 +++++ gcc/tree-ssa-loop-ivopts.cc | 124 ++++++++++++++++++++++++++++++- 2 files changed, 140 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr105665.c diff --git a/gcc/testsuite/gcc.dg/torture/pr105665.c b/gcc/testsuite/gcc.dg/torture/pr105665.c new file mode 100644 index 0000000000000..34cfc65843495 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr105665.c @@ -0,0 +1,20 @@ +/* { dg-do run } */ + +int a, b, c[1], d[2], *e = c; +int main() { + int f = 0; + for (; b < 2; b++) { + int g; + if (f) + g++, b = 40; + a = d[b * b]; + for (f = 0; f < 3; f++) { + if (e) + break; + g--; + if (a) + a = g; + } + } + return 0; +} diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc index 81b536f930415..f20a985d7ca22 100644 --- a/gcc/tree-ssa-loop-ivopts.cc +++ b/gcc/tree-ssa-loop-ivopts.cc @@ -3071,13 +3071,128 @@ get_loop_invariant_expr (struct ivopts_data *data, tree inv_expr) return *slot; } -/* Find the first undefined SSA name in *TP. */ +/* Return TRUE iff VAR is marked as maybe-undefined. See + mark_ssa_maybe_undefs. */ + +static inline bool +ssa_name_maybe_undef_p (tree var) +{ + gcc_checking_assert (TREE_CODE (var) == SSA_NAME); + return TREE_VISITED (var); +} + +/* Set (or clear, depending on VALUE) VAR's maybe-undefined mark. */ + +static inline void +ssa_name_set_maybe_undef (tree var, bool value = true) +{ + gcc_checking_assert (TREE_CODE (var) == SSA_NAME); + TREE_VISITED (var) = value; +} + +/* Return TRUE iff there are any non-PHI uses of VAR that dominate the + end of BB. If we return TRUE and BB is a loop header, then VAR we + be assumed to be defined within the loop, even if it is marked as + maybe-undefined. */ + +static inline bool +ssa_name_any_use_dominates_bb_p (tree var, basic_block bb) +{ + imm_use_iterator iter; + use_operand_p use_p; + FOR_EACH_IMM_USE_FAST (use_p, iter, var) + { + if (is_a <gphi *> (USE_STMT (use_p))) + continue; + basic_block dombb = gimple_bb (USE_STMT (use_p)); + if (dominated_by_p (CDI_DOMINATORS, bb, dombb)) + return true; + } + + return false; +} + +/* Mark as maybe_undef any SSA_NAMEs that are unsuitable as ivopts + candidates for potentially involving undefined behavior. */ + +static void +mark_ssa_maybe_undefs (void) +{ + auto_vec<tree> queue; + + /* Scan all SSA_NAMEs, marking the definitely-undefined ones as + maybe-undefined and queuing them for propagation, while clearing + the mark on others. */ + unsigned int i; + tree var; + FOR_EACH_SSA_NAME (i, var, cfun) + { + if (SSA_NAME_IS_VIRTUAL_OPERAND (var) + || !ssa_undefined_value_p (var, false)) + ssa_name_set_maybe_undef (var, false); + else + { + ssa_name_set_maybe_undef (var); + queue.safe_push (var); + if (dump_file) + fprintf (dump_file, "marking _%i as maybe-undef\n", + SSA_NAME_VERSION (var)); + } + } + + /* Now propagate maybe-undefined from a DEF to any other PHI that + uses it, as long as there isn't any intervening use of DEF. */ + while (!queue.is_empty ()) + { + var = queue.pop (); + imm_use_iterator iter; + use_operand_p use_p; + FOR_EACH_IMM_USE_FAST (use_p, iter, var) + { + /* Any uses of VAR that aren't PHI args imply VAR must be + defined, otherwise undefined behavior would have been + definitely invoked. Only PHI args may hold + maybe-undefined values without invoking undefined + behavior for that reason alone. */ + if (!is_a <gphi *> (USE_STMT (use_p))) + continue; + gphi *phi = as_a <gphi *> (USE_STMT (use_p)); + + tree def = gimple_phi_result (phi); + if (ssa_name_maybe_undef_p (def)) + continue; + + /* Look for any uses of the maybe-unused SSA_NAME that + dominates the block that reaches the incoming block + corresponding to the PHI arg in which it is mentioned. + That means we can assume the SSA_NAME is defined in that + path, so we only mark a PHI result as maybe-undef if we + find an unused reaching SSA_NAME. */ + int idx = phi_arg_index_from_use (use_p); + basic_block bb = gimple_phi_arg_edge (phi, idx)->src; + if (ssa_name_any_use_dominates_bb_p (var, bb)) + continue; + + ssa_name_set_maybe_undef (def); + queue.safe_push (def); + if (dump_file) + fprintf (dump_file, "marking _%i as maybe-undef because of _%i\n", + SSA_NAME_VERSION (def), SSA_NAME_VERSION (var)); + } + } +} + +/* Return *TP if it is an SSA_NAME marked with TREE_VISITED, i.e., as + unsuitable as ivopts candidates for potentially involving undefined + behavior. */ static tree -find_ssa_undef (tree *tp, int *walk_subtrees, void *) +find_ssa_undef (tree *tp, int *walk_subtrees, void *bb_) { + basic_block bb = (basic_block) bb_; if (TREE_CODE (*tp) == SSA_NAME - && ssa_undefined_value_p (*tp, false)) + && ssa_name_maybe_undef_p (*tp) + && !ssa_name_any_use_dominates_bb_p (*tp, bb)) return *tp; if (!EXPR_P (*tp)) *walk_subtrees = 0; @@ -3114,7 +3229,7 @@ add_candidate_1 (struct ivopts_data *data, tree base, tree step, bool important, /* If BASE contains undefined SSA names make sure we only record the original IV. */ bool involves_undefs = false; - if (walk_tree (&base, find_ssa_undef, NULL, NULL)) + if (walk_tree (&base, find_ssa_undef, data->current_loop->header, NULL)) { if (pos != IP_ORIGINAL) return NULL; @@ -8192,6 +8307,7 @@ tree_ssa_iv_optimize (void) auto_bitmap toremove; tree_ssa_iv_optimize_init (&data); + mark_ssa_maybe_undefs (); /* Optimize the loops starting with the innermost ones. */ for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
On Thu, Jun 2, 2022 at 9:10 AM Alexandre Oliva <oliva@adacore.com> wrote: > > On Jun 1, 2022, Alexandre Oliva <oliva@adacore.com> wrote: > > > Now I'm thinking we can go for an even stricter predicate to disable > > the optimization: if a non-PHI use of a maybe-undefined dominates the > > loop, then we can still perform the optimization: > > Here it is. > > > [PR105665] ivopts: check defs of names in base for undefs > > From: Alexandre Oliva <oliva@adacore.com> > > The patch for PR 100810 tested for undefined SSA_NAMEs appearing > directly in the base expression of the potential IV candidate, but > that's not enough. The testcase for PR105665 shows an undefined > SSA_NAME has the same ill effect if it's referenced as an PHI_NODE arg > in the referenced SSA_NAME. The variant of that test shows it can be > further removed from the referenced SSA_NAME. > > To avoid deep recursion, precompute maybe-undefined SSA_NAMEs: start > from known-undefined nonvirtual default defs, and propagate them to > any PHI nodes reached by a maybe-undefined arg, as long as there > aren't intervening non-PHI uses, that would imply the maybe-undefined > name must be defined at that point, otherwise it would invoke > undefined behavior. Also test for intervening non-PHI uses of DEFs in > the base expr. > > The test for intervening uses implemented herein relies on dominance; > this could be further extended, regarding conditional uses in every > path leading to a point as an unconditional use dominating that point, > but I haven't implemented that. > > > for gcc/ChangeLog > > PR tree-optimization/105665 > PR tree-optimization/100810 > * tree-ssa-loop-ivopts.cc > (ssa_name_maybe_undef_p, ssa_name_set_maybe_undef): New. > (ssa_name_any_use_dominates_bb_p, mark_ssa_maybe_undefs): New. > (find_ssa_undef): Check precomputed flag and intervening uses. > (tree_ssa_iv_optimize): Call mark_ssa_maybe_undefs. > > for gcc/testsuite/ChangeLog > > PR tree-optimization/105665 > PR tree-optimization/100810 > * gcc.dg/torture/pr105665.c: New. > --- > gcc/testsuite/gcc.dg/torture/pr105665.c | 20 +++++ > gcc/tree-ssa-loop-ivopts.cc | 124 ++++++++++++++++++++++++++++++- > 2 files changed, 140 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/pr105665.c > > diff --git a/gcc/testsuite/gcc.dg/torture/pr105665.c b/gcc/testsuite/gcc.dg/torture/pr105665.c > new file mode 100644 > index 0000000000000..34cfc65843495 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/pr105665.c > @@ -0,0 +1,20 @@ > +/* { dg-do run } */ > + > +int a, b, c[1], d[2], *e = c; > +int main() { > + int f = 0; > + for (; b < 2; b++) { > + int g; > + if (f) > + g++, b = 40; > + a = d[b * b]; > + for (f = 0; f < 3; f++) { > + if (e) > + break; > + g--; > + if (a) > + a = g; > + } > + } > + return 0; > +} > diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc > index 81b536f930415..f20a985d7ca22 100644 > --- a/gcc/tree-ssa-loop-ivopts.cc > +++ b/gcc/tree-ssa-loop-ivopts.cc > @@ -3071,13 +3071,128 @@ get_loop_invariant_expr (struct ivopts_data *data, tree inv_expr) > return *slot; > } > > -/* Find the first undefined SSA name in *TP. */ > +/* Return TRUE iff VAR is marked as maybe-undefined. See > + mark_ssa_maybe_undefs. */ > + > +static inline bool > +ssa_name_maybe_undef_p (tree var) > +{ > + gcc_checking_assert (TREE_CODE (var) == SSA_NAME); > + return TREE_VISITED (var); > +} > + > +/* Set (or clear, depending on VALUE) VAR's maybe-undefined mark. */ > + > +static inline void > +ssa_name_set_maybe_undef (tree var, bool value = true) > +{ > + gcc_checking_assert (TREE_CODE (var) == SSA_NAME); > + TREE_VISITED (var) = value; > +} > + > +/* Return TRUE iff there are any non-PHI uses of VAR that dominate the > + end of BB. If we return TRUE and BB is a loop header, then VAR we > + be assumed to be defined within the loop, even if it is marked as > + maybe-undefined. */ > + > +static inline bool > +ssa_name_any_use_dominates_bb_p (tree var, basic_block bb) > +{ > + imm_use_iterator iter; > + use_operand_p use_p; > + FOR_EACH_IMM_USE_FAST (use_p, iter, var) > + { > + if (is_a <gphi *> (USE_STMT (use_p))) I think you also want to skip debug stmts here? > + continue; > + basic_block dombb = gimple_bb (USE_STMT (use_p)); > + if (dominated_by_p (CDI_DOMINATORS, bb, dombb)) > + return true; > + } > + > + return false; > +} > + > +/* Mark as maybe_undef any SSA_NAMEs that are unsuitable as ivopts > + candidates for potentially involving undefined behavior. */ > + > +static void > +mark_ssa_maybe_undefs (void) > +{ > + auto_vec<tree> queue; > + > + /* Scan all SSA_NAMEs, marking the definitely-undefined ones as > + maybe-undefined and queuing them for propagation, while clearing > + the mark on others. */ > + unsigned int i; > + tree var; > + FOR_EACH_SSA_NAME (i, var, cfun) > + { > + if (SSA_NAME_IS_VIRTUAL_OPERAND (var) > + || !ssa_undefined_value_p (var, false)) > + ssa_name_set_maybe_undef (var, false); > + else > + { > + ssa_name_set_maybe_undef (var); > + queue.safe_push (var); > + if (dump_file) && (dump_flags & TDF_DETAILS) please > + fprintf (dump_file, "marking _%i as maybe-undef\n", > + SSA_NAME_VERSION (var)); > + } > + } > + > + /* Now propagate maybe-undefined from a DEF to any other PHI that > + uses it, as long as there isn't any intervening use of DEF. */ > + while (!queue.is_empty ()) > + { > + var = queue.pop (); > + imm_use_iterator iter; > + use_operand_p use_p; > + FOR_EACH_IMM_USE_FAST (use_p, iter, var) > + { > + /* Any uses of VAR that aren't PHI args imply VAR must be > + defined, otherwise undefined behavior would have been > + definitely invoked. Only PHI args may hold > + maybe-undefined values without invoking undefined > + behavior for that reason alone. */ > + if (!is_a <gphi *> (USE_STMT (use_p))) > + continue; > + gphi *phi = as_a <gphi *> (USE_STMT (use_p)); > + > + tree def = gimple_phi_result (phi); > + if (ssa_name_maybe_undef_p (def)) > + continue; > + > + /* Look for any uses of the maybe-unused SSA_NAME that > + dominates the block that reaches the incoming block > + corresponding to the PHI arg in which it is mentioned. > + That means we can assume the SSA_NAME is defined in that > + path, so we only mark a PHI result as maybe-undef if we > + find an unused reaching SSA_NAME. */ > + int idx = phi_arg_index_from_use (use_p); > + basic_block bb = gimple_phi_arg_edge (phi, idx)->src; > + if (ssa_name_any_use_dominates_bb_p (var, bb)) > + continue; > + > + ssa_name_set_maybe_undef (def); > + queue.safe_push (def); > + if (dump_file) likewise OK with those changes. Thanks, Richard. > + fprintf (dump_file, "marking _%i as maybe-undef because of _%i\n", > + SSA_NAME_VERSION (def), SSA_NAME_VERSION (var)); > + } > + } > +} > + > +/* Return *TP if it is an SSA_NAME marked with TREE_VISITED, i.e., as > + unsuitable as ivopts candidates for potentially involving undefined > + behavior. */ > > static tree > -find_ssa_undef (tree *tp, int *walk_subtrees, void *) > +find_ssa_undef (tree *tp, int *walk_subtrees, void *bb_) > { > + basic_block bb = (basic_block) bb_; > if (TREE_CODE (*tp) == SSA_NAME > - && ssa_undefined_value_p (*tp, false)) > + && ssa_name_maybe_undef_p (*tp) > + && !ssa_name_any_use_dominates_bb_p (*tp, bb)) > return *tp; > if (!EXPR_P (*tp)) > *walk_subtrees = 0; > @@ -3114,7 +3229,7 @@ add_candidate_1 (struct ivopts_data *data, tree base, tree step, bool important, > /* If BASE contains undefined SSA names make sure we only record > the original IV. */ > bool involves_undefs = false; > - if (walk_tree (&base, find_ssa_undef, NULL, NULL)) > + if (walk_tree (&base, find_ssa_undef, data->current_loop->header, NULL)) > { > if (pos != IP_ORIGINAL) > return NULL; > @@ -8192,6 +8307,7 @@ tree_ssa_iv_optimize (void) > auto_bitmap toremove; > > tree_ssa_iv_optimize_init (&data); > + mark_ssa_maybe_undefs (); > > /* Optimize the loops starting with the innermost ones. */ > for (auto loop : loops_list (cfun, LI_FROM_INNERMOST)) > > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > Disinformation flourishes because many people care deeply about injustice > but very few check the facts. Ask me about <https://stallmansupport.org>
On Jun 2, 2022, Richard Biener <richard.guenther@gmail.com> wrote: >> + if (is_a <gphi *> (USE_STMT (use_p))) > I think you also want to skip debug stmts here? Indeed, thanks! (live by the sword, die by the sword ;-) >> + if (dump_file) > && (dump_flags & TDF_DETAILS) please ack >> + if (dump_file) > likewise ditto > OK with those changes. Thanks, here's what I'm installing momentarily. [PR105665] ivopts: check defs of names in base for undefs From: Alexandre Oliva <oliva@adacore.com> The patch for PR 100810 tested for undefined SSA_NAMEs appearing directly in the base expression of the potential IV candidate, but that's not enough. The testcase for PR105665 shows an undefined SSA_NAME has the same ill effect if it's referenced as an PHI_NODE arg in the referenced SSA_NAME. The variant of that test shows it can be further removed from the referenced SSA_NAME. To avoid deep recursion, precompute maybe-undefined SSA_NAMEs: start from known-undefined nonvirtual default defs, and propagate them to any PHI nodes reached by a maybe-undefined arg, as long as there aren't intervening non-PHI uses, that would imply the maybe-undefined name must be defined at that point, otherwise it would invoke undefined behavior. Also test for intervening non-PHI uses of DEFs in the base expr. The test for intervening uses implemented herein relies on dominance; this could be further extended, regarding conditional uses in every path leading to a point as an unconditional use dominating that point, but I haven't implemented that. for gcc/ChangeLog PR tree-optimization/105665 PR tree-optimization/100810 * tree-ssa-loop-ivopts.cc (ssa_name_maybe_undef_p, ssa_name_set_maybe_undef): New. (ssa_name_any_use_dominates_bb_p, mark_ssa_maybe_undefs): New. (find_ssa_undef): Check precomputed flag and intervening uses. (tree_ssa_iv_optimize): Call mark_ssa_maybe_undefs. for gcc/testsuite/ChangeLog PR tree-optimization/105665 PR tree-optimization/100810 * gcc.dg/torture/pr105665.c: New. --- gcc/testsuite/gcc.dg/torture/pr105665.c | 20 +++++ gcc/tree-ssa-loop-ivopts.cc | 125 ++++++++++++++++++++++++++++++- 2 files changed, 141 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr105665.c diff --git a/gcc/testsuite/gcc.dg/torture/pr105665.c b/gcc/testsuite/gcc.dg/torture/pr105665.c new file mode 100644 index 0000000000000..34cfc65843495 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr105665.c @@ -0,0 +1,20 @@ +/* { dg-do run } */ + +int a, b, c[1], d[2], *e = c; +int main() { + int f = 0; + for (; b < 2; b++) { + int g; + if (f) + g++, b = 40; + a = d[b * b]; + for (f = 0; f < 3; f++) { + if (e) + break; + g--; + if (a) + a = g; + } + } + return 0; +} diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc index 81b536f930415..549168aebd632 100644 --- a/gcc/tree-ssa-loop-ivopts.cc +++ b/gcc/tree-ssa-loop-ivopts.cc @@ -3071,13 +3071,129 @@ get_loop_invariant_expr (struct ivopts_data *data, tree inv_expr) return *slot; } -/* Find the first undefined SSA name in *TP. */ +/* Return TRUE iff VAR is marked as maybe-undefined. See + mark_ssa_maybe_undefs. */ + +static inline bool +ssa_name_maybe_undef_p (tree var) +{ + gcc_checking_assert (TREE_CODE (var) == SSA_NAME); + return TREE_VISITED (var); +} + +/* Set (or clear, depending on VALUE) VAR's maybe-undefined mark. */ + +static inline void +ssa_name_set_maybe_undef (tree var, bool value = true) +{ + gcc_checking_assert (TREE_CODE (var) == SSA_NAME); + TREE_VISITED (var) = value; +} + +/* Return TRUE iff there are any non-PHI uses of VAR that dominate the + end of BB. If we return TRUE and BB is a loop header, then VAR we + be assumed to be defined within the loop, even if it is marked as + maybe-undefined. */ + +static inline bool +ssa_name_any_use_dominates_bb_p (tree var, basic_block bb) +{ + imm_use_iterator iter; + use_operand_p use_p; + FOR_EACH_IMM_USE_FAST (use_p, iter, var) + { + if (is_a <gphi *> (USE_STMT (use_p)) + || is_gimple_debug (USE_STMT (use_p))) + continue; + basic_block dombb = gimple_bb (USE_STMT (use_p)); + if (dominated_by_p (CDI_DOMINATORS, bb, dombb)) + return true; + } + + return false; +} + +/* Mark as maybe_undef any SSA_NAMEs that are unsuitable as ivopts + candidates for potentially involving undefined behavior. */ + +static void +mark_ssa_maybe_undefs (void) +{ + auto_vec<tree> queue; + + /* Scan all SSA_NAMEs, marking the definitely-undefined ones as + maybe-undefined and queuing them for propagation, while clearing + the mark on others. */ + unsigned int i; + tree var; + FOR_EACH_SSA_NAME (i, var, cfun) + { + if (SSA_NAME_IS_VIRTUAL_OPERAND (var) + || !ssa_undefined_value_p (var, false)) + ssa_name_set_maybe_undef (var, false); + else + { + ssa_name_set_maybe_undef (var); + queue.safe_push (var); + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "marking _%i as maybe-undef\n", + SSA_NAME_VERSION (var)); + } + } + + /* Now propagate maybe-undefined from a DEF to any other PHI that + uses it, as long as there isn't any intervening use of DEF. */ + while (!queue.is_empty ()) + { + var = queue.pop (); + imm_use_iterator iter; + use_operand_p use_p; + FOR_EACH_IMM_USE_FAST (use_p, iter, var) + { + /* Any uses of VAR that aren't PHI args imply VAR must be + defined, otherwise undefined behavior would have been + definitely invoked. Only PHI args may hold + maybe-undefined values without invoking undefined + behavior for that reason alone. */ + if (!is_a <gphi *> (USE_STMT (use_p))) + continue; + gphi *phi = as_a <gphi *> (USE_STMT (use_p)); + + tree def = gimple_phi_result (phi); + if (ssa_name_maybe_undef_p (def)) + continue; + + /* Look for any uses of the maybe-unused SSA_NAME that + dominates the block that reaches the incoming block + corresponding to the PHI arg in which it is mentioned. + That means we can assume the SSA_NAME is defined in that + path, so we only mark a PHI result as maybe-undef if we + find an unused reaching SSA_NAME. */ + int idx = phi_arg_index_from_use (use_p); + basic_block bb = gimple_phi_arg_edge (phi, idx)->src; + if (ssa_name_any_use_dominates_bb_p (var, bb)) + continue; + + ssa_name_set_maybe_undef (def); + queue.safe_push (def); + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "marking _%i as maybe-undef because of _%i\n", + SSA_NAME_VERSION (def), SSA_NAME_VERSION (var)); + } + } +} + +/* Return *TP if it is an SSA_NAME marked with TREE_VISITED, i.e., as + unsuitable as ivopts candidates for potentially involving undefined + behavior. */ static tree -find_ssa_undef (tree *tp, int *walk_subtrees, void *) +find_ssa_undef (tree *tp, int *walk_subtrees, void *bb_) { + basic_block bb = (basic_block) bb_; if (TREE_CODE (*tp) == SSA_NAME - && ssa_undefined_value_p (*tp, false)) + && ssa_name_maybe_undef_p (*tp) + && !ssa_name_any_use_dominates_bb_p (*tp, bb)) return *tp; if (!EXPR_P (*tp)) *walk_subtrees = 0; @@ -3114,7 +3230,7 @@ add_candidate_1 (struct ivopts_data *data, tree base, tree step, bool important, /* If BASE contains undefined SSA names make sure we only record the original IV. */ bool involves_undefs = false; - if (walk_tree (&base, find_ssa_undef, NULL, NULL)) + if (walk_tree (&base, find_ssa_undef, data->current_loop->header, NULL)) { if (pos != IP_ORIGINAL) return NULL; @@ -8192,6 +8308,7 @@ tree_ssa_iv_optimize (void) auto_bitmap toremove; tree_ssa_iv_optimize_init (&data); + mark_ssa_maybe_undefs (); /* Optimize the loops starting with the innermost ones. */ for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
diff --git a/gcc/testsuite/gcc.dg/torture/pr105665.c b/gcc/testsuite/gcc.dg/torture/pr105665.c new file mode 100644 index 0000000000000..34cfc65843495 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr105665.c @@ -0,0 +1,20 @@ +/* { dg-do run } */ + +int a, b, c[1], d[2], *e = c; +int main() { + int f = 0; + for (; b < 2; b++) { + int g; + if (f) + g++, b = 40; + a = d[b * b]; + for (f = 0; f < 3; f++) { + if (e) + break; + g--; + if (a) + a = g; + } + } + return 0; +} diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc index 81b536f930415..d8200f2a53b21 100644 --- a/gcc/tree-ssa-loop-ivopts.cc +++ b/gcc/tree-ssa-loop-ivopts.cc @@ -3071,13 +3071,70 @@ get_loop_invariant_expr (struct ivopts_data *data, tree inv_expr) return *slot; } -/* Find the first undefined SSA name in *TP. */ +/* Mark as TREe_VISITED any SSA_NAMEs that are unsuitable as ivopts + candidates for potentially involving undefined behavior. */ + +static void +mark_ssa_undefs (void) +{ + auto_vec<tree> queue; + + unsigned int i; + tree var; + FOR_EACH_SSA_NAME (i, var, cfun) + { + if (SSA_NAME_IS_VIRTUAL_OPERAND (var) + || ssa_defined_default_def_p (var) + || !ssa_undefined_value_p (var, false)) + TREE_VISITED (var) = false; + else + { + TREE_VISITED (var) = true; + queue.safe_push (var); + if (dump_file) + fprintf (dump_file, "marking _%i as undef\n", + SSA_NAME_VERSION (var)); + } + } + + while (!queue.is_empty ()) + { + var = queue.pop (); + gimple *stmt; + imm_use_iterator iter; + FOR_EACH_IMM_USE_STMT (stmt, iter, var) + { + if (is_gimple_call (stmt) || is_a <gasm *> (stmt)) + continue; + + def_operand_p defvar; + ssa_op_iter diter; + FOR_EACH_PHI_OR_STMT_DEF (defvar, stmt, diter, SSA_OP_DEF) + { + gcc_checking_assert (is_gimple_assign (stmt) + || is_a <gphi *> (stmt)); + tree def = DEF_FROM_PTR (defvar); + if (TREE_VISITED (def)) + continue; + TREE_VISITED (def) = true; + queue.safe_push (def); + if (dump_file) + fprintf (dump_file, "Marking _%i as undef because of _%i\n", + SSA_NAME_VERSION (def), SSA_NAME_VERSION (var)); + } + } + } +} + +/* Return *TP if it is an SSA_NAME marked with TREE_VISITED, i.e., as + unsuitable as ivopts candidates for potentially involving undefined + behavior. */ static tree find_ssa_undef (tree *tp, int *walk_subtrees, void *) { if (TREE_CODE (*tp) == SSA_NAME - && ssa_undefined_value_p (*tp, false)) + && TREE_VISITED (*tp)) return *tp; if (!EXPR_P (*tp)) *walk_subtrees = 0; @@ -8192,6 +8249,7 @@ tree_ssa_iv_optimize (void) auto_bitmap toremove; tree_ssa_iv_optimize_init (&data); + mark_ssa_undefs (); /* Optimize the loops starting with the innermost ones. */ for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))