Message ID | 2c4b23c0-7be6-8050-364a-323cb8b51408@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [RFC] Fix PR62147 by passing finiteness information to RTL phase | expand |
On Tue, 25 Jun 2019, Kewen.Lin wrote: > Hi all, > > As PR62147, for some cases loop iv analysis is unable to identify one loop is > finite even if the loop is actually finite and we have known it in middle-end. > It will prevent doloop_optimize and end up with worse codes. > > This patch is going to leverage existing middle-end function finite_loop_p, > record the finiteness information in loop structure. Later loop iv can make use > of this saved information. > > It's based on two observations: > 1) the loop structure for one specific loop is shared between middle-end and > back-end. > 2) for one specific loop, if it's finite then never become infinite itself. > > As one gcc newbie, I'm not sure whether these two observations are true in all > cases. Please feel free to correct me if anything missing. I think 2) is not true with -ffinite-loops. > btw, I also took a look at how the loop constraint LOOP_C_FINITE is used, I think > it's not suitable for this purpose, it's mainly set by vectorizer and tell niter > and scev to take one loop as finite. The original patch has the words > "constraint flag is mainly set by consumers and affects certain semantics of > niter analyzer APIs". > > Bootstrapped and regression testing passed on powerpc64le-unknown-linux-gnu. Did you consider to simply use finite_loop_p () from doloop.c? That would be a much simpler patch. For the testcase in question -ffinite-loops would provide this guarantee even on RTL, so would the upper bound that may be still set. Richard. > > Thanks, > Kewen > > --------------------------- > > gcc/ChangeLog > > 2019-06-25 Kewen Lin <linkw@gcc.gnu.org> > > PR target/62147 > * gcc/cfgloop.c (alloc_loop): Init new field. > * gcc/cfgloop.h (struct loop): New field. > * gcc/cfgloopmanip.c (copy_loop_info): Copy new field. > * gcc/tree-ssa-loop-niter.c (finite_loop_p): Rename... > (test_and_update_loop_finite): ...this. Test and update above field. > * gcc/tree-ssa-loop-niter.h (finite_loop_p): Rename... > (test_and_update_loop_finite): ...this. > * gcc/ipa-pure-const.c (analyze_function): Adjust to use above renamed > function. > * gcc/tree-ssa-dce.c (find_obviously_necessary_stmts): Likewise. > * gcc/tree-ssa-loop-im.c (fill_always_executed_in_1): Likewise. > * gcc/loop-iv.c (find_simple_exit): Update finiteness by new field. > * gcc/tree-ssa-loop-ivopts.c (struct ivopts_data): New field. > (rewrite_use_compare): Update new field. > (tree_ssa_iv_optimize_loop): Init new field and call > test_and_update_loop_finite if set. > > gcc/testsuite/ChangeLog > > 2019-06-25 Kewen Lin <linkw@gcc.gnu.org> > > PR target/62147 > * gcc.target/powerpc/pr62147.c: New test. > > > > > diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c > index f64326b944e..89e4dd069ac 100644 > --- a/gcc/cfgloop.c > +++ b/gcc/cfgloop.c > @@ -355,6 +355,7 @@ alloc_loop (void) > loop->nb_iterations_upper_bound = 0; > loop->nb_iterations_likely_upper_bound = 0; > loop->nb_iterations_estimate = 0; > + loop->is_finite = false; > return loop; > } > > diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h > index 2f8ab106d03..08ec930597e 100644 > --- a/gcc/cfgloop.h > +++ b/gcc/cfgloop.h > @@ -224,6 +224,9 @@ struct GTY ((chain_next ("%h.next"))) loop { > /* True if the loop is part of an oacc kernels region. */ > unsigned in_oacc_kernels_region : 1; > > + /* True if middle end analysis ensure this loop is finite. */ > + unsigned is_finite : 1; > + > /* The number of times to unroll the loop. 0 means no information given, > just do what we always do. A value of 1 means do not unroll the loop. > A value of USHRT_MAX means unroll with no specific unrolling factor. > diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c > index 50250ec4d7c..f1033d11865 100644 > --- a/gcc/cfgloopmanip.c > +++ b/gcc/cfgloopmanip.c > @@ -1026,6 +1026,7 @@ copy_loop_info (struct loop *loop, struct loop *target) > target->in_oacc_kernels_region = loop->in_oacc_kernels_region; > target->unroll = loop->unroll; > target->owned_clique = loop->owned_clique; > + target->is_finite = loop->is_finite; > } > > /* Copies copy of LOOP as subloop of TARGET loop, placing newly > diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c > index bb561d00853..f7282adc5b1 100644 > --- a/gcc/ipa-pure-const.c > +++ b/gcc/ipa-pure-const.c > @@ -1089,7 +1089,7 @@ end: > struct loop *loop; > scev_initialize (); > FOR_EACH_LOOP (loop, 0) > - if (!finite_loop_p (loop)) > + if (!test_and_update_loop_finite (loop)) > { > if (dump_file) > fprintf (dump_file, " cannot prove finiteness of " > diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c > index 82b4bdb1523..a6bc82975cc 100644 > --- a/gcc/loop-iv.c > +++ b/gcc/loop-iv.c > @@ -2997,6 +2997,19 @@ find_simple_exit (struct loop *loop, struct niter_desc *desc) > fprintf (dump_file, "Loop %d is not simple.\n", loop->num); > } > > + /* Fix up the finiteness if possible. We can only do it for single exit, > + since the loop is finite, but it's possible that we predicate one loop > + exit to be finite which can not be determined as finite in middle-end as > + well. It results in incorrect predicate information on the exit condition > + expression. For example, if says [(int) _1 + -8, + , -8] != 0 finite, > + it means _1 can exactly divide -8. */ > + if (single_exit (loop) && desc->infinite && loop->is_finite) > + { > + desc->infinite = NULL_RTX; > + if (dump_file) > + fprintf (dump_file, " infinite updated to finite.\n"); > + } > + > free (body); > } > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr62147.c b/gcc/testsuite/gcc.target/powerpc/pr62147.c > new file mode 100644 > index 00000000000..635c73711da > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr62147.c > @@ -0,0 +1,24 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-options "-O2 -fno-tree-loop-distribute-patterns" } */ > + > +/* Note that it's required to disable loop-distribute-patterns, otherwise the > + loop will be optimized to memset. */ > + > +/* Expect loop_iv can know the loop is finite so the doloop_optimize > + can perform the doloop transformation. */ > + > +typedef struct { > + int l; > + int b[258]; > +} S; > + > +void clear (S* s ) > +{ > + int i; > + int len = s->l + 1; > + > + for (i = 0; i <= len; i++) > + s->b[i] = 0; > +} > + > +/* { dg-final { scan-assembler {\mbdnz\M} } } */ > diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c > index 2478219d873..7c183c7269a 100644 > --- a/gcc/tree-ssa-dce.c > +++ b/gcc/tree-ssa-dce.c > @@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool aggressive) > } > > FOR_EACH_LOOP (loop, 0) > - if (!finite_loop_p (loop)) > + if (!test_and_update_loop_finite (loop)) > { > if (dump_file) > fprintf (dump_file, "cannot prove finiteness of loop %i\n", loop->num); > diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c > index 2064c2900fb..49b8577ccd1 100644 > --- a/gcc/tree-ssa-loop-im.c > +++ b/gcc/tree-ssa-loop-im.c > @@ -2484,7 +2484,7 @@ fill_always_executed_in_1 (struct loop *loop, sbitmap contains_call) > /* Or we enter a possibly non-finite loop. */ > if (flow_loop_nested_p (bb->loop_father, > e->dest->loop_father) > - && ! finite_loop_p (e->dest->loop_father)) > + && ! test_and_update_loop_finite (e->dest->loop_father)) > break; > } > if (e) > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c > index 890f9b788b4..484a6626c05 100644 > --- a/gcc/tree-ssa-loop-ivopts.c > +++ b/gcc/tree-ssa-loop-ivopts.c > @@ -612,6 +612,9 @@ struct ivopts_data > > /* Whether the loop body can only be exited via single exit. */ > bool loop_single_exit_p; > + > + /* Whether to compute loop finiteness for loop iv. */ > + bool compute_loop_finite_p; > }; > > /* An assignment of iv candidates to uses. */ > @@ -7145,6 +7148,9 @@ rewrite_use_compare (struct ivopts_data *data, > gimple_cond_set_lhs (cond_stmt, var); > gimple_cond_set_code (cond_stmt, compare); > gimple_cond_set_rhs (cond_stmt, op); > + > + data->compute_loop_finite_p = true; > + > return; > } > > @@ -7526,6 +7532,7 @@ tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop, > data->current_loop = loop; > data->loop_loc = find_loop_location (loop).get_location_t (); > data->speed = optimize_loop_for_speed_p (loop); > + data->compute_loop_finite_p = false; > > if (dump_file && (dump_flags & TDF_DETAILS)) > { > @@ -7592,6 +7599,10 @@ tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop, > /* Remove the ivs that are unused after rewriting. */ > remove_unused_ivs (data, toremove); > > + /* Update loop finiteness info. */ > + if (data->compute_loop_finite_p) > + test_and_update_loop_finite (data->current_loop); > + > finish: > free (body); > free_loop_data (data); > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > index 84e6e313c85..78152714791 100644 > --- a/gcc/tree-ssa-loop-niter.c > +++ b/gcc/tree-ssa-loop-niter.c > @@ -2808,10 +2808,14 @@ find_loop_niter (struct loop *loop, edge *exit) > /* Return true if loop is known to have bounded number of iterations. */ > > bool > -finite_loop_p (struct loop *loop) > +test_and_update_loop_finite (struct loop *loop) > { > widest_int nit; > int flags; > + bool finite_p = false; > + > + if (loop->is_finite) > + return true; > > flags = flags_from_decl_or_type (current_function_decl); > if ((flags & (ECF_CONST|ECF_PURE)) && !(flags & ECF_LOOPING_CONST_OR_PURE)) > @@ -2819,7 +2823,8 @@ finite_loop_p (struct loop *loop) > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, "Found loop %i to be finite: it is within pure or const function.\n", > loop->num); > - return true; > + finite_p = true; > + goto finish; > } > > if (loop->any_upper_bound > @@ -2828,9 +2833,14 @@ finite_loop_p (struct loop *loop) > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, "Found loop %i to be finite: upper bound found.\n", > loop->num); > - return true; > + finite_p = true; > + goto finish; > } > - return false; > + > + finish: > + if (finite_p) > + loop->is_finite = true; > + return finite_p; > } > > /* > diff --git a/gcc/tree-ssa-loop-niter.h b/gcc/tree-ssa-loop-niter.h > index dc116489218..01d587f1cbb 100644 > --- a/gcc/tree-ssa-loop-niter.h > +++ b/gcc/tree-ssa-loop-niter.h > @@ -30,7 +30,7 @@ extern bool number_of_iterations_exit_assumptions (struct loop *, edge, > struct tree_niter_desc *, > gcond **, bool = true); > extern tree find_loop_niter (struct loop *, edge *); > -extern bool finite_loop_p (struct loop *); > +extern bool test_and_update_loop_finite (struct loop *); > extern tree loop_niter_by_eval (struct loop *, edge); > extern tree find_loop_niter_by_eval (struct loop *, edge *); > extern bool estimated_loop_iterations (struct loop *, widest_int *); > >
diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c index f64326b944e..89e4dd069ac 100644 --- a/gcc/cfgloop.c +++ b/gcc/cfgloop.c @@ -355,6 +355,7 @@ alloc_loop (void) loop->nb_iterations_upper_bound = 0; loop->nb_iterations_likely_upper_bound = 0; loop->nb_iterations_estimate = 0; + loop->is_finite = false; return loop; } diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 2f8ab106d03..08ec930597e 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -224,6 +224,9 @@ struct GTY ((chain_next ("%h.next"))) loop { /* True if the loop is part of an oacc kernels region. */ unsigned in_oacc_kernels_region : 1; + /* True if middle end analysis ensure this loop is finite. */ + unsigned is_finite : 1; + /* The number of times to unroll the loop. 0 means no information given, just do what we always do. A value of 1 means do not unroll the loop. A value of USHRT_MAX means unroll with no specific unrolling factor. diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c index 50250ec4d7c..f1033d11865 100644 --- a/gcc/cfgloopmanip.c +++ b/gcc/cfgloopmanip.c @@ -1026,6 +1026,7 @@ copy_loop_info (struct loop *loop, struct loop *target) target->in_oacc_kernels_region = loop->in_oacc_kernels_region; target->unroll = loop->unroll; target->owned_clique = loop->owned_clique; + target->is_finite = loop->is_finite; } /* Copies copy of LOOP as subloop of TARGET loop, placing newly diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c index bb561d00853..f7282adc5b1 100644 --- a/gcc/ipa-pure-const.c +++ b/gcc/ipa-pure-const.c @@ -1089,7 +1089,7 @@ end: struct loop *loop; scev_initialize (); FOR_EACH_LOOP (loop, 0) - if (!finite_loop_p (loop)) + if (!test_and_update_loop_finite (loop)) { if (dump_file) fprintf (dump_file, " cannot prove finiteness of " diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c index 82b4bdb1523..a6bc82975cc 100644 --- a/gcc/loop-iv.c +++ b/gcc/loop-iv.c @@ -2997,6 +2997,19 @@ find_simple_exit (struct loop *loop, struct niter_desc *desc) fprintf (dump_file, "Loop %d is not simple.\n", loop->num); } + /* Fix up the finiteness if possible. We can only do it for single exit, + since the loop is finite, but it's possible that we predicate one loop + exit to be finite which can not be determined as finite in middle-end as + well. It results in incorrect predicate information on the exit condition + expression. For example, if says [(int) _1 + -8, + , -8] != 0 finite, + it means _1 can exactly divide -8. */ + if (single_exit (loop) && desc->infinite && loop->is_finite) + { + desc->infinite = NULL_RTX; + if (dump_file) + fprintf (dump_file, " infinite updated to finite.\n"); + } + free (body); } diff --git a/gcc/testsuite/gcc.target/powerpc/pr62147.c b/gcc/testsuite/gcc.target/powerpc/pr62147.c new file mode 100644 index 00000000000..635c73711da --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr62147.c @@ -0,0 +1,24 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-options "-O2 -fno-tree-loop-distribute-patterns" } */ + +/* Note that it's required to disable loop-distribute-patterns, otherwise the + loop will be optimized to memset. */ + +/* Expect loop_iv can know the loop is finite so the doloop_optimize + can perform the doloop transformation. */ + +typedef struct { + int l; + int b[258]; +} S; + +void clear (S* s ) +{ + int i; + int len = s->l + 1; + + for (i = 0; i <= len; i++) + s->b[i] = 0; +} + +/* { dg-final { scan-assembler {\mbdnz\M} } } */ diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c index 2478219d873..7c183c7269a 100644 --- a/gcc/tree-ssa-dce.c +++ b/gcc/tree-ssa-dce.c @@ -417,7 +417,7 @@ find_obviously_necessary_stmts (bool aggressive) } FOR_EACH_LOOP (loop, 0) - if (!finite_loop_p (loop)) + if (!test_and_update_loop_finite (loop)) { if (dump_file) fprintf (dump_file, "cannot prove finiteness of loop %i\n", loop->num); diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 2064c2900fb..49b8577ccd1 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -2484,7 +2484,7 @@ fill_always_executed_in_1 (struct loop *loop, sbitmap contains_call) /* Or we enter a possibly non-finite loop. */ if (flow_loop_nested_p (bb->loop_father, e->dest->loop_father) - && ! finite_loop_p (e->dest->loop_father)) + && ! test_and_update_loop_finite (e->dest->loop_father)) break; } if (e) diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 890f9b788b4..484a6626c05 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -612,6 +612,9 @@ struct ivopts_data /* Whether the loop body can only be exited via single exit. */ bool loop_single_exit_p; + + /* Whether to compute loop finiteness for loop iv. */ + bool compute_loop_finite_p; }; /* An assignment of iv candidates to uses. */ @@ -7145,6 +7148,9 @@ rewrite_use_compare (struct ivopts_data *data, gimple_cond_set_lhs (cond_stmt, var); gimple_cond_set_code (cond_stmt, compare); gimple_cond_set_rhs (cond_stmt, op); + + data->compute_loop_finite_p = true; + return; } @@ -7526,6 +7532,7 @@ tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop, data->current_loop = loop; data->loop_loc = find_loop_location (loop).get_location_t (); data->speed = optimize_loop_for_speed_p (loop); + data->compute_loop_finite_p = false; if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -7592,6 +7599,10 @@ tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop, /* Remove the ivs that are unused after rewriting. */ remove_unused_ivs (data, toremove); + /* Update loop finiteness info. */ + if (data->compute_loop_finite_p) + test_and_update_loop_finite (data->current_loop); + finish: free (body); free_loop_data (data); diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 84e6e313c85..78152714791 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -2808,10 +2808,14 @@ find_loop_niter (struct loop *loop, edge *exit) /* Return true if loop is known to have bounded number of iterations. */ bool -finite_loop_p (struct loop *loop) +test_and_update_loop_finite (struct loop *loop) { widest_int nit; int flags; + bool finite_p = false; + + if (loop->is_finite) + return true; flags = flags_from_decl_or_type (current_function_decl); if ((flags & (ECF_CONST|ECF_PURE)) && !(flags & ECF_LOOPING_CONST_OR_PURE)) @@ -2819,7 +2823,8 @@ finite_loop_p (struct loop *loop) if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Found loop %i to be finite: it is within pure or const function.\n", loop->num); - return true; + finite_p = true; + goto finish; } if (loop->any_upper_bound @@ -2828,9 +2833,14 @@ finite_loop_p (struct loop *loop) if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Found loop %i to be finite: upper bound found.\n", loop->num); - return true; + finite_p = true; + goto finish; } - return false; + + finish: + if (finite_p) + loop->is_finite = true; + return finite_p; } /* diff --git a/gcc/tree-ssa-loop-niter.h b/gcc/tree-ssa-loop-niter.h index dc116489218..01d587f1cbb 100644 --- a/gcc/tree-ssa-loop-niter.h +++ b/gcc/tree-ssa-loop-niter.h @@ -30,7 +30,7 @@ extern bool number_of_iterations_exit_assumptions (struct loop *, edge, struct tree_niter_desc *, gcond **, bool = true); extern tree find_loop_niter (struct loop *, edge *); -extern bool finite_loop_p (struct loop *); +extern bool test_and_update_loop_finite (struct loop *); extern tree loop_niter_by_eval (struct loop *, edge); extern tree find_loop_niter_by_eval (struct loop *, edge *); extern bool estimated_loop_iterations (struct loop *, widest_int *);