Message ID | 565455C9.7000206@mentor.com |
---|---|
State | New |
Headers | show |
On Tue, 24 Nov 2015, Tom de Vries wrote: > On 23/11/15 11:02, Richard Biener wrote: > > On Fri, 20 Nov 2015, Tom de Vries wrote: > > > > > On 20/11/15 14:29, Richard Biener wrote: > > > > I agree it's somewhat of an odd behavior but all passes should > > > > either be placed in a sub-pipeline with an outer > > > > loop_optimizer_init()/finalize () call or call both themselves. > > > > > > Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the > > > loop > > > pipeline. > > > > > > We could use the style used in pass_slp_vectorize::execute: > > > ... > > > pass_slp_vectorize::execute (function *fun) > > > { > > > basic_block bb; > > > > > > bool in_loop_pipeline = scev_initialized_p (); > > > if (!in_loop_pipeline) > > > { > > > loop_optimizer_init (LOOPS_NORMAL); > > > scev_initialize (); > > > } > > > > > > ... > > > > > > if (!in_loop_pipeline) > > > { > > > scev_finalize (); > > > loop_optimizer_finalize (); > > > } > > > ... > > > > > > Although that doesn't strike me as particularly clean. > > > > At least it would be a consistent "unclean" style. So yes, the > > above would work for me. > > > > Reposting using the in_loop_pipeline style in pass_lim. The tree-ssa-loop-im.c changes are ok (I suppose the other changes are in the other patch you posted as well). Thanks, Richard.
On 24/11/15 14:13, Richard Biener wrote: > On Tue, 24 Nov 2015, Tom de Vries wrote: > >> >On 23/11/15 11:02, Richard Biener wrote: >>> > >On Fri, 20 Nov 2015, Tom de Vries wrote: >>> > > >>>> > > >On 20/11/15 14:29, Richard Biener wrote: >>>>> > > > >I agree it's somewhat of an odd behavior but all passes should >>>>> > > > >either be placed in a sub-pipeline with an outer >>>>> > > > >loop_optimizer_init()/finalize () call or call both themselves. >>>> > > > >>>> > > >Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the >>>> > > >loop >>>> > > >pipeline. >>>> > > > >>>> > > >We could use the style used in pass_slp_vectorize::execute: >>>> > > >... >>>> > > >pass_slp_vectorize::execute (function *fun) >>>> > > >{ >>>> > > > basic_block bb; >>>> > > > >>>> > > > bool in_loop_pipeline = scev_initialized_p (); >>>> > > > if (!in_loop_pipeline) >>>> > > > { >>>> > > > loop_optimizer_init (LOOPS_NORMAL); >>>> > > > scev_initialize (); >>>> > > > } >>>> > > > >>>> > > > ... >>>> > > > >>>> > > > if (!in_loop_pipeline) >>>> > > > { >>>> > > > scev_finalize (); >>>> > > > loop_optimizer_finalize (); >>>> > > > } >>>> > > >... >>>> > > > >>>> > > >Although that doesn't strike me as particularly clean. >>> > > >>> > >At least it would be a consistent "unclean" style. So yes, the >>> > >above would work for me. >>> > > >> > >> >Reposting using the in_loop_pipeline style in pass_lim. > The tree-ssa-loop-im.c changes are ok OK, I'll commit those. > (I suppose the other changes > are in the other patch you posted as well). This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch contains changes related to adding pass_oacc_kernels2. Are those the "other changes" you're referring to? Thanks, - Tom
On Tue, 24 Nov 2015, Tom de Vries wrote: > On 24/11/15 14:13, Richard Biener wrote: > > On Tue, 24 Nov 2015, Tom de Vries wrote: > > > > > >On 23/11/15 11:02, Richard Biener wrote: > > > > > >On Fri, 20 Nov 2015, Tom de Vries wrote: > > > > > > > > > > > > > >On 20/11/15 14:29, Richard Biener wrote: > > > > > > > > > >I agree it's somewhat of an odd behavior but all passes > > > > > > should > > > > > > > > > >either be placed in a sub-pipeline with an outer > > > > > > > > > >loop_optimizer_init()/finalize () call or call both > > > > > > themselves. > > > > > > > > > > > > > > > >Hmm, but adding loop_optimizer_finalize at the end of pass_lim > > > > > breaks the > > > > > > > >loop > > > > > > > >pipeline. > > > > > > > > > > > > > > > >We could use the style used in pass_slp_vectorize::execute: > > > > > > > >... > > > > > > > >pass_slp_vectorize::execute (function *fun) > > > > > > > >{ > > > > > > > > basic_block bb; > > > > > > > > > > > > > > > > bool in_loop_pipeline = scev_initialized_p (); > > > > > > > > if (!in_loop_pipeline) > > > > > > > > { > > > > > > > > loop_optimizer_init (LOOPS_NORMAL); > > > > > > > > scev_initialize (); > > > > > > > > } > > > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > if (!in_loop_pipeline) > > > > > > > > { > > > > > > > > scev_finalize (); > > > > > > > > loop_optimizer_finalize (); > > > > > > > > } > > > > > > > >... > > > > > > > > > > > > > > > >Although that doesn't strike me as particularly clean. > > > > > > > > > > > >At least it would be a consistent "unclean" style. So yes, the > > > > > >above would work for me. > > > > > > > > > > > > > >Reposting using the in_loop_pipeline style in pass_lim. > > The tree-ssa-loop-im.c changes are ok > > OK, I'll commit those. > > > (I suppose the other changes > > are in the other patch you posted as well). > > This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch > contains changes related to adding pass_oacc_kernels2. Are those the "other > changes" you're referring to? No, the other pathc adding oacc_kernels pass group to passes.def. Btw, at some point splitting patches too much becomes very much confusing instead of helping. Richard.
On 24/11/15 15:33, Richard Biener wrote: > On Tue, 24 Nov 2015, Tom de Vries wrote: > >> On 24/11/15 14:13, Richard Biener wrote: >>> On Tue, 24 Nov 2015, Tom de Vries wrote: >>> >>>>> On 23/11/15 11:02, Richard Biener wrote: >>>>>>> On Fri, 20 Nov 2015, Tom de Vries wrote: >>>>>>> >>>>>>>>> On 20/11/15 14:29, Richard Biener wrote: >>>>>>>>>>> I agree it's somewhat of an odd behavior but all passes >>>>>>> should >>>>>>>>>>> either be placed in a sub-pipeline with an outer >>>>>>>>>>> loop_optimizer_init()/finalize () call or call both >>>>>>> themselves. >>>>>>>>> >>>>>>>>> Hmm, but adding loop_optimizer_finalize at the end of pass_lim >>>>>> breaks the >>>>>>>>> loop >>>>>>>>> pipeline. >>>>>>>>> >>>>>>>>> We could use the style used in pass_slp_vectorize::execute: >>>>>>>>> ... >>>>>>>>> pass_slp_vectorize::execute (function *fun) >>>>>>>>> { >>>>>>>>> basic_block bb; >>>>>>>>> >>>>>>>>> bool in_loop_pipeline = scev_initialized_p (); >>>>>>>>> if (!in_loop_pipeline) >>>>>>>>> { >>>>>>>>> loop_optimizer_init (LOOPS_NORMAL); >>>>>>>>> scev_initialize (); >>>>>>>>> } >>>>>>>>> >>>>>>>>> ... >>>>>>>>> >>>>>>>>> if (!in_loop_pipeline) >>>>>>>>> { >>>>>>>>> scev_finalize (); >>>>>>>>> loop_optimizer_finalize (); >>>>>>>>> } >>>>>>>>> ... >>>>>>>>> >>>>>>>>> Although that doesn't strike me as particularly clean. >>>>>>> >>>>>>> At least it would be a consistent "unclean" style. So yes, the >>>>>>> above would work for me. >>>>>>> >>>>> >>>>> Reposting using the in_loop_pipeline style in pass_lim. >>> The tree-ssa-loop-im.c changes are ok >> >> OK, I'll commit those. >> >>> (I suppose the other changes >>> are in the other patch you posted as well). >> >> This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch >> contains changes related to adding pass_oacc_kernels2. Are those the "other >> changes" you're referring to? > > No, the other pathc adding oacc_kernels pass group to passes.def. > I don't understand. There 's only one patch adding oacc_kernels pass group to passes.def (which is the one in this thread). > Btw, at some point splitting patches too much becomes very much > confusing instead of helping. Would it help if I merge "Add pass_oacc_kernels" with this patch? Thanks, - Tom
On Tue, 24 Nov 2015, Tom de Vries wrote: > On 24/11/15 15:33, Richard Biener wrote: > > On Tue, 24 Nov 2015, Tom de Vries wrote: > > > > > On 24/11/15 14:13, Richard Biener wrote: > > > > On Tue, 24 Nov 2015, Tom de Vries wrote: > > > > > > > > > > On 23/11/15 11:02, Richard Biener wrote: > > > > > > > > On Fri, 20 Nov 2015, Tom de Vries wrote: > > > > > > > > > > > > > > > > > > On 20/11/15 14:29, Richard Biener wrote: > > > > > > > > > > > > I agree it's somewhat of an odd behavior but all passes > > > > > > > > should > > > > > > > > > > > > either be placed in a sub-pipeline with an outer > > > > > > > > > > > > loop_optimizer_init()/finalize () call or call both > > > > > > > > themselves. > > > > > > > > > > > > > > > > > > > > Hmm, but adding loop_optimizer_finalize at the end of > > > > > > > > > > pass_lim > > > > > > > breaks the > > > > > > > > > > loop > > > > > > > > > > pipeline. > > > > > > > > > > > > > > > > > > > > We could use the style used in pass_slp_vectorize::execute: > > > > > > > > > > ... > > > > > > > > > > pass_slp_vectorize::execute (function *fun) > > > > > > > > > > { > > > > > > > > > > basic_block bb; > > > > > > > > > > > > > > > > > > > > bool in_loop_pipeline = scev_initialized_p (); > > > > > > > > > > if (!in_loop_pipeline) > > > > > > > > > > { > > > > > > > > > > loop_optimizer_init (LOOPS_NORMAL); > > > > > > > > > > scev_initialize (); > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > if (!in_loop_pipeline) > > > > > > > > > > { > > > > > > > > > > scev_finalize (); > > > > > > > > > > loop_optimizer_finalize (); > > > > > > > > > > } > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > Although that doesn't strike me as particularly clean. > > > > > > > > > > > > > > > > At least it would be a consistent "unclean" style. So yes, the > > > > > > > > above would work for me. > > > > > > > > > > > > > > > > > > > > Reposting using the in_loop_pipeline style in pass_lim. > > > > The tree-ssa-loop-im.c changes are ok > > > > > > OK, I'll commit those. > > > > > > > (I suppose the other changes > > > > are in the other patch you posted as well). > > > > > > This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch > > > contains changes related to adding pass_oacc_kernels2. Are those the > > > "other > > > changes" you're referring to? > > > > No, the other pathc adding oacc_kernels pass group to passes.def. > > > > I don't understand. There 's only one patch adding oacc_kernels pass group to > passes.def (which is the one in this thread). > > > Btw, at some point splitting patches too much becomes very much > > confusing instead of helping. > > Would it help if I merge "Add pass_oacc_kernels" with this patch? It would have, yes. As said, the excessive splitting just confuses the review process. Will review in the present state anyway. Richard. > Thanks, > - Tom > >
On Tue, 24 Nov 2015, Tom de Vries wrote: > On 23/11/15 11:02, Richard Biener wrote: > > On Fri, 20 Nov 2015, Tom de Vries wrote: > > > > > On 20/11/15 14:29, Richard Biener wrote: > > > > I agree it's somewhat of an odd behavior but all passes should > > > > either be placed in a sub-pipeline with an outer > > > > loop_optimizer_init()/finalize () call or call both themselves. > > > > > > Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the > > > loop > > > pipeline. > > > > > > We could use the style used in pass_slp_vectorize::execute: > > > ... > > > pass_slp_vectorize::execute (function *fun) > > > { > > > basic_block bb; > > > > > > bool in_loop_pipeline = scev_initialized_p (); > > > if (!in_loop_pipeline) > > > { > > > loop_optimizer_init (LOOPS_NORMAL); > > > scev_initialize (); > > > } > > > > > > ... > > > > > > if (!in_loop_pipeline) > > > { > > > scev_finalize (); > > > loop_optimizer_finalize (); > > > } > > > ... > > > > > > Although that doesn't strike me as particularly clean. > > > > At least it would be a consistent "unclean" style. So yes, the > > above would work for me. > > > > Reposting using the in_loop_pipeline style in pass_lim. Ok. Thanks, Richard.
Add pass_oacc_kernels pass group in passes.def 2015-11-09 Tom de Vries <tom@codesourcery.com> * omp-low.c (pass_expand_omp_ssa::clone): New function. * passes.def: Add pass_oacc_kernels pass group. * tree-ssa-loop-ch.c (pass_ch::clone): New function. * tree-ssa-loop-im.c (tree_ssa_lim): Make static. (pass_lim::execute): Allow to run outside pass_tree_loop. --- gcc/omp-low.c | 1 + gcc/passes.def | 18 ++++++++++++++++++ gcc/tree-ssa-loop-ch.c | 2 ++ gcc/tree-ssa-loop-im.c | 12 ++++++++++-- 4 files changed, 31 insertions(+), 2 deletions(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index efe5d3a..7318b0e 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -13366,6 +13366,7 @@ public: return !(fun->curr_properties & PROP_gimple_eomp); } virtual unsigned int execute (function *) { return execute_expand_omp (); } + opt_pass * clone () { return new pass_expand_omp_ssa (m_ctxt); } }; // class pass_expand_omp_ssa diff --git a/gcc/passes.def b/gcc/passes.def index 17027786..f1969c0 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -88,7 +88,25 @@ along with GCC; see the file COPYING3. If not see /* pass_build_ealias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_ealias); + /* Pass group that runs when the function is an offloaded function + containing oacc kernels loops. Part 1. */ + NEXT_PASS (pass_oacc_kernels); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) + NEXT_PASS (pass_ch); + POP_INSERT_PASSES () NEXT_PASS (pass_fre); + /* Pass group that runs when the function is an offloaded function + containing oacc kernels loops. Part 2. */ + NEXT_PASS (pass_oacc_kernels2); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2) + /* We use pass_lim to rewrite in-memory iteration and reduction + variable accesses in loops into local variables accesses. */ + NEXT_PASS (pass_lim); + NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */); + NEXT_PASS (pass_dce); + NEXT_PASS (pass_parallelize_loops_oacc_kernels); + NEXT_PASS (pass_expand_omp_ssa); + POP_INSERT_PASSES () NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_dse); NEXT_PASS (pass_cd_dce); diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c index 7e618bf..6493fcc 100644 --- a/gcc/tree-ssa-loop-ch.c +++ b/gcc/tree-ssa-loop-ch.c @@ -165,6 +165,8 @@ public: /* Initialize and finalize loop structures, copying headers inbetween. */ virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_ch (m_ctxt); } + protected: /* ch_base method: */ virtual bool process_loop_p (struct loop *loop); diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 30b53ce..0d82d36 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa-propagate.h" #include "trans-mem.h" #include "gimple-fold.h" +#include "tree-scalar-evolution.h" /* TODO: Support for predicated code motion. I.e. @@ -2496,7 +2497,7 @@ tree_ssa_lim_finalize (void) /* Moves invariants from loops. Only "expensive" invariants are moved out -- i.e. those that are likely to be win regardless of the register pressure. */ -unsigned int +static unsigned int tree_ssa_lim (void) { unsigned int todo; @@ -2560,10 +2561,17 @@ public: unsigned int pass_lim::execute (function *fun) { + bool in_loop_pipeline = scev_initialized_p (); + if (!in_loop_pipeline) + loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); + if (number_of_loops (fun) <= 1) return 0; + unsigned int todo = tree_ssa_lim (); - return tree_ssa_lim (); + if (!in_loop_pipeline) + loop_optimizer_finalize (); + return todo; } } // anon namespace