Message ID | 87618p1cov.fsf@kepler.schwinge.homeip.net |
---|---|
State | New |
Headers | show |
On Tue, 21 Apr 2015, Thomas Schwinge wrote: > Hi! > > On Tue, 25 Nov 2014 12:29:28 +0100, Tom de Vries <Tom_deVries@mentor.com> wrote: > > On 15-11-14 18:21, Tom de Vries wrote: > > > On 15-11-14 13:14, Tom de Vries wrote: > > >> I'm submitting a patch series with initial support for the oacc kernels > > >> directive. > > >> > > >> The patch series uses pass_parallelize_loops to implement parallelization of > > >> loops in the oacc kernels region. > > >> > > >> The patch series consists of these 8 patches: > > >> ... > > >> 1 Expand oacc kernels after pass_build_ealias > > >> 2 Add pass_oacc_kernels > > >> 3 Add pass_ch_oacc_kernels to pass_oacc_kernels > > >> 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels > > >> 5 Add pass_loop_im to pass_oacc_kernels > > >> 6 Add pass_ccp to pass_oacc_kernels > > >> 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels > > >> 8 Do simple omp lowering for no address taken var > > >> ... > > > > > > This patch adds pass_tree_loop_init and pass_tree_loop_init_done to > > > pass_oacc_kernels. > > > > > > Pass_parallelize_loops is run between these passes in the pass group > > > pass_tree_loop, since it requires loop information. We do the same for > > > pass_oacc_kernels. > > > > > > > Updated for moving pass_oacc_kernels down past pass_fre in the pass list. > > > > Bootstrapped and reg-tested as before. > > > > OK for trunk? Both passes should be basically no-ops. Why not call loop_optimizer_init/finalize from expand_omp_ssa instead? > Committed to gomp-4_0-branch in r222282: > > commit cb95b4a1efcdb96c58cda986d53b20c3537c1ab7 > Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Tue Apr 21 19:51:33 2015 +0000 > > Add pass_tree_loop_{init,done} to pass_oacc_kernels > > gcc/ > * passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass > group pass_oacc_kernels. > * tree-ssa-loop.c (pass_tree_loop_init::clone) > (pass_tree_loop_done::clone): New function. > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@222282 138bc75d-0d04-0410-961f-82ee72b054a4 > --- > gcc/ChangeLog.gomp | 5 +++++ > gcc/passes.def | 2 ++ > gcc/tree-ssa-loop.c | 2 ++ > 3 files changed, 9 insertions(+) > > diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp > index d00c5e0..1fb060f 100644 > --- gcc/ChangeLog.gomp > +++ gcc/ChangeLog.gomp > @@ -1,5 +1,10 @@ > 2015-04-21 Tom de Vries <tom@codesourcery.com> > > + * passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass > + group pass_oacc_kernels. > + * tree-ssa-loop.c (pass_tree_loop_init::clone) > + (pass_tree_loop_done::clone): New function. > + > * omp-low.c (loop_in_oacc_kernels_region_p): New function. > * omp-low.h (loop_in_oacc_kernels_region_p): Declare. > * passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels. > diff --git gcc/passes.def gcc/passes.def > index 5cdbc87..83ae04e 100644 > --- gcc/passes.def > +++ gcc/passes.def > @@ -91,7 +91,9 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_oacc_kernels); > PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) > NEXT_PASS (pass_ch_oacc_kernels); > + NEXT_PASS (pass_tree_loop_init); > NEXT_PASS (pass_expand_omp_ssa); > + NEXT_PASS (pass_tree_loop_done); > POP_INSERT_PASSES () > NEXT_PASS (pass_merge_phi); > NEXT_PASS (pass_cd_dce); > diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c > index a041858..2a96a39 100644 > --- gcc/tree-ssa-loop.c > +++ gcc/tree-ssa-loop.c > @@ -272,6 +272,7 @@ public: > > /* opt_pass methods: */ > virtual unsigned int execute (function *); > + opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); } > > }; // class pass_tree_loop_init > > @@ -566,6 +567,7 @@ public: > > /* opt_pass methods: */ > virtual unsigned int execute (function *) { return tree_ssa_loop_done (); } > + opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); } > > }; // class pass_tree_loop_done > > > > Grüße, > Thomas >
On 22-04-15 09:40, Richard Biener wrote: > On Tue, 21 Apr 2015, Thomas Schwinge wrote: > >> Hi! >> >> On Tue, 25 Nov 2014 12:29:28 +0100, Tom de Vries <Tom_deVries@mentor.com> wrote: >>> On 15-11-14 18:21, Tom de Vries wrote: >>>> On 15-11-14 13:14, Tom de Vries wrote: >>>>> I'm submitting a patch series with initial support for the oacc kernels >>>>> directive. >>>>> >>>>> The patch series uses pass_parallelize_loops to implement parallelization of >>>>> loops in the oacc kernels region. >>>>> >>>>> The patch series consists of these 8 patches: >>>>> ... >>>>> 1 Expand oacc kernels after pass_build_ealias >>>>> 2 Add pass_oacc_kernels >>>>> 3 Add pass_ch_oacc_kernels to pass_oacc_kernels >>>>> 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels >>>>> 5 Add pass_loop_im to pass_oacc_kernels >>>>> 6 Add pass_ccp to pass_oacc_kernels >>>>> 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels >>>>> 8 Do simple omp lowering for no address taken var >>>>> ... >>>> >>>> This patch adds pass_tree_loop_init and pass_tree_loop_init_done to >>>> pass_oacc_kernels. >>>> >>>> Pass_parallelize_loops is run between these passes in the pass group >>>> pass_tree_loop, since it requires loop information. We do the same for >>>> pass_oacc_kernels. >>>> >>> >>> Updated for moving pass_oacc_kernels down past pass_fre in the pass list. >>> >>> Bootstrapped and reg-tested as before. >>> >>> OK for trunk? > > Both passes should be basically no-ops. Why not call > loop_optimizer_init/finalize from expand_omp_ssa instead? > The current pass list is: ... NEXT_PASS (pass_build_ealias); NEXT_PASS (pass_fre); /* Pass group that runs when there are oacc kernels in the function. */ NEXT_PASS (pass_oacc_kernels); PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) NEXT_PASS (pass_ch_oacc_kernels); NEXT_PASS (pass_fre); NEXT_PASS (pass_tree_loop_init); NEXT_PASS (pass_lim); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_scev_cprop); NEXT_PASS (pass_parallelize_loops_oacc_kernels); NEXT_PASS (pass_expand_omp_ssa); NEXT_PASS (pass_tree_loop_done); POP_INSERT_PASSES () NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_dse); ... Do you want to call loop_optimizer_init from pass_lim and loop_optimizer_finalize from pass_expand_omp_ssa, or are things ok as they are? Thanks, - Tom >> Committed to gomp-4_0-branch in r222282: >> >> commit cb95b4a1efcdb96c58cda986d53b20c3537c1ab7 >> Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4> >> Date: Tue Apr 21 19:51:33 2015 +0000 >> >> Add pass_tree_loop_{init,done} to pass_oacc_kernels >> >> gcc/ >> * passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass >> group pass_oacc_kernels. >> * tree-ssa-loop.c (pass_tree_loop_init::clone) >> (pass_tree_loop_done::clone): New function. >> >> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@222282 138bc75d-0d04-0410-961f-82ee72b054a4 >> --- >> gcc/ChangeLog.gomp | 5 +++++ >> gcc/passes.def | 2 ++ >> gcc/tree-ssa-loop.c | 2 ++ >> 3 files changed, 9 insertions(+) >> >> diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp >> index d00c5e0..1fb060f 100644 >> --- gcc/ChangeLog.gomp >> +++ gcc/ChangeLog.gomp >> @@ -1,5 +1,10 @@ >> 2015-04-21 Tom de Vries <tom@codesourcery.com> >> >> + * passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass >> + group pass_oacc_kernels. >> + * tree-ssa-loop.c (pass_tree_loop_init::clone) >> + (pass_tree_loop_done::clone): New function. >> + >> * omp-low.c (loop_in_oacc_kernels_region_p): New function. >> * omp-low.h (loop_in_oacc_kernels_region_p): Declare. >> * passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels. >> diff --git gcc/passes.def gcc/passes.def >> index 5cdbc87..83ae04e 100644 >> --- gcc/passes.def >> +++ gcc/passes.def >> @@ -91,7 +91,9 @@ along with GCC; see the file COPYING3. If not see >> NEXT_PASS (pass_oacc_kernels); >> PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) >> NEXT_PASS (pass_ch_oacc_kernels); >> + NEXT_PASS (pass_tree_loop_init); >> NEXT_PASS (pass_expand_omp_ssa); >> + NEXT_PASS (pass_tree_loop_done); >> POP_INSERT_PASSES () >> NEXT_PASS (pass_merge_phi); >> NEXT_PASS (pass_cd_dce); >> diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c >> index a041858..2a96a39 100644 >> --- gcc/tree-ssa-loop.c >> +++ gcc/tree-ssa-loop.c >> @@ -272,6 +272,7 @@ public: >> >> /* opt_pass methods: */ >> virtual unsigned int execute (function *); >> + opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); } >> >> }; // class pass_tree_loop_init >> >> @@ -566,6 +567,7 @@ public: >> >> /* opt_pass methods: */ >> virtual unsigned int execute (function *) { return tree_ssa_loop_done (); } >> + opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); } >> >> }; // class pass_tree_loop_done >> >> >> >> Grüße, >> Thomas >> >
On Tue, 2 Jun 2015, Tom de Vries wrote: > On 22-04-15 09:40, Richard Biener wrote: > > On Tue, 21 Apr 2015, Thomas Schwinge wrote: > > > > > Hi! > > > > > > On Tue, 25 Nov 2014 12:29:28 +0100, Tom de Vries <Tom_deVries@mentor.com> > > > wrote: > > > > On 15-11-14 18:21, Tom de Vries wrote: > > > > > On 15-11-14 13:14, Tom de Vries wrote: > > > > > > I'm submitting a patch series with initial support for the oacc > > > > > > kernels > > > > > > directive. > > > > > > > > > > > > The patch series uses pass_parallelize_loops to implement > > > > > > parallelization of > > > > > > loops in the oacc kernels region. > > > > > > > > > > > > The patch series consists of these 8 patches: > > > > > > ... > > > > > > 1 Expand oacc kernels after pass_build_ealias > > > > > > 2 Add pass_oacc_kernels > > > > > > 3 Add pass_ch_oacc_kernels to pass_oacc_kernels > > > > > > 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels > > > > > > 5 Add pass_loop_im to pass_oacc_kernels > > > > > > 6 Add pass_ccp to pass_oacc_kernels > > > > > > 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels > > > > > > 8 Do simple omp lowering for no address taken var > > > > > > ... > > > > > > > > > > This patch adds pass_tree_loop_init and pass_tree_loop_init_done to > > > > > pass_oacc_kernels. > > > > > > > > > > Pass_parallelize_loops is run between these passes in the pass group > > > > > pass_tree_loop, since it requires loop information. We do the same > > > > > for > > > > > pass_oacc_kernels. > > > > > > > > > > > > > Updated for moving pass_oacc_kernels down past pass_fre in the pass > > > > list. > > > > > > > > Bootstrapped and reg-tested as before. > > > > > > > > OK for trunk? > > > > Both passes should be basically no-ops. Why not call > > loop_optimizer_init/finalize from expand_omp_ssa instead? > > > > The current pass list is: > ... > NEXT_PASS (pass_build_ealias); > NEXT_PASS (pass_fre); > /* Pass group that runs when there are oacc kernels in the > function. */ > NEXT_PASS (pass_oacc_kernels); > PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) > NEXT_PASS (pass_ch_oacc_kernels); > NEXT_PASS (pass_fre); > NEXT_PASS (pass_tree_loop_init); > NEXT_PASS (pass_lim); > NEXT_PASS (pass_copy_prop); > NEXT_PASS (pass_scev_cprop); > NEXT_PASS (pass_parallelize_loops_oacc_kernels); > NEXT_PASS (pass_expand_omp_ssa); > NEXT_PASS (pass_tree_loop_done); > POP_INSERT_PASSES () > NEXT_PASS (pass_merge_phi); > NEXT_PASS (pass_dse); > ... > > Do you want to call loop_optimizer_init from pass_lim and > loop_optimizer_finalize from pass_expand_omp_ssa, or are things ok as they > are? No, Jakub probably means to call loop_optimizer_init/finalize in each of the passes. Note that keeping loops initialized keeps you in loop-closed SSA form and also preserves some more loop properties during cfg-cleanup. So I think things are ok as they are. As far as I understand at least SCEV-cprop and parloops need loop-closed SSA form to work (LIM doesn't need anything fancy, apart from disambiguated latches). Btw, I wonder why you don't organize the oacc-kernel passes in a new simple-IPA group after pass_local_optimization_passes. Richard. > Thanks, > - Tom > > > > Committed to gomp-4_0-branch in r222282: > > > > > > commit cb95b4a1efcdb96c58cda986d53b20c3537c1ab7 > > > Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4> > > > Date: Tue Apr 21 19:51:33 2015 +0000 > > > > > > Add pass_tree_loop_{init,done} to pass_oacc_kernels > > > > > > gcc/ > > > * passes.def: Run pass_tree_loop_init and pass_tree_loop_done > > > in pass > > > group pass_oacc_kernels. > > > * tree-ssa-loop.c (pass_tree_loop_init::clone) > > > (pass_tree_loop_done::clone): New function. > > > > > > git-svn-id: > > > svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@222282 > > > 138bc75d-0d04-0410-961f-82ee72b054a4 > > > --- > > > gcc/ChangeLog.gomp | 5 +++++ > > > gcc/passes.def | 2 ++ > > > gcc/tree-ssa-loop.c | 2 ++ > > > 3 files changed, 9 insertions(+) > > > > > > diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp > > > index d00c5e0..1fb060f 100644 > > > --- gcc/ChangeLog.gomp > > > +++ gcc/ChangeLog.gomp > > > @@ -1,5 +1,10 @@ > > > 2015-04-21 Tom de Vries <tom@codesourcery.com> > > > > > > + * passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass > > > + group pass_oacc_kernels. > > > + * tree-ssa-loop.c (pass_tree_loop_init::clone) > > > + (pass_tree_loop_done::clone): New function. > > > + > > > * omp-low.c (loop_in_oacc_kernels_region_p): New function. > > > * omp-low.h (loop_in_oacc_kernels_region_p): Declare. > > > * passes.def: Add pass_ch_oacc_kernels to pass group > > > pass_oacc_kernels. > > > diff --git gcc/passes.def gcc/passes.def > > > index 5cdbc87..83ae04e 100644 > > > --- gcc/passes.def > > > +++ gcc/passes.def > > > @@ -91,7 +91,9 @@ along with GCC; see the file COPYING3. If not see > > > NEXT_PASS (pass_oacc_kernels); > > > PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) > > > NEXT_PASS (pass_ch_oacc_kernels); > > > + NEXT_PASS (pass_tree_loop_init); > > > NEXT_PASS (pass_expand_omp_ssa); > > > + NEXT_PASS (pass_tree_loop_done); > > > POP_INSERT_PASSES () > > > NEXT_PASS (pass_merge_phi); > > > NEXT_PASS (pass_cd_dce); > > > diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c > > > index a041858..2a96a39 100644 > > > --- gcc/tree-ssa-loop.c > > > +++ gcc/tree-ssa-loop.c > > > @@ -272,6 +272,7 @@ public: > > > > > > /* opt_pass methods: */ > > > virtual unsigned int execute (function *); > > > + opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); } > > > > > > }; // class pass_tree_loop_init > > > > > > @@ -566,6 +567,7 @@ public: > > > > > > /* opt_pass methods: */ > > > virtual unsigned int execute (function *) { return tree_ssa_loop_done > > > (); } > > > + opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); } > > > > > > }; // class pass_tree_loop_done > > > > > > > > > > > > Grüße, > > > Thomas > > > > > > >
On 02-06-15 15:58, Richard Biener wrote: > Btw, I wonder why you don't organize the oacc-kernel passes in > a new simple-IPA group after pass_local_optimization_passes. I've placed the pass group as early as possible (meaning after ealias) and put passes in front only when that served a purpose for parallelization (pass_fre). The idea there was to minimize the amount of passes that have to be modified to deal (conservatively) with a kernels region. So AFAICT, there's nothing against placing the pass group after pass_local_optimization_passes, other that that it's more work in more passes to keep the region intact. What would be the benefit of doing so? Thanks, - Tom
On Tue, 2 Jun 2015, Tom de Vries wrote: > On 02-06-15 15:58, Richard Biener wrote: > > Btw, I wonder why you don't organize the oacc-kernel passes in > > a new simple-IPA group after pass_local_optimization_passes. > > I've placed the pass group as early as possible (meaning after ealias) and put > passes in front only when that served a purpose for parallelization > (pass_fre). The idea there was to minimize the amount of passes that have to > be modified to deal (conservatively) with a kernels region. I see. > So AFAICT, there's nothing against placing the pass group after > pass_local_optimization_passes, other that that it's more work in more passes > to keep the region intact. > > What would be the benefit of doing so? Get all the local optimizations done, including pure-const discovery. Richard.
diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index d00c5e0..1fb060f 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,5 +1,10 @@ 2015-04-21 Tom de Vries <tom@codesourcery.com> + * passes.def: Run pass_tree_loop_init and pass_tree_loop_done in pass + group pass_oacc_kernels. + * tree-ssa-loop.c (pass_tree_loop_init::clone) + (pass_tree_loop_done::clone): New function. + * omp-low.c (loop_in_oacc_kernels_region_p): New function. * omp-low.h (loop_in_oacc_kernels_region_p): Declare. * passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels. diff --git gcc/passes.def gcc/passes.def index 5cdbc87..83ae04e 100644 --- gcc/passes.def +++ gcc/passes.def @@ -91,7 +91,9 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_oacc_kernels); PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) NEXT_PASS (pass_ch_oacc_kernels); + NEXT_PASS (pass_tree_loop_init); NEXT_PASS (pass_expand_omp_ssa); + NEXT_PASS (pass_tree_loop_done); POP_INSERT_PASSES () NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_cd_dce); diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c index a041858..2a96a39 100644 --- gcc/tree-ssa-loop.c +++ gcc/tree-ssa-loop.c @@ -272,6 +272,7 @@ public: /* opt_pass methods: */ virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); } }; // class pass_tree_loop_init @@ -566,6 +567,7 @@ public: /* opt_pass methods: */ virtual unsigned int execute (function *) { return tree_ssa_loop_done (); } + opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); } }; // class pass_tree_loop_done