Message ID | 20200327180656.83131-2-kito.cheng@sifive.com |
---|---|
State | New |
Headers | show |
Series | [1/2] Move out increase_alignment into ipa-increase-alignment.cc | expand |
On Fri, Mar 27, 2020 at 11:07 AM Kito Cheng <kito.cheng@sifive.com> wrote: > > - The alignment for local variable was adjust during estimate_stack_frame_size, > however it seems wrong spot to adjust that, expand phase will adjust that > but it little too late to some gimple optimization, which rely on certain > target hooks need to check alignment, forwprop is an example for > that, result of simplify_builtin_call rely on the alignment on some > target like ARM or RISC-V. > > - So we must adjust at some point, and here is already a pass to adjust > alignment, which is ipa-increase-alignment, used for tree vectorization. > > - This patch fix gfortran.dg/pr45636.f90 for arm and riscv. > > - Regression test on riscv32/riscv64 and x86_64-linux-gnu, no new fail > introduced. > > gcc/ChangeLog > > PR target/90811 > * ipa-increase-alignment.cc (increase_alignment_local_var): New. > (increase_alignment_global_var): New. > (pass_ipa_increase_alignment::gate): Remove. > --- > gcc/ipa-increase-alignment.cc | 45 ++++++++++++++++++++++++++++------- > 1 file changed, 36 insertions(+), 9 deletions(-) > > diff --git a/gcc/ipa-increase-alignment.cc b/gcc/ipa-increase-alignment.cc > index 6e34124bc03..8f28ad83349 100644 > --- a/gcc/ipa-increase-alignment.cc > +++ b/gcc/ipa-increase-alignment.cc > @@ -148,10 +148,35 @@ get_vec_alignment_for_type (tree type) > return (alignment > TYPE_ALIGN (type)) ? alignment : 0; > } > > -/* Entry point to increase_alignment pass. */ > -static unsigned int > -increase_alignment (void) > +/* Adjust alignment for local variable. */ > +static void > +increase_alignment_local_var (void) > +{ > + size_t i; > + tree var; > + struct cgraph_node *node; > + unsigned int align; > + > + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) > + { > + function *fun = node->get_fun (); > + FOR_EACH_LOCAL_DECL (fun, i, var) > + { > + align = LOCAL_DECL_ALIGNMENT (var); > + > + SET_DECL_ALIGN (var, align); I think this is wrong if the user has set the alignment already. You need to check DECL_USER_ALIGN. Thanks, Andrew Pinski > + } > + } > +} > + > +/* Adjust alignment for global variable, only used for tree vectorization > + currently. */ > +static void > +increase_alignment_global_var (void) > { > + if (!(flag_section_anchors && flag_tree_loop_vectorize)) > + return; > + > varpool_node *vnode; > > vect_location = dump_user_location_t (); > @@ -178,6 +203,14 @@ increase_alignment (void) > } > > delete type_align_map; > +} > + > +/* Entry point to increase_alignment pass. */ > +static unsigned int > +increase_alignment (void) > +{ > + increase_alignment_local_var (); > + increase_alignment_global_var (); > return 0; > } > > @@ -204,12 +237,6 @@ public: > : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt) > {} > > - /* opt_pass methods: */ > - virtual bool gate (function *) > - { > - return flag_section_anchors && flag_tree_loop_vectorize; > - } > - > virtual unsigned int execute (function *) { return increase_alignment (); } > > }; // class pass_ipa_increase_alignment > -- > 2.25.2 >
On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote: > PR target/90811 > * ipa-increase-alignment.cc (increase_alignment_local_var): New. > (increase_alignment_global_var): New. > (pass_ipa_increase_alignment::gate): Remove. I'm afraid this is too early, pass_ipa_increase_alignment is part of all_small_ipa_passes list, those are run before the LTO streaming. See cgraphunit.c (ipa_passes). For OpenMP/OpenACC offloading, this has to run after the streaming, which means either a pass in the all_regular_ipa_passes list, or much better (I don't see any advantage of this being an IPA pass) some new pass that is run very early in the all_passes list. At which point I don't really see the benefit of moving this into a new file either. Jakub
On Fri, Mar 27, 2020 at 7:27 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote: > > PR target/90811 > > * ipa-increase-alignment.cc (increase_alignment_local_var): New. > > (increase_alignment_global_var): New. > > (pass_ipa_increase_alignment::gate): Remove. > > I'm afraid this is too early, pass_ipa_increase_alignment is part of > all_small_ipa_passes list, those are run before the LTO streaming. > See cgraphunit.c (ipa_passes). > For OpenMP/OpenACC offloading, this has to run after the streaming, > which means either a pass in the all_regular_ipa_passes list, or much better > (I don't see any advantage of this being an IPA pass) some new pass that is > run very early in the all_passes list. It's an IPA pass because we IPA propagate alignment. I fear that offload targets have to agree on local variable alignment if we want to go this way (update DECL_ALIGN). But I wonder if we can instead fix the memcpy inlining issue by making the predicates involved honor LOCAL_ALIGNMENT instead of relying on DECL_ALIGN? > > At which point I don't really see the benefit of moving this into a new file > either. > > Jakub >
On Mon, Mar 30, 2020 at 11:09:40AM +0200, Richard Biener wrote: > On Fri, Mar 27, 2020 at 7:27 PM Jakub Jelinek <jakub@redhat.com> wrote: > > > > On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote: > > > PR target/90811 > > > * ipa-increase-alignment.cc (increase_alignment_local_var): New. > > > (increase_alignment_global_var): New. > > > (pass_ipa_increase_alignment::gate): Remove. > > > > I'm afraid this is too early, pass_ipa_increase_alignment is part of > > all_small_ipa_passes list, those are run before the LTO streaming. > > See cgraphunit.c (ipa_passes). > > For OpenMP/OpenACC offloading, this has to run after the streaming, > > which means either a pass in the all_regular_ipa_passes list, or much better > > (I don't see any advantage of this being an IPA pass) some new pass that is > > run very early in the all_passes list. > > It's an IPA pass because we IPA propagate alignment. For global vars sure, but we are talking here about automatic vars, and the offloading targets really have serious problems dealing with the 16/32 byte alignment forced from x86 backend. Jakub
> But I wonder if we can instead fix the memcpy inlining issue by > making the predicates involved honor LOCAL_ALIGNMENT > instead of relying on DECL_ALIGN? The memcpy inlining issue is not the only one affected by alignment issue, I guess? So I think it would be better to fix DECL_ALIGN? On Mon, Mar 30, 2020 at 5:15 PM Jakub Jelinek <jakub@redhat.com> wrote: > > On Mon, Mar 30, 2020 at 11:09:40AM +0200, Richard Biener wrote: > > On Fri, Mar 27, 2020 at 7:27 PM Jakub Jelinek <jakub@redhat.com> wrote: > > > > > > On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote: > > > > PR target/90811 > > > > * ipa-increase-alignment.cc (increase_alignment_local_var): New. > > > > (increase_alignment_global_var): New. > > > > (pass_ipa_increase_alignment::gate): Remove. > > > > > > I'm afraid this is too early, pass_ipa_increase_alignment is part of > > > all_small_ipa_passes list, those are run before the LTO streaming. > > > See cgraphunit.c (ipa_passes). > > > For OpenMP/OpenACC offloading, this has to run after the streaming, > > > which means either a pass in the all_regular_ipa_passes list, or much better > > > (I don't see any advantage of this being an IPA pass) some new pass that is > > > run very early in the all_passes list. > > > > It's an IPA pass because we IPA propagate alignment. > > For global vars sure, but we are talking here about automatic vars, and the > offloading targets really have serious problems dealing with the 16/32 byte > alignment forced from x86 backend. > > Jakub >
Hi Jakub: I saw the omp and oacc related passes are in the head of all_passes, so I plan added after pass_omp_target_link, does it late enough? diff --git a/gcc/passes.def b/gcc/passes.def index 2bf2cb78fc5..92cbe587a8a 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -183,6 +183,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_oacc_device_lower); NEXT_PASS (pass_omp_device_lower); NEXT_PASS (pass_omp_target_link); + NEXT_PASS (pass_adjust_alignment); NEXT_PASS (pass_all_optimizations); PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations) NEXT_PASS (pass_remove_cgraph_callee_edges); Thanks :) On Sat, Mar 28, 2020 at 2:27 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Sat, Mar 28, 2020 at 02:06:56AM +0800, Kito Cheng wrote: > > PR target/90811 > > * ipa-increase-alignment.cc (increase_alignment_local_var): New. > > (increase_alignment_global_var): New. > > (pass_ipa_increase_alignment::gate): Remove. > > I'm afraid this is too early, pass_ipa_increase_alignment is part of > all_small_ipa_passes list, those are run before the LTO streaming. > See cgraphunit.c (ipa_passes). > For OpenMP/OpenACC offloading, this has to run after the streaming, > which means either a pass in the all_regular_ipa_passes list, or much better > (I don't see any advantage of this being an IPA pass) some new pass that is > run very early in the all_passes list. > > At which point I don't really see the benefit of moving this into a new file > either. > > Jakub >
Hi Andrew: > > + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) > > + { > > + function *fun = node->get_fun (); > > + FOR_EACH_LOCAL_DECL (fun, i, var) > > + { > > + align = LOCAL_DECL_ALIGNMENT (var); > > + > > + SET_DECL_ALIGN (var, align); > > > I think this is wrong if the user has set the alignment already. > You need to check DECL_USER_ALIGN. Good point, maybe we need to fix align_local_variable too :) Thanks.
On Mon, Mar 30, 2020 at 06:24:32PM +0800, Kito Cheng wrote: > Hi Jakub: > > I saw the omp and oacc related passes are in the head of all_passes, > so I plan added after pass_omp_target_link, does it late enough? Yes, that is ok. > diff --git a/gcc/passes.def b/gcc/passes.def > index 2bf2cb78fc5..92cbe587a8a 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -183,6 +183,7 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_oacc_device_lower); > NEXT_PASS (pass_omp_device_lower); > NEXT_PASS (pass_omp_target_link); > + NEXT_PASS (pass_adjust_alignment); > NEXT_PASS (pass_all_optimizations); > PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations) > NEXT_PASS (pass_remove_cgraph_callee_edges); Jakub
diff --git a/gcc/ipa-increase-alignment.cc b/gcc/ipa-increase-alignment.cc index 6e34124bc03..8f28ad83349 100644 --- a/gcc/ipa-increase-alignment.cc +++ b/gcc/ipa-increase-alignment.cc @@ -148,10 +148,35 @@ get_vec_alignment_for_type (tree type) return (alignment > TYPE_ALIGN (type)) ? alignment : 0; } -/* Entry point to increase_alignment pass. */ -static unsigned int -increase_alignment (void) +/* Adjust alignment for local variable. */ +static void +increase_alignment_local_var (void) +{ + size_t i; + tree var; + struct cgraph_node *node; + unsigned int align; + + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) + { + function *fun = node->get_fun (); + FOR_EACH_LOCAL_DECL (fun, i, var) + { + align = LOCAL_DECL_ALIGNMENT (var); + + SET_DECL_ALIGN (var, align); + } + } +} + +/* Adjust alignment for global variable, only used for tree vectorization + currently. */ +static void +increase_alignment_global_var (void) { + if (!(flag_section_anchors && flag_tree_loop_vectorize)) + return; + varpool_node *vnode; vect_location = dump_user_location_t (); @@ -178,6 +203,14 @@ increase_alignment (void) } delete type_align_map; +} + +/* Entry point to increase_alignment pass. */ +static unsigned int +increase_alignment (void) +{ + increase_alignment_local_var (); + increase_alignment_global_var (); return 0; } @@ -204,12 +237,6 @@ public: : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt) {} - /* opt_pass methods: */ - virtual bool gate (function *) - { - return flag_section_anchors && flag_tree_loop_vectorize; - } - virtual unsigned int execute (function *) { return increase_alignment (); } }; // class pass_ipa_increase_alignment