Message ID | CAO2gOZUhpbGL02wQbTi9tp1SByj-NkDfhCB3r_QK-WOVCLcMiQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
Ok for now as a workraround, but this is probably not a long term fix. David On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen <dehao@google.com> wrote: > The patch was updated to ignore comdat einline tuning for AutoFDO. > Performance testing is green. > > OK for google-4_9? > > Thanks, > Dehao > > Index: gcc/auto-profile.c > =================================================================== > --- gcc/auto-profile.c (revision 217523) > +++ gcc/auto-profile.c (working copy) > @@ -1771,6 +1771,7 @@ auto_profile (void) > free_dominance_info (CDI_DOMINATORS); > free_dominance_info (CDI_POST_DOMINATORS); > rebuild_cgraph_edges (); > + compute_inline_parameters (cgraph_get_node > (current_function_decl), true); > pop_cfun (); > } > > Index: gcc/ipa-inline.c > =================================================================== > --- gcc/ipa-inline.c (revision 217523) > +++ gcc/ipa-inline.c (working copy) > @@ -501,7 +501,7 @@ want_early_inline_function_p (struct cgraph_edge * > growth); > want_inline = false; > } > - else if (DECL_COMDAT (callee->decl) > + else if (!flag_auto_profile && DECL_COMDAT (callee->decl) > && growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT)) > ; > else if ((n = num_calls (callee)) != 0 > > On Thu, Nov 13, 2014 at 3:42 PM, Dehao Chen <dehao@google.com> wrote: >> We do not do sophisticated recursive call detection in einline phase. >> It only happens in ipa-inline phase. >> >> Dehao >> >> On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li <davidxl@google.com> wrote: >>> On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen <dehao@google.com> wrote: >>>> IIRC, AutoFDO the actual iteration for AutoFDO is mostly <3. But it >>>> should not harm to set max iter as 10. >>>> >>>> On Thu, Nov 13, 2014 at 2:51 PM, Xinliang David Li <davidxl@google.com> wrote: >>>>> After inline summary is recomputed, the large code growth problem will >>>>> also be better controlled, right? >>>> >>>> For this case, recomputing inline summary does not help because the >>>> code was bloated in first einline phase. >>> >>> For recursive inlining, the inline summary for the cloned edges need >>> to be updated to prevent the growth? >>> >>> david >>> >>>> >>>> Dehao >>>> >>>>> >>>>> David >>>>> >>>>> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li <davidxl@google.com> wrote: >>>>>> Is there a need to have 10 iterations of early inline for autofdo? >>>>>> >>>>>> David >>>>>> >>>>>> On Thu, Nov 13, 2014 at 2:25 PM, Dehao Chen <dehao@google.com> wrote: >>>>>>> In AutoFDO, we increase einline iterations. This could lead to >>>>>>> extensive code bloat if we have recursive calls like: >>>>>>> >>>>>>> dtor() { >>>>>>> destroy(node); >>>>>>> } >>>>>>> >>>>>>> destroy(node) { >>>>>>> destroy(left) >>>>>>> destroy(right) >>>>>>> } >>>>>>> >>>>>>> In this case, the size growth will be around 8 which is smaller than >>>>>>> threshold (11). However, if we allow this to happen for 2 iterations, >>>>>>> it will expand the size by 1024X. To fix this problem, we want to set >>>>>>> a much smaller threshold in the AutoFDO case. This is because AutoFDO >>>>>>> do not not rely on aggressive einline to gain more profile context. >>>>>>> >>>>>>> And also, in AutoFDO pass, after we processed a function, we need to >>>>>>> recompute inline parameters because rebuild_cgraph_edges will zero out >>>>>>> all inline parameters. >>>>>>> >>>>>>> The patch is attached below, bootstrapped and perf test on-going. OK >>>>>>> for google-4_9? >>>>>>> >>>>>>> Thanks, >>>>>>> Dehao >>>>>>> >>>>>>> Index: gcc/auto-profile.c >>>>>>> =================================================================== >>>>>>> --- gcc/auto-profile.c (revision 217523) >>>>>>> +++ gcc/auto-profile.c (working copy) >>>>>>> @@ -1771,6 +1771,7 @@ auto_profile (void) >>>>>>> free_dominance_info (CDI_DOMINATORS); >>>>>>> free_dominance_info (CDI_POST_DOMINATORS); >>>>>>> rebuild_cgraph_edges (); >>>>>>> + compute_inline_parameters (cgraph_get_node >>>>>>> (current_function_decl), true); >>>>>>> pop_cfun (); >>>>>>> } >>>>>>> >>>>>>> Index: gcc/opts.c >>>>>>> =================================================================== >>>>>>> --- gcc/opts.c (revision 217523) >>>>>>> +++ gcc/opts.c (working copy) >>>>>>> @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts, >>>>>>> maybe_set_param_value ( >>>>>>> PARAM_EARLY_INLINER_MAX_ITERATIONS, 10, >>>>>>> opts->x_param_values, opts_set->x_param_values); >>>>>>> + maybe_set_param_value ( >>>>>>> + PARAM_EARLY_INLINING_INSNS, 4, >>>>>>> + opts->x_param_values, opts_set->x_param_values); >>>>>>> + maybe_set_param_value ( >>>>>>> + PARAM_EARLY_INLINING_INSNS_COMDAT, 4, >>>>>>> + opts->x_param_values, opts_set->x_param_values); >>>>>>> value = true; >>>>>>> /* No break here - do -fauto-profile processing. */ >>>>>>> case OPT_fauto_profile:
Xinliang David Li <davidxl@google.com> writes: > Ok for now as a workraround, but this is probably not a long term fix. Is the workaround needed for the mainline autofdo version too? -Andi > > David > > > On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen <dehao@google.com> wrote: >> The patch was updated to ignore comdat einline tuning for AutoFDO. >> Performance testing is green.
There are actually two patches needed to port to mainline. I'll send out the patch tomorrow. Dehao On Mon, Nov 17, 2014 at 4:58 PM, Andi Kleen <andi@firstfloor.org> wrote: > Xinliang David Li <davidxl@google.com> writes: > >> Ok for now as a workraround, but this is probably not a long term fix. > > Is the workaround needed for the mainline autofdo version too? > > -Andi >> >> David >> >> >> On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen <dehao@google.com> wrote: >>> The patch was updated to ignore comdat einline tuning for AutoFDO. >>> Performance testing is green.
Index: gcc/auto-profile.c =================================================================== --- gcc/auto-profile.c (revision 217523) +++ gcc/auto-profile.c (working copy) @@ -1771,6 +1771,7 @@ auto_profile (void) free_dominance_info (CDI_DOMINATORS); free_dominance_info (CDI_POST_DOMINATORS); rebuild_cgraph_edges (); + compute_inline_parameters (cgraph_get_node (current_function_decl), true); pop_cfun (); } Index: gcc/ipa-inline.c =================================================================== --- gcc/ipa-inline.c (revision 217523) +++ gcc/ipa-inline.c (working copy) @@ -501,7 +501,7 @@ want_early_inline_function_p (struct cgraph_edge * growth); want_inline = false; } - else if (DECL_COMDAT (callee->decl) + else if (!flag_auto_profile && DECL_COMDAT (callee->decl) && growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT)) ; else if ((n = num_calls (callee)) != 0