Message ID | CAO2gOZVj7vwDDtn1=XjE4PsiB-1B-a-9Ku2gFHViOVOLZUYMrg@mail.gmail.com |
---|---|
State | New |
Headers | show |
> Is this patch ok for trunk? Bootstrapped and regression test on-going. > > Thanks, > Dehao > > 2014-05-16 Dehao Chen <dehao@google.com> > > * tree-inline.c (initialize_cfun): Ensure count_scale is no larger > than REG_BR_PROB_BASE. > (copy_cfg_body): Likewise. This seems like wrong place to paper around the problem - symmetric count scaling is done during production of the inline clone. I think if we want to be smart about broken profiles, we should do it at that place instead here at inliner... What kind of problem does this patch solve? Honza
On Fri, May 16, 2014 at 4:41 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > > > Is this patch ok for trunk? Bootstrapped and regression test on-going. > > > > Thanks, > > Dehao > > > > 2014-05-16 Dehao Chen <dehao@google.com> > > > > * tree-inline.c (initialize_cfun): Ensure count_scale is no larger > > than REG_BR_PROB_BASE. > > (copy_cfg_body): Likewise. > > This seems like wrong place to paper around the problem - symmetric count > scaling is done during production of the inline clone. I think if we want to > be smart about broken profiles, we should do it at that place instead here > at inliner... > > What kind of problem does this patch solve? In AutoFDO, a basic block's count can be much larger than it's actual count because debug info might be incorrect. In this case, a call edge count (calculated from BB count) could be much larger than callee's header count, making the count_scale incorrectly large. Dehao > > > Honza
> In AutoFDO, a basic block's count can be much larger than it's actual > count because debug info might be incorrect. In this case, a call edge > count (calculated from BB count) could be much larger than callee's > header count, making the count_scale incorrectly large. In this case I still think we should handle this when producing the clone: we do not want to have clone's count much larger as well, so i think inliner and ipa-cp needs to deal with capping here instead.... Honza > > Dehao > > > > > > Honza
Do you mean adjusting bb->count? Because in expand_call_inline(tree-inline.c), it will use bb->count to pass into copy_body to calculate count_scale. Thanks, Dehao On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka <hubicka@ucw.cz> wrote: >> In AutoFDO, a basic block's count can be much larger than it's actual >> count because debug info might be incorrect. In this case, a call edge >> count (calculated from BB count) could be much larger than callee's >> header count, making the count_scale incorrectly large. > > In this case I still think we should handle this when producing the clone: > we do not want to have clone's count much larger as well, so i think inliner > and ipa-cp needs to deal with capping here instead.... > > Honza >> >> Dehao >> > >> > >> > Honza
> Do you mean adjusting bb->count? Because in > expand_call_inline(tree-inline.c), it will use bb->count to pass into > copy_body to calculate count_scale. What about taking here callee->count instead? For inline nodes without any capping hack, bb->count == edge->count = callee->count. When profile ends up being obviously inconsistent, I would say that inliner can cap callee->count during producing the clone... Honza > > Thanks, > Dehao > > On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > >> In AutoFDO, a basic block's count can be much larger than it's actual > >> count because debug info might be incorrect. In this case, a call edge > >> count (calculated from BB count) could be much larger than callee's > >> header count, making the count_scale incorrectly large. > > > > In this case I still think we should handle this when producing the clone: > > we do not want to have clone's count much larger as well, so i think inliner > > and ipa-cp needs to deal with capping here instead.... > > > > Honza > >> > >> Dehao > >> > > >> > > >> > Honza
Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c (revision 210535) +++ gcc/tree-inline.c (working copy) @@ -2190,7 +2190,8 @@ initialize_cfun (tree new_fndecl, tree callee_fnde if (!DECL_RESULT (new_fndecl)) DECL_RESULT (new_fndecl) = DECL_RESULT (callee_fndecl); - if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count) + if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count + && count < ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count) count_scale = GCOV_COMPUTE_SCALE (count, ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count); @@ -2451,7 +2452,8 @@ copy_cfg_body (copy_body_data * id, gcov_type coun freqs_to_counts (id->src_node, count > in_count ? count : in_count); } - if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count) + if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count + && count < ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count) count_scale = GCOV_COMPUTE_SCALE (count, ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count);