diff mbox

decide edge's hotness when there is profile info

Message ID CAAe5K+XHKWOkZEaavJkFEfhWME4qKg_ByuXSrzxhKjgCu-QpWQ@mail.gmail.com
State New
Headers show

Commit Message

Teresa Johnson Oct. 18, 2013, 8:34 p.m. UTC
Here is the patch updated to use the new parameter from r203830.
Passed bootstrap and regression tests.

2013-10-18  Jan Hubicka  <jh@suse.cz>
            Teresa Johnson  <tejohnson@google.com>

        * predict.c (handle_missing_profiles): New function.
        (counts_to_freqs): Don't overwrite estimated frequencies
        when function has no profile counts.
        * predict.h (handle_missing_profiles): Declare.
        * tree-profile.c (tree_profiling): Invoke handle_missing_profiles.


On Wed, Oct 16, 2013 at 12:02 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Tue, Oct 15, 2013 at 3:39 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> On Tue, Oct 15, 2013 at 2:57 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>> On Tue, Oct 15, 2013 at 2:03 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>> >> Yes, that would work. So let's discard this patch because the fix for
>>>> >> comdat can also fix this problem.
>>>> >
>>>> > Unforutnately ipa-profile-estimate is an IPA pass and as such it does
>>>> > not have access to profile_status_to_function.
>>>>
>>>> I see. I was coding that up today but hadn't tested it yet.
>>>>
>>>> > You can probably just factor this out into a function that can be called
>>>> > and for normal FDO we call it where the loop stis now and for auto-FDO we can
>>>> > probably have another invocation from before early passes where auto-FDO is collected.
>>>>
>>>> Ok, let's go with that approach for now. It won't address the 0 count
>>>> COMDAT calling another 0 count COMDAT problem, but I will probably
>>>> just find a way to deal with this when inlining.
>>>
>>> You can still propagate, since tree-profile is an simple-ipa pass.
>>
>> Ok, I think I will leave that for the second patch, since I need to do
>> some testing on the effects - i.e. the propagation I had in mind would
>> make any 0-count routine called by a routine that was dropped to
>> guessed to also be dropped to guessed (and no longer unlikely). It may
>> be too aggressive, I need to check.
>>
>>>>
>>>> >> >>> +      if (node->count)
>>>> >> >>> + continue;
>>>> > Also here we should sum the counts and consider function non unlikely executed
>>>> > in the same way as probably_never_executed does.
>>>>
>>>> I assume you mean by doing the same comparison to the number of
>>>> profile->runs. Yes, this makes sense.
>>>
>>> Yes.
>>>>
>>>> >
>>>> > I can prepare updated patch, but i am currently travelling, so i would not
>>>> > be disapointed if you beat me ;)
>>>>
>>>> I'm working on it, and I think based on Dehao's needs I am going to
>>>> split up the patch into two phases, the one that does just the part
>>>> you had sent a patch for (making sure 0 count routines with non-zero
>>>> calls are marked guessed and have their node frequency set
>>>> appropriately), and a subsequent one to do the count application when
>>>> we inline a 0-count routine into a non-zero callsite. I'll shoot for
>>>> getting this ready by tomorrow.
>>>>
>>>> BTW, in your original patch you are checking for both e->count or
>>>> cgraph_maybe_hot_edge_p(e), but AFAICT the call to
>>>> cgraph_maybe_hot_edge_p will never return true when e->count is zero.
>>>> When there is a profile it will return false via maybe_hot_count_p
>>>> since e->count == 0. When there is no profile it will return false
>>>> when the callee has NODE_FREQUENCY_UNLIKELY_EXECUTED. So I think just
>>>> checking for e->count >0 is sufficient here.
>>>
>>> I think I was checking caller count here (that is read) and the code
>>> was supposed to make functoin with hot caller to be hot...
>>
>> The code in your patch was just checking the edge count, not the
>> caller count. cgraph_maybe_hot_edge_p also doesn't check the caller
>> count, just the edge count. Do we want to make all functions with hot
>> callers also be hot, even when the edge count is not hot? This may be
>> to aggressive. I was simply going to make the code check e->count.
>
> Here is the patch from Honza, that I revised based on discussions and
> testing. Once this related patch goes in, I can change the check against the
> profile_info->runs that hardcodes the threshold with the new parameter
> proposed for probably_never_executed there:
>   http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00743.html
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
>
> Thanks, Teresa
>
> 2013-10-16  Jan Hubicka  <jh@suse.cz>
>             Teresa Johnson  <tejohnson@google.com>
>
>         * predict.c (handle_missing_profiles): New function.
>         (counts_to_freqs): Don't overwrite estimated frequencies
>         when function has no profile counts.
>         * predict.h (handle_missing_profiles): Declare.
>         * tree-profile.c (tree_profiling): Invoke handle_missing_profiles.
>
> Index: predict.c
> ===================================================================
> --- predict.c   (revision 203633)
> +++ predict.c   (working copy)
> @@ -2742,6 +2742,39 @@ estimate_loops (void)
>    BITMAP_FREE (tovisit);
>  }
>
> +void
> +handle_missing_profiles (void)
> +{
> +  struct cgraph_node *node;
> +  /* See if 0 count function has non-0 count callers.  In this case we
> +     lost some profile.  Drop its function profile to PROFILE_GUESSED.  */
> +  FOR_EACH_DEFINED_FUNCTION (node)
> +    {
> +      struct cgraph_edge *e;
> +      gcov_type call_count = 0;
> +      struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl);
> +      if (node->count)
> +        continue;
> +      for (e = node->callers; e; e = e->next_caller)
> +        call_count += e->count;
> +      if (call_count
> +          && fn && fn->cfg
> +          && (call_count * 4 >= profile_info->runs))
> +        {
> +          bool maybe_hot = maybe_hot_count_p (NULL, call_count);
> +          if (dump_file)
> +            fprintf (dump_file,
> +                     "Dropping 0 profile for %s/%i. %s based on calls.\n",
> +                     cgraph_node_name (node), node->symbol.order,
> +                     maybe_hot ? "Function is hot" : "Function is normal");
> +          profile_status_for_function (fn)
> +              = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT);
> +          node->frequency
> +              = maybe_hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL;
> +        }
> +    }
> +}
> +
>  /* Convert counts measured by profile driven feedback to frequencies.
>     Return nonzero iff there was any nonzero execution count.  */
>
> @@ -2751,6 +2784,9 @@ counts_to_freqs (void)
>    gcov_type count_max, true_count_max = 0;
>    basic_block bb;
>
> +  if (!ENTRY_BLOCK_PTR->count)
> +    return 0;
> +
>    FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
>      true_count_max = MAX (bb->count, true_count_max);
>
> Index: predict.h
> ===================================================================
> --- predict.h   (revision 203633)
> +++ predict.h   (working copy)
> @@ -37,6 +37,7 @@ enum prediction
>
>  extern void predict_insn_def (rtx, enum br_predictor, enum prediction);
>  extern int counts_to_freqs (void);
> +extern void handle_missing_profiles (void);
>  extern void estimate_bb_frequencies (bool);
>  extern const char *predictor_name (enum br_predictor);
>  extern tree build_predict_expr (enum br_predictor, enum prediction);
> Index: tree-profile.c
> ===================================================================
> --- tree-profile.c      (revision 203633)
> +++ tree-profile.c      (working copy)
> @@ -607,6 +607,8 @@ tree_profiling (void)
>        pop_cfun ();
>      }
>
> +  handle_missing_profiles ();
> +
>    del_node_map ();
>    return 0;
>  }
>
>
>>
>> Teresa
>>
>>>
>>> Honza
>>>>
>>>> Thanks,
>>>> Teresa
>>>>
>>>> >
>>>> > Honza
>>>>
>>>>
>>>>
>>>> --
>>>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413

Comments

Jan Hubicka Oct. 18, 2013, 9:17 p.m. UTC | #1
> Here is the patch updated to use the new parameter from r203830.
> Passed bootstrap and regression tests.
> 
> 2013-10-18  Jan Hubicka  <jh@suse.cz>
>             Teresa Johnson  <tejohnson@google.com>
> 
>         * predict.c (handle_missing_profiles): New function.
>         (counts_to_freqs): Don't overwrite estimated frequencies
>         when function has no profile counts.
>         * predict.h (handle_missing_profiles): Declare.
>         * tree-profile.c (tree_profiling): Invoke handle_missing_profiles.
> 
> Index: predict.c
> ===================================================================
> --- predict.c   (revision 203830)
> +++ predict.c   (working copy)
> @@ -2758,6 +2758,40 @@ estimate_loops (void)
>    BITMAP_FREE (tovisit);
>  }
> 

You need block comment. It should explain the problem of COMDATs and how they are handled.
It is not an obvious thing.

> +void
> +handle_missing_profiles (void)
> +{
> +  struct cgraph_node *node;
> +  int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION);
extra line
> +  /* See if 0 count function has non-0 count callers.  In this case we
> +     lost some profile.  Drop its function profile to PROFILE_GUESSED.  */
> +  FOR_EACH_DEFINED_FUNCTION (node)
> +    {
> +      struct cgraph_edge *e;
> +      gcov_type call_count = 0;
> +      struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl);
Extra line
> +      if (node->count)
> +        continue;
> +      for (e = node->callers; e; e = e->next_caller)
> +        call_count += e->count;
What about checking if the sum is way off even for non-0 counts. 
I.e. for case where function was inlined to some calls but not to others?  In
that case we may want to scale up the counts (with some resonable care for
capping)

Also I think the code really should propagate - i.e. have simple worklist and
look for functions that are COMDAT and are called by other COMDAT functions
where we decided to drop the profile.  Having non-trivial chains of comdats is
common thing.

What about outputting user visible warning/error on the incosnsistency when
no COMDAT function is involved same way as we do for BB profile?

Honza
diff mbox

Patch

Index: predict.c
===================================================================
--- predict.c   (revision 203830)
+++ predict.c   (working copy)
@@ -2758,6 +2758,40 @@  estimate_loops (void)
   BITMAP_FREE (tovisit);
 }

+void
+handle_missing_profiles (void)
+{
+  struct cgraph_node *node;
+  int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION);
+  /* See if 0 count function has non-0 count callers.  In this case we
+     lost some profile.  Drop its function profile to PROFILE_GUESSED.  */
+  FOR_EACH_DEFINED_FUNCTION (node)
+    {
+      struct cgraph_edge *e;
+      gcov_type call_count = 0;
+      struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl);
+      if (node->count)
+        continue;
+      for (e = node->callers; e; e = e->next_caller)
+        call_count += e->count;
+      if (call_count
+          && fn && fn->cfg
+          && (call_count * unlikely_count_fraction >= profile_info->runs))
+        {
+          bool maybe_hot = maybe_hot_count_p (NULL, call_count);
+          if (dump_file)
+            fprintf (dump_file,
+                     "Dropping 0 profile for %s/%i. %s based on calls.\n",
+                     cgraph_node_name (node), node->symbol.order,
+                     maybe_hot ? "Function is hot" : "Function is normal");
+          profile_status_for_function (fn)
+              = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT);
+          node->frequency
+              = maybe_hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL;
+        }
+    }
+}
+
 /* Convert counts measured by profile driven feedback to frequencies.
    Return nonzero iff there was any nonzero execution count.  */

@@ -2767,6 +2801,9 @@  counts_to_freqs (void)
   gcov_type count_max, true_count_max = 0;
   basic_block bb;

+  if (!ENTRY_BLOCK_PTR->count)
+    return 0;
+
   FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
     true_count_max = MAX (bb->count, true_count_max);

Index: predict.h
===================================================================
--- predict.h   (revision 203830)
+++ predict.h   (working copy)
@@ -37,6 +37,7 @@  enum prediction

 extern void predict_insn_def (rtx, enum br_predictor, enum prediction);
 extern int counts_to_freqs (void);
+extern void handle_missing_profiles (void);
 extern void estimate_bb_frequencies (bool);
 extern const char *predictor_name (enum br_predictor);
 extern tree build_predict_expr (enum br_predictor, enum prediction);
Index: tree-profile.c
===================================================================
--- tree-profile.c      (revision 203830)
+++ tree-profile.c      (working copy)
@@ -607,6 +607,8 @@  tree_profiling (void)
       pop_cfun ();
     }

+  handle_missing_profiles ();
+
   del_node_map ();
   return 0;
 }