diff mbox

Handle profile counts truncated to 0 in coldness test

Message ID CAAe5K+WwBXt3uEPZ+mkVyhkMurNZ_dwOKfacpUYZN3Mvv7TcHA@mail.gmail.com
State New
Headers show

Commit Message

Teresa Johnson Oct. 10, 2013, 8:38 p.m. UTC
On Mon, Oct 7, 2013 at 10:49 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Sun, Oct 6, 2013 at 6:51 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> The second part ensures that frequencies are correctly updated after
>>> inlining. The problem is that after inlining the frequencies were
>>> being recomputed directly from the corresponding bb counts in
>>> rebuild_frequencies. If the counts had been truncated to 0, then the
>>> recomputed frequencies would become 0 as well (often leading to
>>> inconsistencies in the frequencies between blocks). Then the above
>>> change to probably_never_executed would not help identify these blocks
>>> as non-cold from the frequencies. The fix was to use the existing
>>> logic used during static profile rebuilding to also
>>> recompute/propagate the frequencies from the branch probabilities in
>>> the profile feedback case. I also renamed a number of estimate_*
>>> routines to compute_* and updated the comments to reflect the fact
>>> that these routines are not doing estimation (from a static profile),
>>> but in fact recomputing/propagating frequencies from the existing
>>> (either guessed or profile-feedback-based) probabilities.
>>
>> I see where your plan - you want frequencies to be computed purely on
>> probabilities that are there.  This however will lead to unnecesary loss of
>> precisions for functions with large counts: probabilities are truncated in
>> precision and the propagation won't give correct answers and has serious
>> truncating issues.
>>
>> So perhaps we want to execute this only if maximal count in function is low?
>> (i.e. less than REG_BR_PROB_BASE or even REG_BR_PROB_BASE/10 - if counts
>> are close enough the roundoff errors of count updating should be less terrible
>> than errors introduced by the propagation).
>
> Yes, I think that is a better approach.
>
>>
>>> -/* Estimate probabilities of loopback edges in loops at same nest level.  */
>>> +/* Compute frequencies in loops at same nest level.  */
>>>
>>>  static void
>>> -estimate_loops_at_level (struct loop *first_loop)
>>> +compute_loops_at_level (struct loop *first_loop)
>>
>> I would keep "estimate".  The whole logic is not safe and it will lead to
>> mistakes for irreducible loops and many other cases...
>
> Ok
>
>>
>>> @@ -3027,18 +3042,16 @@ void
>>>  rebuild_frequencies (void)
>>>  {
>>>    timevar_push (TV_REBUILD_FREQUENCIES);
>>> -  if (profile_status == PROFILE_GUESSED)
>>> +  if (profile_status == PROFILE_GUESSED || profile_status == PROFILE_READ)
>>>      {
>>>        loop_optimizer_init (0);
>>>        add_noreturn_fake_exit_edges ();
>>>        mark_irreducible_loops ();
>>>        connect_infinite_loops_to_exit ();
>>> -      estimate_bb_frequencies ();
>>> +      compute_bb_frequencies (true);
>>
>> Here please add the check about maximal count in functiona and commnet
>> (especially for case we will remember to remove this hack if we switch to sreal
>> based counts/frequencies/probabilities)
>>
>> OK, with those changes (if it fixes the problem)
>
> Ok thanks.
> Teresa

This has been committed as r203395:

2013-10-10  Teresa Johnson  <tejohnson@google.com>

        * predict.c (tree_estimate_probability): Add new parameter
        for estimate_bb_frequencies.
        (estimate_bb_frequencies): Add new parameter to force estimation.
        (rebuild_frequencies): When max frequency in function is small,
        recompute counts from frequencies.
        * predict.h (estimate_bb_frequencies): New parameter.



>
>>
>> Thanks,
>> Honza
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
diff mbox

Patch

Index: predict.c
===================================================================
--- predict.c   (revision 203389)
+++ predict.c   (working copy)
@@ -2389,7 +2389,7 @@  tree_estimate_probability (void)
   pointer_map_destroy (bb_predictions);
   bb_predictions = NULL;

-  estimate_bb_frequencies ();
+  estimate_bb_frequencies (false);
   free_dominance_info (CDI_POST_DOMINATORS);
   remove_fake_exit_edges ();
 }
@@ -2692,7 +2692,7 @@  propagate_freq (basic_block head, bitmap tovisit)
     }
 }

-/* Estimate probabilities of loopback edges in loops at same nest level.  */
+/* Estimate frequencies in loops at same nest level.  */

 static void
 estimate_loops_at_level (struct loop *first_loop)
@@ -2801,15 +2801,17 @@  expensive_function_p (int threshold)
   return false;
 }

-/* Estimate basic blocks frequency by given branch probabilities.  */
+/* Estimate and propagate basic block frequencies using the given branch
+   probabilities.  If FORCE is true, the frequencies are used to estimate
+   the counts even when there are already non-zero profile counts.  */

 void
-estimate_bb_frequencies (void)
+estimate_bb_frequencies (bool force)
 {
   basic_block bb;
   sreal freq_max;

-  if (profile_status != PROFILE_READ || !counts_to_freqs ())
+  if (force || profile_status != PROFILE_READ || !counts_to_freqs ())
     {
       static int real_values_initialized = 0;

@@ -2846,8 +2848,8 @@  void
            }
        }

-      /* First compute probabilities locally for each loop from innermost
-         to outermost to examine probabilities for back edges.  */
+      /* First compute frequencies locally for each loop from innermost
+         to outermost to examine frequencies for back edges.  */
       estimate_loops ();

       memcpy (&freq_max, &real_zero, sizeof (real_zero));
@@ -3028,13 +3030,29 @@  void
 rebuild_frequencies (void)
 {
   timevar_push (TV_REBUILD_FREQUENCIES);
-  if (profile_status == PROFILE_GUESSED)
+
+  /* When the max bb count in the function is small, there is a higher
+     chance that there were truncation errors in the integer scaling
+     of counts by inlining and other optimizations. This could lead
+     to incorrect classification of code as being cold when it isn't.
+     In that case, force the estimation of bb counts/frequencies from the
+     branch probabilities, rather than computing frequencies from counts,
+     which may also lead to frequencies incorrectly reduced to 0. There
+     is less precision in the probabilities, so we only do this for small
+     max counts.  */
+  gcov_type count_max = 0;
+  basic_block bb;
+  FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
+    count_max = MAX (bb->count, count_max);
+
+  if (profile_status == PROFILE_GUESSED
+      || (profile_status == PROFILE_READ && count_max < REG_BR_PROB_BASE/10))
     {
       loop_optimizer_init (0);
       add_noreturn_fake_exit_edges ();
       mark_irreducible_loops ();
       connect_infinite_loops_to_exit ();
-      estimate_bb_frequencies ();
+      estimate_bb_frequencies (true);
       remove_fake_exit_edges ();
       loop_optimizer_finalize ();
     }
Index: predict.h
===================================================================
--- predict.h   (revision 203389)
+++ predict.h   (working copy)
@@ -37,7 +37,7 @@  enum prediction

 extern void predict_insn_def (rtx, enum br_predictor, enum prediction);
 extern int counts_to_freqs (void);
-extern void estimate_bb_frequencies (void);
+extern void estimate_bb_frequencies (bool);
 extern const char *predictor_name (enum br_predictor);
 extern tree build_predict_expr (enum br_predictor, enum prediction);
 extern void tree_estimate_probability (void);