diff mbox series

x86: Update branch hint for Redwood Cove.

Message ID 20240702025133.2737549-1-hongtao.liu@intel.com
State New
Headers show
Series x86: Update branch hint for Redwood Cove. | expand

Commit Message

liuhongt July 2, 2024, 2:51 a.m. UTC
From: "H.J. Lu" <hjl.tools@gmail.com>

According to Intel® 64 and IA-32 Architectures Optimization Reference
Manual[1], Branch Hint is updated for Redwood Cove.

--------cut from [1]-------------------------
Starting with the Redwood Cove microarchitecture, if the predictor has
no stored information about a branch, the branch has the Intel® SSE2
branch taken hint (i.e., instruction prefix 3EH), When the codec
decodes the branch, it flips the branch’s prediction from not-taken to
taken. It then flushes the pipeline in front of it and steers this
pipeline to fetch the taken path of the branch.
--------cut end -----------------------------

For -mtune-ctrl=branch_prediction_hints, always generate branch hint for
conditional branches, this tune is disabled by default.

[1] https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready push to trunk.

gcc/

	* config/i386/i386.cc (ix86_print_operand): Always generate
	branch hint for conditional branches.
---
 gcc/config/i386/i386.cc | 24 +++++-------------------
 1 file changed, 5 insertions(+), 19 deletions(-)

Comments

Richard Biener July 2, 2024, 8:15 a.m. UTC | #1
On Tue, Jul 2, 2024 at 4:54 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> From: "H.J. Lu" <hjl.tools@gmail.com>
>
> According to Intel® 64 and IA-32 Architectures Optimization Reference
> Manual[1], Branch Hint is updated for Redwood Cove.
>
> --------cut from [1]-------------------------
> Starting with the Redwood Cove microarchitecture, if the predictor has
> no stored information about a branch, the branch has the Intel® SSE2
> branch taken hint (i.e., instruction prefix 3EH), When the codec
> decodes the branch, it flips the branch’s prediction from not-taken to
> taken. It then flushes the pipeline in front of it and steers this
> pipeline to fetch the taken path of the branch.
> --------cut end -----------------------------

The above reads like it would be worth splitting branc_prediction_hits
into branch_prediction_hints_taken and branch_prediction_hints_not_taken
given not-taken is the default and thus will just increase code size?

> For -mtune-ctrl=branch_prediction_hints, always generate branch hint for
> conditional branches, this tune is disabled by default.
>
> [1] https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready push to trunk.
>
> gcc/
>
>         * config/i386/i386.cc (ix86_print_operand): Always generate
>         branch hint for conditional branches.
> ---
>  gcc/config/i386/i386.cc | 24 +++++-------------------
>  1 file changed, 5 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 1f71ed04be6..9992b9d6186 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -14050,25 +14050,11 @@ ix86_print_operand (FILE *file, rtx x, int code)
>                 int pred_val = profile_probability::from_reg_br_prob_note
>                                  (XINT (x, 0)).to_reg_br_prob_base ();
>
> -               if (pred_val < REG_BR_PROB_BASE * 45 / 100
> -                   || pred_val > REG_BR_PROB_BASE * 55 / 100)
> -                 {
> -                   bool taken = pred_val > REG_BR_PROB_BASE / 2;
> -                   bool cputaken
> -                     = final_forward_branch_p (current_output_insn) == 0;
> -
> -                   /* Emit hints only in the case default branch prediction
> -                      heuristics would fail.  */
> -                   if (taken != cputaken)
> -                     {
> -                       /* We use 3e (DS) prefix for taken branches and
> -                          2e (CS) prefix for not taken branches.  */
> -                       if (taken)
> -                         fputs ("ds ; ", file);
> -                       else
> -                         fputs ("cs ; ", file);
> -                     }
> -                 }
> +               bool taken = pred_val > REG_BR_PROB_BASE / 2;
> +               /* We use 3e (DS) prefix for taken branches and
> +                  2e (CS) prefix for not taken branches.  */
> +               if (taken)
> +                 fputs ("ds ; ", file);
>               }
>             return;
>           }
> --
> 2.31.1
>
Andi Kleen July 2, 2024, 6:10 p.m. UTC | #2
liuhongt <hongtao.liu@intel.com> writes:

> From: "H.J. Lu" <hjl.tools@gmail.com>
>
> According to Intel® 64 and IA-32 Architectures Optimization Reference
> Manual[1], Branch Hint is updated for Redwood Cove.
>
> --------cut from [1]-------------------------
> Starting with the Redwood Cove microarchitecture, if the predictor has
> no stored information about a branch, the branch has the Intel® SSE2
> branch taken hint (i.e., instruction prefix 3EH), When the codec
> decodes the branch, it flips the branch’s prediction from not-taken to
> taken. It then flushes the pipeline in front of it and steers this
> pipeline to fetch the taken path of the branch.
> --------cut end -----------------------------
>
> For -mtune-ctrl=branch_prediction_hints, always generate branch hint for
> conditional branches, this tune is disabled by default.
>
> [1] https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ready push to trunk.

So what does it do to code size?
You may not want to do it with -Os.

Maybe it should be only done with actual profile feedback data
available, i'm not sure if the builtin heuristics are good enough to
justify it and there is a risk that it is very wrong.  

Yes as long as it's disabled by default that's all not a problem, but it
would need to be solved to enable it.

-Andi
Hongtao Liu July 3, 2024, 1:43 a.m. UTC | #3
On Wed, Jul 3, 2024 at 2:10 AM Andi Kleen <ak@linux.intel.com> wrote:
>
> liuhongt <hongtao.liu@intel.com> writes:
>
> > From: "H.J. Lu" <hjl.tools@gmail.com>
> >
> > According to Intel® 64 and IA-32 Architectures Optimization Reference
> > Manual[1], Branch Hint is updated for Redwood Cove.
> >
> > --------cut from [1]-------------------------
> > Starting with the Redwood Cove microarchitecture, if the predictor has
> > no stored information about a branch, the branch has the Intel® SSE2
> > branch taken hint (i.e., instruction prefix 3EH), When the codec
> > decodes the branch, it flips the branch’s prediction from not-taken to
> > taken. It then flushes the pipeline in front of it and steers this
> > pipeline to fetch the taken path of the branch.
> > --------cut end -----------------------------
> >
> > For -mtune-ctrl=branch_prediction_hints, always generate branch hint for
> > conditional branches, this tune is disabled by default.
> >
> > [1] https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ready push to trunk.
>
> So what does it do to code size?
> You may not want to do it with -Os.
It won't be enabled by Os, because it's guarded by

    if (!optimize
        || optimize_function_for_size_p (cfun)
|| !TARGET_BRANCH_PREDICTION_HINTS)
      return;
>
> Maybe it should be only done with actual profile feedback data
> available, i'm not sure if the builtin heuristics are good enough to
> justify it and there is a risk that it is very wrong.
Yes, best used with a PGO.
>
> Yes as long as it's disabled by default that's all not a problem, but it
> would need to be solved to enable it.
>
> -Andi
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 1f71ed04be6..9992b9d6186 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -14050,25 +14050,11 @@  ix86_print_operand (FILE *file, rtx x, int code)
 		int pred_val = profile_probability::from_reg_br_prob_note
 				 (XINT (x, 0)).to_reg_br_prob_base ();
 
-		if (pred_val < REG_BR_PROB_BASE * 45 / 100
-		    || pred_val > REG_BR_PROB_BASE * 55 / 100)
-		  {
-		    bool taken = pred_val > REG_BR_PROB_BASE / 2;
-		    bool cputaken
-		      = final_forward_branch_p (current_output_insn) == 0;
-
-		    /* Emit hints only in the case default branch prediction
-		       heuristics would fail.  */
-		    if (taken != cputaken)
-		      {
-			/* We use 3e (DS) prefix for taken branches and
-			   2e (CS) prefix for not taken branches.  */
-			if (taken)
-			  fputs ("ds ; ", file);
-			else
-			  fputs ("cs ; ", file);
-		      }
-		  }
+		bool taken = pred_val > REG_BR_PROB_BASE / 2;
+		/* We use 3e (DS) prefix for taken branches and
+		   2e (CS) prefix for not taken branches.  */
+		if (taken)
+		  fputs ("ds ; ", file);
 	      }
 	    return;
 	  }