diff mbox series

[i386] : Fix PR 91654: Runtime SPEC regression on Haswell

Message ID CAFULd4bb6LM8xDg3HgKqcA3DK7_67_yhvg+DOqvAjkL54Aim8A@mail.gmail.com
State New
Headers show
Series [i386] : Fix PR 91654: Runtime SPEC regression on Haswell | expand

Commit Message

Uros Bizjak Sept. 6, 2019, 7:31 p.m. UTC
On Thu, Sep 5, 2019 at 10:53 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Thu, Sep 5, 2019 at 7:47 AM Hongtao Liu <crazylht@gmail.com> wrote:
> >
> > Change cost from 2->6 got
> > -------------
> > 531.deepsjeng_r  9.64%
> > 548.exchange_r  10.24%
> > 557.xc_r              7.99%
> > 508.namd_r         1.08%
> > 527.cam4_r          6.91%
> > 553.nab_r            3.06%
> > ------------
> >
> > for 531,548,557,527, even better comparing to version before regression.
> > for 508,533, still little regressions comparing to version before regression.
>
> Good, that brings us into "noise" region.
>
> Based on these results and other findings, I propose the following solution:
>
> - The inter-regset move costs of architectures, that have been defined
> before r125951 remain the same. These are: size, i386, i486, pentium,
> pentiumpro, geode, k6, athlon, k8, amdfam10, pentium4 and nocona.
> - bdver, btver1 and btver2 have costs higher than 8, so they are not affected.
> - lakemont, znver1, znver2, atom, slm, intel and generic costs have
> inter-regset costs above intra-regset and below or equal memory
> load/store cost, should remain as they are. Additionally, intel and
> generic costs are regularly re-tuned.
> -  only skylake and core costs remain problematic
>
> So, I propose to raise XMM<->intreg costs of skylake and core
> architectures to 6 to solve the regression. These can be fine-tuned
> later, we are now able to change the cost for RA independently of RTX
> costs. Also, the RA cost can be asymmetrical.
>
> Attached patch implements the proposal. If there are no other
> proposals or discussions, I plan to commit it on Friday.

2019-09-06  Uroš Bizjak  <ubizjak@gmail.com>

    PR target/91654
    * config/i386/x86-tune-costs.h (skylake_cost): Raise the
    cost of SSE->integer and integer->SSE moves from 2 to 6.
    (core_cost): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff mbox series

Patch

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index 3381b8bf143c..00edece3eb68 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1610,7 +1610,7 @@  struct processor_costs skylake_cost = {
 					   in 32,64,128,256 and 512-bit */
   {8, 8, 8, 12, 24},			/* cost of storing SSE registers
 					   in 32,64,128,256 and 512-bit */
-  2, 2,					/* SSE->integer and integer->SSE moves */
+  6, 6,					/* SSE->integer and integer->SSE moves */
   /* End of register allocator costs.  */
   },
 
@@ -2555,7 +2555,7 @@  struct processor_costs core_cost = {
 					   in 32,64,128,256 and 512-bit */
   {6, 6, 6, 6, 12},			/* cost of storing SSE registers
 					   in 32,64,128,256 and 512-bit */
-  2, 2,					/* SSE->integer and integer->SSE moves */
+  6, 6,					/* SSE->integer and integer->SSE moves */
   /* End of register allocator costs.  */
   },