Message ID | 5C3DFC6A.1040301@foss.arm.com |
---|---|
State | New |
Headers | show |
Series | [AArch64] Initial -mcpu=ares tuning | expand |
Apologies for the duplicate. I've had issues with my mail client They are both identical submissions. Kyrill On 15/01/19 15:29, Kyrill Tkachov wrote: > Hi all, > > This patch adds a tuning struct for the Arm Ares CPU and uses it for -m{cpu,tune}=ares. > The tunings are an initial attempt and may be improved upon in the future, but they serve > as a decent starting point for GCC 9. > > With this I see a 1.3% improvement on SPEC2006 int and 0.3% on SPEC2006 fp with -mcpu=ares. > On SPEC2017 I see a 0.6% improvement in intrate and changes in the noise for fprate. > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? > Thanks, > Kyrill > > 2019-01-15 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64.c (ares_tunings): Define. > * config/aarch64/aarch64-cores.def (ares): Use the above.
On Tue, Jan 15, 2019 at 09:29:46AM -0600, Kyrill Tkachov wrote: > Hi all, > > This patch adds a tuning struct for the Arm Ares CPU and uses it for -m{cpu,tune}=ares. > The tunings are an initial attempt and may be improved upon in the future, but they serve > as a decent starting point for GCC 9. > > With this I see a 1.3% improvement on SPEC2006 int and 0.3% on SPEC2006 fp with -mcpu=ares. > On SPEC2017 I see a 0.6% improvement in intrate and changes in the noise for fprate. > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? This only changes non-default tuning. OK. Are we nearly done with these types of changes in AArch64 for GCC 9? I'd like to see us start acting like it is stage 4 soon! James > 2019-01-15 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64.c (ares_tunings): Define. > * config/aarch64/aarch64-cores.def (ares): Use the above.
On Wed, Jan 16, 2019 at 10:28 AM James Greenhalgh <james.greenhalgh@arm.com> wrote: > > On Tue, Jan 15, 2019 at 09:29:46AM -0600, Kyrill Tkachov wrote: > > Hi all, > > > > This patch adds a tuning struct for the Arm Ares CPU and uses it for -m{cpu,tune}=ares. > > The tunings are an initial attempt and may be improved upon in the future, but they serve > > as a decent starting point for GCC 9. > > > > With this I see a 1.3% improvement on SPEC2006 int and 0.3% on SPEC2006 fp with -mcpu=ares. > > On SPEC2017 I see a 0.6% improvement in intrate and changes in the noise for fprate. > > > > Bootstrapped and tested on aarch64-none-linux-gnu. > > > > Ok for trunk? > > This only changes non-default tuning. > > OK. > > Are we nearly done with these types of changes in AArch64 for GCC 9? I'd > like to see us start acting like it is stage 4 soon! I am in the process of getting OcteonTX2 patches in shape to submit. I hope to have them in a good shape by end of next week. Thanks, Andrew Pinski > > James > > > 2019-01-15 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > > > * config/aarch64/aarch64.c (ares_tunings): Define. > > * config/aarch64/aarch64-cores.def (ares): Use the above.
Hi James, On 16/01/19 18:27, James Greenhalgh wrote: > On Tue, Jan 15, 2019 at 09:29:46AM -0600, Kyrill Tkachov wrote: >> Hi all, >> >> This patch adds a tuning struct for the Arm Ares CPU and uses it for -m{cpu,tune}=ares. >> The tunings are an initial attempt and may be improved upon in the future, but they serve >> as a decent starting point for GCC 9. >> >> With this I see a 1.3% improvement on SPEC2006 int and 0.3% on SPEC2006 fp with -mcpu=ares. >> On SPEC2017 I see a 0.6% improvement in intrate and changes in the noise for fprate. >> >> Bootstrapped and tested on aarch64-none-linux-gnu. >> >> Ok for trunk? > This only changes non-default tuning. > > OK. Thanks. > > Are we nearly done with these types of changes in AArch64 for GCC 9? I'd > like to see us start acting like it is stage 4 soon! I believe there's a CPU tuning patch from Wuyuan waiting for review at: https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00777.html Kyrill > James > >> 2019-01-15 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> * config/aarch64/aarch64.c (ares_tunings): Define. >> * config/aarch64/aarch64-cores.def (ares): Use the above.
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 70b076694d9901ccf15bfd26c950b6466d3d1cc2..7c4bd52049e5ae33241acce37414da91abaa989c 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -100,7 +100,7 @@ AARCH64_CORE("thunderx2t99", thunderx2t99, thunderx2t99, 8_1A, AARCH64_FL_FOR AARCH64_CORE("cortex-a55", cortexa55, cortexa53, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa53, 0x41, 0xd05, -1) AARCH64_CORE("cortex-a75", cortexa75, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa73, 0x41, 0xd0a, -1) AARCH64_CORE("cortex-a76", cortexa76, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa72, 0x41, 0xd0b, -1) -AARCH64_CORE("ares", ares, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, cortexa72, 0x41, 0xd0c, -1) +AARCH64_CORE("ares", ares, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, ares, 0x41, 0xd0c, -1) /* HiSilicon ('H') cores. */ AARCH64_CORE("tsv110", tsv110, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110, 0x48, 0xd01, -1) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 2fd6bb9821a256eaa2acaee305926b4efebf9c8c..b1e5eacb69728741517e313b110ebbc203d415a4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1083,6 +1083,32 @@ static const struct tune_params thunderx2t99_tunings = &thunderx2t99_prefetch_tune }; +static const struct tune_params ares_tunings = +{ + &cortexa57_extra_costs, + &generic_addrcost_table, + &generic_regmove_cost, + &cortexa57_vector_cost, + &generic_branch_cost, + &generic_approx_modes, + SVE_NOT_IMPLEMENTED, /* sve_width */ + 4, /* memmov_cost */ + 3, /* issue_rate */ + AARCH64_FUSE_AES_AESMC, /* fusible_ops */ + "32:16", /* function_align. */ + "32:16", /* jump_align. */ + "32:16", /* loop_align. */ + 2, /* int_reassoc_width. */ + 4, /* fp_reassoc_width. */ + 2, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 0, /* max_case_values. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ + &generic_prefetch_tune +}; + /* Support for fine-grained override of the tuning structures. */ struct aarch64_tuning_override_function {