Message ID | 181ffc5f-94d6-85dd-dd60-825e618cbd99@arm.com |
---|---|
State | New |
Headers | show |
Series | [AArch64] PR target/105157 Increase number of cores TARGET_CPU_DEFAULT can encode | expand |
"Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes: > Hi, > > This addresses the compile-time increase seen in the PR target/105157. > This was being caused by selecting the wrong core tuning, as when we > added the latest AArch64 the TARGET_CPU_generic tuning was pushed beyond > the 0x3f mask we used to encode both target cpu and attributes into > TARGET_CPU_DEFAULT. > > This is a quick fix to increase the number of allowed cores for now late > in stage 4, but for next GCC we plan to fix this in a more suitable and > maintainable manner. > > gcc/ChangeLog: > > PR target/105157 > * config.gcc: Shift ext_mask by TARGET_CPU_NBITS. > * config/aarch64/aarch64.h (TARGET_CPU_NBITS): New macro. > (TARGET_CPU_MASK): Likewise. > (TARGET_CPU_DEFAULT): Use TARGET_CPU_NBITS. > * config/aarch64/aarch64.cc (aarch64_get_tune_cpu): Use > TARGET_CPU_MASK. > (aarch64_get_arch): Likewise. > (aarch64_override_options): Use TARGET_CPU_NBITS. > (aarch64_run_selftests): Add test to guarantee TARGET_CPU_NBITS > is big enough. Looks good, just a minor nit: > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 2cc5aeec9e46aacc46e4e509d01f57f9f6585169..8946984f5e93cd6fe5364b44e74c64e714ba1d6a 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -4239,7 +4239,7 @@ case "${target}" in > ext_val=`echo $ext_val | sed -e 's/[a-z0-9]\+//'` > done > > - ext_mask="(("$ext_mask") << 6)" > + ext_mask="(("$ext_mask") << TARGET_CPU_NBITS)" > if [ x"$base_id" != x ]; then > target_cpu_cname="TARGET_CPU_$base_id | $ext_mask" > fi > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index efa46ac0b8799b5849b609d591186e26e5cb37ff..97b2d38dac9f13717e90f4f6996ab94814922950 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -813,10 +813,17 @@ enum target_cpus > TARGET_CPU_generic > }; > > + Nit: excess extra line. > +/* Define how many bits are used to represent the CPU in TARGET_CPU_DEFAULT. > + This needs to be big enough to fit the value of TARGET_CPU_generic. > + All bits after this are used to represent the AARCH64_CPU_DEFAULT_FLAGS. */ > +#define TARGET_CPU_NBITS 8 > +#define TARGET_CPU_MASK ((1 << TARGET_CPU_NBITS) - 1) > + > /* If there is no CPU defined at configure, use generic as default. */ > #ifndef TARGET_CPU_DEFAULT > #define TARGET_CPU_DEFAULT \ > - (TARGET_CPU_generic | (AARCH64_CPU_DEFAULT_FLAGS << 6)) > + (TARGET_CPU_generic | (AARCH64_CPU_DEFAULT_FLAGS << TARGET_CPU_NBITS)) > #endif > > /* If inserting NOP before a mult-accumulate insn remember to adjust the > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index d66626748e5edc46e46edf10e243114a9f74be97..9d83e007d7a31455c167bf21124cf9fbc0d31788 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -18062,12 +18062,12 @@ aarch64_get_tune_cpu (enum aarch64_processor cpu) > if (cpu != aarch64_none) > return &all_cores[cpu]; > > - /* The & 0x3f is to extract the bottom 6 bits that encode the > - default cpu as selected by the --with-cpu GCC configure option > + /* The & TARGET_CPU_MASK is to extract the bottom TARGET_CPU_NBITS bits that > + encode the default cpu as selected by the --with-cpu GCC configure option > in config.gcc. > ???: The whole TARGET_CPU_DEFAULT and AARCH64_CPU_DEFAULT_FLAGS > flags mechanism should be reworked to make it more sane. */ > - return &all_cores[TARGET_CPU_DEFAULT & 0x3f]; > + return &all_cores[TARGET_CPU_DEFAULT & TARGET_CPU_MASK]; > } > > /* Return the architecture corresponding to the enum ARCH. > @@ -18079,7 +18079,8 @@ aarch64_get_arch (enum aarch64_arch arch) > if (arch != aarch64_no_arch) > return &all_architectures[arch]; > > - const struct processor *cpu = &all_cores[TARGET_CPU_DEFAULT & 0x3f]; > + const struct processor *cpu > + = &all_cores[TARGET_CPU_DEFAULT & TARGET_CPU_MASK]; > > return &all_architectures[cpu->arch]; > } > @@ -18166,7 +18167,7 @@ aarch64_override_options (void) > { > /* Get default configure-time CPU. */ > selected_cpu = aarch64_get_tune_cpu (aarch64_none); > - aarch64_isa_flags = TARGET_CPU_DEFAULT >> 6; > + aarch64_isa_flags = TARGET_CPU_DEFAULT >> TARGET_CPU_NBITS; > } > > if (selected_tune) > @@ -27260,6 +27261,9 @@ aarch64_run_selftests (void) > { > aarch64_test_loading_full_dump (); > aarch64_test_fractional_cost (); > + /* Check whether the TARGET_CPU_MASK is big enough to fit all of the definded > + CPUs. */ > + ASSERT_TRUE (TARGET_CPU_generic < TARGET_CPU_MASK); I think this would be better as a static assert at the top level: static_assert (TARGET_CPU_generic < TARGET_CPU_MASK, "TARGET_CPU_NBITS is big enough"); OK with that change, thanks. Richard > } > > } // namespace selftest
On 08/04/2022 08:04, Richard Sandiford wrote: > I think this would be better as a static assert at the top level: > > static_assert (TARGET_CPU_generic < TARGET_CPU_MASK, > "TARGET_CPU_NBITS is big enough"); The motivation being that you want this to be checked regardless of whether we are using CHECKING_P?
"Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes: > On 08/04/2022 08:04, Richard Sandiford wrote: >> I think this would be better as a static assert at the top level: >> >> static_assert (TARGET_CPU_generic < TARGET_CPU_MASK, >> "TARGET_CPU_NBITS is big enough"); > The motivation being that you want this to be checked regardless of > whether we are using CHECKING_P? Yeah, that's one reason. Others are that static_asserts are caught at the earliest possible moment (as a build failure) and that they don't require any object code. Thanks, Richard
diff --git a/gcc/config.gcc b/gcc/config.gcc index 2cc5aeec9e46aacc46e4e509d01f57f9f6585169..8946984f5e93cd6fe5364b44e74c64e714ba1d6a 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -4239,7 +4239,7 @@ case "${target}" in ext_val=`echo $ext_val | sed -e 's/[a-z0-9]\+//'` done - ext_mask="(("$ext_mask") << 6)" + ext_mask="(("$ext_mask") << TARGET_CPU_NBITS)" if [ x"$base_id" != x ]; then target_cpu_cname="TARGET_CPU_$base_id | $ext_mask" fi diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index efa46ac0b8799b5849b609d591186e26e5cb37ff..97b2d38dac9f13717e90f4f6996ab94814922950 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -813,10 +813,17 @@ enum target_cpus TARGET_CPU_generic }; + +/* Define how many bits are used to represent the CPU in TARGET_CPU_DEFAULT. + This needs to be big enough to fit the value of TARGET_CPU_generic. + All bits after this are used to represent the AARCH64_CPU_DEFAULT_FLAGS. */ +#define TARGET_CPU_NBITS 8 +#define TARGET_CPU_MASK ((1 << TARGET_CPU_NBITS) - 1) + /* If there is no CPU defined at configure, use generic as default. */ #ifndef TARGET_CPU_DEFAULT #define TARGET_CPU_DEFAULT \ - (TARGET_CPU_generic | (AARCH64_CPU_DEFAULT_FLAGS << 6)) + (TARGET_CPU_generic | (AARCH64_CPU_DEFAULT_FLAGS << TARGET_CPU_NBITS)) #endif /* If inserting NOP before a mult-accumulate insn remember to adjust the diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index d66626748e5edc46e46edf10e243114a9f74be97..9d83e007d7a31455c167bf21124cf9fbc0d31788 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -18062,12 +18062,12 @@ aarch64_get_tune_cpu (enum aarch64_processor cpu) if (cpu != aarch64_none) return &all_cores[cpu]; - /* The & 0x3f is to extract the bottom 6 bits that encode the - default cpu as selected by the --with-cpu GCC configure option + /* The & TARGET_CPU_MASK is to extract the bottom TARGET_CPU_NBITS bits that + encode the default cpu as selected by the --with-cpu GCC configure option in config.gcc. ???: The whole TARGET_CPU_DEFAULT and AARCH64_CPU_DEFAULT_FLAGS flags mechanism should be reworked to make it more sane. */ - return &all_cores[TARGET_CPU_DEFAULT & 0x3f]; + return &all_cores[TARGET_CPU_DEFAULT & TARGET_CPU_MASK]; } /* Return the architecture corresponding to the enum ARCH. @@ -18079,7 +18079,8 @@ aarch64_get_arch (enum aarch64_arch arch) if (arch != aarch64_no_arch) return &all_architectures[arch]; - const struct processor *cpu = &all_cores[TARGET_CPU_DEFAULT & 0x3f]; + const struct processor *cpu + = &all_cores[TARGET_CPU_DEFAULT & TARGET_CPU_MASK]; return &all_architectures[cpu->arch]; } @@ -18166,7 +18167,7 @@ aarch64_override_options (void) { /* Get default configure-time CPU. */ selected_cpu = aarch64_get_tune_cpu (aarch64_none); - aarch64_isa_flags = TARGET_CPU_DEFAULT >> 6; + aarch64_isa_flags = TARGET_CPU_DEFAULT >> TARGET_CPU_NBITS; } if (selected_tune) @@ -27260,6 +27261,9 @@ aarch64_run_selftests (void) { aarch64_test_loading_full_dump (); aarch64_test_fractional_cost (); + /* Check whether the TARGET_CPU_MASK is big enough to fit all of the definded + CPUs. */ + ASSERT_TRUE (TARGET_CPU_generic < TARGET_CPU_MASK); } } // namespace selftest