Message ID | 20230808075600.1878105-1-hongtao.liu@intel.com |
---|---|
State | New |
Headers | show |
Series | [X86] Workaround possible CPUID bug in Sandy Bridge. | expand |
On Tue, Aug 08, 2023 at 03:56:00PM +0800, liuhongt via Gcc-patches wrote:
> + /* According to document, when subleaf is invliad, EAX,EBX,ECX,EDX should
s/invliad/invalid/
Jakub
On Tue, Aug 8, 2023 at 9:58 AM liuhongt <hongtao.liu@intel.com> wrote: > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is > supported via EAX. > > Intel documentation says invalid subleaves return 0. We had been > relying on that behavior instead of checking the max sublef number. > > It appears that some Sandy Bridge CPUs return at least the subleaf 0 > EDX value for subleaf 1. Best guess is that this is a bug in a > microcode patch since all of the bits we're seeing set in EDX were > introduced after Sandy Bridge was originally released. > > This is causing avxvnniint16 to be incorrectly enabled with > -march=native on these CPUs. > > BTW: Thanks for reminder from llvm forks Phoebe and Craig. > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Please rather do it in a more self-descriptive way, as proposed in the attached patch. You won't need a comment then. (Please note that indentation is wrong in the patch in order to better see the changes). Uros. diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h index 30ef0d334ca..49724d2cba1 100644 --- a/gcc/common/config/i386/cpuinfo.h +++ b/gcc/common/config/i386/cpuinfo.h @@ -762,7 +762,9 @@ get_available_features (struct __processor_model *cpu_model, /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */ if (max_cpuid_level >= 7) { - __cpuid_count (7, 0, eax, ebx, ecx, edx); + unsigned subleaf_level; + + __cpuid_count (7, 0, subleaf_level, ebx, ecx, edx); if (ebx & bit_BMI) set_feature (FEATURE_BMI); if (ebx & bit_SGX) @@ -873,8 +875,9 @@ get_available_features (struct __processor_model *cpu_model, if (edx & bit_AVX512FP16) set_feature (FEATURE_AVX512FP16); } - - __cpuid_count (7, 1, eax, ebx, ecx, edx); + if (subleaf_level >= 1) + { + __cpuid_count (7, 1, eax, ebx, ecx, edx); if (eax & bit_HRESET) set_feature (FEATURE_HRESET); if (eax & bit_CMPCCXADD) @@ -914,6 +917,7 @@ get_available_features (struct __processor_model *cpu_model, if (edx & bit_AMX_COMPLEX) set_feature (FEATURE_AMX_COMPLEX); } + } } /* Get Advanced Features at level 0xd (eax = 0xd, ecx = 1). */
diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h index 30ef0d334ca..24ab2252eb0 100644 --- a/gcc/common/config/i386/cpuinfo.h +++ b/gcc/common/config/i386/cpuinfo.h @@ -874,45 +874,53 @@ get_available_features (struct __processor_model *cpu_model, set_feature (FEATURE_AVX512FP16); } - __cpuid_count (7, 1, eax, ebx, ecx, edx); - if (eax & bit_HRESET) - set_feature (FEATURE_HRESET); - if (eax & bit_CMPCCXADD) - set_feature(FEATURE_CMPCCXADD); - if (edx & bit_PREFETCHI) - set_feature (FEATURE_PREFETCHI); - if (eax & bit_RAOINT) - set_feature (FEATURE_RAOINT); - if (avx_usable) - { - if (eax & bit_AVXVNNI) - set_feature (FEATURE_AVXVNNI); - if (eax & bit_AVXIFMA) - set_feature (FEATURE_AVXIFMA); - if (edx & bit_AVXVNNIINT8) - set_feature (FEATURE_AVXVNNIINT8); - if (edx & bit_AVXNECONVERT) - set_feature (FEATURE_AVXNECONVERT); - if (edx & bit_AVXVNNIINT16) - set_feature (FEATURE_AVXVNNIINT16); - if (eax & bit_SM3) - set_feature (FEATURE_SM3); - if (eax & bit_SHA512) - set_feature (FEATURE_SHA512); - if (eax & bit_SM4) - set_feature (FEATURE_SM4); - } - if (avx512_usable) - { - if (eax & bit_AVX512BF16) - set_feature (FEATURE_AVX512BF16); - } - if (amx_usable) + /* According to document, when subleaf is invliad, EAX,EBX,ECX,EDX should + return 0 for CPUID (7, 1, EAX, EBX, ECX, EDX). + But looks like it doesn't satisfy the document on some CPU, refer to + https://reviews.llvm.org/D155145. + Manually check valid subleaf here. */ + if (eax) { - if (eax & bit_AMX_FP16) - set_feature (FEATURE_AMX_FP16); - if (edx & bit_AMX_COMPLEX) - set_feature (FEATURE_AMX_COMPLEX); + __cpuid_count (7, 1, eax, ebx, ecx, edx); + if (eax & bit_HRESET) + set_feature (FEATURE_HRESET); + if (eax & bit_CMPCCXADD) + set_feature(FEATURE_CMPCCXADD); + if (edx & bit_PREFETCHI) + set_feature (FEATURE_PREFETCHI); + if (eax & bit_RAOINT) + set_feature (FEATURE_RAOINT); + if (avx_usable) + { + if (eax & bit_AVXVNNI) + set_feature (FEATURE_AVXVNNI); + if (eax & bit_AVXIFMA) + set_feature (FEATURE_AVXIFMA); + if (edx & bit_AVXVNNIINT8) + set_feature (FEATURE_AVXVNNIINT8); + if (edx & bit_AVXNECONVERT) + set_feature (FEATURE_AVXNECONVERT); + if (edx & bit_AVXVNNIINT16) + set_feature (FEATURE_AVXVNNIINT16); + if (eax & bit_SM3) + set_feature (FEATURE_SM3); + if (eax & bit_SHA512) + set_feature (FEATURE_SHA512); + if (eax & bit_SM4) + set_feature (FEATURE_SM4); + } + if (avx512_usable) + { + if (eax & bit_AVX512BF16) + set_feature (FEATURE_AVX512BF16); + } + if (amx_usable) + { + if (eax & bit_AMX_FP16) + set_feature (FEATURE_AMX_FP16); + if (edx & bit_AMX_COMPLEX) + set_feature (FEATURE_AMX_COMPLEX); + } } }