Message ID | 158326549681.40452.14219781294371149597.stgit@naples-babu.amd.com |
---|---|
State | New |
Headers | show |
Series | APIC ID fixes for AMD EPYC CPU model | expand |
On Tue, 03 Mar 2020 13:58:16 -0600 Babu Moger <babu.moger@amd.com> wrote: > Add the new EPYC model specific handlers to fix the apicid decoding. > > The APIC ID is decoded based on the sequence sockets->dies->cores->threads. > This works fine for most standard AMD and other vendors' configurations, > but this decoding sequence does not follow that of AMD's APIC ID enumeration > strictly. In some cases this can cause CPU topology inconsistency. > > When booting a guest VM, the kernel tries to validate the topology, and finds > it inconsistent with the enumeration of EPYC cpu models. The more details are > in the bug https://bugzilla.redhat.com/show_bug.cgi?id=1728166. > > To fix the problem we need to build the topology as per the Processor > Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 > Processors. > It is available at https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip > > Here is the text from the PPR. > Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the > number of least significant bits in the Initial APIC ID that indicate core ID > within a processor, in constructing per-core CPUID masks. > Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores > (MNC) that the processor could theoretically support, not the actual number of > cores that are actually implemented or enabled on the processor, as indicated > by Core::X86::Cpuid::SizeId[NC]. > Each Core::X86::Apic::ApicId[ApicId] register is preset as follows: > • ApicId[6] = Socket ID. > • ApicId[5:4] = Node ID. > • ApicId[3] = Logical CCX L3 complex ID > • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]} > > Signed-off-by: Babu Moger <babu.moger@amd.com> Acked-by: Igor Mammedov <imammedo@redhat.com> > --- > target/i386/cpu.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index f33d8b77f5..f870f7c55b 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -3884,6 +3884,10 @@ static X86CPUDefinition builtin_x86_defs[] = { > .xlevel = 0x8000001E, > .model_id = "AMD EPYC Processor", > .cache_info = &epyc_cache_info, > + .apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc, > + .topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc, > + .apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc, > + .apicid_pkg_offset = apicid_pkg_offset_epyc, > .versions = (X86CPUVersionDefinition[]) { > { .version = 1 }, > { > >
On 3/9/20 10:03 AM, Igor Mammedov wrote: > On Tue, 03 Mar 2020 13:58:16 -0600 > Babu Moger <babu.moger@amd.com> wrote: > >> Add the new EPYC model specific handlers to fix the apicid decoding. >> >> The APIC ID is decoded based on the sequence sockets->dies->cores->threads. >> This works fine for most standard AMD and other vendors' configurations, >> but this decoding sequence does not follow that of AMD's APIC ID enumeration >> strictly. In some cases this can cause CPU topology inconsistency. >> >> When booting a guest VM, the kernel tries to validate the topology, and finds >> it inconsistent with the enumeration of EPYC cpu models. The more details are >> in the bug https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&data=02%7C01%7Cbabu.moger%40amd.com%7C3ddda6803d584aac171b08d7c43b0530%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637193630113242242&sdata=3TkNY2O8HWBqaOrmO8QQoxlzvIv2oEdTO1P9k6VglmU%3D&reserved=0. >> >> To fix the problem we need to build the topology as per the Processor >> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 >> Processors. >> It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&data=02%7C01%7Cbabu.moger%40amd.com%7C3ddda6803d584aac171b08d7c43b0530%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637193630113242242&sdata=cPYmcthdgilh9lOiGDrwKwOt7cn%2BvBcT%2F8PhRs92x8I%3D&reserved=0 >> >> Here is the text from the PPR. >> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the >> number of least significant bits in the Initial APIC ID that indicate core ID >> within a processor, in constructing per-core CPUID masks. >> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores >> (MNC) that the processor could theoretically support, not the actual number of >> cores that are actually implemented or enabled on the processor, as indicated >> by Core::X86::Cpuid::SizeId[NC]. >> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows: >> • ApicId[6] = Socket ID. >> • ApicId[5:4] = Node ID. >> • ApicId[3] = Logical CCX L3 complex ID >> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]} >> >> Signed-off-by: Babu Moger <babu.moger@amd.com> > > Acked-by: Igor Mammedov <imammedo@redhat.com> If use a boolean variable, then I dont need all these handlers in X86CPUDefinition. I just need to set a boolean variable here. > >> --- >> target/i386/cpu.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c >> index f33d8b77f5..f870f7c55b 100644 >> --- a/target/i386/cpu.c >> +++ b/target/i386/cpu.c >> @@ -3884,6 +3884,10 @@ static X86CPUDefinition builtin_x86_defs[] = { >> .xlevel = 0x8000001E, >> .model_id = "AMD EPYC Processor", >> .cache_info = &epyc_cache_info, >> + .apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc, >> + .topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc, >> + .apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc, >> + .apicid_pkg_offset = apicid_pkg_offset_epyc, >> .versions = (X86CPUVersionDefinition[]) { >> { .version = 1 }, >> { >> >> >
On Mon, 9 Mar 2020 14:12:10 -0500 Babu Moger <babu.moger@amd.com> wrote: > On 3/9/20 10:03 AM, Igor Mammedov wrote: > > On Tue, 03 Mar 2020 13:58:16 -0600 > > Babu Moger <babu.moger@amd.com> wrote: > > > >> Add the new EPYC model specific handlers to fix the apicid decoding. > >> > >> The APIC ID is decoded based on the sequence sockets->dies->cores->threads. > >> This works fine for most standard AMD and other vendors' configurations, > >> but this decoding sequence does not follow that of AMD's APIC ID enumeration > >> strictly. In some cases this can cause CPU topology inconsistency. > >> > >> When booting a guest VM, the kernel tries to validate the topology, and finds > >> it inconsistent with the enumeration of EPYC cpu models. The more details are > >> in the bug https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&data=02%7C01%7Cbabu.moger%40amd.com%7C3ddda6803d584aac171b08d7c43b0530%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637193630113242242&sdata=3TkNY2O8HWBqaOrmO8QQoxlzvIv2oEdTO1P9k6VglmU%3D&reserved=0. > >> > >> To fix the problem we need to build the topology as per the Processor > >> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 > >> Processors. > >> It is available at https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&data=02%7C01%7Cbabu.moger%40amd.com%7C3ddda6803d584aac171b08d7c43b0530%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637193630113242242&sdata=cPYmcthdgilh9lOiGDrwKwOt7cn%2BvBcT%2F8PhRs92x8I%3D&reserved=0 > >> > >> Here is the text from the PPR. > >> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the > >> number of least significant bits in the Initial APIC ID that indicate core ID > >> within a processor, in constructing per-core CPUID masks. > >> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores > >> (MNC) that the processor could theoretically support, not the actual number of > >> cores that are actually implemented or enabled on the processor, as indicated > >> by Core::X86::Cpuid::SizeId[NC]. > >> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows: > >> • ApicId[6] = Socket ID. > >> • ApicId[5:4] = Node ID. > >> • ApicId[3] = Logical CCX L3 complex ID > >> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]} > >> > >> Signed-off-by: Babu Moger <babu.moger@amd.com> > > > > Acked-by: Igor Mammedov <imammedo@redhat.com> > > If use a boolean variable, then I dont need all these handlers in > X86CPUDefinition. I just need to set a boolean variable here. agreed, that would be better > > > >> --- > >> target/i386/cpu.c | 4 ++++ > >> 1 file changed, 4 insertions(+) > >> > >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c > >> index f33d8b77f5..f870f7c55b 100644 > >> --- a/target/i386/cpu.c > >> +++ b/target/i386/cpu.c > >> @@ -3884,6 +3884,10 @@ static X86CPUDefinition builtin_x86_defs[] = { > >> .xlevel = 0x8000001E, > >> .model_id = "AMD EPYC Processor", > >> .cache_info = &epyc_cache_info, > >> + .apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc, > >> + .topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc, > >> + .apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc, > >> + .apicid_pkg_offset = apicid_pkg_offset_epyc, > >> .versions = (X86CPUVersionDefinition[]) { > >> { .version = 1 }, > >> { > >> > >> > > >
diff --git a/target/i386/cpu.c b/target/i386/cpu.c index f33d8b77f5..f870f7c55b 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -3884,6 +3884,10 @@ static X86CPUDefinition builtin_x86_defs[] = { .xlevel = 0x8000001E, .model_id = "AMD EPYC Processor", .cache_info = &epyc_cache_info, + .apicid_from_cpu_idx = x86_apicid_from_cpu_idx_epyc, + .topo_ids_from_apicid = x86_topo_ids_from_apicid_epyc, + .apicid_from_topo_ids = x86_apicid_from_topo_ids_epyc, + .apicid_pkg_offset = apicid_pkg_offset_epyc, .versions = (X86CPUVersionDefinition[]) { { .version = 1 }, {
Add the new EPYC model specific handlers to fix the apicid decoding. The APIC ID is decoded based on the sequence sockets->dies->cores->threads. This works fine for most standard AMD and other vendors' configurations, but this decoding sequence does not follow that of AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology inconsistency. When booting a guest VM, the kernel tries to validate the topology, and finds it inconsistent with the enumeration of EPYC cpu models. The more details are in the bug https://bugzilla.redhat.com/show_bug.cgi?id=1728166. To fix the problem we need to build the topology as per the Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors. It is available at https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip Here is the text from the PPR. Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the number of least significant bits in the Initial APIC ID that indicate core ID within a processor, in constructing per-core CPUID masks. Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores (MNC) that the processor could theoretically support, not the actual number of cores that are actually implemented or enabled on the processor, as indicated by Core::X86::Cpuid::SizeId[NC]. Each Core::X86::Apic::ApicId[ApicId] register is preset as follows: • ApicId[6] = Socket ID. • ApicId[5:4] = Node ID. • ApicId[3] = Logical CCX L3 complex ID • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]} Signed-off-by: Babu Moger <babu.moger@amd.com> --- target/i386/cpu.c | 4 ++++ 1 file changed, 4 insertions(+)