diff mbox

[v9,0/5] Support x2APIC mode with TCG accelerator

Message ID 20231024152105.35942-1-minhquangbui99@gmail.com
State New
Headers show

Commit Message

Bui Quang Minh Oct. 24, 2023, 3:21 p.m. UTC
Hi everyone,

This series implements x2APIC mode in userspace local APIC and the
RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
using either Intel or AMD iommu.

Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
with enabled x2APIC and can enumerate CPU with APIC ID 257

Using Intel IOMMU

qemu/build/qemu-system-x86_64 \
  -smp 2,maxcpus=260 \
  -cpu qemu64,x2apic=on \
  -machine q35 \
  -device intel-iommu,intremap=on,eim=on \
  -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
  -m 2G \
  -kernel $KERNEL_DIR \
  -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
  -drive file=$IMAGE_DIR,format=raw \
  -nographic \
  -s

Using AMD IOMMU

qemu/build/qemu-system-x86_64 \
  -smp 2,maxcpus=260 \
  -cpu qemu64,x2apic=on \
  -machine q35 \
  -device amd-iommu,intremap=on,xtsup=on \
  -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
  -m 2G \
  -kernel $KERNEL_DIR \
  -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
  -drive file=$IMAGE_DIR,format=raw \
  -nographic \
  -s

Testing the emulated userspace APIC with kvm-unit-tests, disable test
device with this patch


~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
./run_tests.sh -v -g apic

TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
-cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
apic-split (54 tests, 8 unexpected failures, 1 skipped)
TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
6 unexpected failures, 2 skipped)

  FAIL: apic_disable: *0xfee00030: 50014
  FAIL: apic_disable: *0xfee00080: f0
  FAIL: apic_disable: *0xfee00030: 50014
  FAIL: apic_disable: *0xfee00080: f0
  FAIL: apicbase: relocate apic

These errors are because we don't disable MMIO region when switching to
x2APIC and don't support relocate MMIO region yet. This is a problem
because, MMIO region is the same for all CPUs, in order to support these we
need to figure out how to allocate and manage different MMIO regions for
each CPUs. This can be an improvement in the future.

  FAIL: nmi-after-sti
  FAIL: multiple nmi

These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.

  FAIL: TMCCT should stay at zero

This error is related to APIC timer which should be addressed in separate
patch.

Version 9 changes,
- Patch 1:
  + Create apic_msr_read/write which is a small wrapper around
  apic_register_read/write that have additional x2apic mode check
- Patch 2:
  + Remove raise_exception_ra which is is TCG specific. Instead, return -1
  and let the accelerator raise the appropriate exception
  + Refactor apic_get_delivery_bitmask a little bit to reduce line length
  + Move cpu_has_x2apic_feature and cpu_set_apic_feature from patch 3 to
  patch 2 so that patch 2 can be compiled without patch 3
- Patch 3:
  + set_base in APICCommonClass now returns an int to indicate error
  + Remove raise_exception_ra in apic_set base which is is TCG specific.
  Instead, return -1 and let the accelerator raise the appropriate
  exception

Version 8 changes,
- Patch 2, 4:
  + Rebase to master and resolve conflicts in these 2 patches

Version 7 changes,
- Patch 4:
  + If eim=on, keep checking if kvm x2APIC is enabled when kernel-irqchip
  is split

Version 6 changes,
- Patch 5:
  + Make all places use the amdvi_extended_feature_register to get extended
  feature register

Version 5 changes,
- Patch 3:
  + Rebase to master and fix conflict
- Patch 5:
  + Create a helper function to get amdvi extended feature register instead
  of storing it in AMDVIState

Version 4 changes,
- Patch 5:
  + Instead of replacing IVHD type 0x10 with type 0x11, export both types
  for backward compatibility with old guest operating system
  + Flip the xtsup feature check condition in amdvi_int_remap_ga for
  readability

Version 3 changes,
- Patch 2:
  + Allow APIC ID > 255 only when x2APIC feature is supported on CPU
  + Make physical destination mode IPI which has destination id 0xffffffff
  a broadcast to xAPIC CPUs
  + Make cluster address 0xf in cluster model of xAPIC logical destination
  mode a broadcast to all clusters
  + Create new extended_log_dest to store APIC_LDR information in x2APIC
  instead of extending log_dest for backward compatibility in vmstate

Version 2 changes,
- Add support for APIC ID larger than 255
- Adjust AMD iommu for x2APIC suuport
- Reorganize and split patch 1,2 into patch 1,2,3 in version 2

Thanks,
Quang Minh.

Bui Quang Minh (5):
  i386/tcg: implement x2APIC registers MSR access
  apic: add support for x2APIC mode
  apic, i386/tcg: add x2apic transitions
  intel_iommu: allow Extended Interrupt Mode when using userspace APIC
  amd_iommu: report x2APIC support to the operating system

 hw/i386/acpi-build.c                 | 129 +++++---
 hw/i386/amd_iommu.c                  |  29 +-
 hw/i386/amd_iommu.h                  |  16 +-
 hw/i386/intel_iommu.c                |   6 +-
 hw/i386/kvm/apic.c                   |   3 +-
 hw/i386/x86.c                        |   6 +-
 hw/i386/xen/xen_apic.c               |   3 +-
 hw/intc/apic.c                       | 464 +++++++++++++++++++++------
 hw/intc/apic_common.c                |  22 +-
 hw/intc/trace-events                 |   4 +-
 include/hw/i386/apic.h               |   8 +-
 include/hw/i386/apic_internal.h      |   9 +-
 target/i386/cpu-sysemu.c             |  18 +-
 target/i386/cpu.c                    |   9 +-
 target/i386/cpu.h                    |   9 +
 target/i386/tcg/sysemu/misc_helper.c |  41 ++-
 target/i386/whpx/whpx-apic.c         |   3 +-
 17 files changed, 591 insertions(+), 188 deletions(-)

Comments

Santosh Shukla Nov. 9, 2023, 10:11 a.m. UTC | #1
On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
> Hi everyone,
> 
> This series implements x2APIC mode in userspace local APIC and the
> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
> using either Intel or AMD iommu.
> 
> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
> with enabled x2APIC and can enumerate CPU with APIC ID 257
> 
> Using Intel IOMMU
> 
> qemu/build/qemu-system-x86_64 \
>   -smp 2,maxcpus=260 \
>   -cpu qemu64,x2apic=on \
>   -machine q35 \
>   -device intel-iommu,intremap=on,eim=on \
>   -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>   -m 2G \
>   -kernel $KERNEL_DIR \
>   -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
>   -drive file=$IMAGE_DIR,format=raw \
>   -nographic \
>   -s
> 
> Using AMD IOMMU
> 
> qemu/build/qemu-system-x86_64 \
>   -smp 2,maxcpus=260 \
>   -cpu qemu64,x2apic=on \
>   -machine q35 \
>   -device amd-iommu,intremap=on,xtsup=on \
>   -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>   -m 2G \
>   -kernel $KERNEL_DIR \
>   -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
>   -drive file=$IMAGE_DIR,format=raw \
>   -nographic \
>   -s
> 
> Testing the emulated userspace APIC with kvm-unit-tests, disable test
> device with this patch
> 
> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
> index 1734afb..f56fe1c 100644
> --- a/lib/x86/fwcfg.c
> +++ b/lib/x86/fwcfg.c
> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
> 
>         if ((str = getenv("TEST_DEVICE")))
>                 no_test_device = !atol(str);
> +       no_test_device = true;
> 
>         if ((str = getenv("MEMLIMIT")))
>                 fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
> 
> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
> ./run_tests.sh -v -g apic
> 
> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
> apic-split (54 tests, 8 unexpected failures, 1 skipped)
> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
> 6 unexpected failures, 2 skipped)
> 
>   FAIL: apic_disable: *0xfee00030: 50014
>   FAIL: apic_disable: *0xfee00080: f0
>   FAIL: apic_disable: *0xfee00030: 50014
>   FAIL: apic_disable: *0xfee00080: f0
>   FAIL: apicbase: relocate apic
> 
> These errors are because we don't disable MMIO region when switching to
> x2APIC and don't support relocate MMIO region yet. This is a problem
> because, MMIO region is the same for all CPUs, in order to support these we
> need to figure out how to allocate and manage different MMIO regions for
> each CPUs. This can be an improvement in the future.
> 
>   FAIL: nmi-after-sti
>   FAIL: multiple nmi
> 
> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
> 
>   FAIL: TMCCT should stay at zero
> 
> This error is related to APIC timer which should be addressed in separate
> patch.
> 
> Version 9 changes,

Hi Bui,

I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
see > 255 vCPU for Linux and Windows Guest.

Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>

Thanks,
Santosh
Bui Quang Minh Nov. 9, 2023, 2:10 p.m. UTC | #2
On 11/9/23 17:11, Santosh Shukla wrote:
> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>> Hi everyone,
>>
>> This series implements x2APIC mode in userspace local APIC and the
>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>> using either Intel or AMD iommu.
>>
>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>
>> Using Intel IOMMU
>>
>> qemu/build/qemu-system-x86_64 \
>>    -smp 2,maxcpus=260 \
>>    -cpu qemu64,x2apic=on \
>>    -machine q35 \
>>    -device intel-iommu,intremap=on,eim=on \
>>    -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>    -m 2G \
>>    -kernel $KERNEL_DIR \
>>    -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
>>    -drive file=$IMAGE_DIR,format=raw \
>>    -nographic \
>>    -s
>>
>> Using AMD IOMMU
>>
>> qemu/build/qemu-system-x86_64 \
>>    -smp 2,maxcpus=260 \
>>    -cpu qemu64,x2apic=on \
>>    -machine q35 \
>>    -device amd-iommu,intremap=on,xtsup=on \
>>    -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>    -m 2G \
>>    -kernel $KERNEL_DIR \
>>    -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \
>>    -drive file=$IMAGE_DIR,format=raw \
>>    -nographic \
>>    -s
>>
>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>> device with this patch
>>
>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>> index 1734afb..f56fe1c 100644
>> --- a/lib/x86/fwcfg.c
>> +++ b/lib/x86/fwcfg.c
>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>
>>          if ((str = getenv("TEST_DEVICE")))
>>                  no_test_device = !atol(str);
>> +       no_test_device = true;
>>
>>          if ((str = getenv("MEMLIMIT")))
>>                  fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>
>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>> ./run_tests.sh -v -g apic
>>
>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>> 6 unexpected failures, 2 skipped)
>>
>>    FAIL: apic_disable: *0xfee00030: 50014
>>    FAIL: apic_disable: *0xfee00080: f0
>>    FAIL: apic_disable: *0xfee00030: 50014
>>    FAIL: apic_disable: *0xfee00080: f0
>>    FAIL: apicbase: relocate apic
>>
>> These errors are because we don't disable MMIO region when switching to
>> x2APIC and don't support relocate MMIO region yet. This is a problem
>> because, MMIO region is the same for all CPUs, in order to support these we
>> need to figure out how to allocate and manage different MMIO regions for
>> each CPUs. This can be an improvement in the future.
>>
>>    FAIL: nmi-after-sti
>>    FAIL: multiple nmi
>>
>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>
>>    FAIL: TMCCT should stay at zero
>>
>> This error is related to APIC timer which should be addressed in separate
>> patch.
>>
>> Version 9 changes,
> 
> Hi Bui,
> 
> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
> see > 255 vCPU for Linux and Windows Guest.
> 
> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>

Hi Santosh,

With KVM enabled, you may be using the in kernel APIC from KVM not the 
emulated APIC in userspace as in this series.

Thanks,
Quang Minh.
Joao Martins Nov. 9, 2023, 2:32 p.m. UTC | #3
On 09/11/2023 14:10, Bui Quang Minh wrote:
> On 11/9/23 17:11, Santosh Shukla wrote:
>> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>>> Hi everyone,
>>>
>>> This series implements x2APIC mode in userspace local APIC and the
>>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>>> using either Intel or AMD iommu.
>>>
>>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>>
>>> Using Intel IOMMU
>>>
>>> qemu/build/qemu-system-x86_64 \
>>>    -smp 2,maxcpus=260 \
>>>    -cpu qemu64,x2apic=on \
>>>    -machine q35 \
>>>    -device intel-iommu,intremap=on,eim=on \
>>>    -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>    -m 2G \
>>>    -kernel $KERNEL_DIR \
>>>    -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>> net.ifnames=0" \
>>>    -drive file=$IMAGE_DIR,format=raw \
>>>    -nographic \
>>>    -s
>>>
>>> Using AMD IOMMU
>>>
>>> qemu/build/qemu-system-x86_64 \
>>>    -smp 2,maxcpus=260 \
>>>    -cpu qemu64,x2apic=on \
>>>    -machine q35 \
>>>    -device amd-iommu,intremap=on,xtsup=on \
>>>    -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>    -m 2G \
>>>    -kernel $KERNEL_DIR \
>>>    -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>> net.ifnames=0" \
>>>    -drive file=$IMAGE_DIR,format=raw \
>>>    -nographic \
>>>    -s
>>>
>>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>>> device with this patch
>>>
>>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>>> index 1734afb..f56fe1c 100644
>>> --- a/lib/x86/fwcfg.c
>>> +++ b/lib/x86/fwcfg.c
>>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>>
>>>          if ((str = getenv("TEST_DEVICE")))
>>>                  no_test_device = !atol(str);
>>> +       no_test_device = true;
>>>
>>>          if ((str = getenv("MEMLIMIT")))
>>>                  fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>>
>>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>>> ./run_tests.sh -v -g apic
>>>
>>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>>> 6 unexpected failures, 2 skipped)
>>>
>>>    FAIL: apic_disable: *0xfee00030: 50014
>>>    FAIL: apic_disable: *0xfee00080: f0
>>>    FAIL: apic_disable: *0xfee00030: 50014
>>>    FAIL: apic_disable: *0xfee00080: f0
>>>    FAIL: apicbase: relocate apic
>>>
>>> These errors are because we don't disable MMIO region when switching to
>>> x2APIC and don't support relocate MMIO region yet. This is a problem
>>> because, MMIO region is the same for all CPUs, in order to support these we
>>> need to figure out how to allocate and manage different MMIO regions for
>>> each CPUs. This can be an improvement in the future.
>>>
>>>    FAIL: nmi-after-sti
>>>    FAIL: multiple nmi
>>>
>>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>>
>>>    FAIL: TMCCT should stay at zero
>>>
>>> This error is related to APIC timer which should be addressed in separate
>>> patch.
>>>
>>> Version 9 changes,
>>
>> Hi Bui,
>>
>> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
>> see > 255 vCPU for Linux and Windows Guest.
>>
>> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
> 
> Hi Santosh,
> 
> With KVM enabled, you may be using the in kernel APIC from KVM not the emulated
> APIC in userspace as in this series.
> 

Your XTSup code isn't necessarily userspace APIC specific. You can have
accel=kvm with split irqchip and things will still work. I suspect that's how
Santosh tested it.

	Joao
Bui Quang Minh Nov. 9, 2023, 2:42 p.m. UTC | #4
On 11/9/23 21:32, Joao Martins wrote:
> On 09/11/2023 14:10, Bui Quang Minh wrote:
>> On 11/9/23 17:11, Santosh Shukla wrote:
>>> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>>>> Hi everyone,
>>>>
>>>> This series implements x2APIC mode in userspace local APIC and the
>>>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>>>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>>>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>>>> using either Intel or AMD iommu.
>>>>
>>>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>>>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>>>
>>>> Using Intel IOMMU
>>>>
>>>> qemu/build/qemu-system-x86_64 \
>>>>     -smp 2,maxcpus=260 \
>>>>     -cpu qemu64,x2apic=on \
>>>>     -machine q35 \
>>>>     -device intel-iommu,intremap=on,eim=on \
>>>>     -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>>     -m 2G \
>>>>     -kernel $KERNEL_DIR \
>>>>     -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>> net.ifnames=0" \
>>>>     -drive file=$IMAGE_DIR,format=raw \
>>>>     -nographic \
>>>>     -s
>>>>
>>>> Using AMD IOMMU
>>>>
>>>> qemu/build/qemu-system-x86_64 \
>>>>     -smp 2,maxcpus=260 \
>>>>     -cpu qemu64,x2apic=on \
>>>>     -machine q35 \
>>>>     -device amd-iommu,intremap=on,xtsup=on \
>>>>     -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>>     -m 2G \
>>>>     -kernel $KERNEL_DIR \
>>>>     -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>> net.ifnames=0" \
>>>>     -drive file=$IMAGE_DIR,format=raw \
>>>>     -nographic \
>>>>     -s
>>>>
>>>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>>>> device with this patch
>>>>
>>>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>>>> index 1734afb..f56fe1c 100644
>>>> --- a/lib/x86/fwcfg.c
>>>> +++ b/lib/x86/fwcfg.c
>>>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>>>
>>>>           if ((str = getenv("TEST_DEVICE")))
>>>>                   no_test_device = !atol(str);
>>>> +       no_test_device = true;
>>>>
>>>>           if ((str = getenv("MEMLIMIT")))
>>>>                   fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>>>
>>>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>>>> ./run_tests.sh -v -g apic
>>>>
>>>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>>>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>>>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>>>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>>>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>>>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>>>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>>>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>>>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>>>> 6 unexpected failures, 2 skipped)
>>>>
>>>>     FAIL: apic_disable: *0xfee00030: 50014
>>>>     FAIL: apic_disable: *0xfee00080: f0
>>>>     FAIL: apic_disable: *0xfee00030: 50014
>>>>     FAIL: apic_disable: *0xfee00080: f0
>>>>     FAIL: apicbase: relocate apic
>>>>
>>>> These errors are because we don't disable MMIO region when switching to
>>>> x2APIC and don't support relocate MMIO region yet. This is a problem
>>>> because, MMIO region is the same for all CPUs, in order to support these we
>>>> need to figure out how to allocate and manage different MMIO regions for
>>>> each CPUs. This can be an improvement in the future.
>>>>
>>>>     FAIL: nmi-after-sti
>>>>     FAIL: multiple nmi
>>>>
>>>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>>>
>>>>     FAIL: TMCCT should stay at zero
>>>>
>>>> This error is related to APIC timer which should be addressed in separate
>>>> patch.
>>>>
>>>> Version 9 changes,
>>>
>>> Hi Bui,
>>>
>>> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
>>> see > 255 vCPU for Linux and Windows Guest.
>>>
>>> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
>>
>> Hi Santosh,
>>
>> With KVM enabled, you may be using the in kernel APIC from KVM not the emulated
>> APIC in userspace as in this series.
>>
> 
> Your XTSup code isn't necessarily userspace APIC specific. You can have
> accel=kvm with split irqchip and things will still work. I suspect that's how
> Santosh tested it.

Ah, I got it. Thanks Santosh, Joao.
Quang Minh.
Santosh Shukla Nov. 9, 2023, 3:29 p.m. UTC | #5
On 11/9/2023 8:12 PM, Bui Quang Minh wrote:
> On 11/9/23 21:32, Joao Martins wrote:
>> On 09/11/2023 14:10, Bui Quang Minh wrote:
>>> On 11/9/23 17:11, Santosh Shukla wrote:
>>>> On 10/24/2023 8:51 PM, Bui Quang Minh wrote:
>>>>> Hi everyone,
>>>>>
>>>>> This series implements x2APIC mode in userspace local APIC and the
>>>>> RDMSR/WRMSR helper to access x2APIC registers in x2APIC mode. Intel iommu
>>>>> and AMD iommu are adjusted to support x2APIC interrupt remapping. With this
>>>>> series, we can now boot Linux kernel into x2APIC mode with TCG accelerator
>>>>> using either Intel or AMD iommu.
>>>>>
>>>>> Testing to boot my own built Linux 6.3.0-rc2, the kernel successfully boot
>>>>> with enabled x2APIC and can enumerate CPU with APIC ID 257
>>>>>
>>>>> Using Intel IOMMU
>>>>>
>>>>> qemu/build/qemu-system-x86_64 \
>>>>>     -smp 2,maxcpus=260 \
>>>>>     -cpu qemu64,x2apic=on \
>>>>>     -machine q35 \
>>>>>     -device intel-iommu,intremap=on,eim=on \
>>>>>     -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>>>     -m 2G \
>>>>>     -kernel $KERNEL_DIR \
>>>>>     -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>>> net.ifnames=0" \
>>>>>     -drive file=$IMAGE_DIR,format=raw \
>>>>>     -nographic \
>>>>>     -s
>>>>>
>>>>> Using AMD IOMMU
>>>>>
>>>>> qemu/build/qemu-system-x86_64 \
>>>>>     -smp 2,maxcpus=260 \
>>>>>     -cpu qemu64,x2apic=on \
>>>>>     -machine q35 \
>>>>>     -device amd-iommu,intremap=on,xtsup=on \
>>>>>     -device qemu64-x86_64-cpu,x2apic=on,core-id=257,socket-id=0,thread-id=0 \
>>>>>     -m 2G \
>>>>>     -kernel $KERNEL_DIR \
>>>>>     -append "nokaslr console=ttyS0 root=/dev/sda earlyprintk=serial
>>>>> net.ifnames=0" \
>>>>>     -drive file=$IMAGE_DIR,format=raw \
>>>>>     -nographic \
>>>>>     -s
>>>>>
>>>>> Testing the emulated userspace APIC with kvm-unit-tests, disable test
>>>>> device with this patch
>>>>>
>>>>> diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
>>>>> index 1734afb..f56fe1c 100644
>>>>> --- a/lib/x86/fwcfg.c
>>>>> +++ b/lib/x86/fwcfg.c
>>>>> @@ -27,6 +27,7 @@ static void read_cfg_override(void)
>>>>>
>>>>>           if ((str = getenv("TEST_DEVICE")))
>>>>>                   no_test_device = !atol(str);
>>>>> +       no_test_device = true;
>>>>>
>>>>>           if ((str = getenv("MEMLIMIT")))
>>>>>                   fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;
>>>>>
>>>>> ~ env QEMU=/home/minh/Desktop/oss/qemu/build/qemu-system-x86_64 ACCEL=tcg \
>>>>> ./run_tests.sh -v -g apic
>>>>>
>>>>> TESTNAME=apic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/apic.flat -smp 2
>>>>> -cpu qemu64,+x2apic,+tsc-deadline -machine kernel_irqchip=split FAIL
>>>>> apic-split (54 tests, 8 unexpected failures, 1 skipped)
>>>>> TESTNAME=ioapic-split TIMEOUT=90s ACCEL=tcg ./x86/run x86/ioapic.flat -smp
>>>>> 1 -cpu qemu64 -machine kernel_irqchip=split PASS ioapic-split (19 tests)
>>>>> TESTNAME=x2apic TIMEOUT=30 ACCEL=tcg ./x86/run x86/apic.flat -smp 2 -cpu
>>>>> qemu64,+x2apic,+tsc-deadline FAIL x2apic (54 tests, 8 unexpected failures,
>>>>> 1 skipped) TESTNAME=xapic TIMEOUT=60 ACCEL=tcg ./x86/run x86/apic.flat -smp
>>>>> 2 -cpu qemu64,-x2apic,+tsc-deadline -machine pit=off FAIL xapic (43 tests,
>>>>> 6 unexpected failures, 2 skipped)
>>>>>
>>>>>     FAIL: apic_disable: *0xfee00030: 50014
>>>>>     FAIL: apic_disable: *0xfee00080: f0
>>>>>     FAIL: apic_disable: *0xfee00030: 50014
>>>>>     FAIL: apic_disable: *0xfee00080: f0
>>>>>     FAIL: apicbase: relocate apic
>>>>>
>>>>> These errors are because we don't disable MMIO region when switching to
>>>>> x2APIC and don't support relocate MMIO region yet. This is a problem
>>>>> because, MMIO region is the same for all CPUs, in order to support these we
>>>>> need to figure out how to allocate and manage different MMIO regions for
>>>>> each CPUs. This can be an improvement in the future.
>>>>>
>>>>>     FAIL: nmi-after-sti
>>>>>     FAIL: multiple nmi
>>>>>
>>>>> These errors are in the way we handle CPU_INTERRUPT_NMI in core TCG.
>>>>>
>>>>>     FAIL: TMCCT should stay at zero
>>>>>
>>>>> This error is related to APIC timer which should be addressed in separate
>>>>> patch.
>>>>>
>>>>> Version 9 changes,
>>>>
>>>> Hi Bui,
>>>>
>>>> I have tested v9 on EPYC-Genoa system with kvm acceleration mode on, I could
>>>> see > 255 vCPU for Linux and Windows Guest.
>>>>
>>>> Tested-by: Santosh Shukla <Santosh.Shukla@amd.com>
>>>
>>> Hi Santosh,
>>>
>>> With KVM enabled, you may be using the in kernel APIC from KVM not the emulated
>>> APIC in userspace as in this series.
>>>
>>
>> Your XTSup code isn't necessarily userspace APIC specific. You can have
>> accel=kvm with split irqchip and things will still work. I suspect that's how
>> Santosh tested it.
> 
That's correct.

> Ah, I got it. Thanks Santosh, Joao.
> Quang Minh.
> 

Thanks,
Santosh
diff mbox

Patch

diff --git a/lib/x86/fwcfg.c b/lib/x86/fwcfg.c
index 1734afb..f56fe1c 100644
--- a/lib/x86/fwcfg.c
+++ b/lib/x86/fwcfg.c
@@ -27,6 +27,7 @@  static void read_cfg_override(void)

        if ((str = getenv("TEST_DEVICE")))
                no_test_device = !atol(str);
+       no_test_device = true;

        if ((str = getenv("MEMLIMIT")))
                fw_override[FW_CFG_MAX_RAM] = atol(str) * 1024 * 1024;