mbox series

[v1,00/40] TDX QEMU support

Message ID 20220802074750.2581308-1-xiaoyao.li@intel.com
Headers show
Series TDX QEMU support | expand

Message

Xiaoyao Li Aug. 2, 2022, 7:47 a.m. UTC
This is the first version that removes RFC tag since last RFC gots
several acked-by. Hope more people and reviewers can help review it.


This patch series aims to enable TDX support to allow creating and booting a
TD (TDX VM) with QEMU. It needs to work with corresponding KVM patch [1].
TDX related documents can be found in [2].

this series is also available in github:

https://github.com/intel/qemu-tdx/tree/tdx-qemu-upstream-v1

To boot a TDX VM, it requires several changes/additional steps in the flow:

 1. specify the vm type KVM_X86_TDX_VM when creating VM with
    IOCTL(KVM_CREATE_VM);
 2. initialize VM scope configuration before creating any VCPU;
 3. initialize VCPU scope configuration;
 4. initialize virtual firmware (TDVF) in guest private memory before
    vcpu running;

Besides, TDX VM needs to boot with TDVF (TDX virtual firmware) and currently
upstream OVMF can serve as TDVF. This series adds the support of parsing TDVF,
loading TDVF into guest's private memory and preparing TD HOB info for TDVF.

[1] KVM TDX basic feature support v7
https://lore.kernel.org/all/cover.1656366337.git.isaku.yamahata@intel.com/

[2] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html

== Limitation and future work ==
- Readonly memslot

  TDX only support readonly (write protection) memslot for shared memory, but
  not for private memory. For simplicity, just mark readonly memslot not
  supported entirely for TDX. 

- CPU model

  We cannot create a TD with arbitrary CPU model like what for non-TDX VMs,
  because only a subset of features can be configured for TD.
  
  - It's recommended to use '-cpu host' to create TD;
  - '+feature/-feature' might not work as expected;

  future work: To introduce specific CPU model for TDs and enhance +/-features
               for TDs.

- gdb suppport

  gdb support to debug a TD of off-debug mode is future work.

== Patch organization ==
1           Manually fetch Linux UAPI changes for TDX;
2-19,29-30  Basic TDX support that parses vm-type and invoke TDX
            specific IOCTLs
20-28       Load, parse and initialize TDVF for TDX VM;
31-35       Disable unsupported functions for TDX VM;
36-39       Avoid errors due to KVM's requirement on TDX;
40          Add documentation of TDX;

== Change history ==
Changes from RFC v4:
[RFC v4] https://lore.kernel.org/qemu-devel/20220512031803.3315890-1-xiaoyao.li@intel.com/

- Add 3 more patches(9, 10, 11) to improve the tdx_get_supported_cpuid();
- make attributes of object tdx-guest not settable by user;
- improve get_tdx_capabilities() by using a known starting value and
  limiting the loop with a known size;
- clarify why isa.bios needs to be skipped;
- remove the MMIO hob setup since OVMF sets them up itself;

Changes from RFC v3:
[RFC v3] https://lore.kernel.org/qemu-devel/20220317135913.2166202-1-xiaoyao.li@intel.com/

- Load TDVF with -bios interface;
- Adapt to KVM API changes;
	- KVM_TDX_CAPABILITIES changes back to KVM-scope;
	- struct kvm_tdx_init_vm changes;
- Define TDX_SUPPORTED_KVM_FEATURES;
- Drop the patch of introducing property sept-ve-disable since it's not
  public yet;
- some misc cleanups


Changes from RFC v2:
[RFC v2] https://lore.kernel.org/qemu-devel/cover.1625704980.git.isaku.yamahata@intel.com/

- Get vm-type from confidential-guest-support object type;
- Drop machine_init_done_late_notifiers;
- Refactor tdx_ioctl implementation;
- re-use existing pflash interface to load TDVF (i.e., OVMF binaries);
- introduce new date structure to track memory type instead of changing
  e820 table;
- Force smm to off for TDX VM;
- Drop the patches that suppress level-trigger/SMI/INIT/SIPI since KVM
  will ingore them;
- Add documentation;


Changes from RFC v1:
[RFC v1] https://lore.kernel.org/qemu-devel/cover.1613188118.git.isaku.yamahata@intel.com/

- suppress level trigger/SMI/INIT/SIPI related to IOAPIC.
- add VM attribute sha384 to TD measurement.
- guest TSC Hz specification


Isaku Yamahata (4):
  i386/tdvf: Introduce function to parse TDVF metadata
  i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
  hw/i386: add option to forcibly report edge trigger in acpi tables
  i386/tdx: Don't synchronize guest tsc for TDs

Sean Christopherson (2):
  i386/kvm: Move architectural CPUID leaf generation to separate helper
  i386/tdx: Don't get/put guest state for TDX VMs

Xiaoyao Li (34):
  *** HACK *** linux-headers: Update headers to pull in TDX API changes
  i386: Introduce tdx-guest object
  target/i386: Implement mc->kvm_type() to get VM type
  target/i386: Introduce kvm_confidential_guest_init()
  i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  i386/tdx: Adjust the supported CPUID based on TDX restrictions
  i386/tdx: Update tdx_fixed0/1 bits by tdx_caps.cpuid_config[]
  i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup
  i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup
  KVM: Introduce kvm_arch_pre_create_vcpu()
  i386/tdx: Initialize TDX before creating TD vcpus
  i386/tdx: Add property sept-ve-disable for tdx-guest object
  i386/tdx: Wire CPU features up with attributes of TD guest
  i386/tdx: Validate TD attributes
  i386/tdx: Implement user specified tsc frequency
  i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  i386/tdx: Parse TDVF metadata for TDX VM
  i386/tdx: Skip BIOS shadowing setup
  i386/tdx: Don't initialize pc.rom for TDX VMs
  i386/tdx: Track mem_ptr for each firmware entry of TDVF
  i386/tdx: Track RAM entries for TDX VM
  headers: Add definitions from UEFI spec for volumes, resources, etc...
  i386/tdx: Setup the TD HOB list
  i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
  i386/tdx: Finalize TDX VM
  i386/tdx: Disable SMM for TDX VMs
  i386/tdx: Disable PIC for TDX VMs
  i386/tdx: Don't allow system reset for TDX VMs
  hw/i386: add eoi_intercept_unsupported member to X86MachineState
  i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  i386/tdx: Skip kvm_put_apicbase() for TDs
  docs: Add TDX documentation

 accel/kvm/kvm-all.c                        |  21 +-
 configs/devices/i386-softmmu/default.mak   |   1 +
 docs/system/confidential-guest-support.rst |   1 +
 docs/system/i386/tdx.rst                   | 105 +++
 docs/system/target-i386.rst                |   1 +
 hw/i386/Kconfig                            |   6 +
 hw/i386/acpi-build.c                       |  99 ++-
 hw/i386/acpi-common.c                      |  50 +-
 hw/i386/meson.build                        |   1 +
 hw/i386/pc.c                               |  21 +-
 hw/i386/pc_sysfw.c                         |   7 +
 hw/i386/tdvf-hob.c                         | 146 ++++
 hw/i386/tdvf-hob.h                         |  24 +
 hw/i386/tdvf.c                             | 198 +++++
 hw/i386/x86.c                              |  35 +-
 include/hw/i386/tdvf.h                     |  58 ++
 include/hw/i386/x86.h                      |   1 +
 include/standard-headers/uefi/uefi.h       | 198 +++++
 include/sysemu/kvm.h                       |   1 +
 linux-headers/asm-x86/kvm.h                |  95 +++
 linux-headers/linux/kvm.h                  |   2 +
 qapi/qom.json                              |  14 +
 target/i386/cpu-internal.h                 |   9 +
 target/i386/cpu.c                          |  12 -
 target/i386/cpu.h                          |  21 +
 target/i386/kvm/kvm.c                      | 363 +++++----
 target/i386/kvm/kvm_i386.h                 |   6 +
 target/i386/kvm/meson.build                |   2 +
 target/i386/kvm/tdx-stub.c                 |  19 +
 target/i386/kvm/tdx.c                      | 838 +++++++++++++++++++++
 target/i386/kvm/tdx.h                      |  55 ++
 target/i386/sev.c                          |   1 -
 target/i386/sev.h                          |   2 +
 33 files changed, 2193 insertions(+), 220 deletions(-)
 create mode 100644 docs/system/i386/tdx.rst
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 include/hw/i386/tdvf.h
 create mode 100644 include/standard-headers/uefi/uefi.h
 create mode 100644 target/i386/kvm/tdx-stub.c
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

Comments

Daniel P. Berrangé Aug. 2, 2022, 9:49 a.m. UTC | #1
On Tue, Aug 02, 2022 at 03:47:10PM +0800, Xiaoyao Li wrote:
> This is the first version that removes RFC tag since last RFC gots
> several acked-by. Hope more people and reviewers can help review it.
> 
> 
> This patch series aims to enable TDX support to allow creating and booting a
> TD (TDX VM) with QEMU. It needs to work with corresponding KVM patch [1].
> TDX related documents can be found in [2].
> 
> this series is also available in github:
> 
> https://github.com/intel/qemu-tdx/tree/tdx-qemu-upstream-v1
> 
> To boot a TDX VM, it requires several changes/additional steps in the flow:
> 
>  1. specify the vm type KVM_X86_TDX_VM when creating VM with
>     IOCTL(KVM_CREATE_VM);
>  2. initialize VM scope configuration before creating any VCPU;
>  3. initialize VCPU scope configuration;
>  4. initialize virtual firmware (TDVF) in guest private memory before
>     vcpu running;
> 
> Besides, TDX VM needs to boot with TDVF (TDX virtual firmware) and currently
> upstream OVMF can serve as TDVF. This series adds the support of parsing TDVF,
> loading TDVF into guest's private memory and preparing TD HOB info for TDVF.
> 
> [1] KVM TDX basic feature support v7
> https://lore.kernel.org/all/cover.1656366337.git.isaku.yamahata@intel.com/
> 
> [2] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html
> 
> == Limitation and future work ==


> - CPU model
> 
>   We cannot create a TD with arbitrary CPU model like what for non-TDX VMs,
>   because only a subset of features can be configured for TD.
>   
>   - It's recommended to use '-cpu host' to create TD;
>   - '+feature/-feature' might not work as expected;
> 
>   future work: To introduce specific CPU model for TDs and enhance +/-features
>                for TDs.

Which features are incompatible with TDX ?

Presumably you have such a list, so that KVM can block them when
using '-cpu host' ? If so, we should be able to sanity check the
use of these features in QEMU for the named CPU models / feature
selection too.


With regards,
Daniel
Xiaoyao Li Aug. 2, 2022, 10:55 a.m. UTC | #2
On 8/2/2022 5:49 PM, Daniel P. Berrangé wrote:
> On Tue, Aug 02, 2022 at 03:47:10PM +0800, Xiaoyao Li wrote:

>> - CPU model
>>
>>    We cannot create a TD with arbitrary CPU model like what for non-TDX VMs,
>>    because only a subset of features can be configured for TD.
>>    
>>    - It's recommended to use '-cpu host' to create TD;
>>    - '+feature/-feature' might not work as expected;
>>
>>    future work: To introduce specific CPU model for TDs and enhance +/-features
>>                 for TDs.
> 
> Which features are incompatible with TDX ?

TDX enforces some features fixed to 1 (e.g., CPUID_EXT_X2APIC, 
CPUID_EXT_HYPERVISOR)and some fixed to 0 (e.g., CPUID_EXT_VMX ).

Details can be found in patch 8 and TDX spec chapter "CPUID virtualization"

> Presumably you have such a list, so that KVM can block them when
> using '-cpu host' ? 

No, KVM doesn't do this. The result is no error reported from KVM but 
what TD OS sees from CPUID might be different what user specifies in QEMU.

> If so, we should be able to sanity check the
> use of these features in QEMU for the named CPU models / feature
> selection too.

This series enhances get_supported_cpuid() for TDX. If named CPU models 
are used to boot a TDX guest, it likely gets warning of "xxx feature is 
not available"

We have another series to enhance the "-feature" for TDX, to warn out if 
some fixed1 is specified to be removed. Besides, we will introduce 
specific named CPU model for TDX. e.g., TDX-SapphireRapids which 
contains the maximum feature set a TDX guest can have on SPR host.

> 
> With regards,
> Daniel
Daniel P. Berrangé Aug. 3, 2022, 5:44 p.m. UTC | #3
On Tue, Aug 02, 2022 at 06:55:48PM +0800, Xiaoyao Li wrote:
> On 8/2/2022 5:49 PM, Daniel P. Berrangé wrote:
> > On Tue, Aug 02, 2022 at 03:47:10PM +0800, Xiaoyao Li wrote:
> 
> > > - CPU model
> > > 
> > >    We cannot create a TD with arbitrary CPU model like what for non-TDX VMs,
> > >    because only a subset of features can be configured for TD.
> > >    - It's recommended to use '-cpu host' to create TD;
> > >    - '+feature/-feature' might not work as expected;
> > > 
> > >    future work: To introduce specific CPU model for TDs and enhance +/-features
> > >                 for TDs.
> > 
> > Which features are incompatible with TDX ?
> 
> TDX enforces some features fixed to 1 (e.g., CPUID_EXT_X2APIC,
> CPUID_EXT_HYPERVISOR)and some fixed to 0 (e.g., CPUID_EXT_VMX ).
> 
> Details can be found in patch 8 and TDX spec chapter "CPUID virtualization"
> 
> > Presumably you have such a list, so that KVM can block them when
> > using '-cpu host' ?
> 
> No, KVM doesn't do this. The result is no error reported from KVM but what
> TD OS sees from CPUID might be different what user specifies in QEMU.
> 
> > If so, we should be able to sanity check the
> > use of these features in QEMU for the named CPU models / feature
> > selection too.
> 
> This series enhances get_supported_cpuid() for TDX. If named CPU models are
> used to boot a TDX guest, it likely gets warning of "xxx feature is not
> available"

If the  ',check=on' arg is given to -cpu, does it ensure that the
guest fails to startup with an incompatible feature set ? That's
really the key thing to protect the user from mistakes.


> We have another series to enhance the "-feature" for TDX, to warn out if
> some fixed1 is specified to be removed. Besides, we will introduce specific
> named CPU model for TDX. e.g., TDX-SapphireRapids which contains the maximum
> feature set a TDX guest can have on SPR host.

I don't know if this is the right approach or not, but we should at least
consider making use of CPU versioning here.  ie have a single "SapphireRapids"
alias, which resolves to a suitable specific CPU version depending on whether
TDX is used or not.

With regards,
Daniel
Xiaoyao Li Aug. 5, 2022, 12:16 a.m. UTC | #4
On 8/4/2022 1:44 AM, Daniel P. Berrangé wrote:
> On Tue, Aug 02, 2022 at 06:55:48PM +0800, Xiaoyao Li wrote:
>> On 8/2/2022 5:49 PM, Daniel P. Berrangé wrote:
>>> On Tue, Aug 02, 2022 at 03:47:10PM +0800, Xiaoyao Li wrote:
>>
>>>> - CPU model
>>>>
>>>>     We cannot create a TD with arbitrary CPU model like what for non-TDX VMs,
>>>>     because only a subset of features can be configured for TD.
>>>>     - It's recommended to use '-cpu host' to create TD;
>>>>     - '+feature/-feature' might not work as expected;
>>>>
>>>>     future work: To introduce specific CPU model for TDs and enhance +/-features
>>>>                  for TDs.
>>>
>>> Which features are incompatible with TDX ?
>>
>> TDX enforces some features fixed to 1 (e.g., CPUID_EXT_X2APIC,
>> CPUID_EXT_HYPERVISOR)and some fixed to 0 (e.g., CPUID_EXT_VMX ).
>>
>> Details can be found in patch 8 and TDX spec chapter "CPUID virtualization"
>>
>>> Presumably you have such a list, so that KVM can block them when
>>> using '-cpu host' ?
>>
>> No, KVM doesn't do this. The result is no error reported from KVM but what
>> TD OS sees from CPUID might be different what user specifies in QEMU.
>>
>>> If so, we should be able to sanity check the
>>> use of these features in QEMU for the named CPU models / feature
>>> selection too.
>>
>> This series enhances get_supported_cpuid() for TDX. If named CPU models are
>> used to boot a TDX guest, it likely gets warning of "xxx feature is not
>> available"
> 
> If the  ',check=on' arg is given to -cpu, does it ensure that the
> guest fails to startup with an incompatible feature set ? That's
> really the key thing to protect the user from mistakes.

"check=on" won't stop startup with an incompatible feature set but 
"enforce=on". Yes, this series can ensure it with "enforce=on"

> 
>> We have another series to enhance the "-feature" for TDX, to warn out if
>> some fixed1 is specified to be removed. Besides, we will introduce specific
>> named CPU model for TDX. e.g., TDX-SapphireRapids which contains the maximum
>> feature set a TDX guest can have on SPR host.
> 
> I don't know if this is the right approach or not, but we should at least
> consider making use of CPU versioning here.  ie have a single "SapphireRapids"
> alias, which resolves to a suitable specific CPU version depending on whether
> TDX is used or not.

New version of a CPU model inherits from the last version. This fits 
well with CPU model fixup when features need to be removed/added to 
existing CPU model to make it work well with the latest kernel, and a 
new version is created.

However, I think it less proper to define a TDX variant with versioned- 
cpu model. For example, we have a SPR-V(x), then we need to define 
SPR-V(x+1) and alias it as SPR-TDX. For SPR-V(x+1), we need to add and 
remove several features. In the future, we may need a SPR-V(x+2) to fix 
up the normal SPR cpu model SPR-V(x). All the changes in V(x+1)/SPR-TDX 
  has to be reverted at first.

Anyway, we can discuss it in the future when we post the series of TDX 
CPU model. We plan to do that after this basic series gets merged. :)

> With regards,
> Daniel
Xiaoyao Li Sept. 5, 2022, 12:58 a.m. UTC | #5
Hi Gerd

On 8/2/2022 3:47 PM, Xiaoyao Li wrote:
..
> == Change history ==
> Changes from RFC v4:
> [RFC v4] https://lore.kernel.org/qemu-devel/20220512031803.3315890-1-xiaoyao.li@intel.com/
> 
> - Add 3 more patches(9, 10, 11) to improve the tdx_get_supported_cpuid();

Patch 8-11 are the only left ones that don't get your Acked-by. Do you 
have any comment on them?