Message ID | 20240229063726.610065-26-xiaoyao.li@intel.com |
---|---|
State | New |
Headers | show |
Series | QEMU Guest memfd + QEMU TDX support | expand |
Copying Zhenzhong Duan as my point relates to the proposed libvirt TDX patches. On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: > Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables > EPT violation conversion to #VE on guest TD access of PENDING pages. > > Some guest OS (e.g., Linux TD guest) may require this bit as 1. > Otherwise refuse to boot. > > Add sept-ve-disable property for tdx-guest object, for user to configure > this bit. > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> > Acked-by: Gerd Hoffmann <kraxel@redhat.com> > Acked-by: Markus Armbruster <armbru@redhat.com> > --- > Changes in v4: > - collect Acked-by from Markus > > Changes in v3: > - update the comment of property @sept-ve-disable to make it more > descriptive and use new format. (Daniel and Markus) > --- > qapi/qom.json | 7 ++++++- > target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ > 2 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/qapi/qom.json b/qapi/qom.json > index 220cc6c98d4b..89ed89b9b46e 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -900,10 +900,15 @@ > # > # Properties for tdx-guest objects. > # > +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling > +# of EPT violation conversion to #VE on guest TD access of PENDING > +# pages. Some guest OS (e.g., Linux TD guest) may require this to > +# be set, otherwise they refuse to boot. > +# > # Since: 9.0 > ## > { 'struct': 'TdxGuestProperties', > - 'data': { }} > + 'data': { '*sept-ve-disable': 'bool' } } So this exposes a single boolean property that gets mapped into one specific bit in the TD attributes: > + > +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp) > +{ > + TdxGuest *tdx = TDX_GUEST(obj); > + > + if (value) { > + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > + } else { > + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > + } > +} If I look at the documentation for TD attributes https://download.01.org/intel-sgx/latest/dcap-latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf Section "A.3.4. TD Attributes" I see "TD attributes" is a 64-bit int, with 5 bits currently defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", and the rest currently reserved for future use. This makes me wonder about our modelling approach into the future ? For the AMD SEV equivalent we've just directly exposed the whole field as an int: 'policy' : 'uint32', For the proposed SEV-SNP patches, the same has been done again https://lists.nongnu.org/archive/html/qemu-devel/2024-06/msg00536.html '*policy': 'uint64', The advantage of exposing individual booleans is that it is self-documenting at the QAPI level, but the disadvantage is that every time we want to expose ability to control a new bit in the policy we have to modify QEMU, libvirt, the mgmt app above libvirt, and whatever tools the end user has to talk to the mgmt app. If we expose a policy int, then newly defined bits only require a change in QEMU, and everything above QEMU will already be capable of setting it. In fact if I look at the proposed libvirt patches, they have proposed just exposing a policy "int" field in the XML, which then has to be unpacked to set the individual QAPI booleans https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWXEESYUA77DP7YIBP55T2OPSVKV5QW/ On balance, I think it would be better if QEMU just exposed the raw TD attributes policy as an uint64 at QAPI, instead of trying to unpack it to discrete bool fields. This gives consistency with SEV and SEV-SNP, and with what's proposed at the libvirt level, and minimizes future changes when more policy bits are defined. > + > /* tdx guest */ > OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest, > tdx_guest, > @@ -529,6 +549,10 @@ static void tdx_guest_init(Object *obj) > qemu_mutex_init(&tdx->lock); > > tdx->attributes = 0; > + > + object_property_add_bool(obj, "sept-ve-disable", > + tdx_guest_get_sept_ve_disable, > + tdx_guest_set_sept_ve_disable); > } > > static void tdx_guest_finalize(Object *obj) > -- > 2.34.1 > With regards, Daniel
On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: > Copying Zhenzhong Duan as my point relates to the proposed libvirt > TDX patches. > > On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: >> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables >> EPT violation conversion to #VE on guest TD access of PENDING pages. >> >> Some guest OS (e.g., Linux TD guest) may require this bit as 1. >> Otherwise refuse to boot. >> >> Add sept-ve-disable property for tdx-guest object, for user to configure >> this bit. >> >> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> >> Acked-by: Gerd Hoffmann <kraxel@redhat.com> >> Acked-by: Markus Armbruster <armbru@redhat.com> >> --- >> Changes in v4: >> - collect Acked-by from Markus >> >> Changes in v3: >> - update the comment of property @sept-ve-disable to make it more >> descriptive and use new format. (Daniel and Markus) >> --- >> qapi/qom.json | 7 ++++++- >> target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ >> 2 files changed, 30 insertions(+), 1 deletion(-) >> >> diff --git a/qapi/qom.json b/qapi/qom.json >> index 220cc6c98d4b..89ed89b9b46e 100644 >> --- a/qapi/qom.json >> +++ b/qapi/qom.json >> @@ -900,10 +900,15 @@ >> # >> # Properties for tdx-guest objects. >> # >> +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling >> +# of EPT violation conversion to #VE on guest TD access of PENDING >> +# pages. Some guest OS (e.g., Linux TD guest) may require this to >> +# be set, otherwise they refuse to boot. >> +# >> # Since: 9.0 >> ## >> { 'struct': 'TdxGuestProperties', >> - 'data': { }} >> + 'data': { '*sept-ve-disable': 'bool' } } > > So this exposes a single boolean property that gets mapped into one > specific bit in the TD attributes: > >> + >> +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp) >> +{ >> + TdxGuest *tdx = TDX_GUEST(obj); >> + >> + if (value) { >> + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >> + } else { >> + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >> + } >> +} > > If I look at the documentation for TD attributes > > https://download.01.org/intel-sgx/latest/dcap-latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf > > Section "A.3.4. TD Attributes" > > I see "TD attributes" is a 64-bit int, with 5 bits currently > defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", > and the rest currently reserved for future use. This makes me > wonder about our modelling approach into the future ? > > For the AMD SEV equivalent we've just directly exposed the whole > field as an int: > > 'policy' : 'uint32', > > For the proposed SEV-SNP patches, the same has been done again > > https://lists.nongnu.org/archive/html/qemu-devel/2024-06/msg00536.html > > '*policy': 'uint64', > > > The advantage of exposing individual booleans is that it is > self-documenting at the QAPI level, but the disadvantage is > that every time we want to expose ability to control a new > bit in the policy we have to modify QEMU, libvirt, the mgmt > app above libvirt, and whatever tools the end user has to > talk to the mgmt app. > > If we expose a policy int, then newly defined bits only require > a change in QEMU, and everything above QEMU will already be > capable of setting it. > > In fact if I look at the proposed libvirt patches, they have > proposed just exposing a policy "int" field in the XML, which > then has to be unpacked to set the individual QAPI booleans > > https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWXEESYUA77DP7YIBP55T2OPSVKV5QW/ > > On balance, I think it would be better if QEMU just exposed > the raw TD attributes policy as an uint64 at QAPI, instead > of trying to unpack it to discrete bool fields. This gives > consistency with SEV and SEV-SNP, and with what's proposed > at the libvirt level, and minimizes future changes when > more policy bits are defined. The reasons why introducing individual bit of sept-ve-disable instead of a raw TD attribute as a whole are that 1. other bits like perfmon, PKS, KL are associated with cpu properties, e.g., perfmon -> pmu, pks -> pks, kl -> keylokcer feature that QEMU currently doesn't support If allowing configuring attribute directly, we need to deal with the inconsistence between attribute vs cpu property. 2. people need to know the exact bit position of each attribute. I don't think it is a user-friendly interface to require user to be aware of such details. For example, if user wants to create a Debug TD, user just needs to set 'debug=on' for tdx-guest object. It's much more friendly than that user needs to set the bit 0 of the attribute. >> + >> /* tdx guest */ >> OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest, >> tdx_guest, >> @@ -529,6 +549,10 @@ static void tdx_guest_init(Object *obj) >> qemu_mutex_init(&tdx->lock); >> >> tdx->attributes = 0; >> + >> + object_property_add_bool(obj, "sept-ve-disable", >> + tdx_guest_get_sept_ve_disable, >> + tdx_guest_set_sept_ve_disable); >> } >> >> static void tdx_guest_finalize(Object *obj) >> -- >> 2.34.1 >> > > With regards, > Daniel
>-----Original Message----- >From: Li, Xiaoyao <xiaoyao.li@intel.com> >Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for >tdx-guest object > >On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: >> Copying Zhenzhong Duan as my point relates to the proposed libvirt >> TDX patches. >> >> On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: >>> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it >disables >>> EPT violation conversion to #VE on guest TD access of PENDING pages. >>> >>> Some guest OS (e.g., Linux TD guest) may require this bit as 1. >>> Otherwise refuse to boot. >>> >>> Add sept-ve-disable property for tdx-guest object, for user to configure >>> this bit. >>> >>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> >>> Acked-by: Gerd Hoffmann <kraxel@redhat.com> >>> Acked-by: Markus Armbruster <armbru@redhat.com> >>> --- >>> Changes in v4: >>> - collect Acked-by from Markus >>> >>> Changes in v3: >>> - update the comment of property @sept-ve-disable to make it more >>> descriptive and use new format. (Daniel and Markus) >>> --- >>> qapi/qom.json | 7 ++++++- >>> target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ >>> 2 files changed, 30 insertions(+), 1 deletion(-) >>> >>> diff --git a/qapi/qom.json b/qapi/qom.json >>> index 220cc6c98d4b..89ed89b9b46e 100644 >>> --- a/qapi/qom.json >>> +++ b/qapi/qom.json >>> @@ -900,10 +900,15 @@ >>> # >>> # Properties for tdx-guest objects. >>> # >>> +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling >>> +# of EPT violation conversion to #VE on guest TD access of PENDING >>> +# pages. Some guest OS (e.g., Linux TD guest) may require this to >>> +# be set, otherwise they refuse to boot. >>> +# >>> # Since: 9.0 >>> ## >>> { 'struct': 'TdxGuestProperties', >>> - 'data': { }} >>> + 'data': { '*sept-ve-disable': 'bool' } } >> >> So this exposes a single boolean property that gets mapped into one >> specific bit in the TD attributes: >> >>> + >>> +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error >**errp) >>> +{ >>> + TdxGuest *tdx = TDX_GUEST(obj); >>> + >>> + if (value) { >>> + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>> + } else { >>> + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>> + } >>> +} >> >> If I look at the documentation for TD attributes >> >> https://download.01.org/intel-sgx/latest/dcap- >latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf >> >> Section "A.3.4. TD Attributes" >> >> I see "TD attributes" is a 64-bit int, with 5 bits currently >> defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", >> and the rest currently reserved for future use. This makes me >> wonder about our modelling approach into the future ? >> >> For the AMD SEV equivalent we've just directly exposed the whole >> field as an int: >> >> 'policy' : 'uint32', >> >> For the proposed SEV-SNP patches, the same has been done again >> >> https://lists.nongnu.org/archive/html/qemu-devel/2024- >06/msg00536.html >> >> '*policy': 'uint64', >> >> >> The advantage of exposing individual booleans is that it is >> self-documenting at the QAPI level, but the disadvantage is >> that every time we want to expose ability to control a new >> bit in the policy we have to modify QEMU, libvirt, the mgmt >> app above libvirt, and whatever tools the end user has to >> talk to the mgmt app. >> >> If we expose a policy int, then newly defined bits only require >> a change in QEMU, and everything above QEMU will already be >> capable of setting it. >> >> In fact if I look at the proposed libvirt patches, they have >> proposed just exposing a policy "int" field in the XML, which >> then has to be unpacked to set the individual QAPI booleans >> >> >https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWX >EESYUA77DP7YIBP55T2OPSVKV5QW/ >> >> On balance, I think it would be better if QEMU just exposed >> the raw TD attributes policy as an uint64 at QAPI, instead >> of trying to unpack it to discrete bool fields. This gives >> consistency with SEV and SEV-SNP, and with what's proposed >> at the libvirt level, and minimizes future changes when >> more policy bits are defined. > >The reasons why introducing individual bit of sept-ve-disable instead of >a raw TD attribute as a whole are that > >1. other bits like perfmon, PKS, KL are associated with cpu properties, >e.g., > > perfmon -> pmu, > pks -> pks, > kl -> keylokcer feature that QEMU currently doesn't support > >If allowing configuring attribute directly, we need to deal with the >inconsistence between attribute vs cpu property. What about defining those bits associated with cpu properties reserved But other bits work as Daniel suggested way. Thanks Zhenzhong > >2. people need to know the exact bit position of each attribute. I don't >think it is a user-friendly interface to require user to be aware of >such details. > >For example, if user wants to create a Debug TD, user just needs to set >'debug=on' for tdx-guest object. It's much more friendly than that user >needs to set the bit 0 of the attribute. > > >>> + >>> /* tdx guest */ >>> OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest, >>> tdx_guest, >>> @@ -529,6 +549,10 @@ static void tdx_guest_init(Object *obj) >>> qemu_mutex_init(&tdx->lock); >>> >>> tdx->attributes = 0; >>> + >>> + object_property_add_bool(obj, "sept-ve-disable", >>> + tdx_guest_get_sept_ve_disable, >>> + tdx_guest_set_sept_ve_disable); >>> } >>> >>> static void tdx_guest_finalize(Object *obj) >>> -- >>> 2.34.1 >>> >> >> With regards, >> Daniel
On 6/13/2024 4:35 PM, Duan, Zhenzhong wrote: > > >> -----Original Message----- >> From: Li, Xiaoyao <xiaoyao.li@intel.com> >> Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for >> tdx-guest object >> >> On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: >>> Copying Zhenzhong Duan as my point relates to the proposed libvirt >>> TDX patches. >>> >>> On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: >>>> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it >> disables >>>> EPT violation conversion to #VE on guest TD access of PENDING pages. >>>> >>>> Some guest OS (e.g., Linux TD guest) may require this bit as 1. >>>> Otherwise refuse to boot. >>>> >>>> Add sept-ve-disable property for tdx-guest object, for user to configure >>>> this bit. >>>> >>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> >>>> Acked-by: Gerd Hoffmann <kraxel@redhat.com> >>>> Acked-by: Markus Armbruster <armbru@redhat.com> >>>> --- >>>> Changes in v4: >>>> - collect Acked-by from Markus >>>> >>>> Changes in v3: >>>> - update the comment of property @sept-ve-disable to make it more >>>> descriptive and use new format. (Daniel and Markus) >>>> --- >>>> qapi/qom.json | 7 ++++++- >>>> target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ >>>> 2 files changed, 30 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/qapi/qom.json b/qapi/qom.json >>>> index 220cc6c98d4b..89ed89b9b46e 100644 >>>> --- a/qapi/qom.json >>>> +++ b/qapi/qom.json >>>> @@ -900,10 +900,15 @@ >>>> # >>>> # Properties for tdx-guest objects. >>>> # >>>> +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling >>>> +# of EPT violation conversion to #VE on guest TD access of PENDING >>>> +# pages. Some guest OS (e.g., Linux TD guest) may require this to >>>> +# be set, otherwise they refuse to boot. >>>> +# >>>> # Since: 9.0 >>>> ## >>>> { 'struct': 'TdxGuestProperties', >>>> - 'data': { }} >>>> + 'data': { '*sept-ve-disable': 'bool' } } >>> >>> So this exposes a single boolean property that gets mapped into one >>> specific bit in the TD attributes: >>> >>>> + >>>> +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error >> **errp) >>>> +{ >>>> + TdxGuest *tdx = TDX_GUEST(obj); >>>> + >>>> + if (value) { >>>> + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>>> + } else { >>>> + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>>> + } >>>> +} >>> >>> If I look at the documentation for TD attributes >>> >>> https://download.01.org/intel-sgx/latest/dcap- >> latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf >>> >>> Section "A.3.4. TD Attributes" >>> >>> I see "TD attributes" is a 64-bit int, with 5 bits currently >>> defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", >>> and the rest currently reserved for future use. This makes me >>> wonder about our modelling approach into the future ? >>> >>> For the AMD SEV equivalent we've just directly exposed the whole >>> field as an int: >>> >>> 'policy' : 'uint32', >>> >>> For the proposed SEV-SNP patches, the same has been done again >>> >>> https://lists.nongnu.org/archive/html/qemu-devel/2024- >> 06/msg00536.html >>> >>> '*policy': 'uint64', >>> >>> >>> The advantage of exposing individual booleans is that it is >>> self-documenting at the QAPI level, but the disadvantage is >>> that every time we want to expose ability to control a new >>> bit in the policy we have to modify QEMU, libvirt, the mgmt >>> app above libvirt, and whatever tools the end user has to >>> talk to the mgmt app. >>> >>> If we expose a policy int, then newly defined bits only require >>> a change in QEMU, and everything above QEMU will already be >>> capable of setting it. >>> >>> In fact if I look at the proposed libvirt patches, they have >>> proposed just exposing a policy "int" field in the XML, which >>> then has to be unpacked to set the individual QAPI booleans >>> >>> >> https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWX >> EESYUA77DP7YIBP55T2OPSVKV5QW/ >>> >>> On balance, I think it would be better if QEMU just exposed >>> the raw TD attributes policy as an uint64 at QAPI, instead >>> of trying to unpack it to discrete bool fields. This gives >>> consistency with SEV and SEV-SNP, and with what's proposed >>> at the libvirt level, and minimizes future changes when >>> more policy bits are defined. >> >> The reasons why introducing individual bit of sept-ve-disable instead of >> a raw TD attribute as a whole are that >> >> 1. other bits like perfmon, PKS, KL are associated with cpu properties, >> e.g., >> >> perfmon -> pmu, >> pks -> pks, >> kl -> keylokcer feature that QEMU currently doesn't support >> >> If allowing configuring attribute directly, we need to deal with the >> inconsistence between attribute vs cpu property. > > What about defining those bits associated with cpu properties reserved > But other bits work as Daniel suggested way. I don't understand. Do you mean we provide the interface to configure raw 64 bit attributes while some bits of it are reserved? > Thanks > Zhenzhong > >> >> 2. people need to know the exact bit position of each attribute. I don't >> think it is a user-friendly interface to require user to be aware of >> such details. >> >> For example, if user wants to create a Debug TD, user just needs to set >> 'debug=on' for tdx-guest object. It's much more friendly than that user >> needs to set the bit 0 of the attribute. >> >> >>>> + >>>> /* tdx guest */ >>>> OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest, >>>> tdx_guest, >>>> @@ -529,6 +549,10 @@ static void tdx_guest_init(Object *obj) >>>> qemu_mutex_init(&tdx->lock); >>>> >>>> tdx->attributes = 0; >>>> + >>>> + object_property_add_bool(obj, "sept-ve-disable", >>>> + tdx_guest_get_sept_ve_disable, >>>> + tdx_guest_set_sept_ve_disable); >>>> } >>>> >>>> static void tdx_guest_finalize(Object *obj) >>>> -- >>>> 2.34.1 >>>> >>> >>> With regards, >>> Daniel >
>-----Original Message----- >From: Li, Xiaoyao <xiaoyao.li@intel.com> >Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for >tdx-guest object > >On 6/13/2024 4:35 PM, Duan, Zhenzhong wrote: >> >> >>> -----Original Message----- >>> From: Li, Xiaoyao <xiaoyao.li@intel.com> >>> Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for >>> tdx-guest object >>> >>> On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: >>>> Copying Zhenzhong Duan as my point relates to the proposed libvirt >>>> TDX patches. >>>> >>>> On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: >>>>> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it >>> disables >>>>> EPT violation conversion to #VE on guest TD access of PENDING pages. >>>>> >>>>> Some guest OS (e.g., Linux TD guest) may require this bit as 1. >>>>> Otherwise refuse to boot. >>>>> >>>>> Add sept-ve-disable property for tdx-guest object, for user to configure >>>>> this bit. >>>>> >>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> >>>>> Acked-by: Gerd Hoffmann <kraxel@redhat.com> >>>>> Acked-by: Markus Armbruster <armbru@redhat.com> >>>>> --- >>>>> Changes in v4: >>>>> - collect Acked-by from Markus >>>>> >>>>> Changes in v3: >>>>> - update the comment of property @sept-ve-disable to make it more >>>>> descriptive and use new format. (Daniel and Markus) >>>>> --- >>>>> qapi/qom.json | 7 ++++++- >>>>> target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ >>>>> 2 files changed, 30 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/qapi/qom.json b/qapi/qom.json >>>>> index 220cc6c98d4b..89ed89b9b46e 100644 >>>>> --- a/qapi/qom.json >>>>> +++ b/qapi/qom.json >>>>> @@ -900,10 +900,15 @@ >>>>> # >>>>> # Properties for tdx-guest objects. >>>>> # >>>>> +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling >>>>> +# of EPT violation conversion to #VE on guest TD access of PENDING >>>>> +# pages. Some guest OS (e.g., Linux TD guest) may require this to >>>>> +# be set, otherwise they refuse to boot. >>>>> +# >>>>> # Since: 9.0 >>>>> ## >>>>> { 'struct': 'TdxGuestProperties', >>>>> - 'data': { }} >>>>> + 'data': { '*sept-ve-disable': 'bool' } } >>>> >>>> So this exposes a single boolean property that gets mapped into one >>>> specific bit in the TD attributes: >>>> >>>>> + >>>>> +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, >Error >>> **errp) >>>>> +{ >>>>> + TdxGuest *tdx = TDX_GUEST(obj); >>>>> + >>>>> + if (value) { >>>>> + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>>>> + } else { >>>>> + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>>>> + } >>>>> +} >>>> >>>> If I look at the documentation for TD attributes >>>> >>>> https://download.01.org/intel-sgx/latest/dcap- >>> latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf >>>> >>>> Section "A.3.4. TD Attributes" >>>> >>>> I see "TD attributes" is a 64-bit int, with 5 bits currently >>>> defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", >>>> and the rest currently reserved for future use. This makes me >>>> wonder about our modelling approach into the future ? >>>> >>>> For the AMD SEV equivalent we've just directly exposed the whole >>>> field as an int: >>>> >>>> 'policy' : 'uint32', >>>> >>>> For the proposed SEV-SNP patches, the same has been done again >>>> >>>> https://lists.nongnu.org/archive/html/qemu-devel/2024- >>> 06/msg00536.html >>>> >>>> '*policy': 'uint64', >>>> >>>> >>>> The advantage of exposing individual booleans is that it is >>>> self-documenting at the QAPI level, but the disadvantage is >>>> that every time we want to expose ability to control a new >>>> bit in the policy we have to modify QEMU, libvirt, the mgmt >>>> app above libvirt, and whatever tools the end user has to >>>> talk to the mgmt app. >>>> >>>> If we expose a policy int, then newly defined bits only require >>>> a change in QEMU, and everything above QEMU will already be >>>> capable of setting it. >>>> >>>> In fact if I look at the proposed libvirt patches, they have >>>> proposed just exposing a policy "int" field in the XML, which >>>> then has to be unpacked to set the individual QAPI booleans >>>> >>>> >>> >https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWX >>> EESYUA77DP7YIBP55T2OPSVKV5QW/ >>>> >>>> On balance, I think it would be better if QEMU just exposed >>>> the raw TD attributes policy as an uint64 at QAPI, instead >>>> of trying to unpack it to discrete bool fields. This gives >>>> consistency with SEV and SEV-SNP, and with what's proposed >>>> at the libvirt level, and minimizes future changes when >>>> more policy bits are defined. >>> >>> The reasons why introducing individual bit of sept-ve-disable instead of >>> a raw TD attribute as a whole are that >>> >>> 1. other bits like perfmon, PKS, KL are associated with cpu properties, >>> e.g., >>> >>> perfmon -> pmu, >>> pks -> pks, >>> kl -> keylokcer feature that QEMU currently doesn't support >>> >>> If allowing configuring attribute directly, we need to deal with the >>> inconsistence between attribute vs cpu property. >> >> What about defining those bits associated with cpu properties reserved >> But other bits work as Daniel suggested way. > >I don't understand. Do you mean we provide the interface to configure >raw 64 bit attributes while some bits of it are reserved? Yes, qemu provide raw 64bit attribute but ignore those cpuid related bits to avoid config conflict. You can still provide sept-ve-disable and debug properties in qemu if you want, But Libvirt will not use them, it will use the raw 64bit attribute. Thanks Zhenzhong > >> Thanks >> Zhenzhong >> >>> >>> 2. people need to know the exact bit position of each attribute. I don't >>> think it is a user-friendly interface to require user to be aware of >>> such details. >>> >>> For example, if user wants to create a Debug TD, user just needs to set >>> 'debug=on' for tdx-guest object. It's much more friendly than that user >>> needs to set the bit 0 of the attribute. >>> >>> >>>>> + >>>>> /* tdx guest */ >>>>> OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest, >>>>> tdx_guest, >>>>> @@ -529,6 +549,10 @@ static void tdx_guest_init(Object *obj) >>>>> qemu_mutex_init(&tdx->lock); >>>>> >>>>> tdx->attributes = 0; >>>>> + >>>>> + object_property_add_bool(obj, "sept-ve-disable", >>>>> + tdx_guest_get_sept_ve_disable, >>>>> + tdx_guest_set_sept_ve_disable); >>>>> } >>>>> >>>>> static void tdx_guest_finalize(Object *obj) >>>>> -- >>>>> 2.34.1 >>>>> >>>> >>>> With regards, >>>> Daniel >>
On Fri, Jun 14, 2024 at 09:04:33AM +0800, Xiaoyao Li wrote: > On 6/13/2024 4:35 PM, Duan, Zhenzhong wrote: > > > > > > > -----Original Message----- > > > From: Li, Xiaoyao <xiaoyao.li@intel.com> > > > Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for > > > tdx-guest object > > > > > > On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: > > > > Copying Zhenzhong Duan as my point relates to the proposed libvirt > > > > TDX patches. > > > > > > > > On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: > > > > > Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it > > > disables > > > > > EPT violation conversion to #VE on guest TD access of PENDING pages. > > > > > > > > > > Some guest OS (e.g., Linux TD guest) may require this bit as 1. > > > > > Otherwise refuse to boot. > > > > > > > > > > Add sept-ve-disable property for tdx-guest object, for user to configure > > > > > this bit. > > > > > > > > > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> > > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com> > > > > > Acked-by: Markus Armbruster <armbru@redhat.com> > > > > > --- > > > > > Changes in v4: > > > > > - collect Acked-by from Markus > > > > > > > > > > Changes in v3: > > > > > - update the comment of property @sept-ve-disable to make it more > > > > > descriptive and use new format. (Daniel and Markus) > > > > > --- > > > > > qapi/qom.json | 7 ++++++- > > > > > target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ > > > > > 2 files changed, 30 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/qapi/qom.json b/qapi/qom.json > > > > > index 220cc6c98d4b..89ed89b9b46e 100644 > > > > > --- a/qapi/qom.json > > > > > +++ b/qapi/qom.json > > > > > @@ -900,10 +900,15 @@ > > > > > # > > > > > # Properties for tdx-guest objects. > > > > > # > > > > > +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling > > > > > +# of EPT violation conversion to #VE on guest TD access of PENDING > > > > > +# pages. Some guest OS (e.g., Linux TD guest) may require this to > > > > > +# be set, otherwise they refuse to boot. > > > > > +# > > > > > # Since: 9.0 > > > > > ## > > > > > { 'struct': 'TdxGuestProperties', > > > > > - 'data': { }} > > > > > + 'data': { '*sept-ve-disable': 'bool' } } > > > > > > > > So this exposes a single boolean property that gets mapped into one > > > > specific bit in the TD attributes: > > > > > > > > > + > > > > > +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error > > > **errp) > > > > > +{ > > > > > + TdxGuest *tdx = TDX_GUEST(obj); > > > > > + > > > > > + if (value) { > > > > > + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > > > > > + } else { > > > > > + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > > > > > + } > > > > > +} > > > > > > > > If I look at the documentation for TD attributes > > > > > > > > https://download.01.org/intel-sgx/latest/dcap- > > > latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf > > > > > > > > Section "A.3.4. TD Attributes" > > > > > > > > I see "TD attributes" is a 64-bit int, with 5 bits currently > > > > defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", > > > > and the rest currently reserved for future use. This makes me > > > > wonder about our modelling approach into the future ? > > > > > > > > For the AMD SEV equivalent we've just directly exposed the whole > > > > field as an int: > > > > > > > > 'policy' : 'uint32', > > > > > > > > For the proposed SEV-SNP patches, the same has been done again > > > > > > > > https://lists.nongnu.org/archive/html/qemu-devel/2024- > > > 06/msg00536.html > > > > > > > > '*policy': 'uint64', > > > > > > > > > > > > The advantage of exposing individual booleans is that it is > > > > self-documenting at the QAPI level, but the disadvantage is > > > > that every time we want to expose ability to control a new > > > > bit in the policy we have to modify QEMU, libvirt, the mgmt > > > > app above libvirt, and whatever tools the end user has to > > > > talk to the mgmt app. > > > > > > > > If we expose a policy int, then newly defined bits only require > > > > a change in QEMU, and everything above QEMU will already be > > > > capable of setting it. > > > > > > > > In fact if I look at the proposed libvirt patches, they have > > > > proposed just exposing a policy "int" field in the XML, which > > > > then has to be unpacked to set the individual QAPI booleans > > > > > > > > > > > https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWX > > > EESYUA77DP7YIBP55T2OPSVKV5QW/ > > > > > > > > On balance, I think it would be better if QEMU just exposed > > > > the raw TD attributes policy as an uint64 at QAPI, instead > > > > of trying to unpack it to discrete bool fields. This gives > > > > consistency with SEV and SEV-SNP, and with what's proposed > > > > at the libvirt level, and minimizes future changes when > > > > more policy bits are defined. > > > > > > The reasons why introducing individual bit of sept-ve-disable instead of > > > a raw TD attribute as a whole are that > > > > > > 1. other bits like perfmon, PKS, KL are associated with cpu properties, > > > e.g., > > > > > > perfmon -> pmu, > > > pks -> pks, > > > kl -> keylokcer feature that QEMU currently doesn't support > > > > > > If allowing configuring attribute directly, we need to deal with the > > > inconsistence between attribute vs cpu property. > > > > What about defining those bits associated with cpu properties reserved > > But other bits work as Daniel suggested way. > > I don't understand. Do you mean we provide the interface to configure raw 64 > bit attributes while some bits of it are reserved? Just have a mask of what bits are permitted to be set, and report an error if the user sets non-permitted bits. With regards, Daniel
On Fri, Jun 14, 2024 at 08:49:57AM +0100, Daniel P. Berrangé wrote: > On Fri, Jun 14, 2024 at 09:04:33AM +0800, Xiaoyao Li wrote: > > On 6/13/2024 4:35 PM, Duan, Zhenzhong wrote: > > > > > > > > > > -----Original Message----- > > > > From: Li, Xiaoyao <xiaoyao.li@intel.com> > > > > Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for > > > > tdx-guest object > > > > > > > > On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: > > > > > Copying Zhenzhong Duan as my point relates to the proposed libvirt > > > > > TDX patches. > > > > > > > > > > On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: > > > > > > Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it > > > > disables > > > > > > EPT violation conversion to #VE on guest TD access of PENDING pages. > > > > > > > > > > > > Some guest OS (e.g., Linux TD guest) may require this bit as 1. > > > > > > Otherwise refuse to boot. > > > > > > > > > > > > Add sept-ve-disable property for tdx-guest object, for user to configure > > > > > > this bit. > > > > > > > > > > > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> > > > > > > Acked-by: Gerd Hoffmann <kraxel@redhat.com> > > > > > > Acked-by: Markus Armbruster <armbru@redhat.com> > > > > > > --- > > > > > > Changes in v4: > > > > > > - collect Acked-by from Markus > > > > > > > > > > > > Changes in v3: > > > > > > - update the comment of property @sept-ve-disable to make it more > > > > > > descriptive and use new format. (Daniel and Markus) > > > > > > --- > > > > > > qapi/qom.json | 7 ++++++- > > > > > > target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ > > > > > > 2 files changed, 30 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/qapi/qom.json b/qapi/qom.json > > > > > > index 220cc6c98d4b..89ed89b9b46e 100644 > > > > > > --- a/qapi/qom.json > > > > > > +++ b/qapi/qom.json > > > > > > @@ -900,10 +900,15 @@ > > > > > > # > > > > > > # Properties for tdx-guest objects. > > > > > > # > > > > > > +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling > > > > > > +# of EPT violation conversion to #VE on guest TD access of PENDING > > > > > > +# pages. Some guest OS (e.g., Linux TD guest) may require this to > > > > > > +# be set, otherwise they refuse to boot. > > > > > > +# > > > > > > # Since: 9.0 > > > > > > ## > > > > > > { 'struct': 'TdxGuestProperties', > > > > > > - 'data': { }} > > > > > > + 'data': { '*sept-ve-disable': 'bool' } } > > > > > > > > > > So this exposes a single boolean property that gets mapped into one > > > > > specific bit in the TD attributes: > > > > > > > > > > > + > > > > > > +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error > > > > **errp) > > > > > > +{ > > > > > > + TdxGuest *tdx = TDX_GUEST(obj); > > > > > > + > > > > > > + if (value) { > > > > > > + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > > > > > > + } else { > > > > > > + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; > > > > > > + } > > > > > > +} > > > > > > > > > > If I look at the documentation for TD attributes > > > > > > > > > > https://download.01.org/intel-sgx/latest/dcap- > > > > latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf > > > > > > > > > > Section "A.3.4. TD Attributes" > > > > > > > > > > I see "TD attributes" is a 64-bit int, with 5 bits currently > > > > > defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", > > > > > and the rest currently reserved for future use. This makes me > > > > > wonder about our modelling approach into the future ? > > > > > > > > > > For the AMD SEV equivalent we've just directly exposed the whole > > > > > field as an int: > > > > > > > > > > 'policy' : 'uint32', > > > > > > > > > > For the proposed SEV-SNP patches, the same has been done again > > > > > > > > > > https://lists.nongnu.org/archive/html/qemu-devel/2024- > > > > 06/msg00536.html > > > > > > > > > > '*policy': 'uint64', > > > > > > > > > > > > > > > The advantage of exposing individual booleans is that it is > > > > > self-documenting at the QAPI level, but the disadvantage is > > > > > that every time we want to expose ability to control a new > > > > > bit in the policy we have to modify QEMU, libvirt, the mgmt > > > > > app above libvirt, and whatever tools the end user has to > > > > > talk to the mgmt app. > > > > > > > > > > If we expose a policy int, then newly defined bits only require > > > > > a change in QEMU, and everything above QEMU will already be > > > > > capable of setting it. > > > > > > > > > > In fact if I look at the proposed libvirt patches, they have > > > > > proposed just exposing a policy "int" field in the XML, which > > > > > then has to be unpacked to set the individual QAPI booleans > > > > > > > > > > > > > > https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWX > > > > EESYUA77DP7YIBP55T2OPSVKV5QW/ > > > > > > > > > > On balance, I think it would be better if QEMU just exposed > > > > > the raw TD attributes policy as an uint64 at QAPI, instead > > > > > of trying to unpack it to discrete bool fields. This gives > > > > > consistency with SEV and SEV-SNP, and with what's proposed > > > > > at the libvirt level, and minimizes future changes when > > > > > more policy bits are defined. > > > > > > > > The reasons why introducing individual bit of sept-ve-disable instead of > > > > a raw TD attribute as a whole are that > > > > > > > > 1. other bits like perfmon, PKS, KL are associated with cpu properties, > > > > e.g., > > > > > > > > perfmon -> pmu, > > > > pks -> pks, > > > > kl -> keylokcer feature that QEMU currently doesn't support > > > > > > > > If allowing configuring attribute directly, we need to deal with the > > > > inconsistence between attribute vs cpu property. > > > > > > What about defining those bits associated with cpu properties reserved > > > But other bits work as Daniel suggested way. > > > > I don't understand. Do you mean we provide the interface to configure raw 64 > > bit attributes while some bits of it are reserved? > > Just have a mask of what bits are permitted to be set, and report an > error if the user sets non-permitted bits. Looking at the IGVM patches, the CnofidentialGuestSupport class is gaining a "set_guest_policy" method, which takes a "uint64_t policy". If (when) IGVM support is extended to cover TDX too, then we're going to need to accept the user providing the policy in integer format via the IGVM file metadata. This will require adding code to check for any inconsistency between the policy bitmask, and the CPU flags. https://lists.nongnu.org/archive/html/qemu-devel/2024-06/msg04037.html so to me this is another reason to just expose the policy as an integer in the QAPI/QOM structure too. Everywhere just wants to be working with policy in integer format. With regards, Daniel
On 6/24/2024 11:01 PM, Daniel P. Berrangé wrote: > On Fri, Jun 14, 2024 at 08:49:57AM +0100, Daniel P. Berrangé wrote: >> On Fri, Jun 14, 2024 at 09:04:33AM +0800, Xiaoyao Li wrote: >>> On 6/13/2024 4:35 PM, Duan, Zhenzhong wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Li, Xiaoyao <xiaoyao.li@intel.com> >>>>> Subject: Re: [PATCH v5 25/65] i386/tdx: Add property sept-ve-disable for >>>>> tdx-guest object >>>>> >>>>> On 6/6/2024 6:45 PM, Daniel P. Berrangé wrote: >>>>>> Copying Zhenzhong Duan as my point relates to the proposed libvirt >>>>>> TDX patches. >>>>>> >>>>>> On Thu, Feb 29, 2024 at 01:36:46AM -0500, Xiaoyao Li wrote: >>>>>>> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it >>>>> disables >>>>>>> EPT violation conversion to #VE on guest TD access of PENDING pages. >>>>>>> >>>>>>> Some guest OS (e.g., Linux TD guest) may require this bit as 1. >>>>>>> Otherwise refuse to boot. >>>>>>> >>>>>>> Add sept-ve-disable property for tdx-guest object, for user to configure >>>>>>> this bit. >>>>>>> >>>>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> >>>>>>> Acked-by: Gerd Hoffmann <kraxel@redhat.com> >>>>>>> Acked-by: Markus Armbruster <armbru@redhat.com> >>>>>>> --- >>>>>>> Changes in v4: >>>>>>> - collect Acked-by from Markus >>>>>>> >>>>>>> Changes in v3: >>>>>>> - update the comment of property @sept-ve-disable to make it more >>>>>>> descriptive and use new format. (Daniel and Markus) >>>>>>> --- >>>>>>> qapi/qom.json | 7 ++++++- >>>>>>> target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++ >>>>>>> 2 files changed, 30 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/qapi/qom.json b/qapi/qom.json >>>>>>> index 220cc6c98d4b..89ed89b9b46e 100644 >>>>>>> --- a/qapi/qom.json >>>>>>> +++ b/qapi/qom.json >>>>>>> @@ -900,10 +900,15 @@ >>>>>>> # >>>>>>> # Properties for tdx-guest objects. >>>>>>> # >>>>>>> +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling >>>>>>> +# of EPT violation conversion to #VE on guest TD access of PENDING >>>>>>> +# pages. Some guest OS (e.g., Linux TD guest) may require this to >>>>>>> +# be set, otherwise they refuse to boot. >>>>>>> +# >>>>>>> # Since: 9.0 >>>>>>> ## >>>>>>> { 'struct': 'TdxGuestProperties', >>>>>>> - 'data': { }} >>>>>>> + 'data': { '*sept-ve-disable': 'bool' } } >>>>>> >>>>>> So this exposes a single boolean property that gets mapped into one >>>>>> specific bit in the TD attributes: >>>>>> >>>>>>> + >>>>>>> +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error >>>>> **errp) >>>>>>> +{ >>>>>>> + TdxGuest *tdx = TDX_GUEST(obj); >>>>>>> + >>>>>>> + if (value) { >>>>>>> + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>>>>>> + } else { >>>>>>> + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; >>>>>>> + } >>>>>>> +} >>>>>> >>>>>> If I look at the documentation for TD attributes >>>>>> >>>>>> https://download.01.org/intel-sgx/latest/dcap- >>>>> latest/linux/docs/Intel_TDX_DCAP_Quoting_Library_API.pdf >>>>>> >>>>>> Section "A.3.4. TD Attributes" >>>>>> >>>>>> I see "TD attributes" is a 64-bit int, with 5 bits currently >>>>>> defined "DEBUG", "SEPT_VE_DISABLE", "PKS", "PL", "PERFMON", >>>>>> and the rest currently reserved for future use. This makes me >>>>>> wonder about our modelling approach into the future ? >>>>>> >>>>>> For the AMD SEV equivalent we've just directly exposed the whole >>>>>> field as an int: >>>>>> >>>>>> 'policy' : 'uint32', >>>>>> >>>>>> For the proposed SEV-SNP patches, the same has been done again >>>>>> >>>>>> https://lists.nongnu.org/archive/html/qemu-devel/2024- >>>>> 06/msg00536.html >>>>>> >>>>>> '*policy': 'uint64', >>>>>> >>>>>> >>>>>> The advantage of exposing individual booleans is that it is >>>>>> self-documenting at the QAPI level, but the disadvantage is >>>>>> that every time we want to expose ability to control a new >>>>>> bit in the policy we have to modify QEMU, libvirt, the mgmt >>>>>> app above libvirt, and whatever tools the end user has to >>>>>> talk to the mgmt app. >>>>>> >>>>>> If we expose a policy int, then newly defined bits only require >>>>>> a change in QEMU, and everything above QEMU will already be >>>>>> capable of setting it. >>>>>> >>>>>> In fact if I look at the proposed libvirt patches, they have >>>>>> proposed just exposing a policy "int" field in the XML, which >>>>>> then has to be unpacked to set the individual QAPI booleans >>>>>> >>>>>> >>>>> https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/WXWX >>>>> EESYUA77DP7YIBP55T2OPSVKV5QW/ >>>>>> >>>>>> On balance, I think it would be better if QEMU just exposed >>>>>> the raw TD attributes policy as an uint64 at QAPI, instead >>>>>> of trying to unpack it to discrete bool fields. This gives >>>>>> consistency with SEV and SEV-SNP, and with what's proposed >>>>>> at the libvirt level, and minimizes future changes when >>>>>> more policy bits are defined. >>>>> >>>>> The reasons why introducing individual bit of sept-ve-disable instead of >>>>> a raw TD attribute as a whole are that >>>>> >>>>> 1. other bits like perfmon, PKS, KL are associated with cpu properties, >>>>> e.g., >>>>> >>>>> perfmon -> pmu, >>>>> pks -> pks, >>>>> kl -> keylokcer feature that QEMU currently doesn't support >>>>> >>>>> If allowing configuring attribute directly, we need to deal with the >>>>> inconsistence between attribute vs cpu property. >>>> >>>> What about defining those bits associated with cpu properties reserved >>>> But other bits work as Daniel suggested way. >>> >>> I don't understand. Do you mean we provide the interface to configure raw 64 >>> bit attributes while some bits of it are reserved? >> >> Just have a mask of what bits are permitted to be set, and report an >> error if the user sets non-permitted bits. > > Looking at the IGVM patches, the CnofidentialGuestSupport class is > gaining a "set_guest_policy" method, which takes a "uint64_t policy". > > If (when) IGVM support is extended to cover TDX too, then we're > going to need to accept the user providing the policy in integer > format via the IGVM file metadata. This will require adding code > to check for any inconsistency between the policy bitmask, and > the CPU flags. > > https://lists.nongnu.org/archive/html/qemu-devel/2024-06/msg04037.html Just have a glance at the IGVM series (which is a news to me). The IGVM series looks specific to SEV-*. Though some of the callbacks introduced in ConfidentialGuestSupportClass can be tweaked for TDX, but the arguments introduced are specific to SEV-*. Aside from above, are you going to map (sev-*)'s policy to TDX's attributes? I don't think it is a good idea. It mixes up things with limited benefit. To me, the common of them is only they are both 64-bit field. > so to me this is another reason to just expose the policy as an > integer in the QAPI/QOM structure too. Everywhere just wants to > be working with policy in integer format. I'm not reluctant to expose TD's attribute as a raw 64-bit data in QAPI structure. I just don't like the idea of the permitted mask, which makes the 64-bit field incomplete, and makes things complicated. People need to learn that some bits are configured in attribute directly while other bits are configured via cpu properties indirectly. Maybe we can allow the direct configurability of raw 64-bit attribute and give it highest priority. If inconsistent value is provided via cpu properties, warn it and let the attribute value overwrite CPU properties? > With regards, > Daniel
diff --git a/qapi/qom.json b/qapi/qom.json index 220cc6c98d4b..89ed89b9b46e 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -900,10 +900,15 @@ # # Properties for tdx-guest objects. # +# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling +# of EPT violation conversion to #VE on guest TD access of PENDING +# pages. Some guest OS (e.g., Linux TD guest) may require this to +# be set, otherwise they refuse to boot. +# # Since: 9.0 ## { 'struct': 'TdxGuestProperties', - 'data': { }} + 'data': { '*sept-ve-disable': 'bool' } } ## # @ThreadContextProperties: diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c index d548ec340285..806192158c9d 100644 --- a/target/i386/kvm/tdx.c +++ b/target/i386/kvm/tdx.c @@ -32,6 +32,8 @@ (1U << KVM_FEATURE_PV_SCHED_YIELD) | \ (1U << KVM_FEATURE_MSI_EXT_DEST_ID)) +#define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE BIT_ULL(28) + #define TDX_ATTRIBUTES_MAX_BITS 64 static FeatureMask tdx_attrs_ctrl_fields[TDX_ATTRIBUTES_MAX_BITS] = { @@ -514,6 +516,24 @@ int tdx_pre_create_vcpu(CPUState *cpu, Error **errp) return 0; } +static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp) +{ + TdxGuest *tdx = TDX_GUEST(obj); + + return !!(tdx->attributes & TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE); +} + +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp) +{ + TdxGuest *tdx = TDX_GUEST(obj); + + if (value) { + tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; + } else { + tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE; + } +} + /* tdx guest */ OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest, tdx_guest, @@ -529,6 +549,10 @@ static void tdx_guest_init(Object *obj) qemu_mutex_init(&tdx->lock); tdx->attributes = 0; + + object_property_add_bool(obj, "sept-ve-disable", + tdx_guest_get_sept_ve_disable, + tdx_guest_set_sept_ve_disable); } static void tdx_guest_finalize(Object *obj)