Message ID | 20191120114334.2287-1-frankja@linux.ibm.com |
---|---|
Headers | show |
Series | s390x: Protected Virtualization support | expand |
On Wed, 20 Nov 2019 06:43:19 -0500 Janosch Frank <frankja@linux.ibm.com> wrote: Do you have a branch with this somewhere? > Most of the QEMU changes for PV are related to the new IPL type with > subcodes 8 - 10 and the execution of the necessary Ultravisor calls to > IPL secure guests. Note that we can only boot into secure mode from > normal mode, i.e. stfle 161 is not active in secure mode. > > The other changes related to data gathering for emulation and > disabling addressing checks in secure mode, as well as CPU resets. > > While working on this I sprinkled in some cleanups, as we sometimes > significantly increase line count of some functions and they got > unreadable. Any other cleanups than in the first two patches? I.e., anything that could be picked up independently? > > Janosch Frank (15): > s390x: Cleanup cpu resets > s390x: Beautify diag308 handling > s390x: protvirt: Add diag308 subcodes 8 - 10 > Header sync protvirt > s390x: protvirt: Sync PV state > s390x: protvirt: Support unpack facility > s390x: protvirt: Handle diag 308 subcodes 0,1,3,4 > s390x: protvirt: KVM intercept changes > s390x: protvirt: SCLP interpretation > s390x: protvirt: Add new VCPU reset functions > RFC: s390x: Exit on vcpu reset error > s390x: protvirt: Set guest IPL PSW > s390x: protvirt: Move diag 308 data over SIDAD > s390x: protvirt: Disable address checks for PV guest IO emulation > s390x: protvirt: Handle SIGP store status correctly > > hw/s390x/Makefile.objs | 1 + > hw/s390x/ipl.c | 81 +++++++++++++++++- > hw/s390x/ipl.h | 35 ++++++++ > hw/s390x/pv.c | 123 +++++++++++++++++++++++++++ > hw/s390x/pv.h | 27 ++++++ > hw/s390x/s390-virtio-ccw.c | 79 ++++++++++++++--- > hw/s390x/sclp.c | 16 ++++ > include/hw/s390x/sclp.h | 2 + > linux-headers/asm-s390/kvm.h | 4 +- > linux-headers/linux/kvm.h | 43 ++++++++++ > target/s390x/cpu.c | 127 ++++++++++++++-------------- > target/s390x/cpu.h | 1 + > target/s390x/cpu_features_def.inc.h | 1 + > target/s390x/diag.c | 108 +++++++++++++++++------ > target/s390x/ioinst.c | 46 ++++++---- > target/s390x/kvm-stub.c | 10 ++- > target/s390x/kvm.c | 58 +++++++++++-- > target/s390x/kvm_s390x.h | 4 +- > target/s390x/sigp.c | 7 +- > 19 files changed, 640 insertions(+), 133 deletions(-) > create mode 100644 hw/s390x/pv.c > create mode 100644 hw/s390x/pv.h >
On 11/20/19 2:26 PM, Cornelia Huck wrote: > On Wed, 20 Nov 2019 06:43:19 -0500 > Janosch Frank <frankja@linux.ibm.com> wrote: > > Do you have a branch with this somewhere? > >> Most of the QEMU changes for PV are related to the new IPL type with >> subcodes 8 - 10 and the execution of the necessary Ultravisor calls to >> IPL secure guests. Note that we can only boot into secure mode from >> normal mode, i.e. stfle 161 is not active in secure mode. >> >> The other changes related to data gathering for emulation and >> disabling addressing checks in secure mode, as well as CPU resets. >> >> While working on this I sprinkled in some cleanups, as we sometimes >> significantly increase line count of some functions and they got >> unreadable. > > Any other cleanups than in the first two patches? I.e., anything that > could be picked up independently? Maybe patch #11, but that's RFC > >> >> Janosch Frank (15): >> s390x: Cleanup cpu resets >> s390x: Beautify diag308 handling >> s390x: protvirt: Add diag308 subcodes 8 - 10 >> Header sync protvirt >> s390x: protvirt: Sync PV state >> s390x: protvirt: Support unpack facility >> s390x: protvirt: Handle diag 308 subcodes 0,1,3,4 >> s390x: protvirt: KVM intercept changes >> s390x: protvirt: SCLP interpretation >> s390x: protvirt: Add new VCPU reset functions >> RFC: s390x: Exit on vcpu reset error >> s390x: protvirt: Set guest IPL PSW >> s390x: protvirt: Move diag 308 data over SIDAD >> s390x: protvirt: Disable address checks for PV guest IO emulation >> s390x: protvirt: Handle SIGP store status correctly >> >> hw/s390x/Makefile.objs | 1 + >> hw/s390x/ipl.c | 81 +++++++++++++++++- >> hw/s390x/ipl.h | 35 ++++++++ >> hw/s390x/pv.c | 123 +++++++++++++++++++++++++++ >> hw/s390x/pv.h | 27 ++++++ >> hw/s390x/s390-virtio-ccw.c | 79 ++++++++++++++--- >> hw/s390x/sclp.c | 16 ++++ >> include/hw/s390x/sclp.h | 2 + >> linux-headers/asm-s390/kvm.h | 4 +- >> linux-headers/linux/kvm.h | 43 ++++++++++ >> target/s390x/cpu.c | 127 ++++++++++++++-------------- >> target/s390x/cpu.h | 1 + >> target/s390x/cpu_features_def.inc.h | 1 + >> target/s390x/diag.c | 108 +++++++++++++++++------ >> target/s390x/ioinst.c | 46 ++++++---- >> target/s390x/kvm-stub.c | 10 ++- >> target/s390x/kvm.c | 58 +++++++++++-- >> target/s390x/kvm_s390x.h | 4 +- >> target/s390x/sigp.c | 7 +- >> 19 files changed, 640 insertions(+), 133 deletions(-) >> create mode 100644 hw/s390x/pv.c >> create mode 100644 hw/s390x/pv.h >> > >
On 11/20/19 2:26 PM, Cornelia Huck wrote: > On Wed, 20 Nov 2019 06:43:19 -0500 > Janosch Frank <frankja@linux.ibm.com> wrote: > > Do you have a branch with this somewhere? Just for you: https://github.com/frankjaa/qemu/tree/protvirt
On Thu, 21 Nov 2019 10:13:29 +0100 Janosch Frank <frankja@linux.ibm.com> wrote: > On 11/20/19 2:26 PM, Cornelia Huck wrote: > > On Wed, 20 Nov 2019 06:43:19 -0500 > > Janosch Frank <frankja@linux.ibm.com> wrote: > > > > Do you have a branch with this somewhere? > > Just for you: > https://github.com/frankjaa/qemu/tree/protvirt > Thanks!
On Wed, Nov 20, 2019 at 06:43:19AM -0500, Janosch Frank wrote: > Most of the QEMU changes for PV are related to the new IPL type with > subcodes 8 - 10 and the execution of the necessary Ultravisor calls to > IPL secure guests. Note that we can only boot into secure mode from > normal mode, i.e. stfle 161 is not active in secure mode. > > The other changes related to data gathering for emulation and > disabling addressing checks in secure mode, as well as CPU resets. > > While working on this I sprinkled in some cleanups, as we sometimes > significantly increase line count of some functions and they got > unreadable. Can you give some guidance on how management applications including libvirt & layers above (oVirt, OpenStack, etc) would/should use this feature ? What new command line / monitor calls are needed, and what feature restrictions are there on its use ? Regards, Daniel
On 11/29/19 12:08 PM, Daniel P. Berrangé wrote: > On Wed, Nov 20, 2019 at 06:43:19AM -0500, Janosch Frank wrote: >> Most of the QEMU changes for PV are related to the new IPL type with >> subcodes 8 - 10 and the execution of the necessary Ultravisor calls to >> IPL secure guests. Note that we can only boot into secure mode from >> normal mode, i.e. stfle 161 is not active in secure mode. >> >> The other changes related to data gathering for emulation and >> disabling addressing checks in secure mode, as well as CPU resets. >> >> While working on this I sprinkled in some cleanups, as we sometimes >> significantly increase line count of some functions and they got >> unreadable. > > Can you give some guidance on how management applications including > libvirt & layers above (oVirt, OpenStack, etc) would/should use this > feature ? What new command line / monitor calls are needed, and > what feature restrictions are there on its use ? > > Regards, > Daniel > Hey Daniel, management applications generally do not need to know about this feature. Most of the magic is in the guest image, which boots up in a certain way to become a protected machine. The requirements for that to happen are: * Machine/firmware support * KVM & QEMU support * IO only with iommu * Guest needs to use IO bounce buffers * A kernel image or a kernel on a disk that was prepared with special tooling Such VMs are started like any other VM and run a short "normal" stub that will prepare some things and then requests to be protected. Most of the restrictions are memory related and might be lifted in the future: * No paging * No migration * No huge page backings * No collaborative memory management There are no monitor changes or cmd additions currently. We're trying to insert protected VMs into the normal VM flow as much as possible. You can even do a memory dump without any segfault or protection exception for QEMU, however the guest's memory content will be unreadable because it's encrypted.
On Fri, Nov 29, 2019 at 01:14:27PM +0100, Janosch Frank wrote: > On 11/29/19 12:08 PM, Daniel P. Berrangé wrote: > > On Wed, Nov 20, 2019 at 06:43:19AM -0500, Janosch Frank wrote: > >> Most of the QEMU changes for PV are related to the new IPL type with > >> subcodes 8 - 10 and the execution of the necessary Ultravisor calls to > >> IPL secure guests. Note that we can only boot into secure mode from > >> normal mode, i.e. stfle 161 is not active in secure mode. > >> > >> The other changes related to data gathering for emulation and > >> disabling addressing checks in secure mode, as well as CPU resets. > >> > >> While working on this I sprinkled in some cleanups, as we sometimes > >> significantly increase line count of some functions and they got > >> unreadable. > > > > Can you give some guidance on how management applications including > > libvirt & layers above (oVirt, OpenStack, etc) would/should use this > > feature ? What new command line / monitor calls are needed, and > > what feature restrictions are there on its use ? > > management applications generally do not need to know about this > feature. Most of the magic is in the guest image, which boots up in a > certain way to become a protected machine. > > The requirements for that to happen are: > * Machine/firmware support > * KVM & QEMU support > * IO only with iommu > * Guest needs to use IO bounce buffers > * A kernel image or a kernel on a disk that was prepared with special > tooling If the user has a guest image that's expecting to run in protected machine mode, presumably this will fail to boot if run on a host which doesn't support this feature ? As a mgmt app I think there will be a need to be able to determine whether a host + QEMU combo is actually able to support protected machines. If the mgmt app is given an image and the users says it required protected mode, then the mgmt app needs to know which host(s) are able to run it. Doing version number checks is not particularly desirable, so is there a way libvirt can determine if QEMU + host in general supports protected machines, so that we can report this feature to mgmt apps ? If a guest has booted & activated protected mode is there any way for libvirt to query that status ? This would allow the mgmt app to know that the guest is not going to be migratable thereafter. Is there any way to prevent a guest from using protected mode even if QEMU supports it ? eg the mgmt app may want to be able to guarantee that all VMs are migratable, so don't want a guest OS secretly activating protected mode which blocks migration. > Such VMs are started like any other VM and run a short "normal" stub > that will prepare some things and then requests to be protected. > > Most of the restrictions are memory related and might be lifted in the > future: > * No paging > * No migration Presumably QEMU is going to set a migration blocker when a guest activates protected mode ? > * No huge page backings > * No collaborative memory management > There are no monitor changes or cmd additions currently. > We're trying to insert protected VMs into the normal VM flow as much as > possible. You can even do a memory dump without any segfault or > protection exception for QEMU, however the guest's memory content will > be unreadable because it's encrypted. Is there any way to securely acquire a key needed to interpret this, or is the memory dump completely useless ? Regards, Daniel
On 11/29/19 1:35 PM, Daniel P. Berrangé wrote: > On Fri, Nov 29, 2019 at 01:14:27PM +0100, Janosch Frank wrote: >> On 11/29/19 12:08 PM, Daniel P. Berrangé wrote: >>> On Wed, Nov 20, 2019 at 06:43:19AM -0500, Janosch Frank wrote: >>>> Most of the QEMU changes for PV are related to the new IPL type with >>>> subcodes 8 - 10 and the execution of the necessary Ultravisor calls to >>>> IPL secure guests. Note that we can only boot into secure mode from >>>> normal mode, i.e. stfle 161 is not active in secure mode. >>>> >>>> The other changes related to data gathering for emulation and >>>> disabling addressing checks in secure mode, as well as CPU resets. >>>> >>>> While working on this I sprinkled in some cleanups, as we sometimes >>>> significantly increase line count of some functions and they got >>>> unreadable. >>> >>> Can you give some guidance on how management applications including >>> libvirt & layers above (oVirt, OpenStack, etc) would/should use this >>> feature ? What new command line / monitor calls are needed, and >>> what feature restrictions are there on its use ? >> >> management applications generally do not need to know about this >> feature. Most of the magic is in the guest image, which boots up in a >> certain way to become a protected machine. >> >> The requirements for that to happen are: >> * Machine/firmware support >> * KVM & QEMU support >> * IO only with iommu >> * Guest needs to use IO bounce buffers >> * A kernel image or a kernel on a disk that was prepared with special >> tooling > > If the user has a guest image that's expecting to run in protected > machine mode, presumably this will fail to boot if run on a host > which doesn't support this feature ? Yes, the guest will lack stfle facility 161 and KVM will report a specification exception on the diagnose subcode 8 - 10. > > As a mgmt app I think there will be a need to be able to determine > whether a host + QEMU combo is actually able to support protected > machines. If the mgmt app is given an image and the users says it > required protected mode, then the mgmt app needs to know which > host(s) are able to run it. > > Doing version number checks is not particularly desirable, so is > there a way libvirt can determine if QEMU + host in general supports > protected machines, so that we can report this feature to mgmt apps ? I thought that would be visible via the cpu model by checking for the unpack facility (161)? Time for somebody else to explain that. @Viktor @Boris: This one's for you. > > > If a guest has booted & activated protected mode is there any way > for libvirt to query that status ? This would allow the mgmt app > to know that the guest is not going to be migratable thereafter. Currently not > > Is there any way to prevent a guest from using protected mode even > if QEMU supports it ? eg the mgmt app may want to be able to > guarantee that all VMs are migratable, so don't want a guest OS > secretly activating protected mode which blocks migration. Not enabling facility 161 is enough. > >> Such VMs are started like any other VM and run a short "normal" stub >> that will prepare some things and then requests to be protected. >> >> Most of the restrictions are memory related and might be lifted in the >> future: >> * No paging >> * No migration > > Presumably QEMU is going to set a migration blocker when a guest > activates protected mode ? Well, that's stuff I still need to figure out :) > >> * No huge page backings >> * No collaborative memory management > >> There are no monitor changes or cmd additions currently. >> We're trying to insert protected VMs into the normal VM flow as much as >> possible. You can even do a memory dump without any segfault or >> protection exception for QEMU, however the guest's memory content will >> be unreadable because it's encrypted. > > Is there any way to securely acquire a key needed to interpret this, > or is the memory dump completely useless ? It's part of the design, but not yet implemented. > > Regards, > Daniel >
On 11/29/19 3:02 PM, Janosch Frank wrote: [...] >> >> As a mgmt app I think there will be a need to be able to determine >> whether a host + QEMU combo is actually able to support protected >> machines. If the mgmt app is given an image and the users says it >> required protected mode, then the mgmt app needs to know which >> host(s) are able to run it. >> >> Doing version number checks is not particularly desirable, so is >> there a way libvirt can determine if QEMU + host in general supports >> protected machines, so that we can report this feature to mgmt apps ? > > I thought that would be visible via the cpu model by checking for the > unpack facility (161)? > Time for somebody else to explain that. > > > @Viktor @Boris: This one's for you. > Right, a management app could check the supported CPU model, with something like virsh domcapabilities. The domain's CPU model would have to require the 'unpack' facility. So, in theory any management app establishing CPU model compatibility using the libvirt APIs should be able to find appropriate hosts. [...]
On Fri, Nov 29, 2019 at 03:02:41PM +0100, Janosch Frank wrote: > On 11/29/19 1:35 PM, Daniel P. Berrangé wrote: > > Is there any way to prevent a guest from using protected mode even > > if QEMU supports it ? eg the mgmt app may want to be able to > > guarantee that all VMs are migratable, so don't want a guest OS > > secretly activating protected mode which blocks migration. > > Not enabling facility 161 is enough. Is this facility enabled by default in any scenario ? What happens if the feature is enabled & QEMU is also coinfigured to use huge pages or does not have memory pinned into RAM, given that those features are said to be incompatible ? > > > > >> Such VMs are started like any other VM and run a short "normal" stub > >> that will prepare some things and then requests to be protected. > >> > >> Most of the restrictions are memory related and might be lifted in the > >> future: > >> * No paging > >> * No migration > > > > Presumably QEMU is going to set a migration blocker when a guest > > activates protected mode ? > > Well, that's stuff I still need to figure out :) > > > > >> * No huge page backings > >> * No collaborative memory management Regards, Daniel