Message ID | 1556749903-19221-1-git-send-email-walling@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [v4] s390: diagnose 318 info reset and migration support | expand |
On 02.05.19 00:31, Collin Walling wrote: > DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must > be intercepted by SIE and handled via KVM. Let's introduce some > functions to communicate between QEMU and KVM via ioctls. These > will be used to get/set the diag318 related information (also known > as the "Control Program Code" or "CPC"), as well as check the system > if KVM supports handling this instruction. > > The availability of this instruction is determined by byte 134, bit 0 > of the Read Info block. This coincidentally expands into the space used > for CPU entries, which means VMs running with the diag318 capability > will have a reduced maximum CPU count. To alleviate this, let's calculate > the actual max CPU entry space by subtracting the size of Read Info from > the SCCB size then dividing that number by the size of a CPU entry. If > this value is less than the value denoted by S390_MAX_CPUS, then let's > reduce the max cpus for s390 from 248 to 240 in an effort to anticipate > this potentially happening again in the future. > > In order to simplify the migration and system reset requirements of > the diag318 data, let's introduce it as a device class and include > a VMStateDescription. > > Diag318 is reset on during modified clear and load normal. > > Signed-off-by: Collin Walling <walling@linux.ibm.com> > --- > hw/s390x/Makefile.objs | 1 + > hw/s390x/diag318.c | 100 +++++++++++++++++++++++++++++++++++++++++++ > hw/s390x/diag318.h | 39 +++++++++++++++++ > hw/s390x/s390-virtio-ccw.c | 23 ++++++++++ > hw/s390x/sclp.c | 5 +++ > include/hw/s390x/sclp.h | 2 + > linux-headers/asm-s390/kvm.h | 4 ++ > target/s390x/kvm-stub.c | 15 +++++++ > target/s390x/kvm.c | 32 ++++++++++++++ > target/s390x/kvm_s390x.h | 3 ++ > 10 files changed, 224 insertions(+) > create mode 100644 hw/s390x/diag318.c > create mode 100644 hw/s390x/diag318.h > > diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs > index e02ed80..93621dc 100644 > --- a/hw/s390x/Makefile.objs > +++ b/hw/s390x/Makefile.objs > @@ -34,3 +34,4 @@ obj-$(CONFIG_KVM) += s390-stattrib-kvm.o > obj-y += s390-ccw.o > obj-y += ap-device.o > obj-y += ap-bridge.o > +obj-y += diag318.o > diff --git a/hw/s390x/diag318.c b/hw/s390x/diag318.c > new file mode 100644 > index 0000000..94b44da > --- /dev/null > +++ b/hw/s390x/diag318.c > @@ -0,0 +1,100 @@ > +/* > + * DIAGNOSE 0x318 functions for reset and migration > + * > + * Copyright IBM, Corp. 2019 > + * > + * Authors: > + * Collin Walling <walling@linux.ibm.com> > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or (at your > + * option) any later version. See the COPYING file in the top-level directory. > + */ > + > +#include "hw/s390x/diag318.h" > +#include "qapi/error.h" > +#include "kvm_s390x.h" > +#include "sysemu/kvm.h" > + > +static int diag318_post_load(void *opaque, int version_id) > +{ > + DIAG318State *d = opaque; > + > + kvm_s390_set_cpc(d->cpc); > + > + /* It is not necessary to retain a copy of the cpc after migration. */ > + d->cpc = 0; But we also do not need to zero it out? Can't you just drop these 2 lines? > + > + return 0; > +} > + > +static int diag318_pre_save(void *opaque) > +{ > + DIAG318State *d = opaque; > + > + kvm_s390_get_cpc(&d->cpc); > + return 0; > +} > + > +static bool diag318_needed(void *opaque) > +{ > + DIAG318State *d = opaque; > + > + return d->enabled; > +} I would like to have a cpumodel entry that allows to disable that feature. And we should then check for this. > + > +const VMStateDescription vmstate_diag318 = { > + .name = "vmstate_diag318", > + .post_load = diag318_post_load, > + .pre_save = diag318_pre_save, > + .version_id = 1, > + .minimum_version_id = 1, > + .needed = diag318_needed, > + .fields = (VMStateField[]) { > + VMSTATE_UINT64(cpc, DIAG318State), > + VMSTATE_END_OF_LIST() > + } > +}; > + > +static void s390_diag318_realize(DeviceState *dev, Error **errp) > +{ > + DIAG318State *d = DIAG318(dev); > + > + if (kvm_s390_has_diag318()) { > + d->enabled = true; > + } same here -> cpumodel Then we can get rid of the enabled bool. > +} > + > +static void s390_diag318_reset(DeviceState *dev) > +{ > + DIAG318State *d = DIAG318(dev); > + > + if (d->enabled) { > + kvm_s390_set_cpc(0); > + } > +} > + > +static void s390_diag318_class_init(ObjectClass *klass, void *data) > +{ > + DeviceClass *dc = DEVICE_CLASS(klass); > + > + dc->realize = s390_diag318_realize; > + dc->reset = s390_diag318_reset; > + dc->vmsd = &vmstate_diag318; > + dc->hotpluggable = false; > + /* Reason: Set automatically during IPL */ > + dc->user_creatable = false; > +} > + > +static const TypeInfo s390_diag318_info = { > + .class_init = s390_diag318_class_init, > + .parent = TYPE_DEVICE, > + .name = TYPE_S390_DIAG318, > + .instance_size = sizeof(DIAG318State), > +}; > + > +static void s390_diag318_register_types(void) > +{ > + type_register_static(&s390_diag318_info); > +} > + > +type_init(s390_diag318_register_types) > diff --git a/hw/s390x/diag318.h b/hw/s390x/diag318.h > new file mode 100644 > index 0000000..c154b0a > --- /dev/null > +++ b/hw/s390x/diag318.h > @@ -0,0 +1,39 @@ > +/* > + * DIAGNOSE 0x318 functions for reset and migration > + * > + * Copyright IBM, Corp. 2019 > + * > + * Authors: > + * Collin Walling <walling@linux.ibm.com> > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or (at your > + * option) any later version. See the COPYING file in the top-level directory. > + */ > + > + #ifndef DIAG318_H > + #define DIAG318_H > + > +#include "qemu/osdep.h" > +#include "hw/qdev.h" > + > +#define TYPE_S390_DIAG318 "diag318" > +#define DIAG318(obj) \ > + OBJECT_CHECK(DIAG318State, (obj), TYPE_S390_DIAG318) > + > +typedef struct DIAG318State { > + /*< private >*/ > + DeviceState parent_obj; > + > + /*< public >*/ > + uint64_t cpc; > + bool enabled; > +} DIAG318State; > + > +typedef struct DIAG318Class { > + /*< private >*/ > + DeviceClass parent_class; > + > + /*< public >*/ > +} DIAG318Class; > + > +#endif /* DIAG318_H */ > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c > index d11069b..44a424b 100644 > --- a/hw/s390x/s390-virtio-ccw.c > +++ b/hw/s390x/s390-virtio-ccw.c > @@ -36,6 +36,7 @@ > #include "cpu_models.h" > #include "hw/nmi.h" > #include "hw/s390x/tod.h" > +#include "hw/s390x/diag318.h" > > S390CPU *s390_cpu_addr2state(uint16_t cpu_addr) > { > @@ -92,6 +93,7 @@ static const char *const reset_dev_types[] = { > "s390-sclp-event-facility", > "s390-flic", > "diag288", > + TYPE_S390_DIAG318, > }; > > static void subsystem_reset(void) > @@ -246,6 +248,17 @@ static void s390_create_sclpconsole(const char *type, Chardev *chardev) > qdev_init_nofail(dev); > } > > +static void s390_init_diag318(void) > +{ > + Object *new = object_new(TYPE_S390_DIAG318); > + DeviceState *dev = DEVICE(new); > + > + object_property_add_child(qdev_get_machine(), TYPE_S390_DIAG318, > + new, NULL); > + object_unref(new); > + qdev_init_nofail(dev); > +} > + > static void ccw_init(MachineState *machine) > { > int ret; > @@ -302,6 +315,8 @@ static void ccw_init(MachineState *machine) > > /* init the TOD clock */ > s390_init_tod(); > + > + s390_init_diag318(); > } > > static void s390_cpu_plug(HotplugHandler *hotplug_dev, > @@ -570,6 +585,7 @@ static void machine_set_loadparm(Object *obj, const char *val, Error **errp) > ms->loadparm[i] = ' '; /* pad right with spaces */ > } > } > + > static inline void s390_machine_initfn(Object *obj) > { > object_property_add_bool(obj, "aes-key-wrap", > @@ -652,6 +668,13 @@ static void ccw_machine_4_0_instance_options(MachineState *machine) > > static void ccw_machine_4_0_class_options(MachineClass *mc) > { > + /* > + * Read Info might reveal more bytes used to detect facilities, thus > + * reducing the number of CPU entries. Let's reduce the max CPUs by > + * an arbitrary number in effort to anticipate future facility bytes. > + */ > + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) > + mc->max_cpus = S390_MAX_CPUS - 8; Maybe this should depend on the presence of this feature in the cpumodel? > } > DEFINE_CCW_MACHINE(4_0, "4.0", true); > > diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c > index 4510a80..9cfa188 100644 > --- a/hw/s390x/sclp.c > +++ b/hw/s390x/sclp.c > @@ -22,6 +22,7 @@ > #include "hw/s390x/event-facility.h" > #include "hw/s390x/s390-pci-bus.h" > #include "hw/s390x/ipl.h" > +#include "kvm_s390x.h" > > static inline SCLPDevice *get_sclp_device(void) > { > @@ -74,6 +75,10 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb) > s390_get_feat_block(S390_FEAT_TYPE_SCLP_CONF_CHAR_EXT, > read_info->conf_char_ext); > > + /* Enable diag318 for guest if KVM supports emulation */ > + if (kvm_s390_has_diag318()) > + read_info->fac134 = 0x80; I think we should rather make this part of the cpumodel > + > read_info->facilities = cpu_to_be64(SCLP_HAS_CPU_INFO | > SCLP_HAS_IOA_RECONFIG); > > diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h > index f9db243..d47e10a 100644 > --- a/include/hw/s390x/sclp.h > +++ b/include/hw/s390x/sclp.h > @@ -133,6 +133,8 @@ typedef struct ReadInfo { > uint16_t highest_cpu; > uint8_t _reserved5[124 - 122]; /* 122-123 */ > uint32_t hmfai; > + uint8_t _reserved7[134 - 128]; /* 128-133 */ > + uint8_t fac134; > struct CPUEntry entries[0]; > } QEMU_PACKED ReadInfo; > > diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h > index 0265482..735f5a4 100644 > --- a/linux-headers/asm-s390/kvm.h > +++ b/linux-headers/asm-s390/kvm.h > @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req { > #define KVM_S390_VM_CRYPTO 2 > #define KVM_S390_VM_CPU_MODEL 3 > #define KVM_S390_VM_MIGRATION 4 > +#define KVM_S390_VM_MISC 5 Can you split that out into a separate patch (header sync). (We also need the relevant kernel patch first, I could not find it right now). > > /* kvm attributes for mem_ctrl */ > #define KVM_S390_VM_MEM_ENABLE_CMMA 0 > @@ -168,6 +169,9 @@ struct kvm_s390_vm_cpu_subfunc { > #define KVM_S390_VM_MIGRATION_START 1 > #define KVM_S390_VM_MIGRATION_STATUS 2 > > +/* kvm attributes for KVM_S390_VM_MISC */ > +#define KVM_S390_VM_MISC_CPC 0 > + > /* for KVM_GET_REGS and KVM_SET_REGS */ > struct kvm_regs { > /* general purpose regs for s390 */ > diff --git a/target/s390x/kvm-stub.c b/target/s390x/kvm-stub.c > index bf7795e..7861ccd 100644 > --- a/target/s390x/kvm-stub.c > +++ b/target/s390x/kvm-stub.c > @@ -104,3 +104,18 @@ void kvm_s390_stop_interrupt(S390CPU *cpu) > void kvm_s390_restart_interrupt(S390CPU *cpu) > { > } > + > +int kvm_s390_get_cpc(uint64_t *cpc) > +{ > + return 0; > +} > + > +int kvm_s390_set_cpc(uint64_t cpc) > +{ > + return 0; > +} > + > +bool kvm_s390_has_diag318(void) > +{ > + return false; > +} > diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c > index 19530fb..225e516 100644 > --- a/target/s390x/kvm.c > +++ b/target/s390x/kvm.c > @@ -747,6 +747,38 @@ int kvm_s390_set_clock_ext(uint8_t tod_high, uint64_t tod_low) > return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr); > } > > +int kvm_s390_get_cpc(uint64_t *cpc) > +{ > + struct kvm_device_attr attr = { > + .group = KVM_S390_VM_MISC, > + .attr = KVM_S390_VM_MISC_CPC, > + .addr = (uint64_t)cpc, > + }; > + > + return kvm_vm_ioctl(kvm_state, KVM_GET_DEVICE_ATTR, &attr); > +} > + > +int kvm_s390_set_cpc(uint64_t cpc) > +{ > + struct kvm_device_attr attr = { > + .group = KVM_S390_VM_MISC, > + .attr = KVM_S390_VM_MISC_CPC, > + .addr = (uint64_t)&cpc, > + }; > + > + return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr); > +} > + > +bool kvm_s390_has_diag318(void) > +{ > + struct kvm_device_attr attr = { > + .group = KVM_S390_VM_MISC, > + .attr = KVM_S390_VM_MISC_CPC, > + }; > + > + return kvm_vm_ioctl(kvm_state, KVM_HAS_DEVICE_ATTR, &attr) == 0; > +} > + > /** > * kvm_s390_mem_op: > * @addr: the logical start address in guest memory > diff --git a/target/s390x/kvm_s390x.h b/target/s390x/kvm_s390x.h > index 6e52287..53f165f 100644 > --- a/target/s390x/kvm_s390x.h > +++ b/target/s390x/kvm_s390x.h > @@ -29,6 +29,9 @@ int kvm_s390_get_clock(uint8_t *tod_high, uint64_t *tod_clock); > int kvm_s390_get_clock_ext(uint8_t *tod_high, uint64_t *tod_clock); > int kvm_s390_set_clock(uint8_t tod_high, uint64_t tod_clock); > int kvm_s390_set_clock_ext(uint8_t tod_high, uint64_t tod_clock); > +int kvm_s390_get_cpc(uint64_t *cpc); > +int kvm_s390_set_cpc(uint64_t cpc); > +bool kvm_s390_has_diag318(void); > void kvm_s390_enable_css_support(S390CPU *cpu); > int kvm_s390_assign_subch_ioeventfd(EventNotifier *notifier, uint32_t sch, > int vq, bool assign); >
On 09.05.19 11:58, Christian Borntraeger wrote: > > > On 02.05.19 00:31, Collin Walling wrote: >> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >> be intercepted by SIE and handled via KVM. Let's introduce some >> functions to communicate between QEMU and KVM via ioctls. These >> will be used to get/set the diag318 related information (also known >> as the "Control Program Code" or "CPC"), as well as check the system >> if KVM supports handling this instruction. >> >> The availability of this instruction is determined by byte 134, bit 0 >> of the Read Info block. This coincidentally expands into the space used >> for CPU entries, which means VMs running with the diag318 capability >> will have a reduced maximum CPU count. To alleviate this, let's calculate >> the actual max CPU entry space by subtracting the size of Read Info from >> the SCCB size then dividing that number by the size of a CPU entry. If >> this value is less than the value denoted by S390_MAX_CPUS, then let's >> reduce the max cpus for s390 from 248 to 240 in an effort to anticipate >> this potentially happening again in the future. >> >> In order to simplify the migration and system reset requirements of >> the diag318 data, let's introduce it as a device class and include >> a VMStateDescription. >> >> Diag318 is reset on during modified clear and load normal. >> >> Signed-off-by: Collin Walling <walling@linux.ibm.com> >> --- >> hw/s390x/Makefile.objs | 1 + >> hw/s390x/diag318.c | 100 +++++++++++++++++++++++++++++++++++++++++++ >> hw/s390x/diag318.h | 39 +++++++++++++++++ >> hw/s390x/s390-virtio-ccw.c | 23 ++++++++++ >> hw/s390x/sclp.c | 5 +++ >> include/hw/s390x/sclp.h | 2 + >> linux-headers/asm-s390/kvm.h | 4 ++ >> target/s390x/kvm-stub.c | 15 +++++++ >> target/s390x/kvm.c | 32 ++++++++++++++ >> target/s390x/kvm_s390x.h | 3 ++ >> 10 files changed, 224 insertions(+) >> create mode 100644 hw/s390x/diag318.c >> create mode 100644 hw/s390x/diag318.h >> >> diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs >> index e02ed80..93621dc 100644 >> --- a/hw/s390x/Makefile.objs >> +++ b/hw/s390x/Makefile.objs >> @@ -34,3 +34,4 @@ obj-$(CONFIG_KVM) += s390-stattrib-kvm.o >> obj-y += s390-ccw.o >> obj-y += ap-device.o >> obj-y += ap-bridge.o >> +obj-y += diag318.o >> diff --git a/hw/s390x/diag318.c b/hw/s390x/diag318.c >> new file mode 100644 >> index 0000000..94b44da >> --- /dev/null >> +++ b/hw/s390x/diag318.c >> @@ -0,0 +1,100 @@ >> +/* >> + * DIAGNOSE 0x318 functions for reset and migration >> + * >> + * Copyright IBM, Corp. 2019 >> + * >> + * Authors: >> + * Collin Walling <walling@linux.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or (at your >> + * option) any later version. See the COPYING file in the top-level directory. >> + */ >> + >> +#include "hw/s390x/diag318.h" >> +#include "qapi/error.h" >> +#include "kvm_s390x.h" >> +#include "sysemu/kvm.h" >> + >> +static int diag318_post_load(void *opaque, int version_id) >> +{ >> + DIAG318State *d = opaque; >> + >> + kvm_s390_set_cpc(d->cpc); >> + >> + /* It is not necessary to retain a copy of the cpc after migration. */ >> + d->cpc = 0; > > But we also do not need to zero it out? Can't you just drop these 2 lines? > > >> + >> + return 0; >> +} >> + >> +static int diag318_pre_save(void *opaque) >> +{ >> + DIAG318State *d = opaque; >> + >> + kvm_s390_get_cpc(&d->cpc); >> + return 0; >> +} >> + >> +static bool diag318_needed(void *opaque) >> +{ >> + DIAG318State *d = opaque; >> + >> + return d->enabled; >> +} > > I would like to have a cpumodel entry that allows to disable that feature. And we should > then check for this. > >> + >> +const VMStateDescription vmstate_diag318 = { >> + .name = "vmstate_diag318", >> + .post_load = diag318_post_load, >> + .pre_save = diag318_pre_save, >> + .version_id = 1, >> + .minimum_version_id = 1, >> + .needed = diag318_needed, >> + .fields = (VMStateField[]) { >> + VMSTATE_UINT64(cpc, DIAG318State), >> + VMSTATE_END_OF_LIST() >> + } >> +}; >> + >> +static void s390_diag318_realize(DeviceState *dev, Error **errp) >> +{ >> + DIAG318State *d = DIAG318(dev); >> + >> + if (kvm_s390_has_diag318()) { >> + d->enabled = true; >> + } > > same here -> cpumodel > Then we can get rid of the enabled bool. >> +} >> + >> +static void s390_diag318_reset(DeviceState *dev) >> +{ >> + DIAG318State *d = DIAG318(dev); >> + >> + if (d->enabled) { >> + kvm_s390_set_cpc(0); >> + } >> +} >> + >> +static void s390_diag318_class_init(ObjectClass *klass, void *data) >> +{ >> + DeviceClass *dc = DEVICE_CLASS(klass); >> + >> + dc->realize = s390_diag318_realize; >> + dc->reset = s390_diag318_reset; >> + dc->vmsd = &vmstate_diag318; >> + dc->hotpluggable = false; >> + /* Reason: Set automatically during IPL */ >> + dc->user_creatable = false; >> +} >> + >> +static const TypeInfo s390_diag318_info = { >> + .class_init = s390_diag318_class_init, >> + .parent = TYPE_DEVICE, >> + .name = TYPE_S390_DIAG318, >> + .instance_size = sizeof(DIAG318State), >> +}; >> + >> +static void s390_diag318_register_types(void) >> +{ >> + type_register_static(&s390_diag318_info); >> +} >> + >> +type_init(s390_diag318_register_types) >> diff --git a/hw/s390x/diag318.h b/hw/s390x/diag318.h >> new file mode 100644 >> index 0000000..c154b0a >> --- /dev/null >> +++ b/hw/s390x/diag318.h >> @@ -0,0 +1,39 @@ >> +/* >> + * DIAGNOSE 0x318 functions for reset and migration >> + * >> + * Copyright IBM, Corp. 2019 >> + * >> + * Authors: >> + * Collin Walling <walling@linux.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or (at your >> + * option) any later version. See the COPYING file in the top-level directory. >> + */ >> + >> + #ifndef DIAG318_H >> + #define DIAG318_H >> + >> +#include "qemu/osdep.h" >> +#include "hw/qdev.h" >> + >> +#define TYPE_S390_DIAG318 "diag318" >> +#define DIAG318(obj) \ >> + OBJECT_CHECK(DIAG318State, (obj), TYPE_S390_DIAG318) >> + >> +typedef struct DIAG318State { >> + /*< private >*/ >> + DeviceState parent_obj; >> + >> + /*< public >*/ >> + uint64_t cpc; >> + bool enabled; >> +} DIAG318State; >> + >> +typedef struct DIAG318Class { >> + /*< private >*/ >> + DeviceClass parent_class; >> + >> + /*< public >*/ >> +} DIAG318Class; >> + >> +#endif /* DIAG318_H */ >> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c >> index d11069b..44a424b 100644 >> --- a/hw/s390x/s390-virtio-ccw.c >> +++ b/hw/s390x/s390-virtio-ccw.c >> @@ -36,6 +36,7 @@ >> #include "cpu_models.h" >> #include "hw/nmi.h" >> #include "hw/s390x/tod.h" >> +#include "hw/s390x/diag318.h" >> >> S390CPU *s390_cpu_addr2state(uint16_t cpu_addr) >> { >> @@ -92,6 +93,7 @@ static const char *const reset_dev_types[] = { >> "s390-sclp-event-facility", >> "s390-flic", >> "diag288", >> + TYPE_S390_DIAG318, >> }; >> >> static void subsystem_reset(void) >> @@ -246,6 +248,17 @@ static void s390_create_sclpconsole(const char *type, Chardev *chardev) >> qdev_init_nofail(dev); >> } >> >> +static void s390_init_diag318(void) >> +{ >> + Object *new = object_new(TYPE_S390_DIAG318); >> + DeviceState *dev = DEVICE(new); >> + >> + object_property_add_child(qdev_get_machine(), TYPE_S390_DIAG318, >> + new, NULL); >> + object_unref(new); >> + qdev_init_nofail(dev); >> +} >> + >> static void ccw_init(MachineState *machine) >> { >> int ret; >> @@ -302,6 +315,8 @@ static void ccw_init(MachineState *machine) >> >> /* init the TOD clock */ >> s390_init_tod(); >> + >> + s390_init_diag318(); >> } >> >> static void s390_cpu_plug(HotplugHandler *hotplug_dev, >> @@ -570,6 +585,7 @@ static void machine_set_loadparm(Object *obj, const char *val, Error **errp) >> ms->loadparm[i] = ' '; /* pad right with spaces */ >> } >> } >> + >> static inline void s390_machine_initfn(Object *obj) >> { >> object_property_add_bool(obj, "aes-key-wrap", >> @@ -652,6 +668,13 @@ static void ccw_machine_4_0_instance_options(MachineState *machine) >> >> static void ccw_machine_4_0_class_options(MachineClass *mc) >> { >> + /* >> + * Read Info might reveal more bytes used to detect facilities, thus >> + * reducing the number of CPU entries. Let's reduce the max CPUs by >> + * an arbitrary number in effort to anticipate future facility bytes. >> + */ >> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >> + mc->max_cpus = S390_MAX_CPUS - 8; > > Maybe this should depend on the presence of this feature in the cpumodel? > >> } >> DEFINE_CCW_MACHINE(4_0, "4.0", true); >> >> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c >> index 4510a80..9cfa188 100644 >> --- a/hw/s390x/sclp.c >> +++ b/hw/s390x/sclp.c >> @@ -22,6 +22,7 @@ >> #include "hw/s390x/event-facility.h" >> #include "hw/s390x/s390-pci-bus.h" >> #include "hw/s390x/ipl.h" >> +#include "kvm_s390x.h" >> >> static inline SCLPDevice *get_sclp_device(void) >> { >> @@ -74,6 +75,10 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb) >> s390_get_feat_block(S390_FEAT_TYPE_SCLP_CONF_CHAR_EXT, >> read_info->conf_char_ext); >> >> + /* Enable diag318 for guest if KVM supports emulation */ >> + if (kvm_s390_has_diag318()) >> + read_info->fac134 = 0x80; > > I think we should rather make this part of the cpumodel +1, had the same idea when skimming over this.
On 5/9/19 5:58 AM, Christian Borntraeger wrote: > > > On 02.05.19 00:31, Collin Walling wrote: >> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >> be intercepted by SIE and handled via KVM. Let's introduce some >> functions to communicate between QEMU and KVM via ioctls. These >> will be used to get/set the diag318 related information (also known >> as the "Control Program Code" or "CPC"), as well as check the system >> if KVM supports handling this instruction. >> >> The availability of this instruction is determined by byte 134, bit 0 >> of the Read Info block. This coincidentally expands into the space used >> for CPU entries, which means VMs running with the diag318 capability >> will have a reduced maximum CPU count. To alleviate this, let's calculate >> the actual max CPU entry space by subtracting the size of Read Info from >> the SCCB size then dividing that number by the size of a CPU entry. If >> this value is less than the value denoted by S390_MAX_CPUS, then let's >> reduce the max cpus for s390 from 248 to 240 in an effort to anticipate >> this potentially happening again in the future. >> >> In order to simplify the migration and system reset requirements of >> the diag318 data, let's introduce it as a device class and include >> a VMStateDescription. >> >> Diag318 is reset on during modified clear and load normal. >> >> Signed-off-by: Collin Walling <walling@linux.ibm.com> >> --- >> hw/s390x/Makefile.objs | 1 + >> hw/s390x/diag318.c | 100 +++++++++++++++++++++++++++++++++++++++++++ >> hw/s390x/diag318.h | 39 +++++++++++++++++ >> hw/s390x/s390-virtio-ccw.c | 23 ++++++++++ >> hw/s390x/sclp.c | 5 +++ >> include/hw/s390x/sclp.h | 2 + >> linux-headers/asm-s390/kvm.h | 4 ++ >> target/s390x/kvm-stub.c | 15 +++++++ >> target/s390x/kvm.c | 32 ++++++++++++++ >> target/s390x/kvm_s390x.h | 3 ++ >> 10 files changed, 224 insertions(+) >> create mode 100644 hw/s390x/diag318.c >> create mode 100644 hw/s390x/diag318.h >> >> diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs >> index e02ed80..93621dc 100644 >> --- a/hw/s390x/Makefile.objs >> +++ b/hw/s390x/Makefile.objs >> @@ -34,3 +34,4 @@ obj-$(CONFIG_KVM) += s390-stattrib-kvm.o >> obj-y += s390-ccw.o >> obj-y += ap-device.o >> obj-y += ap-bridge.o >> +obj-y += diag318.o >> diff --git a/hw/s390x/diag318.c b/hw/s390x/diag318.c >> new file mode 100644 >> index 0000000..94b44da >> --- /dev/null >> +++ b/hw/s390x/diag318.c >> @@ -0,0 +1,100 @@ >> +/* >> + * DIAGNOSE 0x318 functions for reset and migration >> + * >> + * Copyright IBM, Corp. 2019 >> + * >> + * Authors: >> + * Collin Walling <walling@linux.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or (at your >> + * option) any later version. See the COPYING file in the top-level directory. >> + */ >> + >> +#include "hw/s390x/diag318.h" >> +#include "qapi/error.h" >> +#include "kvm_s390x.h" >> +#include "sysemu/kvm.h" >> + >> +static int diag318_post_load(void *opaque, int version_id) >> +{ >> + DIAG318State *d = opaque; >> + >> + kvm_s390_set_cpc(d->cpc); >> + >> + /* It is not necessary to retain a copy of the cpc after migration. */ >> + d->cpc = 0; > > But we also do not need to zero it out? Can't you just drop these 2 lines? > > Absolutely. I'll drop them. >> + >> + return 0; >> +} >> + >> +static int diag318_pre_save(void *opaque) >> +{ >> + DIAG318State *d = opaque; >> + >> + kvm_s390_get_cpc(&d->cpc); >> + return 0; >> +} >> + >> +static bool diag318_needed(void *opaque) >> +{ >> + DIAG318State *d = opaque; >> + >> + return d->enabled; >> +} > > I would like to have a cpumodel entry that allows to disable that feature. And we should > then check for this. > Noted. >> + >> +const VMStateDescription vmstate_diag318 = { >> + .name = "vmstate_diag318", >> + .post_load = diag318_post_load, >> + .pre_save = diag318_pre_save, >> + .version_id = 1, >> + .minimum_version_id = 1, >> + .needed = diag318_needed, >> + .fields = (VMStateField[]) { >> + VMSTATE_UINT64(cpc, DIAG318State), >> + VMSTATE_END_OF_LIST() >> + } >> +}; >> + >> +static void s390_diag318_realize(DeviceState *dev, Error **errp) >> +{ >> + DIAG318State *d = DIAG318(dev); >> + >> + if (kvm_s390_has_diag318()) { >> + d->enabled = true; >> + } > > same here -> cpumodel > Then we can get rid of the enabled bool. Noted. [...] >> >> static void s390_cpu_plug(HotplugHandler *hotplug_dev, >> @@ -570,6 +585,7 @@ static void machine_set_loadparm(Object *obj, const char *val, Error **errp) >> ms->loadparm[i] = ' '; /* pad right with spaces */ >> } >> } >> + >> static inline void s390_machine_initfn(Object *obj) >> { >> object_property_add_bool(obj, "aes-key-wrap", >> @@ -652,6 +668,13 @@ static void ccw_machine_4_0_instance_options(MachineState *machine) >> >> static void ccw_machine_4_0_class_options(MachineClass *mc) >> { >> + /* >> + * Read Info might reveal more bytes used to detect facilities, thus >> + * reducing the number of CPU entries. Let's reduce the max CPUs by >> + * an arbitrary number in effort to anticipate future facility bytes. >> + */ >> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >> + mc->max_cpus = S390_MAX_CPUS - 8; > > Maybe this should depend on the presence of this feature in the cpumodel? > That's a good idea. This would allow the user to fine tune their VM to either allow the original max 248 _or_ enable the diag318 instruction. Do you think the reduction to 240 max CPUs still makes sense? I do not know what the future of the RSCPI looks like, but if we end up with more bytes indicating facility bits, we could simply reduce the the max CPUs dynamically depending on which features the user / CPU model enables. In this case, if diag318 is enabled then reduce the max CPUs to 247. CPU models can protect us WRT migration -- we wouldn't have to concern ourselves if the destination machine can support the same number of CPUs. In the future, if feature X is enabled, then reduce the max CPUs to as many as we can fit in the remainder of the SCCB. >> } >> DEFINE_CCW_MACHINE(4_0, "4.0", true); >> >> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c >> index 4510a80..9cfa188 100644 >> --- a/hw/s390x/sclp.c >> +++ b/hw/s390x/sclp.c >> @@ -22,6 +22,7 @@ >> #include "hw/s390x/event-facility.h" >> #include "hw/s390x/s390-pci-bus.h" >> #include "hw/s390x/ipl.h" >> +#include "kvm_s390x.h" >> >> static inline SCLPDevice *get_sclp_device(void) >> { >> @@ -74,6 +75,10 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb) >> s390_get_feat_block(S390_FEAT_TYPE_SCLP_CONF_CHAR_EXT, >> read_info->conf_char_ext); >> >> + /* Enable diag318 for guest if KVM supports emulation */ >> + if (kvm_s390_has_diag318()) >> + read_info->fac134 = 0x80; > > I think we should rather make this part of the cpumodel > > As per an older discussion, does it still make sense to restrict this feature starting with a particular CPU model (we decided to allow this feature starting with the zEC12 model). We have the opportunity to simply enable this feature for *all* CPU models since this only requires KVM to support emulation. Pseudo: if diag318_cpu_feat_is_enabled if kvm_has_diag318 set facility bit for diag318 else error "KVM does not support emulation of diag318" <terminate> The next question would be this: do we want this feature to be enabled by default, or only if it is specified by the user? Default enabled: VMs will have the diag318 capability, but max CPUs will also be reduced by default. Use will have to explicitly disable diag318 to allow the max 248 CPUs. User enabled: VMs will have to be configured to enable this capability (users might not realize / care to add this feature themselves). This is as simple as providing the CPU model in a guest XML or command line and providing the diag318feature. The max CPUs will not be altered by default. >> + >> read_info->facilities = cpu_to_be64(SCLP_HAS_CPU_INFO | >> SCLP_HAS_IOA_RECONFIG); >> >> diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h >> index f9db243..d47e10a 100644 >> --- a/include/hw/s390x/sclp.h >> +++ b/include/hw/s390x/sclp.h >> @@ -133,6 +133,8 @@ typedef struct ReadInfo { >> uint16_t highest_cpu; >> uint8_t _reserved5[124 - 122]; /* 122-123 */ >> uint32_t hmfai; >> + uint8_t _reserved7[134 - 128]; /* 128-133 */ >> + uint8_t fac134; >> struct CPUEntry entries[0]; >> } QEMU_PACKED ReadInfo; >> >> diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h >> index 0265482..735f5a4 100644 >> --- a/linux-headers/asm-s390/kvm.h >> +++ b/linux-headers/asm-s390/kvm.h >> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req { >> #define KVM_S390_VM_CRYPTO 2 >> #define KVM_S390_VM_CPU_MODEL 3 >> #define KVM_S390_VM_MIGRATION 4 >> +#define KVM_S390_VM_MISC 5 > > Can you split that out into a separate patch (header sync). (We also need the relevant kernel > patch first, I could not find it right now). Gah, I keep forgetting about splitting this. Thanks for the reminder :) [...]
On 09/05/2019 22.50, Collin Walling wrote: > On 5/9/19 5:58 AM, Christian Borntraeger wrote: >> >> >> On 02.05.19 00:31, Collin Walling wrote: >>> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >>> be intercepted by SIE and handled via KVM. Let's introduce some >>> functions to communicate between QEMU and KVM via ioctls. These >>> will be used to get/set the diag318 related information (also known >>> as the "Control Program Code" or "CPC"), as well as check the system >>> if KVM supports handling this instruction. >>> >>> The availability of this instruction is determined by byte 134, bit 0 >>> of the Read Info block. This coincidentally expands into the space used >>> for CPU entries, which means VMs running with the diag318 capability >>> will have a reduced maximum CPU count. To alleviate this, let's >>> calculate >>> the actual max CPU entry space by subtracting the size of Read Info from >>> the SCCB size then dividing that number by the size of a CPU entry. If >>> this value is less than the value denoted by S390_MAX_CPUS, then let's >>> reduce the max cpus for s390 from 248 to 240 in an effort to anticipate >>> this potentially happening again in the future. >>> >>> In order to simplify the migration and system reset requirements of >>> the diag318 data, let's introduce it as a device class and include >>> a VMStateDescription. >>> >>> Diag318 is reset on during modified clear and load normal. >>> >>> Signed-off-by: Collin Walling <walling@linux.ibm.com> >>> --- >>> hw/s390x/Makefile.objs | 1 + >>> hw/s390x/diag318.c | 100 >>> +++++++++++++++++++++++++++++++++++++++++++ >>> hw/s390x/diag318.h | 39 +++++++++++++++++ >>> hw/s390x/s390-virtio-ccw.c | 23 ++++++++++ >>> hw/s390x/sclp.c | 5 +++ >>> include/hw/s390x/sclp.h | 2 + >>> linux-headers/asm-s390/kvm.h | 4 ++ >>> target/s390x/kvm-stub.c | 15 +++++++ >>> target/s390x/kvm.c | 32 ++++++++++++++ >>> target/s390x/kvm_s390x.h | 3 ++ >>> 10 files changed, 224 insertions(+) >>> create mode 100644 hw/s390x/diag318.c >>> create mode 100644 hw/s390x/diag318.h >>> >>> diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs >>> index e02ed80..93621dc 100644 >>> --- a/hw/s390x/Makefile.objs >>> +++ b/hw/s390x/Makefile.objs >>> @@ -34,3 +34,4 @@ obj-$(CONFIG_KVM) += s390-stattrib-kvm.o >>> obj-y += s390-ccw.o >>> obj-y += ap-device.o >>> obj-y += ap-bridge.o >>> +obj-y += diag318.o >>> diff --git a/hw/s390x/diag318.c b/hw/s390x/diag318.c >>> new file mode 100644 >>> index 0000000..94b44da >>> --- /dev/null >>> +++ b/hw/s390x/diag318.c >>> @@ -0,0 +1,100 @@ >>> +/* >>> + * DIAGNOSE 0x318 functions for reset and migration >>> + * >>> + * Copyright IBM, Corp. 2019 >>> + * >>> + * Authors: >>> + * Collin Walling <walling@linux.ibm.com> >>> + * >>> + * This work is licensed under the terms of the GNU GPL, version 2 >>> or (at your >>> + * option) any later version. See the COPYING file in the top-level >>> directory. >>> + */ >>> + >>> +#include "hw/s390x/diag318.h" >>> +#include "qapi/error.h" >>> +#include "kvm_s390x.h" >>> +#include "sysemu/kvm.h" >>> + >>> +static int diag318_post_load(void *opaque, int version_id) >>> +{ >>> + DIAG318State *d = opaque; >>> + >>> + kvm_s390_set_cpc(d->cpc); >>> + >>> + /* It is not necessary to retain a copy of the cpc after >>> migration. */ >>> + d->cpc = 0; >> >> But we also do not need to zero it out? Can't you just drop these 2 >> lines? >> >> > > Absolutely. I'll drop them. > >>> + >>> + return 0; >>> +} >>> + >>> +static int diag318_pre_save(void *opaque) >>> +{ >>> + DIAG318State *d = opaque; >>> + >>> + kvm_s390_get_cpc(&d->cpc); >>> + return 0; >>> +} >>> + >>> +static bool diag318_needed(void *opaque) >>> +{ >>> + DIAG318State *d = opaque; >>> + >>> + return d->enabled; >>> +} >> >> I would like to have a cpumodel entry that allows to disable that >> feature. And we should >> then check for this. >> > > Noted. > >>> + >>> +const VMStateDescription vmstate_diag318 = { >>> + .name = "vmstate_diag318", >>> + .post_load = diag318_post_load, >>> + .pre_save = diag318_pre_save, >>> + .version_id = 1, >>> + .minimum_version_id = 1, >>> + .needed = diag318_needed, >>> + .fields = (VMStateField[]) { >>> + VMSTATE_UINT64(cpc, DIAG318State), >>> + VMSTATE_END_OF_LIST() >>> + } >>> +}; >>> + >>> +static void s390_diag318_realize(DeviceState *dev, Error **errp) >>> +{ >>> + DIAG318State *d = DIAG318(dev); >>> + >>> + if (kvm_s390_has_diag318()) { >>> + d->enabled = true; >>> + } >> >> same here -> cpumodel >> Then we can get rid of the enabled bool. > > Noted. > > [...] >>> static void s390_cpu_plug(HotplugHandler *hotplug_dev, >>> @@ -570,6 +585,7 @@ static void machine_set_loadparm(Object *obj, >>> const char *val, Error **errp) >>> ms->loadparm[i] = ' '; /* pad right with spaces */ >>> } >>> } >>> + >>> static inline void s390_machine_initfn(Object *obj) >>> { >>> object_property_add_bool(obj, "aes-key-wrap", >>> @@ -652,6 +668,13 @@ static void >>> ccw_machine_4_0_instance_options(MachineState *machine) >>> static void ccw_machine_4_0_class_options(MachineClass *mc) >>> { >>> + /* >>> + * Read Info might reveal more bytes used to detect facilities, >>> thus >>> + * reducing the number of CPU entries. Let's reduce the max CPUs by >>> + * an arbitrary number in effort to anticipate future facility >>> bytes. >>> + */ >>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < >>> S390_MAX_CPUS) >>> + mc->max_cpus = S390_MAX_CPUS - 8; >> >> Maybe this should depend on the presence of this feature in the cpumodel? >> > > That's a good idea. This would allow the user to fine tune their VM to > either allow the original max 248 _or_ enable the diag318 instruction. Please also not that ccw_machine_4_0_class_options() is called from ccw_machine_3_1_class_options() and thus all older machine class option functions - you have to make sure to restore the 248 for them. As a test, please check that you can migrate a s390-ccw-virtio-3.1 machine with 248 CPUs from a qemu without your patch to a QEMU with your patch and back. Thomas
On 02.05.19 00:31, Collin Walling wrote: > DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must > be intercepted by SIE and handled via KVM. Let's introduce some > functions to communicate between QEMU and KVM via ioctls. These > will be used to get/set the diag318 related information (also known > as the "Control Program Code" or "CPC"), as well as check the system > if KVM supports handling this instruction. > > The availability of this instruction is determined by byte 134, bit 0 > of the Read Info block. This coincidentally expands into the space used > for CPU entries, which means VMs running with the diag318 capability > will have a reduced maximum CPU count. To alleviate this, let's calculate > the actual max CPU entry space by subtracting the size of Read Info from > the SCCB size then dividing that number by the size of a CPU entry. If > this value is less than the value denoted by S390_MAX_CPUS, then let's > reduce the max cpus for s390 from 248 to 240 in an effort to anticipate > this potentially happening again in the future. > > In order to simplify the migration and system reset requirements of > the diag318 data, let's introduce it as a device class and include > a VMStateDescription. > > Diag318 is reset on during modified clear and load normal. > > Signed-off-by: Collin Walling <walling@linux.ibm.com> > --- > hw/s390x/Makefile.objs | 1 + > hw/s390x/diag318.c | 100 +++++++++++++++++++++++++++++++++++++++++++ > hw/s390x/diag318.h | 39 +++++++++++++++++ > hw/s390x/s390-virtio-ccw.c | 23 ++++++++++ > hw/s390x/sclp.c | 5 +++ > include/hw/s390x/sclp.h | 2 + > linux-headers/asm-s390/kvm.h | 4 ++ > target/s390x/kvm-stub.c | 15 +++++++ > target/s390x/kvm.c | 32 ++++++++++++++ > target/s390x/kvm_s390x.h | 3 ++ > 10 files changed, 224 insertions(+) > create mode 100644 hw/s390x/diag318.c > create mode 100644 hw/s390x/diag318.h > > diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs > index e02ed80..93621dc 100644 > --- a/hw/s390x/Makefile.objs > +++ b/hw/s390x/Makefile.objs > @@ -34,3 +34,4 @@ obj-$(CONFIG_KVM) += s390-stattrib-kvm.o > obj-y += s390-ccw.o > obj-y += ap-device.o > obj-y += ap-bridge.o > +obj-y += diag318.o > diff --git a/hw/s390x/diag318.c b/hw/s390x/diag318.c > new file mode 100644 > index 0000000..94b44da > --- /dev/null > +++ b/hw/s390x/diag318.c > @@ -0,0 +1,100 @@ > +/* > + * DIAGNOSE 0x318 functions for reset and migration > + * > + * Copyright IBM, Corp. 2019 > + * > + * Authors: > + * Collin Walling <walling@linux.ibm.com> > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or (at your > + * option) any later version. See the COPYING file in the top-level directory. > + */ > + > +#include "hw/s390x/diag318.h" > +#include "qapi/error.h" > +#include "kvm_s390x.h" > +#include "sysemu/kvm.h" > + > +static int diag318_post_load(void *opaque, int version_id) > +{ > + DIAG318State *d = opaque; > + > + kvm_s390_set_cpc(d->cpc); > + > + /* It is not necessary to retain a copy of the cpc after migration. */ > + d->cpc = 0; > + > + return 0; > +} > + > +static int diag318_pre_save(void *opaque) > +{ > + DIAG318State *d = opaque; > + > + kvm_s390_get_cpc(&d->cpc); > + return 0; > +} > + > +static bool diag318_needed(void *opaque) > +{ > + DIAG318State *d = opaque; > + > + return d->enabled; > +} > + > +const VMStateDescription vmstate_diag318 = { > + .name = "vmstate_diag318", > + .post_load = diag318_post_load, > + .pre_save = diag318_pre_save, > + .version_id = 1, > + .minimum_version_id = 1, > + .needed = diag318_needed, > + .fields = (VMStateField[]) { > + VMSTATE_UINT64(cpc, DIAG318State), > + VMSTATE_END_OF_LIST() > + } > +}; > + > +static void s390_diag318_realize(DeviceState *dev, Error **errp) > +{ > + DIAG318State *d = DIAG318(dev); > + > + if (kvm_s390_has_diag318()) { > + d->enabled = true; > + } > +} > + > +static void s390_diag318_reset(DeviceState *dev) > +{ > + DIAG318State *d = DIAG318(dev); > + > + if (d->enabled) { > + kvm_s390_set_cpc(0); > + } > +} > + > +static void s390_diag318_class_init(ObjectClass *klass, void *data) > +{ > + DeviceClass *dc = DEVICE_CLASS(klass); > + > + dc->realize = s390_diag318_realize; > + dc->reset = s390_diag318_reset; > + dc->vmsd = &vmstate_diag318; > + dc->hotpluggable = false; > + /* Reason: Set automatically during IPL */ > + dc->user_creatable = false; > +} > + > +static const TypeInfo s390_diag318_info = { > + .class_init = s390_diag318_class_init, > + .parent = TYPE_DEVICE, > + .name = TYPE_S390_DIAG318, > + .instance_size = sizeof(DIAG318State), > +}; > + > +static void s390_diag318_register_types(void) > +{ > + type_register_static(&s390_diag318_info); > +} > + > +type_init(s390_diag318_register_types) > diff --git a/hw/s390x/diag318.h b/hw/s390x/diag318.h > new file mode 100644 > index 0000000..c154b0a > --- /dev/null > +++ b/hw/s390x/diag318.h > @@ -0,0 +1,39 @@ > +/* > + * DIAGNOSE 0x318 functions for reset and migration > + * > + * Copyright IBM, Corp. 2019 > + * > + * Authors: > + * Collin Walling <walling@linux.ibm.com> > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or (at your > + * option) any later version. See the COPYING file in the top-level directory. > + */ > + > + #ifndef DIAG318_H > + #define DIAG318_H > + > +#include "qemu/osdep.h" > +#include "hw/qdev.h" > + > +#define TYPE_S390_DIAG318 "diag318" > +#define DIAG318(obj) \ > + OBJECT_CHECK(DIAG318State, (obj), TYPE_S390_DIAG318) > + > +typedef struct DIAG318State { > + /*< private >*/ > + DeviceState parent_obj; > + > + /*< public >*/ > + uint64_t cpc; > + bool enabled; > +} DIAG318State; > + > +typedef struct DIAG318Class { > + /*< private >*/ > + DeviceClass parent_class; > + > + /*< public >*/ > +} DIAG318Class; > + > +#endif /* DIAG318_H */ > diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c > index d11069b..44a424b 100644 > --- a/hw/s390x/s390-virtio-ccw.c > +++ b/hw/s390x/s390-virtio-ccw.c > @@ -36,6 +36,7 @@ > #include "cpu_models.h" > #include "hw/nmi.h" > #include "hw/s390x/tod.h" > +#include "hw/s390x/diag318.h" > > S390CPU *s390_cpu_addr2state(uint16_t cpu_addr) > { > @@ -92,6 +93,7 @@ static const char *const reset_dev_types[] = { > "s390-sclp-event-facility", > "s390-flic", > "diag288", > + TYPE_S390_DIAG318, > }; > > static void subsystem_reset(void) > @@ -246,6 +248,17 @@ static void s390_create_sclpconsole(const char *type, Chardev *chardev) > qdev_init_nofail(dev); > } > > +static void s390_init_diag318(void) > +{ > + Object *new = object_new(TYPE_S390_DIAG318); > + DeviceState *dev = DEVICE(new); > + > + object_property_add_child(qdev_get_machine(), TYPE_S390_DIAG318, > + new, NULL); > + object_unref(new); > + qdev_init_nofail(dev); > +} > + > static void ccw_init(MachineState *machine) > { > int ret; > @@ -302,6 +315,8 @@ static void ccw_init(MachineState *machine) > > /* init the TOD clock */ > s390_init_tod(); > + > + s390_init_diag318(); > } > > static void s390_cpu_plug(HotplugHandler *hotplug_dev, > @@ -570,6 +585,7 @@ static void machine_set_loadparm(Object *obj, const char *val, Error **errp) > ms->loadparm[i] = ' '; /* pad right with spaces */ > } > } > + > static inline void s390_machine_initfn(Object *obj) > { > object_property_add_bool(obj, "aes-key-wrap", > @@ -652,6 +668,13 @@ static void ccw_machine_4_0_instance_options(MachineState *machine) > > static void ccw_machine_4_0_class_options(MachineClass *mc) > { > + /* > + * Read Info might reveal more bytes used to detect facilities, thus > + * reducing the number of CPU entries. Let's reduce the max CPUs by > + * an arbitrary number in effort to anticipate future facility bytes. > + */ > + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) > + mc->max_cpus = S390_MAX_CPUS - 8; This is too complicated, just set it always to 240. However, I am still not sure how to best handle this scenario. One solution is 1. Set it statically to 240 for machine > 4.1 2. Keep the old machines unmodifed 3. Don't indicate the CPU feature for machines <= 4.0 #3 is the problematic part, as it mixes host CPU features and machines. Bad. The host CPU model should always look the same on all machines. I don't like this.
On 13.05.19 09:46, David Hildenbrand wrote: > On 02.05.19 00:31, Collin Walling wrote: >> DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must >> be intercepted by SIE and handled via KVM. Let's introduce some >> functions to communicate between QEMU and KVM via ioctls. These >> will be used to get/set the diag318 related information (also known >> as the "Control Program Code" or "CPC"), as well as check the system >> if KVM supports handling this instruction. >> >> The availability of this instruction is determined by byte 134, bit 0 >> of the Read Info block. This coincidentally expands into the space used >> for CPU entries, which means VMs running with the diag318 capability >> will have a reduced maximum CPU count. To alleviate this, let's calculate >> the actual max CPU entry space by subtracting the size of Read Info from >> the SCCB size then dividing that number by the size of a CPU entry. If >> this value is less than the value denoted by S390_MAX_CPUS, then let's >> reduce the max cpus for s390 from 248 to 240 in an effort to anticipate >> this potentially happening again in the future. >> >> In order to simplify the migration and system reset requirements of >> the diag318 data, let's introduce it as a device class and include >> a VMStateDescription. >> >> Diag318 is reset on during modified clear and load normal. >> >> Signed-off-by: Collin Walling <walling@linux.ibm.com> >> --- >> hw/s390x/Makefile.objs | 1 + >> hw/s390x/diag318.c | 100 +++++++++++++++++++++++++++++++++++++++++++ >> hw/s390x/diag318.h | 39 +++++++++++++++++ >> hw/s390x/s390-virtio-ccw.c | 23 ++++++++++ >> hw/s390x/sclp.c | 5 +++ >> include/hw/s390x/sclp.h | 2 + >> linux-headers/asm-s390/kvm.h | 4 ++ >> target/s390x/kvm-stub.c | 15 +++++++ >> target/s390x/kvm.c | 32 ++++++++++++++ >> target/s390x/kvm_s390x.h | 3 ++ >> 10 files changed, 224 insertions(+) >> create mode 100644 hw/s390x/diag318.c >> create mode 100644 hw/s390x/diag318.h >> >> diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs >> index e02ed80..93621dc 100644 >> --- a/hw/s390x/Makefile.objs >> +++ b/hw/s390x/Makefile.objs >> @@ -34,3 +34,4 @@ obj-$(CONFIG_KVM) += s390-stattrib-kvm.o >> obj-y += s390-ccw.o >> obj-y += ap-device.o >> obj-y += ap-bridge.o >> +obj-y += diag318.o >> diff --git a/hw/s390x/diag318.c b/hw/s390x/diag318.c >> new file mode 100644 >> index 0000000..94b44da >> --- /dev/null >> +++ b/hw/s390x/diag318.c >> @@ -0,0 +1,100 @@ >> +/* >> + * DIAGNOSE 0x318 functions for reset and migration >> + * >> + * Copyright IBM, Corp. 2019 >> + * >> + * Authors: >> + * Collin Walling <walling@linux.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or (at your >> + * option) any later version. See the COPYING file in the top-level directory. >> + */ >> + >> +#include "hw/s390x/diag318.h" >> +#include "qapi/error.h" >> +#include "kvm_s390x.h" >> +#include "sysemu/kvm.h" >> + >> +static int diag318_post_load(void *opaque, int version_id) >> +{ >> + DIAG318State *d = opaque; >> + >> + kvm_s390_set_cpc(d->cpc); >> + >> + /* It is not necessary to retain a copy of the cpc after migration. */ >> + d->cpc = 0; >> + >> + return 0; >> +} >> + >> +static int diag318_pre_save(void *opaque) >> +{ >> + DIAG318State *d = opaque; >> + >> + kvm_s390_get_cpc(&d->cpc); >> + return 0; >> +} >> + >> +static bool diag318_needed(void *opaque) >> +{ >> + DIAG318State *d = opaque; >> + >> + return d->enabled; >> +} >> + >> +const VMStateDescription vmstate_diag318 = { >> + .name = "vmstate_diag318", >> + .post_load = diag318_post_load, >> + .pre_save = diag318_pre_save, >> + .version_id = 1, >> + .minimum_version_id = 1, >> + .needed = diag318_needed, >> + .fields = (VMStateField[]) { >> + VMSTATE_UINT64(cpc, DIAG318State), >> + VMSTATE_END_OF_LIST() >> + } >> +}; >> + >> +static void s390_diag318_realize(DeviceState *dev, Error **errp) >> +{ >> + DIAG318State *d = DIAG318(dev); >> + >> + if (kvm_s390_has_diag318()) { >> + d->enabled = true; >> + } >> +} >> + >> +static void s390_diag318_reset(DeviceState *dev) >> +{ >> + DIAG318State *d = DIAG318(dev); >> + >> + if (d->enabled) { >> + kvm_s390_set_cpc(0); >> + } >> +} >> + >> +static void s390_diag318_class_init(ObjectClass *klass, void *data) >> +{ >> + DeviceClass *dc = DEVICE_CLASS(klass); >> + >> + dc->realize = s390_diag318_realize; >> + dc->reset = s390_diag318_reset; >> + dc->vmsd = &vmstate_diag318; >> + dc->hotpluggable = false; >> + /* Reason: Set automatically during IPL */ >> + dc->user_creatable = false; >> +} >> + >> +static const TypeInfo s390_diag318_info = { >> + .class_init = s390_diag318_class_init, >> + .parent = TYPE_DEVICE, >> + .name = TYPE_S390_DIAG318, >> + .instance_size = sizeof(DIAG318State), >> +}; >> + >> +static void s390_diag318_register_types(void) >> +{ >> + type_register_static(&s390_diag318_info); >> +} >> + >> +type_init(s390_diag318_register_types) >> diff --git a/hw/s390x/diag318.h b/hw/s390x/diag318.h >> new file mode 100644 >> index 0000000..c154b0a >> --- /dev/null >> +++ b/hw/s390x/diag318.h >> @@ -0,0 +1,39 @@ >> +/* >> + * DIAGNOSE 0x318 functions for reset and migration >> + * >> + * Copyright IBM, Corp. 2019 >> + * >> + * Authors: >> + * Collin Walling <walling@linux.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU GPL, version 2 or (at your >> + * option) any later version. See the COPYING file in the top-level directory. >> + */ >> + >> + #ifndef DIAG318_H >> + #define DIAG318_H >> + >> +#include "qemu/osdep.h" >> +#include "hw/qdev.h" >> + >> +#define TYPE_S390_DIAG318 "diag318" >> +#define DIAG318(obj) \ >> + OBJECT_CHECK(DIAG318State, (obj), TYPE_S390_DIAG318) >> + >> +typedef struct DIAG318State { >> + /*< private >*/ >> + DeviceState parent_obj; >> + >> + /*< public >*/ >> + uint64_t cpc; >> + bool enabled; >> +} DIAG318State; >> + >> +typedef struct DIAG318Class { >> + /*< private >*/ >> + DeviceClass parent_class; >> + >> + /*< public >*/ >> +} DIAG318Class; >> + >> +#endif /* DIAG318_H */ >> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c >> index d11069b..44a424b 100644 >> --- a/hw/s390x/s390-virtio-ccw.c >> +++ b/hw/s390x/s390-virtio-ccw.c >> @@ -36,6 +36,7 @@ >> #include "cpu_models.h" >> #include "hw/nmi.h" >> #include "hw/s390x/tod.h" >> +#include "hw/s390x/diag318.h" >> >> S390CPU *s390_cpu_addr2state(uint16_t cpu_addr) >> { >> @@ -92,6 +93,7 @@ static const char *const reset_dev_types[] = { >> "s390-sclp-event-facility", >> "s390-flic", >> "diag288", >> + TYPE_S390_DIAG318, >> }; >> >> static void subsystem_reset(void) >> @@ -246,6 +248,17 @@ static void s390_create_sclpconsole(const char *type, Chardev *chardev) >> qdev_init_nofail(dev); >> } >> >> +static void s390_init_diag318(void) >> +{ >> + Object *new = object_new(TYPE_S390_DIAG318); >> + DeviceState *dev = DEVICE(new); >> + >> + object_property_add_child(qdev_get_machine(), TYPE_S390_DIAG318, >> + new, NULL); >> + object_unref(new); >> + qdev_init_nofail(dev); >> +} >> + >> static void ccw_init(MachineState *machine) >> { >> int ret; >> @@ -302,6 +315,8 @@ static void ccw_init(MachineState *machine) >> >> /* init the TOD clock */ >> s390_init_tod(); >> + >> + s390_init_diag318(); >> } >> >> static void s390_cpu_plug(HotplugHandler *hotplug_dev, >> @@ -570,6 +585,7 @@ static void machine_set_loadparm(Object *obj, const char *val, Error **errp) >> ms->loadparm[i] = ' '; /* pad right with spaces */ >> } >> } >> + >> static inline void s390_machine_initfn(Object *obj) >> { >> object_property_add_bool(obj, "aes-key-wrap", >> @@ -652,6 +668,13 @@ static void ccw_machine_4_0_instance_options(MachineState *machine) >> >> static void ccw_machine_4_0_class_options(MachineClass *mc) >> { >> + /* >> + * Read Info might reveal more bytes used to detect facilities, thus >> + * reducing the number of CPU entries. Let's reduce the max CPUs by >> + * an arbitrary number in effort to anticipate future facility bytes. >> + */ >> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >> + mc->max_cpus = S390_MAX_CPUS - 8; > > This is too complicated, just set it always to 240. > > However, I am still not sure how to best handle this scenario. One > solution is > > 1. Set it statically to 240 for machine > 4.1 > 2. Keep the old machines unmodifed > 3. Don't indicate the CPU feature for machines <= 4.0 > > #3 is the problematic part, as it mixes host CPU features and machines. > Bad. The host CPU model should always look the same on all machines. I > don't like this. > FWIW, #3 is only an issue when modeling it via the CPU model, like Christian suggested. I suggest the following 1. Set the max #cpus for 4.1 to 240 (already done) 2. Keep it for the other machines unmodified (as suggested by Thomas) 3. Create the layout of the SCCB depending on the machine type (to be done) If we want to model diag318 via a CPU feature (which makes sense for migration): 4. Disable diag318 with a warning if used with a machine < 4.1
On 13.05.19 10:03, David Hildenbrand wrote: >>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >>> + mc->max_cpus = S390_MAX_CPUS - 8; >> >> This is too complicated, just set it always to 240. >> >> However, I am still not sure how to best handle this scenario. One >> solution is >> >> 1. Set it statically to 240 for machine > 4.1 >> 2. Keep the old machines unmodifed >> 3. Don't indicate the CPU feature for machines <= 4.0 >> >> #3 is the problematic part, as it mixes host CPU features and machines. >> Bad. The host CPU model should always look the same on all machines. I >> don't like this. >> > > FWIW, #3 is only an issue when modeling it via the CPU model, like > Christian suggested. > > I suggest the following > > 1. Set the max #cpus for 4.1 to 240 (already done) > 2. Keep it for the other machines unmodified (as suggested by Thomas) > 3. Create the layout of the SCCB depending on the machine type (to be done) > > If we want to model diag318 via a CPU feature (which makes sense for > migration): > > 4. Disable diag318 with a warning if used with a machine < 4.1 > I think there is a simpler solution. It is perfectly fine to fail the startup if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature also for older machines. And if somebody chooses both at the same time, lets fails the startup.
On 13.05.19 11:34, Christian Borntraeger wrote: > > > On 13.05.19 10:03, David Hildenbrand wrote: >>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >>>> + mc->max_cpus = S390_MAX_CPUS - 8; >>> >>> This is too complicated, just set it always to 240. >>> >>> However, I am still not sure how to best handle this scenario. One >>> solution is >>> >>> 1. Set it statically to 240 for machine > 4.1 >>> 2. Keep the old machines unmodifed >>> 3. Don't indicate the CPU feature for machines <= 4.0 >>> >>> #3 is the problematic part, as it mixes host CPU features and machines. >>> Bad. The host CPU model should always look the same on all machines. I >>> don't like this. >>> >> >> FWIW, #3 is only an issue when modeling it via the CPU model, like >> Christian suggested. >> >> I suggest the following >> >> 1. Set the max #cpus for 4.1 to 240 (already done) >> 2. Keep it for the other machines unmodified (as suggested by Thomas) >> 3. Create the layout of the SCCB depending on the machine type (to be done) >> >> If we want to model diag318 via a CPU feature (which makes sense for >> migration): >> >> 4. Disable diag318 with a warning if used with a machine < 4.1 >> > > I think there is a simpler solution. It is perfectly fine to fail the startup > if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature > also for older machines. And if somebody chooses both at the same time, > lets fails the startup. To which knob do you want to glue the layout of the SCLP response? Like I described? Do you mean instead of warning and masking the feature off as I suggested, simply failing? In that case, -machine ..-4.0 -cpu host will not work on new HW with new KVM. Just noting.
On 13.05.19 11:40, David Hildenbrand wrote: > On 13.05.19 11:34, Christian Borntraeger wrote: >> >> >> On 13.05.19 10:03, David Hildenbrand wrote: >>>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >>>>> + mc->max_cpus = S390_MAX_CPUS - 8; >>>> >>>> This is too complicated, just set it always to 240. >>>> >>>> However, I am still not sure how to best handle this scenario. One >>>> solution is >>>> >>>> 1. Set it statically to 240 for machine > 4.1 >>>> 2. Keep the old machines unmodifed >>>> 3. Don't indicate the CPU feature for machines <= 4.0 >>>> >>>> #3 is the problematic part, as it mixes host CPU features and machines. >>>> Bad. The host CPU model should always look the same on all machines. I >>>> don't like this. >>>> >>> >>> FWIW, #3 is only an issue when modeling it via the CPU model, like >>> Christian suggested. >>> >>> I suggest the following >>> >>> 1. Set the max #cpus for 4.1 to 240 (already done) >>> 2. Keep it for the other machines unmodified (as suggested by Thomas) >>> 3. Create the layout of the SCCB depending on the machine type (to be done) >>> >>> If we want to model diag318 via a CPU feature (which makes sense for >>> migration): >>> >>> 4. Disable diag318 with a warning if used with a machine < 4.1 >>> >> >> I think there is a simpler solution. It is perfectly fine to fail the startup >> if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature >> also for older machines. And if somebody chooses both at the same time, >> lets fails the startup. > > To which knob do you want to glue the layout of the SCLP response? Like > I described? Do you mean instead of warning and masking the feature off > as I suggested, simply failing? The sclp response will depend on the dia318 cpu model flag. If its on, the sclp response will have it, otherwise not. - host-passthrough: not migration safe anyway - host-model: if the target has diag318 good, otherwise we reject migration > > In that case, -machine ..-4.0 -cpu host will not work on new HW with new > KVM. Just noting. Only if you have 248 CPUs (which is unlikely). My point was to do that for all machine levels.
On 13.05.19 11:51, Christian Borntraeger wrote: > > > On 13.05.19 11:40, David Hildenbrand wrote: >> On 13.05.19 11:34, Christian Borntraeger wrote: >>> >>> >>> On 13.05.19 10:03, David Hildenbrand wrote: >>>>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >>>>>> + mc->max_cpus = S390_MAX_CPUS - 8; >>>>> >>>>> This is too complicated, just set it always to 240. >>>>> >>>>> However, I am still not sure how to best handle this scenario. One >>>>> solution is >>>>> >>>>> 1. Set it statically to 240 for machine > 4.1 >>>>> 2. Keep the old machines unmodifed >>>>> 3. Don't indicate the CPU feature for machines <= 4.0 >>>>> >>>>> #3 is the problematic part, as it mixes host CPU features and machines. >>>>> Bad. The host CPU model should always look the same on all machines. I >>>>> don't like this. >>>>> >>>> >>>> FWIW, #3 is only an issue when modeling it via the CPU model, like >>>> Christian suggested. >>>> >>>> I suggest the following >>>> >>>> 1. Set the max #cpus for 4.1 to 240 (already done) >>>> 2. Keep it for the other machines unmodified (as suggested by Thomas) >>>> 3. Create the layout of the SCCB depending on the machine type (to be done) >>>> >>>> If we want to model diag318 via a CPU feature (which makes sense for >>>> migration): >>>> >>>> 4. Disable diag318 with a warning if used with a machine < 4.1 >>>> >>> >>> I think there is a simpler solution. It is perfectly fine to fail the startup >>> if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature >>> also for older machines. And if somebody chooses both at the same time, >>> lets fails the startup. >> >> To which knob do you want to glue the layout of the SCLP response? Like >> I described? Do you mean instead of warning and masking the feature off >> as I suggested, simply failing? > > The sclp response will depend on the dia318 cpu model flag. If its on, the sclp > response will have it, otherwise not. > - host-passthrough: not migration safe anyway > - host-model: if the target has diag318 good, otherwise we reject migration >> >> In that case, -machine ..-4.0 -cpu host will not work on new HW with new >> KVM. Just noting. > > Only if you have 248 CPUs (which is unlikely). My point was to do that for all > machine levels. > The issue with this approach is that e.g. libvirt is not aware of this restriction. It could query "max_cpus" and expand the host-cpu model, but starting a guest with > 240 cpus would fail. Maybe this is acceptable. The approach you describe is actually pretty nice. We would have to 1. Modify SCLP code to change the layout based on the feature availability 2. Check against the cpus when trying to enable the cpu model.
On 13.05.19 11:57, David Hildenbrand wrote: > On 13.05.19 11:51, Christian Borntraeger wrote: >> >> >> On 13.05.19 11:40, David Hildenbrand wrote: >>> On 13.05.19 11:34, Christian Borntraeger wrote: >>>> >>>> >>>> On 13.05.19 10:03, David Hildenbrand wrote: >>>>>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >>>>>>> + mc->max_cpus = S390_MAX_CPUS - 8; >>>>>> >>>>>> This is too complicated, just set it always to 240. >>>>>> >>>>>> However, I am still not sure how to best handle this scenario. One >>>>>> solution is >>>>>> >>>>>> 1. Set it statically to 240 for machine > 4.1 >>>>>> 2. Keep the old machines unmodifed >>>>>> 3. Don't indicate the CPU feature for machines <= 4.0 >>>>>> >>>>>> #3 is the problematic part, as it mixes host CPU features and machines. >>>>>> Bad. The host CPU model should always look the same on all machines. I >>>>>> don't like this. >>>>>> >>>>> >>>>> FWIW, #3 is only an issue when modeling it via the CPU model, like >>>>> Christian suggested. >>>>> >>>>> I suggest the following >>>>> >>>>> 1. Set the max #cpus for 4.1 to 240 (already done) >>>>> 2. Keep it for the other machines unmodified (as suggested by Thomas) >>>>> 3. Create the layout of the SCCB depending on the machine type (to be done) >>>>> >>>>> If we want to model diag318 via a CPU feature (which makes sense for >>>>> migration): >>>>> >>>>> 4. Disable diag318 with a warning if used with a machine < 4.1 >>>>> >>>> >>>> I think there is a simpler solution. It is perfectly fine to fail the startup >>>> if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature >>>> also for older machines. And if somebody chooses both at the same time, >>>> lets fails the startup. >>> >>> To which knob do you want to glue the layout of the SCLP response? Like >>> I described? Do you mean instead of warning and masking the feature off >>> as I suggested, simply failing? >> >> The sclp response will depend on the dia318 cpu model flag. If its on, the sclp >> response will have it, otherwise not. >> - host-passthrough: not migration safe anyway >> - host-model: if the target has diag318 good, otherwise we reject migration >>> >>> In that case, -machine ..-4.0 -cpu host will not work on new HW with new >>> KVM. Just noting. >> >> Only if you have 248 CPUs (which is unlikely). My point was to do that for all >> machine levels. >> > > The issue with this approach is that e.g. libvirt is not aware of this > restriction. It could query "max_cpus" and expand the host-cpu model, > but starting a guest with > 240 cpus would fail. Maybe this is acceptable. As of today we do the cpu model check in the same way. libvirt actually tries to run QEMU and handles failures. For a failure, the user still has still to use >240 CPUs in its XML. The only downside is that libvirt will not reject this right away. During startup we would then print an error message like "The diag318 cpu feature is only supported for 240 and less CPUs." This is of similar quality as "Selected CPU GA level is too new. Maximum supported model in the configuration: \'%s\'", and others that we have today. So yes, I think this would be acceptable. > > The approach you describe is actually pretty nice. We would have to > > 1. Modify SCLP code to change the layout based on the feature availability > 2. Check against the cpus when trying to enable the cpu model. >
On 13.05.19 12:55, Christian Borntraeger wrote: > > > On 13.05.19 11:57, David Hildenbrand wrote: >> On 13.05.19 11:51, Christian Borntraeger wrote: >>> >>> >>> On 13.05.19 11:40, David Hildenbrand wrote: >>>> On 13.05.19 11:34, Christian Borntraeger wrote: >>>>> >>>>> >>>>> On 13.05.19 10:03, David Hildenbrand wrote: >>>>>>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >>>>>>>> + mc->max_cpus = S390_MAX_CPUS - 8; >>>>>>> >>>>>>> This is too complicated, just set it always to 240. >>>>>>> >>>>>>> However, I am still not sure how to best handle this scenario. One >>>>>>> solution is >>>>>>> >>>>>>> 1. Set it statically to 240 for machine > 4.1 >>>>>>> 2. Keep the old machines unmodifed >>>>>>> 3. Don't indicate the CPU feature for machines <= 4.0 >>>>>>> >>>>>>> #3 is the problematic part, as it mixes host CPU features and machines. >>>>>>> Bad. The host CPU model should always look the same on all machines. I >>>>>>> don't like this. >>>>>>> >>>>>> >>>>>> FWIW, #3 is only an issue when modeling it via the CPU model, like >>>>>> Christian suggested. >>>>>> >>>>>> I suggest the following >>>>>> >>>>>> 1. Set the max #cpus for 4.1 to 240 (already done) >>>>>> 2. Keep it for the other machines unmodified (as suggested by Thomas) >>>>>> 3. Create the layout of the SCCB depending on the machine type (to be done) >>>>>> >>>>>> If we want to model diag318 via a CPU feature (which makes sense for >>>>>> migration): >>>>>> >>>>>> 4. Disable diag318 with a warning if used with a machine < 4.1 >>>>>> >>>>> >>>>> I think there is a simpler solution. It is perfectly fine to fail the startup >>>>> if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature >>>>> also for older machines. And if somebody chooses both at the same time, >>>>> lets fails the startup. >>>> >>>> To which knob do you want to glue the layout of the SCLP response? Like >>>> I described? Do you mean instead of warning and masking the feature off >>>> as I suggested, simply failing? >>> >>> The sclp response will depend on the dia318 cpu model flag. If its on, the sclp >>> response will have it, otherwise not. >>> - host-passthrough: not migration safe anyway >>> - host-model: if the target has diag318 good, otherwise we reject migration >>>> >>>> In that case, -machine ..-4.0 -cpu host will not work on new HW with new >>>> KVM. Just noting. >>> >>> Only if you have 248 CPUs (which is unlikely). My point was to do that for all >>> machine levels. >>> >> >> The issue with this approach is that e.g. libvirt is not aware of this >> restriction. It could query "max_cpus" and expand the host-cpu model, >> but starting a guest with > 240 cpus would fail. Maybe this is acceptable. > > As of today we do the cpu model check in the same way. libvirt actually tries > to run QEMU and handles failures. > > For a failure, the user still has still to use >240 CPUs in its XML. The only downside > is that libvirt will not reject this right away. > > During startup we would then print an error message like > > "The diag318 cpu feature is only supported for 240 and less CPUs." > > This is of similar quality as > "Selected CPU GA level is too new. Maximum supported model in the configuration: \'%s\'", > But that can be tested using the runability information if I am not wrong. > and others that we have today. > > So yes, I think this would be acceptable. I guess it is acceptable yes. I doubt anybody uses that many CPUs in production either way. But you never know.
On Mon, 13 May 2019 13:34:35 +0200 David Hildenbrand <david@redhat.com> wrote: > On 13.05.19 12:55, Christian Borntraeger wrote: > > > > > > On 13.05.19 11:57, David Hildenbrand wrote: > >> On 13.05.19 11:51, Christian Borntraeger wrote: > >>> > >>> > >>> On 13.05.19 11:40, David Hildenbrand wrote: > >>>> On 13.05.19 11:34, Christian Borntraeger wrote: > >>>>> > >>>>> > >>>>> On 13.05.19 10:03, David Hildenbrand wrote: > >>>>>>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) > >>>>>>>> + mc->max_cpus = S390_MAX_CPUS - 8; > >>>>>>> > >>>>>>> This is too complicated, just set it always to 240. > >>>>>>> > >>>>>>> However, I am still not sure how to best handle this scenario. One > >>>>>>> solution is > >>>>>>> > >>>>>>> 1. Set it statically to 240 for machine > 4.1 > >>>>>>> 2. Keep the old machines unmodifed > >>>>>>> 3. Don't indicate the CPU feature for machines <= 4.0 > >>>>>>> > >>>>>>> #3 is the problematic part, as it mixes host CPU features and machines. > >>>>>>> Bad. The host CPU model should always look the same on all machines. I > >>>>>>> don't like this. > >>>>>>> > >>>>>> > >>>>>> FWIW, #3 is only an issue when modeling it via the CPU model, like > >>>>>> Christian suggested. > >>>>>> > >>>>>> I suggest the following > >>>>>> > >>>>>> 1. Set the max #cpus for 4.1 to 240 (already done) > >>>>>> 2. Keep it for the other machines unmodified (as suggested by Thomas) > >>>>>> 3. Create the layout of the SCCB depending on the machine type (to be done) > >>>>>> > >>>>>> If we want to model diag318 via a CPU feature (which makes sense for > >>>>>> migration): > >>>>>> > >>>>>> 4. Disable diag318 with a warning if used with a machine < 4.1 > >>>>>> > >>>>> > >>>>> I think there is a simpler solution. It is perfectly fine to fail the startup > >>>>> if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature > >>>>> also for older machines. And if somebody chooses both at the same time, > >>>>> lets fails the startup. > >>>> > >>>> To which knob do you want to glue the layout of the SCLP response? Like > >>>> I described? Do you mean instead of warning and masking the feature off > >>>> as I suggested, simply failing? > >>> > >>> The sclp response will depend on the dia318 cpu model flag. If its on, the sclp > >>> response will have it, otherwise not. > >>> - host-passthrough: not migration safe anyway > >>> - host-model: if the target has diag318 good, otherwise we reject migration > >>>> > >>>> In that case, -machine ..-4.0 -cpu host will not work on new HW with new > >>>> KVM. Just noting. > >>> > >>> Only if you have 248 CPUs (which is unlikely). My point was to do that for all > >>> machine levels. > >>> > >> > >> The issue with this approach is that e.g. libvirt is not aware of this > >> restriction. It could query "max_cpus" and expand the host-cpu model, > >> but starting a guest with > 240 cpus would fail. Maybe this is acceptable. > > > > As of today we do the cpu model check in the same way. libvirt actually tries > > to run QEMU and handles failures. > > > > For a failure, the user still has still to use >240 CPUs in its XML. The only downside > > is that libvirt will not reject this right away. > > > > During startup we would then print an error message like > > > > "The diag318 cpu feature is only supported for 240 and less CPUs." > > > > This is of similar quality as > > "Selected CPU GA level is too new. Maximum supported model in the configuration: \'%s\'", > > > > But that can be tested using the runability information if I am not wrong. You mean the cpu level information, right? > > > and others that we have today. > > > > So yes, I think this would be acceptable. > > I guess it is acceptable yes. I doubt anybody uses that many CPUs in > production either way. But you never know. I think that using that many cpus is a more uncommon setup, but I still think that having to wait for actual failure is worse than being able to find out beforehand. Any way to make this discoverable?
On 13.05.19 13:46, Cornelia Huck wrote: > On Mon, 13 May 2019 13:34:35 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> On 13.05.19 12:55, Christian Borntraeger wrote: >>> >>> >>> On 13.05.19 11:57, David Hildenbrand wrote: >>>> On 13.05.19 11:51, Christian Borntraeger wrote: >>>>> >>>>> >>>>> On 13.05.19 11:40, David Hildenbrand wrote: >>>>>> On 13.05.19 11:34, Christian Borntraeger wrote: >>>>>>> >>>>>>> >>>>>>> On 13.05.19 10:03, David Hildenbrand wrote: >>>>>>>>>> + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) >>>>>>>>>> + mc->max_cpus = S390_MAX_CPUS - 8; >>>>>>>>> >>>>>>>>> This is too complicated, just set it always to 240. >>>>>>>>> >>>>>>>>> However, I am still not sure how to best handle this scenario. One >>>>>>>>> solution is >>>>>>>>> >>>>>>>>> 1. Set it statically to 240 for machine > 4.1 >>>>>>>>> 2. Keep the old machines unmodifed >>>>>>>>> 3. Don't indicate the CPU feature for machines <= 4.0 >>>>>>>>> >>>>>>>>> #3 is the problematic part, as it mixes host CPU features and machines. >>>>>>>>> Bad. The host CPU model should always look the same on all machines. I >>>>>>>>> don't like this. >>>>>>>>> >>>>>>>> >>>>>>>> FWIW, #3 is only an issue when modeling it via the CPU model, like >>>>>>>> Christian suggested. >>>>>>>> >>>>>>>> I suggest the following >>>>>>>> >>>>>>>> 1. Set the max #cpus for 4.1 to 240 (already done) >>>>>>>> 2. Keep it for the other machines unmodified (as suggested by Thomas) >>>>>>>> 3. Create the layout of the SCCB depending on the machine type (to be done) >>>>>>>> >>>>>>>> If we want to model diag318 via a CPU feature (which makes sense for >>>>>>>> migration): >>>>>>>> >>>>>>>> 4. Disable diag318 with a warning if used with a machine < 4.1 >>>>>>>> >>>>>>> >>>>>>> I think there is a simpler solution. It is perfectly fine to fail the startup >>>>>>> if we cannot fulfil the cpu model. So lets just allow 248 and allow this feature >>>>>>> also for older machines. And if somebody chooses both at the same time, >>>>>>> lets fails the startup. >>>>>> >>>>>> To which knob do you want to glue the layout of the SCLP response? Like >>>>>> I described? Do you mean instead of warning and masking the feature off >>>>>> as I suggested, simply failing? >>>>> >>>>> The sclp response will depend on the dia318 cpu model flag. If its on, the sclp >>>>> response will have it, otherwise not. >>>>> - host-passthrough: not migration safe anyway >>>>> - host-model: if the target has diag318 good, otherwise we reject migration >>>>>> >>>>>> In that case, -machine ..-4.0 -cpu host will not work on new HW with new >>>>>> KVM. Just noting. >>>>> >>>>> Only if you have 248 CPUs (which is unlikely). My point was to do that for all >>>>> machine levels. >>>>> >>>> >>>> The issue with this approach is that e.g. libvirt is not aware of this >>>> restriction. It could query "max_cpus" and expand the host-cpu model, >>>> but starting a guest with > 240 cpus would fail. Maybe this is acceptable. >>> >>> As of today we do the cpu model check in the same way. libvirt actually tries >>> to run QEMU and handles failures. >>> >>> For a failure, the user still has still to use >240 CPUs in its XML. The only downside >>> is that libvirt will not reject this right away. >>> >>> During startup we would then print an error message like >>> >>> "The diag318 cpu feature is only supported for 240 and less CPUs." >>> >>> This is of similar quality as >>> "Selected CPU GA level is too new. Maximum supported model in the configuration: \'%s\'", >>> >> >> But that can be tested using the runability information if I am not wrong. > > You mean the cpu level information, right? > >> >>> and others that we have today. >>> >>> So yes, I think this would be acceptable. >> >> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >> production either way. But you never know. > > I think that using that many cpus is a more uncommon setup, but I still > think that having to wait for actual failure That can happen all the time today. You can easily say z14 in the xml when on a zEC12. Only at startup you get the error. The question is really: do you want to error on definition of the xml or on startup. And I think startup is the better place here. This allows to create definitions that will be useful in the future (pre-planning), e.g. if you know that you will update your machine or the code soon. > is worse than being able > to find out beforehand. Any way to make this discoverable?
>>> But that can be tested using the runability information if I am not wrong. >> >> You mean the cpu level information, right? Yes, query-cpu-definition includes for each model runability information via "unavailable-features" (valid under the started QEMU machine). >> >>> >>>> and others that we have today. >>>> >>>> So yes, I think this would be acceptable. >>> >>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>> production either way. But you never know. >> >> I think that using that many cpus is a more uncommon setup, but I still >> think that having to wait for actual failure > > That can happen all the time today. You can easily say z14 in the xml when > on a zEC12. Only at startup you get the error. The question is really: "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" will work. Actually, even "-smp 248" will no longer work on affected machines. That is why wonder if it is better to disable the feature and print a warning. Similar to CMMA, where want want to tolerate when CMMA is not possible in the current environment (huge pages). "Diag318 will not be enabled because it is not compatible with more than 240 CPUs". However, I still think that implementing support for more than one SCLP response page is the best solution. Guests will need adaptions for > 240 CPUs with Diag318, but who cares? Existing setups will continue to work. Implementing that SCLP thingy will avoid any warnings and any errors. It just works from the QEMU perspective. Is implementing this realistic? > do you want to error on definition of the xml or on startup. I actually have no idea what the best practice on the libvirt side is. There seems to be a user for max-cpus and unavailable-features in QEMU. And I think > startup is the better place here. This allows to create definitions that will > be useful in the future (pre-planning), e.g. if you know that you will update > your machine or the code soon.
On 14.05.19 09:28, David Hildenbrand wrote: >>>> But that can be tested using the runability information if I am not wrong. >>> >>> You mean the cpu level information, right? > > Yes, query-cpu-definition includes for each model runability information > via "unavailable-features" (valid under the started QEMU machine). > >>> >>>> >>>>> and others that we have today. >>>>> >>>>> So yes, I think this would be acceptable. >>>> >>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>> production either way. But you never know. >>> >>> I think that using that many cpus is a more uncommon setup, but I still >>> think that having to wait for actual failure >> >> That can happen all the time today. You can easily say z14 in the xml when >> on a zEC12. Only at startup you get the error. The question is really: > > "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" > will work. Actually, even "-smp 248" will no longer work on affected > machines. > > That is why wonder if it is better to disable the feature and print a > warning. Similar to CMMA, where want want to tolerate when CMMA is not > possible in the current environment (huge pages). > > "Diag318 will not be enabled because it is not compatible with more than > 240 CPUs". > > However, I still think that implementing support for more than one SCLP > response page is the best solution. Guests will need adaptions for > 240 > CPUs with Diag318, but who cares? Existing setups will continue to work. > > Implementing that SCLP thingy will avoid any warnings and any errors. It > just works from the QEMU perspective. > > Is implementing this realistic? Yes it is but it will take time. I will try to get this rolling. To make progress on the diag318 thing, can we error on startup now and simply remove that check when when have implemented a larger sccb? If we would now do all kinds of "change the max number games" would be harder to "fix". > >> do you want to error on definition of the xml or on startup. > > I actually have no idea what the best practice on the libvirt side is. > There seems to be a user for max-cpus and unavailable-features in QEMU. > > And I think >> startup is the better place here. This allows to create definitions that will >> be useful in the future (pre-planning), e.g. if you know that you will update >> your machine or the code soon. > > >
On Tue, 14 May 2019 10:37:32 +0200 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 14.05.19 09:28, David Hildenbrand wrote: > >>>> But that can be tested using the runability information if I am not wrong. > >>> > >>> You mean the cpu level information, right? > > > > Yes, query-cpu-definition includes for each model runability information > > via "unavailable-features" (valid under the started QEMU machine). > > > >>> > >>>> > >>>>> and others that we have today. > >>>>> > >>>>> So yes, I think this would be acceptable. > >>>> > >>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in > >>>> production either way. But you never know. > >>> > >>> I think that using that many cpus is a more uncommon setup, but I still > >>> think that having to wait for actual failure > >> > >> That can happen all the time today. You can easily say z14 in the xml when > >> on a zEC12. Only at startup you get the error. The question is really: > > > > "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" > > will work. Actually, even "-smp 248" will no longer work on affected > > machines. > > > > That is why wonder if it is better to disable the feature and print a > > warning. Similar to CMMA, where want want to tolerate when CMMA is not > > possible in the current environment (huge pages). > > > > "Diag318 will not be enabled because it is not compatible with more than > > 240 CPUs". > > > > However, I still think that implementing support for more than one SCLP > > response page is the best solution. Guests will need adaptions for > 240 > > CPUs with Diag318, but who cares? Existing setups will continue to work. > > > > Implementing that SCLP thingy will avoid any warnings and any errors. It > > just works from the QEMU perspective. > > > > Is implementing this realistic? > > Yes it is but it will take time. I will try to get this rolling. To make > progress on the diag318 thing, can we error on startup now and simply > remove that check when when have implemented a larger sccb? If we would > now do all kinds of "change the max number games" would be harder to "fix". So, the idea right now is: - fail to start if you try to specify a diag318 device and more than 240 cpus (do we need a knob to turn off the device?) - in the future, support more than one SCLP response page I'm getting a bit lost in the discussion; but the above sounds reasonable to me.
On 14.05.19 10:37, Christian Borntraeger wrote: > > > On 14.05.19 09:28, David Hildenbrand wrote: >>>>> But that can be tested using the runability information if I am not wrong. >>>> >>>> You mean the cpu level information, right? >> >> Yes, query-cpu-definition includes for each model runability information >> via "unavailable-features" (valid under the started QEMU machine). >> >>>> >>>>> >>>>>> and others that we have today. >>>>>> >>>>>> So yes, I think this would be acceptable. >>>>> >>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>> production either way. But you never know. >>>> >>>> I think that using that many cpus is a more uncommon setup, but I still >>>> think that having to wait for actual failure >>> >>> That can happen all the time today. You can easily say z14 in the xml when >>> on a zEC12. Only at startup you get the error. The question is really: >> >> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >> will work. Actually, even "-smp 248" will no longer work on affected >> machines. >> >> That is why wonder if it is better to disable the feature and print a >> warning. Similar to CMMA, where want want to tolerate when CMMA is not >> possible in the current environment (huge pages). >> >> "Diag318 will not be enabled because it is not compatible with more than >> 240 CPUs". >> >> However, I still think that implementing support for more than one SCLP >> response page is the best solution. Guests will need adaptions for > 240 >> CPUs with Diag318, but who cares? Existing setups will continue to work. >> >> Implementing that SCLP thingy will avoid any warnings and any errors. It >> just works from the QEMU perspective. >> >> Is implementing this realistic? > > Yes it is but it will take time. I will try to get this rolling. To make > progress on the diag318 thing, can we error on startup now and simply > remove that check when when have implemented a larger sccb? If we would > now do all kinds of "change the max number games" would be harder to "fix". Another idea for temporary handling: Simply only indicate 240 CPUs to the guest if the response does not fit into a page. Once we have that SCLP thingy, this will be fixed. Guest migration back and forth should work, as the VCPUs are fully functional (and initially always stopped), the guest will simply not be able to detect them via SCLP when booting up, and therefore not use them.
On 14.05.19 10:49, Cornelia Huck wrote: > On Tue, 14 May 2019 10:37:32 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>> But that can be tested using the runability information if I am not wrong. >>>>> >>>>> You mean the cpu level information, right? >>> >>> Yes, query-cpu-definition includes for each model runability information >>> via "unavailable-features" (valid under the started QEMU machine). >>> >>>>> >>>>>> >>>>>>> and others that we have today. >>>>>>> >>>>>>> So yes, I think this would be acceptable. >>>>>> >>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>> production either way. But you never know. >>>>> >>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>> think that having to wait for actual failure >>>> >>>> That can happen all the time today. You can easily say z14 in the xml when >>>> on a zEC12. Only at startup you get the error. The question is really: >>> >>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>> will work. Actually, even "-smp 248" will no longer work on affected >>> machines. >>> >>> That is why wonder if it is better to disable the feature and print a >>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>> possible in the current environment (huge pages). >>> >>> "Diag318 will not be enabled because it is not compatible with more than >>> 240 CPUs". >>> >>> However, I still think that implementing support for more than one SCLP >>> response page is the best solution. Guests will need adaptions for > 240 >>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>> >>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>> just works from the QEMU perspective. >>> >>> Is implementing this realistic? >> >> Yes it is but it will take time. I will try to get this rolling. To make >> progress on the diag318 thing, can we error on startup now and simply >> remove that check when when have implemented a larger sccb? If we would >> now do all kinds of "change the max number games" would be harder to "fix". > > So, the idea right now is: > > - fail to start if you try to specify a diag318 device and more than > 240 cpus (do we need a knob to turn off the device?) The know will be the cpu-model e.g. -cpu z14,diag318=off or something like that > - in the future, support more than one SCLP response page > > I'm getting a bit lost in the discussion; but the above sounds > reasonable to me. >
On 14.05.19 10:50, David Hildenbrand wrote: > On 14.05.19 10:37, Christian Borntraeger wrote: >> >> >> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>> But that can be tested using the runability information if I am not wrong. >>>>> >>>>> You mean the cpu level information, right? >>> >>> Yes, query-cpu-definition includes for each model runability information >>> via "unavailable-features" (valid under the started QEMU machine). >>> >>>>> >>>>>> >>>>>>> and others that we have today. >>>>>>> >>>>>>> So yes, I think this would be acceptable. >>>>>> >>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>> production either way. But you never know. >>>>> >>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>> think that having to wait for actual failure >>>> >>>> That can happen all the time today. You can easily say z14 in the xml when >>>> on a zEC12. Only at startup you get the error. The question is really: >>> >>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>> will work. Actually, even "-smp 248" will no longer work on affected >>> machines. >>> >>> That is why wonder if it is better to disable the feature and print a >>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>> possible in the current environment (huge pages). >>> >>> "Diag318 will not be enabled because it is not compatible with more than >>> 240 CPUs". >>> >>> However, I still think that implementing support for more than one SCLP >>> response page is the best solution. Guests will need adaptions for > 240 >>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>> >>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>> just works from the QEMU perspective. >>> >>> Is implementing this realistic? >> >> Yes it is but it will take time. I will try to get this rolling. To make >> progress on the diag318 thing, can we error on startup now and simply >> remove that check when when have implemented a larger sccb? If we would >> now do all kinds of "change the max number games" would be harder to "fix". > > > Another idea for temporary handling: Simply only indicate 240 CPUs to > the guest if the response does not fit into a page. Once we have that > SCLP thingy, this will be fixed. Guest migration back and forth should > work, as the VCPUs are fully functional (and initially always stopped), > the guest will simply not be able to detect them via SCLP when booting > up, and therefore not use them. Yes, that looks like a good temporary solution. In fact if the guest relies on simply probing it could even make use of the additional CPUs. Its just the sclp response that is limited to 240 (or make it 247?)
On 14.05.19 10:49, Cornelia Huck wrote: > On Tue, 14 May 2019 10:37:32 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>> But that can be tested using the runability information if I am not wrong. >>>>> >>>>> You mean the cpu level information, right? >>> >>> Yes, query-cpu-definition includes for each model runability information >>> via "unavailable-features" (valid under the started QEMU machine). >>> >>>>> >>>>>> >>>>>>> and others that we have today. >>>>>>> >>>>>>> So yes, I think this would be acceptable. >>>>>> >>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>> production either way. But you never know. >>>>> >>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>> think that having to wait for actual failure >>>> >>>> That can happen all the time today. You can easily say z14 in the xml when >>>> on a zEC12. Only at startup you get the error. The question is really: >>> >>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>> will work. Actually, even "-smp 248" will no longer work on affected >>> machines. >>> >>> That is why wonder if it is better to disable the feature and print a >>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>> possible in the current environment (huge pages). >>> >>> "Diag318 will not be enabled because it is not compatible with more than >>> 240 CPUs". >>> >>> However, I still think that implementing support for more than one SCLP >>> response page is the best solution. Guests will need adaptions for > 240 >>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>> >>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>> just works from the QEMU perspective. >>> >>> Is implementing this realistic? >> >> Yes it is but it will take time. I will try to get this rolling. To make >> progress on the diag318 thing, can we error on startup now and simply >> remove that check when when have implemented a larger sccb? If we would >> now do all kinds of "change the max number games" would be harder to "fix". > > So, the idea right now is: > > - fail to start if you try to specify a diag318 device and more than > 240 cpus (do we need a knob to turn off the device?) > - in the future, support more than one SCLP response page > > I'm getting a bit lost in the discussion; but the above sounds > reasonable to me. > We can 1. Fail to start with #cpus > 240 when diag318=on 2. Remove the error once we support more than one SCLP response page Or 1. Allow to start with #cpus > 240 when diag318=on, but indicate only 240 CPUs via SCLP 2. Print a warning 3. Remove the restriction and the warning once we support more than one SCLP response page While I prefer the second approach (similar to defining zPCI devices without zpci=on), I could also live with the first approach.
On Tue, 14 May 2019 10:56:43 +0200 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 14.05.19 10:50, David Hildenbrand wrote: > > Another idea for temporary handling: Simply only indicate 240 CPUs to > > the guest if the response does not fit into a page. Once we have that > > SCLP thingy, this will be fixed. Guest migration back and forth should > > work, as the VCPUs are fully functional (and initially always stopped), > > the guest will simply not be able to detect them via SCLP when booting > > up, and therefore not use them. > > Yes, that looks like a good temporary solution. In fact if the guest relies > on simply probing it could even make use of the additional CPUs. Its just > the sclp response that is limited to 240 (or make it 247?) Where did the 240 come from - extra spare room? If so, 247 would probably be all right?
On 14.05.19 10:56, Christian Borntraeger wrote: > > > On 14.05.19 10:50, David Hildenbrand wrote: >> On 14.05.19 10:37, Christian Borntraeger wrote: >>> >>> >>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>> >>>>>> You mean the cpu level information, right? >>>> >>>> Yes, query-cpu-definition includes for each model runability information >>>> via "unavailable-features" (valid under the started QEMU machine). >>>> >>>>>> >>>>>>> >>>>>>>> and others that we have today. >>>>>>>> >>>>>>>> So yes, I think this would be acceptable. >>>>>>> >>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>> production either way. But you never know. >>>>>> >>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>> think that having to wait for actual failure >>>>> >>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>> on a zEC12. Only at startup you get the error. The question is really: >>>> >>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>> will work. Actually, even "-smp 248" will no longer work on affected >>>> machines. >>>> >>>> That is why wonder if it is better to disable the feature and print a >>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>> possible in the current environment (huge pages). >>>> >>>> "Diag318 will not be enabled because it is not compatible with more than >>>> 240 CPUs". >>>> >>>> However, I still think that implementing support for more than one SCLP >>>> response page is the best solution. Guests will need adaptions for > 240 >>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>> >>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>> just works from the QEMU perspective. >>>> >>>> Is implementing this realistic? >>> >>> Yes it is but it will take time. I will try to get this rolling. To make >>> progress on the diag318 thing, can we error on startup now and simply >>> remove that check when when have implemented a larger sccb? If we would >>> now do all kinds of "change the max number games" would be harder to "fix". >> >> >> Another idea for temporary handling: Simply only indicate 240 CPUs to >> the guest if the response does not fit into a page. Once we have that >> SCLP thingy, this will be fixed. Guest migration back and forth should >> work, as the VCPUs are fully functional (and initially always stopped), >> the guest will simply not be able to detect them via SCLP when booting >> up, and therefore not use them. > > Yes, that looks like a good temporary solution. In fact if the guest relies > on simply probing it could even make use of the additional CPUs. Its just > the sclp response that is limited to 240 (or make it 247?) I think the limiting factor was more than a single CPU, but I don't recall. We can do the math again and come up with the right number.
On 14.05.19 11:00, Cornelia Huck wrote: > On Tue, 14 May 2019 10:56:43 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> On 14.05.19 10:50, David Hildenbrand wrote: > >>> Another idea for temporary handling: Simply only indicate 240 CPUs to >>> the guest if the response does not fit into a page. Once we have that >>> SCLP thingy, this will be fixed. Guest migration back and forth should >>> work, as the VCPUs are fully functional (and initially always stopped), >>> the guest will simply not be able to detect them via SCLP when booting >>> up, and therefore not use them. >> >> Yes, that looks like a good temporary solution. In fact if the guest relies >> on simply probing it could even make use of the additional CPUs. Its just >> the sclp response that is limited to 240 (or make it 247?) > > Where did the 240 come from - extra spare room? If so, 247 would > probably be all right? > +++ b/include/hw/s390x/sclp.h @@ -133,6 +133,8 @@ typedef struct ReadInfo { uint16_t highest_cpu; uint8_t _reserved5[124 - 122]; /* 122-123 */ uint32_t hmfai; + uint8_t _reserved7[134 - 128]; /* 128-133 */ + uint8_t fac134; struct CPUEntry entries[0]; } QEMU_PACKED ReadInfo; So we have "4096 - 135 + 1" memory. Each element is 16 bytes wide. -> 246 CPUs fit.
On 14.05.19 11:00, David Hildenbrand wrote: > On 14.05.19 10:56, Christian Borntraeger wrote: >> >> >> On 14.05.19 10:50, David Hildenbrand wrote: >>> On 14.05.19 10:37, Christian Borntraeger wrote: >>>> >>>> >>>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>>> >>>>>>> You mean the cpu level information, right? >>>>> >>>>> Yes, query-cpu-definition includes for each model runability information >>>>> via "unavailable-features" (valid under the started QEMU machine). >>>>> >>>>>>> >>>>>>>> >>>>>>>>> and others that we have today. >>>>>>>>> >>>>>>>>> So yes, I think this would be acceptable. >>>>>>>> >>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>>> production either way. But you never know. >>>>>>> >>>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>>> think that having to wait for actual failure >>>>>> >>>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>>> on a zEC12. Only at startup you get the error. The question is really: >>>>> >>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>>> will work. Actually, even "-smp 248" will no longer work on affected >>>>> machines. >>>>> >>>>> That is why wonder if it is better to disable the feature and print a >>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>>> possible in the current environment (huge pages). >>>>> >>>>> "Diag318 will not be enabled because it is not compatible with more than >>>>> 240 CPUs". >>>>> >>>>> However, I still think that implementing support for more than one SCLP >>>>> response page is the best solution. Guests will need adaptions for > 240 >>>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>>> >>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>>> just works from the QEMU perspective. >>>>> >>>>> Is implementing this realistic? >>>> >>>> Yes it is but it will take time. I will try to get this rolling. To make >>>> progress on the diag318 thing, can we error on startup now and simply >>>> remove that check when when have implemented a larger sccb? If we would >>>> now do all kinds of "change the max number games" would be harder to "fix". >>> >>> >>> Another idea for temporary handling: Simply only indicate 240 CPUs to >>> the guest if the response does not fit into a page. Once we have that >>> SCLP thingy, this will be fixed. Guest migration back and forth should >>> work, as the VCPUs are fully functional (and initially always stopped), >>> the guest will simply not be able to detect them via SCLP when booting >>> up, and therefore not use them. >> >> Yes, that looks like a good temporary solution. In fact if the guest relies >> on simply probing it could even make use of the additional CPUs. Its just >> the sclp response that is limited to 240 (or make it 247?) > > I think the limiting factor was more than a single CPU, but I don't > recall. We can do the math again and come up with the right number. I think We need 8 byte per CPU. With byte 134 we should still be ok with 247. Collin can do the math in the patch description.
On 14.05.19 11:03, David Hildenbrand wrote: > On 14.05.19 11:00, Cornelia Huck wrote: >> On Tue, 14 May 2019 10:56:43 +0200 >> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >> >>> On 14.05.19 10:50, David Hildenbrand wrote: >> >>>> Another idea for temporary handling: Simply only indicate 240 CPUs to >>>> the guest if the response does not fit into a page. Once we have that >>>> SCLP thingy, this will be fixed. Guest migration back and forth should >>>> work, as the VCPUs are fully functional (and initially always stopped), >>>> the guest will simply not be able to detect them via SCLP when booting >>>> up, and therefore not use them. >>> >>> Yes, that looks like a good temporary solution. In fact if the guest relies >>> on simply probing it could even make use of the additional CPUs. Its just >>> the sclp response that is limited to 240 (or make it 247?) >> >> Where did the 240 come from - extra spare room? If so, 247 would >> probably be all right? >> > > +++ b/include/hw/s390x/sclp.h > @@ -133,6 +133,8 @@ typedef struct ReadInfo { > uint16_t highest_cpu; > uint8_t _reserved5[124 - 122]; /* 122-123 */ > uint32_t hmfai; > + uint8_t _reserved7[134 - 128]; /* 128-133 */ > + uint8_t fac134; > struct CPUEntry entries[0]; > } QEMU_PACKED ReadInfo; > > > So we have "4096 - 135 + 1" memory. Each element is 16 bytes wide. > -> 246 CPUs fit. (I meant 247 :( )
On 14.05.19 10:59, David Hildenbrand wrote: > On 14.05.19 10:49, Cornelia Huck wrote: >> On Tue, 14 May 2019 10:37:32 +0200 >> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >> >>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>> >>>>>> You mean the cpu level information, right? >>>> >>>> Yes, query-cpu-definition includes for each model runability information >>>> via "unavailable-features" (valid under the started QEMU machine). >>>> >>>>>> >>>>>>> >>>>>>>> and others that we have today. >>>>>>>> >>>>>>>> So yes, I think this would be acceptable. >>>>>>> >>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>> production either way. But you never know. >>>>>> >>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>> think that having to wait for actual failure >>>>> >>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>> on a zEC12. Only at startup you get the error. The question is really: >>>> >>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>> will work. Actually, even "-smp 248" will no longer work on affected >>>> machines. >>>> >>>> That is why wonder if it is better to disable the feature and print a >>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>> possible in the current environment (huge pages). >>>> >>>> "Diag318 will not be enabled because it is not compatible with more than >>>> 240 CPUs". >>>> >>>> However, I still think that implementing support for more than one SCLP >>>> response page is the best solution. Guests will need adaptions for > 240 >>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>> >>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>> just works from the QEMU perspective. >>>> >>>> Is implementing this realistic? >>> >>> Yes it is but it will take time. I will try to get this rolling. To make >>> progress on the diag318 thing, can we error on startup now and simply >>> remove that check when when have implemented a larger sccb? If we would >>> now do all kinds of "change the max number games" would be harder to "fix". >> >> So, the idea right now is: >> >> - fail to start if you try to specify a diag318 device and more than >> 240 cpus (do we need a knob to turn off the device?) >> - in the future, support more than one SCLP response page >> >> I'm getting a bit lost in the discussion; but the above sounds >> reasonable to me. >> > > We can > > 1. Fail to start with #cpus > 240 when diag318=on > 2. Remove the error once we support more than one SCLP response page > > Or > > 1. Allow to start with #cpus > 240 when diag318=on, but indicate only > 240 CPUs via SCLP > 2. Print a warning > 3. Remove the restriction and the warning once we support more than one > SCLP response page > > While I prefer the second approach (similar to defining zPCI devices > without zpci=on), I could also live with the first approach. Lets just continue with your other suggestion to simply limit the sclp response and do not do any failure or machine change. That seems like the easiest solution.
On 14.05.19 10:59, David Hildenbrand wrote: > On 14.05.19 10:49, Cornelia Huck wrote: >> On Tue, 14 May 2019 10:37:32 +0200 >> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >> >>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>> >>>>>> You mean the cpu level information, right? >>>> >>>> Yes, query-cpu-definition includes for each model runability information >>>> via "unavailable-features" (valid under the started QEMU machine). >>>> >>>>>> >>>>>>> >>>>>>>> and others that we have today. >>>>>>>> >>>>>>>> So yes, I think this would be acceptable. >>>>>>> >>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>> production either way. But you never know. >>>>>> >>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>> think that having to wait for actual failure >>>>> >>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>> on a zEC12. Only at startup you get the error. The question is really: >>>> >>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>> will work. Actually, even "-smp 248" will no longer work on affected >>>> machines. >>>> >>>> That is why wonder if it is better to disable the feature and print a >>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>> possible in the current environment (huge pages). >>>> >>>> "Diag318 will not be enabled because it is not compatible with more than >>>> 240 CPUs". >>>> >>>> However, I still think that implementing support for more than one SCLP >>>> response page is the best solution. Guests will need adaptions for > 240 >>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>> >>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>> just works from the QEMU perspective. >>>> >>>> Is implementing this realistic? >>> >>> Yes it is but it will take time. I will try to get this rolling. To make >>> progress on the diag318 thing, can we error on startup now and simply >>> remove that check when when have implemented a larger sccb? If we would >>> now do all kinds of "change the max number games" would be harder to "fix". >> >> So, the idea right now is: >> >> - fail to start if you try to specify a diag318 device and more than >> 240 cpus (do we need a knob to turn off the device?) >> - in the future, support more than one SCLP response page >> >> I'm getting a bit lost in the discussion; but the above sounds >> reasonable to me. >> > > We can > > 1. Fail to start with #cpus > 240 when diag318=on > 2. Remove the error once we support more than one SCLP response page > > Or > > 1. Allow to start with #cpus > 240 when diag318=on, but indicate only > 240 CPUs via SCLP > 2. Print a warning > 3. Remove the restriction and the warning once we support more than one > SCLP response page > > While I prefer the second approach (similar to defining zPCI devices > without zpci=on), I could also live with the first approach. I prefer approach 1.
On Tue, 14 May 2019 11:07:41 +0200 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 14.05.19 10:59, David Hildenbrand wrote: > > We can > > > > 1. Fail to start with #cpus > 240 when diag318=on > > 2. Remove the error once we support more than one SCLP response page > > > > Or > > > > 1. Allow to start with #cpus > 240 when diag318=on, but indicate only > > 240 CPUs via SCLP > > 2. Print a warning > > 3. Remove the restriction and the warning once we support more than one > > SCLP response page We'd need compat handling for step 3., then? > > > > While I prefer the second approach (similar to defining zPCI devices > > without zpci=on), I could also live with the first approach. > > Lets just continue with your other suggestion to simply limit the sclp > response and do not do any failure or machine change. That seems like > the easiest solution. That's the second option, right? Should be reasonable.
On 14.05.19 11:10, Christian Borntraeger wrote: > > > On 14.05.19 10:59, David Hildenbrand wrote: >> On 14.05.19 10:49, Cornelia Huck wrote: >>> On Tue, 14 May 2019 10:37:32 +0200 >>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>> >>>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>>> >>>>>>> You mean the cpu level information, right? >>>>> >>>>> Yes, query-cpu-definition includes for each model runability information >>>>> via "unavailable-features" (valid under the started QEMU machine). >>>>> >>>>>>> >>>>>>>> >>>>>>>>> and others that we have today. >>>>>>>>> >>>>>>>>> So yes, I think this would be acceptable. >>>>>>>> >>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>>> production either way. But you never know. >>>>>>> >>>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>>> think that having to wait for actual failure >>>>>> >>>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>>> on a zEC12. Only at startup you get the error. The question is really: >>>>> >>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>>> will work. Actually, even "-smp 248" will no longer work on affected >>>>> machines. >>>>> >>>>> That is why wonder if it is better to disable the feature and print a >>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>>> possible in the current environment (huge pages). >>>>> >>>>> "Diag318 will not be enabled because it is not compatible with more than >>>>> 240 CPUs". >>>>> >>>>> However, I still think that implementing support for more than one SCLP >>>>> response page is the best solution. Guests will need adaptions for > 240 >>>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>>> >>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>>> just works from the QEMU perspective. >>>>> >>>>> Is implementing this realistic? >>>> >>>> Yes it is but it will take time. I will try to get this rolling. To make >>>> progress on the diag318 thing, can we error on startup now and simply >>>> remove that check when when have implemented a larger sccb? If we would >>>> now do all kinds of "change the max number games" would be harder to "fix". >>> >>> So, the idea right now is: >>> >>> - fail to start if you try to specify a diag318 device and more than >>> 240 cpus (do we need a knob to turn off the device?) >>> - in the future, support more than one SCLP response page >>> >>> I'm getting a bit lost in the discussion; but the above sounds >>> reasonable to me. >>> >> >> We can >> >> 1. Fail to start with #cpus > 240 when diag318=on >> 2. Remove the error once we support more than one SCLP response page >> >> Or >> >> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only >> 240 CPUs via SCLP >> 2. Print a warning >> 3. Remove the restriction and the warning once we support more than one >> SCLP response page >> >> While I prefer the second approach (similar to defining zPCI devices >> without zpci=on), I could also live with the first approach. > > I prefer approach 1. > Isn't approach #2 what we discussed (limiting sclp, but of course to 247 CPUs), but with an additional warning? I'm confused.
On 14.05.19 11:20, David Hildenbrand wrote: > On 14.05.19 11:10, Christian Borntraeger wrote: >> >> >> On 14.05.19 10:59, David Hildenbrand wrote: >>> On 14.05.19 10:49, Cornelia Huck wrote: >>>> On Tue, 14 May 2019 10:37:32 +0200 >>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>>> >>>>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>>>> >>>>>>>> You mean the cpu level information, right? >>>>>> >>>>>> Yes, query-cpu-definition includes for each model runability information >>>>>> via "unavailable-features" (valid under the started QEMU machine). >>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> and others that we have today. >>>>>>>>>> >>>>>>>>>> So yes, I think this would be acceptable. >>>>>>>>> >>>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>>>> production either way. But you never know. >>>>>>>> >>>>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>>>> think that having to wait for actual failure >>>>>>> >>>>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>>>> on a zEC12. Only at startup you get the error. The question is really: >>>>>> >>>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>>>> will work. Actually, even "-smp 248" will no longer work on affected >>>>>> machines. >>>>>> >>>>>> That is why wonder if it is better to disable the feature and print a >>>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>>>> possible in the current environment (huge pages). >>>>>> >>>>>> "Diag318 will not be enabled because it is not compatible with more than >>>>>> 240 CPUs". >>>>>> >>>>>> However, I still think that implementing support for more than one SCLP >>>>>> response page is the best solution. Guests will need adaptions for > 240 >>>>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>>>> >>>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>>>> just works from the QEMU perspective. >>>>>> >>>>>> Is implementing this realistic? >>>>> >>>>> Yes it is but it will take time. I will try to get this rolling. To make >>>>> progress on the diag318 thing, can we error on startup now and simply >>>>> remove that check when when have implemented a larger sccb? If we would >>>>> now do all kinds of "change the max number games" would be harder to "fix". >>>> >>>> So, the idea right now is: >>>> >>>> - fail to start if you try to specify a diag318 device and more than >>>> 240 cpus (do we need a knob to turn off the device?) >>>> - in the future, support more than one SCLP response page >>>> >>>> I'm getting a bit lost in the discussion; but the above sounds >>>> reasonable to me. >>>> >>> >>> We can >>> >>> 1. Fail to start with #cpus > 240 when diag318=on >>> 2. Remove the error once we support more than one SCLP response page >>> >>> Or >>> >>> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only >>> 240 CPUs via SCLP >>> 2. Print a warning >>> 3. Remove the restriction and the warning once we support more than one >>> SCLP response page >>> >>> While I prefer the second approach (similar to defining zPCI devices >>> without zpci=on), I could also live with the first approach. >> >> I prefer approach 1. >> > > Isn't approach #2 what we discussed (limiting sclp, but of course to 247 > CPUs), but with an additional warning? I'm confused. Different numbering interpretion. I was talking about 1 = "Allow to start with #cpus > 240 when diag318=on, but indicate only 240 CPUs via SCLP"
On 14.05.19 11:23, Christian Borntraeger wrote: > > > On 14.05.19 11:20, David Hildenbrand wrote: >> On 14.05.19 11:10, Christian Borntraeger wrote: >>> >>> >>> On 14.05.19 10:59, David Hildenbrand wrote: >>>> On 14.05.19 10:49, Cornelia Huck wrote: >>>>> On Tue, 14 May 2019 10:37:32 +0200 >>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>>>> >>>>>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>>>>> >>>>>>>>> You mean the cpu level information, right? >>>>>>> >>>>>>> Yes, query-cpu-definition includes for each model runability information >>>>>>> via "unavailable-features" (valid under the started QEMU machine). >>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> and others that we have today. >>>>>>>>>>> >>>>>>>>>>> So yes, I think this would be acceptable. >>>>>>>>>> >>>>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>>>>> production either way. But you never know. >>>>>>>>> >>>>>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>>>>> think that having to wait for actual failure >>>>>>>> >>>>>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>>>>> on a zEC12. Only at startup you get the error. The question is really: >>>>>>> >>>>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>>>>> will work. Actually, even "-smp 248" will no longer work on affected >>>>>>> machines. >>>>>>> >>>>>>> That is why wonder if it is better to disable the feature and print a >>>>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>>>>> possible in the current environment (huge pages). >>>>>>> >>>>>>> "Diag318 will not be enabled because it is not compatible with more than >>>>>>> 240 CPUs". >>>>>>> >>>>>>> However, I still think that implementing support for more than one SCLP >>>>>>> response page is the best solution. Guests will need adaptions for > 240 >>>>>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>>>>> >>>>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>>>>> just works from the QEMU perspective. >>>>>>> >>>>>>> Is implementing this realistic? >>>>>> >>>>>> Yes it is but it will take time. I will try to get this rolling. To make >>>>>> progress on the diag318 thing, can we error on startup now and simply >>>>>> remove that check when when have implemented a larger sccb? If we would >>>>>> now do all kinds of "change the max number games" would be harder to "fix". >>>>> >>>>> So, the idea right now is: >>>>> >>>>> - fail to start if you try to specify a diag318 device and more than >>>>> 240 cpus (do we need a knob to turn off the device?) >>>>> - in the future, support more than one SCLP response page >>>>> >>>>> I'm getting a bit lost in the discussion; but the above sounds >>>>> reasonable to me. >>>>> >>>> >>>> We can >>>> >>>> 1. Fail to start with #cpus > 240 when diag318=on >>>> 2. Remove the error once we support more than one SCLP response page >>>> >>>> Or >>>> >>>> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only >>>> 240 CPUs via SCLP >>>> 2. Print a warning >>>> 3. Remove the restriction and the warning once we support more than one >>>> SCLP response page >>>> >>>> While I prefer the second approach (similar to defining zPCI devices >>>> without zpci=on), I could also live with the first approach. >>> >>> I prefer approach 1. >>> >> >> Isn't approach #2 what we discussed (limiting sclp, but of course to 247 >> CPUs), but with an additional warning? I'm confused. > > Different numbering interpretion. I was talking about 1 = "Allow to start with #cpus > 240 when diag318=on, but indicate only > 240 CPUs via SCLP" So yes, variant 2 when I use your numbering. The only question is: do we need a warning? It probably does not hurt.
On 14.05.19 11:25, Christian Borntraeger wrote: > > > On 14.05.19 11:23, Christian Borntraeger wrote: >> >> >> On 14.05.19 11:20, David Hildenbrand wrote: >>> On 14.05.19 11:10, Christian Borntraeger wrote: >>>> >>>> >>>> On 14.05.19 10:59, David Hildenbrand wrote: >>>>> On 14.05.19 10:49, Cornelia Huck wrote: >>>>>> On Tue, 14 May 2019 10:37:32 +0200 >>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>>>>> >>>>>>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>>>>>> >>>>>>>>>> You mean the cpu level information, right? >>>>>>>> >>>>>>>> Yes, query-cpu-definition includes for each model runability information >>>>>>>> via "unavailable-features" (valid under the started QEMU machine). >>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> and others that we have today. >>>>>>>>>>>> >>>>>>>>>>>> So yes, I think this would be acceptable. >>>>>>>>>>> >>>>>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>>>>>> production either way. But you never know. >>>>>>>>>> >>>>>>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>>>>>> think that having to wait for actual failure >>>>>>>>> >>>>>>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>>>>>> on a zEC12. Only at startup you get the error. The question is really: >>>>>>>> >>>>>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>>>>>> will work. Actually, even "-smp 248" will no longer work on affected >>>>>>>> machines. >>>>>>>> >>>>>>>> That is why wonder if it is better to disable the feature and print a >>>>>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>>>>>> possible in the current environment (huge pages). >>>>>>>> >>>>>>>> "Diag318 will not be enabled because it is not compatible with more than >>>>>>>> 240 CPUs". >>>>>>>> >>>>>>>> However, I still think that implementing support for more than one SCLP >>>>>>>> response page is the best solution. Guests will need adaptions for > 240 >>>>>>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>>>>>> >>>>>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>>>>>> just works from the QEMU perspective. >>>>>>>> >>>>>>>> Is implementing this realistic? >>>>>>> >>>>>>> Yes it is but it will take time. I will try to get this rolling. To make >>>>>>> progress on the diag318 thing, can we error on startup now and simply >>>>>>> remove that check when when have implemented a larger sccb? If we would >>>>>>> now do all kinds of "change the max number games" would be harder to "fix". >>>>>> >>>>>> So, the idea right now is: >>>>>> >>>>>> - fail to start if you try to specify a diag318 device and more than >>>>>> 240 cpus (do we need a knob to turn off the device?) >>>>>> - in the future, support more than one SCLP response page >>>>>> >>>>>> I'm getting a bit lost in the discussion; but the above sounds >>>>>> reasonable to me. >>>>>> >>>>> >>>>> We can >>>>> >>>>> 1. Fail to start with #cpus > 240 when diag318=on >>>>> 2. Remove the error once we support more than one SCLP response page >>>>> >>>>> Or >>>>> >>>>> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only >>>>> 240 CPUs via SCLP >>>>> 2. Print a warning >>>>> 3. Remove the restriction and the warning once we support more than one >>>>> SCLP response page >>>>> >>>>> While I prefer the second approach (similar to defining zPCI devices >>>>> without zpci=on), I could also live with the first approach. >>>> >>>> I prefer approach 1. >>>> >>> >>> Isn't approach #2 what we discussed (limiting sclp, but of course to 247 >>> CPUs), but with an additional warning? I'm confused. >> >> Different numbering interpretion. I was talking about 1 = "Allow to start with #cpus > 240 when diag318=on, but indicate only >> 240 CPUs via SCLP" > > So yes, variant 2 when I use your numbering. The only question is: do we need > a warning? It probably does not hurt. After all, we are talking about 1 VCPU that the guest can only use by indirect probing ... I leave that up to Collin :)
On Tue, 14 May 2019 11:27:32 +0200 David Hildenbrand <david@redhat.com> wrote: > On 14.05.19 11:25, Christian Borntraeger wrote: > > > > > > On 14.05.19 11:23, Christian Borntraeger wrote: > >> > >> > >> On 14.05.19 11:20, David Hildenbrand wrote: > >>> On 14.05.19 11:10, Christian Borntraeger wrote: > >>>> > >>>> > >>>> On 14.05.19 10:59, David Hildenbrand wrote: > >>>>> We can > >>>>> > >>>>> 1. Fail to start with #cpus > 240 when diag318=on > >>>>> 2. Remove the error once we support more than one SCLP response page > >>>>> > >>>>> Or > >>>>> > >>>>> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only > >>>>> 240 CPUs via SCLP > >>>>> 2. Print a warning > >>>>> 3. Remove the restriction and the warning once we support more than one > >>>>> SCLP response page > >>>>> > >>>>> While I prefer the second approach (similar to defining zPCI devices > >>>>> without zpci=on), I could also live with the first approach. > >>>> > >>>> I prefer approach 1. > >>>> > >>> > >>> Isn't approach #2 what we discussed (limiting sclp, but of course to 247 > >>> CPUs), but with an additional warning? I'm confused. > >> > >> Different numbering interpretion. I was talking about 1 = "Allow to start with #cpus > 240 when diag318=on, but indicate only > >> 240 CPUs via SCLP" > > > > So yes, variant 2 when I use your numbering. The only question is: do we need > > a warning? It probably does not hurt. > > After all, we are talking about 1 VCPU that the guest can only use by > indirect probing ... I leave that up to Collin :) I'd prefer a warning... even if it is a corner case, I think it's better to be explicit instead of silent.
On 5/14/19 5:04 AM, Christian Borntraeger wrote: > > > On 14.05.19 11:00, David Hildenbrand wrote: >> On 14.05.19 10:56, Christian Borntraeger wrote: >>> >>> >>> On 14.05.19 10:50, David Hildenbrand wrote: >>>> On 14.05.19 10:37, Christian Borntraeger wrote: >>>>> >>>>> >>>>> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>>>>> But that can be tested using the runability information if I am not wrong. >>>>>>>> >>>>>>>> You mean the cpu level information, right? >>>>>> >>>>>> Yes, query-cpu-definition includes for each model runability information >>>>>> via "unavailable-features" (valid under the started QEMU machine). >>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> and others that we have today. >>>>>>>>>> >>>>>>>>>> So yes, I think this would be acceptable. >>>>>>>>> >>>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>>>>> production either way. But you never know. >>>>>>>> >>>>>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>>>>> think that having to wait for actual failure >>>>>>> >>>>>>> That can happen all the time today. You can easily say z14 in the xml when >>>>>>> on a zEC12. Only at startup you get the error. The question is really: >>>>>> >>>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>>>>> will work. Actually, even "-smp 248" will no longer work on affected >>>>>> machines. >>>>>> >>>>>> That is why wonder if it is better to disable the feature and print a >>>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>>>>> possible in the current environment (huge pages). >>>>>> >>>>>> "Diag318 will not be enabled because it is not compatible with more than >>>>>> 240 CPUs". >>>>>> >>>>>> However, I still think that implementing support for more than one SCLP >>>>>> response page is the best solution. Guests will need adaptions for > 240 >>>>>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>>>>> >>>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>>>>> just works from the QEMU perspective. >>>>>> >>>>>> Is implementing this realistic? >>>>> >>>>> Yes it is but it will take time. I will try to get this rolling. To make >>>>> progress on the diag318 thing, can we error on startup now and simply >>>>> remove that check when when have implemented a larger sccb? If we would >>>>> now do all kinds of "change the max number games" would be harder to "fix". >>>> >>>> >>>> Another idea for temporary handling: Simply only indicate 240 CPUs to >>>> the guest if the response does not fit into a page. Once we have that >>>> SCLP thingy, this will be fixed. Guest migration back and forth should >>>> work, as the VCPUs are fully functional (and initially always stopped), >>>> the guest will simply not be able to detect them via SCLP when booting >>>> up, and therefore not use them. >>> >>> Yes, that looks like a good temporary solution. In fact if the guest relies >>> on simply probing it could even make use of the additional CPUs. Its just >>> the sclp response that is limited to 240 (or make it 247?) >> >> I think the limiting factor was more than a single CPU, but I don't >> recall. We can do the math again and come up with the right number. > > I think We need 8 byte per CPU. With byte 134 we should still be ok with > 247. Collin can do the math in the patch description. > > Yes 247 fits just fine. The 240 came up as extra space in case we expand the rscpi even more in the future. I used the The SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS calculation as an example of what we could do if the SCCB_SIZE ever increases (it would allow older machines to retroactively allow diag318 if they can also support a larger SCCB... but thinking out loud, there might be many more moving parts that would make this preemptive approach too tricky to implement today).
On 5/14/19 5:30 AM, Cornelia Huck wrote: > On Tue, 14 May 2019 11:27:32 +0200 > David Hildenbrand <david@redhat.com> wrote: > >> On 14.05.19 11:25, Christian Borntraeger wrote: >>> >>> >>> On 14.05.19 11:23, Christian Borntraeger wrote: >>>> >>>> >>>> On 14.05.19 11:20, David Hildenbrand wrote: >>>>> On 14.05.19 11:10, Christian Borntraeger wrote: >>>>>> >>>>>> >>>>>> On 14.05.19 10:59, David Hildenbrand wrote: > >>>>>>> We can >>>>>>> >>>>>>> 1. Fail to start with #cpus > 240 when diag318=on >>>>>>> 2. Remove the error once we support more than one SCLP response page >>>>>>> >>>>>>> Or >>>>>>>b >>>>>>> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only >>>>>>> 240 CPUs via SCLP >>>>>>> 2. Print a warning >>>>>>> 3. Remove the restriction and the warning once we support more than one >>>>>>> SCLP response page >>>>>>> >>>>>>> While I prefer the second approach (similar to defining zPCI devices >>>>>>> without zpci=on), I could also live with the first approach. >>>>>> >>>>>> I prefer approach 1. >>>>>> >>>>> >>>>> Isn't approach #2 what we discussed (limiting sclp, but of course to 247 >>>>> CPUs), but with an additional warning? I'm confused. >>>> >>>> Different numbering interpretion. I was talking about 1 = "Allow to start with #cpus > 240 when diag318=on, but indicate only >>>> 240 CPUs via SCLP" >>> >>> So yes, variant 2 when I use your numbering. The only question is: do we need >>> a warning? It probably does not hurt. >> >> After all, we are talking about 1 VCPU that the guest can only use by >> indirect probing ... I leave that up to Collin :) > > I'd prefer a warning... even if it is a corner case, I think it's > better to be explicit instead of silent. > Thanks for the discussion. I'll implement diag318 as a CPU feature again and adjust the max_cpu handling (to 247 now instead of 240). Question: should diag318 be enabled by default for current CPU models? What should we do in the case where diag318=on but KVM does not support handling? I am not familiar with how close QEMU and KVM versions stay in sync.
On 16.05.19 15:35, Collin Walling wrote: > On 5/14/19 5:30 AM, Cornelia Huck wrote: >> On Tue, 14 May 2019 11:27:32 +0200 >> David Hildenbrand <david@redhat.com> wrote: >> >>> On 14.05.19 11:25, Christian Borntraeger wrote: >>>> >>>> >>>> On 14.05.19 11:23, Christian Borntraeger wrote: >>>>> >>>>> >>>>> On 14.05.19 11:20, David Hildenbrand wrote: >>>>>> On 14.05.19 11:10, Christian Borntraeger wrote: >>>>>>> >>>>>>> >>>>>>> On 14.05.19 10:59, David Hildenbrand wrote: >> >>>>>>>> We can >>>>>>>> >>>>>>>> 1. Fail to start with #cpus > 240 when diag318=on >>>>>>>> 2. Remove the error once we support more than one SCLP response page >>>>>>>> >>>>>>>> Or >>>>>>>> b >>>>>>>> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only >>>>>>>> 240 CPUs via SCLP >>>>>>>> 2. Print a warning >>>>>>>> 3. Remove the restriction and the warning once we support more than one >>>>>>>> SCLP response page >>>>>>>> >>>>>>>> While I prefer the second approach (similar to defining zPCI devices >>>>>>>> without zpci=on), I could also live with the first approach. >>>>>>> >>>>>>> I prefer approach 1. >>>>>>> >>>>>> >>>>>> Isn't approach #2 what we discussed (limiting sclp, but of course to 247 >>>>>> CPUs), but with an additional warning? I'm confused. >>>>> >>>>> Different numbering interpretion. I was talking about 1 = "Allow to start with #cpus > 240 when diag318=on, but indicate only >>>>> 240 CPUs via SCLP" >>>> >>>> So yes, variant 2 when I use your numbering. The only question is: do we need >>>> a warning? It probably does not hurt. >>> >>> After all, we are talking about 1 VCPU that the guest can only use by >>> indirect probing ... I leave that up to Collin :) >> >> I'd prefer a warning... even if it is a corner case, I think it's >> better to be explicit instead of silent. >> > > Thanks for the discussion. I'll implement diag318 as a CPU feature again > and adjust the max_cpu handling (to 247 now instead of 240). Well, max cpu is 248, we decided to just not report cpus > 247 to the guest via sclp. > > Question: should diag318 be enabled by default for current CPU models? > What should we do in the case where diag318=on but KVM does not support > handling? Lets keep it in the full model for now. we can change that later on. > > I am not familiar with how close QEMU and KVM versions stay in sync.
diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs index e02ed80..93621dc 100644 --- a/hw/s390x/Makefile.objs +++ b/hw/s390x/Makefile.objs @@ -34,3 +34,4 @@ obj-$(CONFIG_KVM) += s390-stattrib-kvm.o obj-y += s390-ccw.o obj-y += ap-device.o obj-y += ap-bridge.o +obj-y += diag318.o diff --git a/hw/s390x/diag318.c b/hw/s390x/diag318.c new file mode 100644 index 0000000..94b44da --- /dev/null +++ b/hw/s390x/diag318.c @@ -0,0 +1,100 @@ +/* + * DIAGNOSE 0x318 functions for reset and migration + * + * Copyright IBM, Corp. 2019 + * + * Authors: + * Collin Walling <walling@linux.ibm.com> + * + * This work is licensed under the terms of the GNU GPL, version 2 or (at your + * option) any later version. See the COPYING file in the top-level directory. + */ + +#include "hw/s390x/diag318.h" +#include "qapi/error.h" +#include "kvm_s390x.h" +#include "sysemu/kvm.h" + +static int diag318_post_load(void *opaque, int version_id) +{ + DIAG318State *d = opaque; + + kvm_s390_set_cpc(d->cpc); + + /* It is not necessary to retain a copy of the cpc after migration. */ + d->cpc = 0; + + return 0; +} + +static int diag318_pre_save(void *opaque) +{ + DIAG318State *d = opaque; + + kvm_s390_get_cpc(&d->cpc); + return 0; +} + +static bool diag318_needed(void *opaque) +{ + DIAG318State *d = opaque; + + return d->enabled; +} + +const VMStateDescription vmstate_diag318 = { + .name = "vmstate_diag318", + .post_load = diag318_post_load, + .pre_save = diag318_pre_save, + .version_id = 1, + .minimum_version_id = 1, + .needed = diag318_needed, + .fields = (VMStateField[]) { + VMSTATE_UINT64(cpc, DIAG318State), + VMSTATE_END_OF_LIST() + } +}; + +static void s390_diag318_realize(DeviceState *dev, Error **errp) +{ + DIAG318State *d = DIAG318(dev); + + if (kvm_s390_has_diag318()) { + d->enabled = true; + } +} + +static void s390_diag318_reset(DeviceState *dev) +{ + DIAG318State *d = DIAG318(dev); + + if (d->enabled) { + kvm_s390_set_cpc(0); + } +} + +static void s390_diag318_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + + dc->realize = s390_diag318_realize; + dc->reset = s390_diag318_reset; + dc->vmsd = &vmstate_diag318; + dc->hotpluggable = false; + /* Reason: Set automatically during IPL */ + dc->user_creatable = false; +} + +static const TypeInfo s390_diag318_info = { + .class_init = s390_diag318_class_init, + .parent = TYPE_DEVICE, + .name = TYPE_S390_DIAG318, + .instance_size = sizeof(DIAG318State), +}; + +static void s390_diag318_register_types(void) +{ + type_register_static(&s390_diag318_info); +} + +type_init(s390_diag318_register_types) diff --git a/hw/s390x/diag318.h b/hw/s390x/diag318.h new file mode 100644 index 0000000..c154b0a --- /dev/null +++ b/hw/s390x/diag318.h @@ -0,0 +1,39 @@ +/* + * DIAGNOSE 0x318 functions for reset and migration + * + * Copyright IBM, Corp. 2019 + * + * Authors: + * Collin Walling <walling@linux.ibm.com> + * + * This work is licensed under the terms of the GNU GPL, version 2 or (at your + * option) any later version. See the COPYING file in the top-level directory. + */ + + #ifndef DIAG318_H + #define DIAG318_H + +#include "qemu/osdep.h" +#include "hw/qdev.h" + +#define TYPE_S390_DIAG318 "diag318" +#define DIAG318(obj) \ + OBJECT_CHECK(DIAG318State, (obj), TYPE_S390_DIAG318) + +typedef struct DIAG318State { + /*< private >*/ + DeviceState parent_obj; + + /*< public >*/ + uint64_t cpc; + bool enabled; +} DIAG318State; + +typedef struct DIAG318Class { + /*< private >*/ + DeviceClass parent_class; + + /*< public >*/ +} DIAG318Class; + +#endif /* DIAG318_H */ diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index d11069b..44a424b 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -36,6 +36,7 @@ #include "cpu_models.h" #include "hw/nmi.h" #include "hw/s390x/tod.h" +#include "hw/s390x/diag318.h" S390CPU *s390_cpu_addr2state(uint16_t cpu_addr) { @@ -92,6 +93,7 @@ static const char *const reset_dev_types[] = { "s390-sclp-event-facility", "s390-flic", "diag288", + TYPE_S390_DIAG318, }; static void subsystem_reset(void) @@ -246,6 +248,17 @@ static void s390_create_sclpconsole(const char *type, Chardev *chardev) qdev_init_nofail(dev); } +static void s390_init_diag318(void) +{ + Object *new = object_new(TYPE_S390_DIAG318); + DeviceState *dev = DEVICE(new); + + object_property_add_child(qdev_get_machine(), TYPE_S390_DIAG318, + new, NULL); + object_unref(new); + qdev_init_nofail(dev); +} + static void ccw_init(MachineState *machine) { int ret; @@ -302,6 +315,8 @@ static void ccw_init(MachineState *machine) /* init the TOD clock */ s390_init_tod(); + + s390_init_diag318(); } static void s390_cpu_plug(HotplugHandler *hotplug_dev, @@ -570,6 +585,7 @@ static void machine_set_loadparm(Object *obj, const char *val, Error **errp) ms->loadparm[i] = ' '; /* pad right with spaces */ } } + static inline void s390_machine_initfn(Object *obj) { object_property_add_bool(obj, "aes-key-wrap", @@ -652,6 +668,13 @@ static void ccw_machine_4_0_instance_options(MachineState *machine) static void ccw_machine_4_0_class_options(MachineClass *mc) { + /* + * Read Info might reveal more bytes used to detect facilities, thus + * reducing the number of CPU entries. Let's reduce the max CPUs by + * an arbitrary number in effort to anticipate future facility bytes. + */ + if ((SCCB_SIZE - sizeof(ReadInfo)) / sizeof(CPUEntry) < S390_MAX_CPUS) + mc->max_cpus = S390_MAX_CPUS - 8; } DEFINE_CCW_MACHINE(4_0, "4.0", true); diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c index 4510a80..9cfa188 100644 --- a/hw/s390x/sclp.c +++ b/hw/s390x/sclp.c @@ -22,6 +22,7 @@ #include "hw/s390x/event-facility.h" #include "hw/s390x/s390-pci-bus.h" #include "hw/s390x/ipl.h" +#include "kvm_s390x.h" static inline SCLPDevice *get_sclp_device(void) { @@ -74,6 +75,10 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb) s390_get_feat_block(S390_FEAT_TYPE_SCLP_CONF_CHAR_EXT, read_info->conf_char_ext); + /* Enable diag318 for guest if KVM supports emulation */ + if (kvm_s390_has_diag318()) + read_info->fac134 = 0x80; + read_info->facilities = cpu_to_be64(SCLP_HAS_CPU_INFO | SCLP_HAS_IOA_RECONFIG); diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h index f9db243..d47e10a 100644 --- a/include/hw/s390x/sclp.h +++ b/include/hw/s390x/sclp.h @@ -133,6 +133,8 @@ typedef struct ReadInfo { uint16_t highest_cpu; uint8_t _reserved5[124 - 122]; /* 122-123 */ uint32_t hmfai; + uint8_t _reserved7[134 - 128]; /* 128-133 */ + uint8_t fac134; struct CPUEntry entries[0]; } QEMU_PACKED ReadInfo; diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h index 0265482..735f5a4 100644 --- a/linux-headers/asm-s390/kvm.h +++ b/linux-headers/asm-s390/kvm.h @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req { #define KVM_S390_VM_CRYPTO 2 #define KVM_S390_VM_CPU_MODEL 3 #define KVM_S390_VM_MIGRATION 4 +#define KVM_S390_VM_MISC 5 /* kvm attributes for mem_ctrl */ #define KVM_S390_VM_MEM_ENABLE_CMMA 0 @@ -168,6 +169,9 @@ struct kvm_s390_vm_cpu_subfunc { #define KVM_S390_VM_MIGRATION_START 1 #define KVM_S390_VM_MIGRATION_STATUS 2 +/* kvm attributes for KVM_S390_VM_MISC */ +#define KVM_S390_VM_MISC_CPC 0 + /* for KVM_GET_REGS and KVM_SET_REGS */ struct kvm_regs { /* general purpose regs for s390 */ diff --git a/target/s390x/kvm-stub.c b/target/s390x/kvm-stub.c index bf7795e..7861ccd 100644 --- a/target/s390x/kvm-stub.c +++ b/target/s390x/kvm-stub.c @@ -104,3 +104,18 @@ void kvm_s390_stop_interrupt(S390CPU *cpu) void kvm_s390_restart_interrupt(S390CPU *cpu) { } + +int kvm_s390_get_cpc(uint64_t *cpc) +{ + return 0; +} + +int kvm_s390_set_cpc(uint64_t cpc) +{ + return 0; +} + +bool kvm_s390_has_diag318(void) +{ + return false; +} diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c index 19530fb..225e516 100644 --- a/target/s390x/kvm.c +++ b/target/s390x/kvm.c @@ -747,6 +747,38 @@ int kvm_s390_set_clock_ext(uint8_t tod_high, uint64_t tod_low) return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr); } +int kvm_s390_get_cpc(uint64_t *cpc) +{ + struct kvm_device_attr attr = { + .group = KVM_S390_VM_MISC, + .attr = KVM_S390_VM_MISC_CPC, + .addr = (uint64_t)cpc, + }; + + return kvm_vm_ioctl(kvm_state, KVM_GET_DEVICE_ATTR, &attr); +} + +int kvm_s390_set_cpc(uint64_t cpc) +{ + struct kvm_device_attr attr = { + .group = KVM_S390_VM_MISC, + .attr = KVM_S390_VM_MISC_CPC, + .addr = (uint64_t)&cpc, + }; + + return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr); +} + +bool kvm_s390_has_diag318(void) +{ + struct kvm_device_attr attr = { + .group = KVM_S390_VM_MISC, + .attr = KVM_S390_VM_MISC_CPC, + }; + + return kvm_vm_ioctl(kvm_state, KVM_HAS_DEVICE_ATTR, &attr) == 0; +} + /** * kvm_s390_mem_op: * @addr: the logical start address in guest memory diff --git a/target/s390x/kvm_s390x.h b/target/s390x/kvm_s390x.h index 6e52287..53f165f 100644 --- a/target/s390x/kvm_s390x.h +++ b/target/s390x/kvm_s390x.h @@ -29,6 +29,9 @@ int kvm_s390_get_clock(uint8_t *tod_high, uint64_t *tod_clock); int kvm_s390_get_clock_ext(uint8_t *tod_high, uint64_t *tod_clock); int kvm_s390_set_clock(uint8_t tod_high, uint64_t tod_clock); int kvm_s390_set_clock_ext(uint8_t tod_high, uint64_t tod_clock); +int kvm_s390_get_cpc(uint64_t *cpc); +int kvm_s390_set_cpc(uint64_t cpc); +bool kvm_s390_has_diag318(void); void kvm_s390_enable_css_support(S390CPU *cpu); int kvm_s390_assign_subch_ioeventfd(EventNotifier *notifier, uint32_t sch, int vq, bool assign);
DIAGNOSE 0x318 (diag318) is a privileged s390x instruction that must be intercepted by SIE and handled via KVM. Let's introduce some functions to communicate between QEMU and KVM via ioctls. These will be used to get/set the diag318 related information (also known as the "Control Program Code" or "CPC"), as well as check the system if KVM supports handling this instruction. The availability of this instruction is determined by byte 134, bit 0 of the Read Info block. This coincidentally expands into the space used for CPU entries, which means VMs running with the diag318 capability will have a reduced maximum CPU count. To alleviate this, let's calculate the actual max CPU entry space by subtracting the size of Read Info from the SCCB size then dividing that number by the size of a CPU entry. If this value is less than the value denoted by S390_MAX_CPUS, then let's reduce the max cpus for s390 from 248 to 240 in an effort to anticipate this potentially happening again in the future. In order to simplify the migration and system reset requirements of the diag318 data, let's introduce it as a device class and include a VMStateDescription. Diag318 is reset on during modified clear and load normal. Signed-off-by: Collin Walling <walling@linux.ibm.com> --- hw/s390x/Makefile.objs | 1 + hw/s390x/diag318.c | 100 +++++++++++++++++++++++++++++++++++++++++++ hw/s390x/diag318.h | 39 +++++++++++++++++ hw/s390x/s390-virtio-ccw.c | 23 ++++++++++ hw/s390x/sclp.c | 5 +++ include/hw/s390x/sclp.h | 2 + linux-headers/asm-s390/kvm.h | 4 ++ target/s390x/kvm-stub.c | 15 +++++++ target/s390x/kvm.c | 32 ++++++++++++++ target/s390x/kvm_s390x.h | 3 ++ 10 files changed, 224 insertions(+) create mode 100644 hw/s390x/diag318.c create mode 100644 hw/s390x/diag318.h