Message ID | 20211111162746.100598-2-vkuznets@redhat.com |
---|---|
State | New |
Headers | show |
Series | KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS and re-purpose it on x86 | expand |
Hi Vitaly, On 2021-11-11 16:27, Vitaly Kuznetsov wrote: > It doesn't make sense to return the recommended maximum number of > vCPUs which exceeds the maximum possible number of vCPUs. > > Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> > --- > arch/arm64/kvm/arm.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index 7838e9fb693e..391dc7a921d5 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, > long ext) > r = 1; > break; > case KVM_CAP_NR_VCPUS: > - r = num_online_cpus(); > + if (kvm) > + r = min_t(unsigned int, num_online_cpus(), > + kvm->arch.max_vcpus); > + else > + r = min_t(unsigned int, num_online_cpus(), > + kvm_arm_default_max_vcpus()); > break; > case KVM_CAP_MAX_VCPUS: > case KVM_CAP_MAX_VCPU_ID: This looks odd. This means that depending on the phase userspace is in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing or the other. For example, I create a VM on a 32 CPU system, NR_VCPUS says 32. I create a GICv2 interrupt controller, it now says 8. That's a change in behaviour that is visible by userspace, which I'm keen on avoiding. I'd rather have the kvm and !kvm cases return the same thing. Thanks, M.
Marc Zyngier <maz@kernel.org> writes: > Hi Vitaly, > > On 2021-11-11 16:27, Vitaly Kuznetsov wrote: >> It doesn't make sense to return the recommended maximum number of >> vCPUs which exceeds the maximum possible number of vCPUs. >> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> >> --- >> arch/arm64/kvm/arm.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c >> index 7838e9fb693e..391dc7a921d5 100644 >> --- a/arch/arm64/kvm/arm.c >> +++ b/arch/arm64/kvm/arm.c >> @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, >> long ext) >> r = 1; >> break; >> case KVM_CAP_NR_VCPUS: >> - r = num_online_cpus(); >> + if (kvm) >> + r = min_t(unsigned int, num_online_cpus(), >> + kvm->arch.max_vcpus); >> + else >> + r = min_t(unsigned int, num_online_cpus(), >> + kvm_arm_default_max_vcpus()); >> break; >> case KVM_CAP_MAX_VCPUS: >> case KVM_CAP_MAX_VCPU_ID: > > This looks odd. This means that depending on the phase userspace is > in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing > or the other. > > For example, I create a VM on a 32 CPU system, NR_VCPUS says 32. > I create a GICv2 interrupt controller, it now says 8. > > That's a change in behaviour that is visible by userspace Yes, I realize this is a userspace visible change. The reason I suggest it is that logically, it seems very odd that the maximum recommended number of vCPUs (KVM_CAP_NR_VCPUS) can be higher, than the maximum supported number of vCPUs (KVM_CAP_MAX_VCPUS). All userspaces which use this information somehow should already contain some workaround for this case. (maybe it's a rare one and nobody hit it yet or maybe there are no userspaces using KVM_CAP_NR_VCPUS for anything besides complaining -- like QEMU). I'd like KVM to be consistent across architectures and have the same (similar) meaning for KVM_CAP_NR_VCPUS. > which I'm keen on avoiding. I'd rather have the kvm and !kvm cases > return the same thing. Forgive me my (ARM?) ignorance but what would it be then? If we go for min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat this can still go above KVM_CAP_MAX_VCPUS after vGIC is created? Thanks for the feedback!
On Fri, Nov 12, 2021 at 10:51:10AM +0100, Vitaly Kuznetsov wrote: > Marc Zyngier <maz@kernel.org> writes: > > > Hi Vitaly, > > > > On 2021-11-11 16:27, Vitaly Kuznetsov wrote: > >> It doesn't make sense to return the recommended maximum number of > >> vCPUs which exceeds the maximum possible number of vCPUs. > >> > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> > >> --- > >> arch/arm64/kvm/arm.c | 7 ++++++- > >> 1 file changed, 6 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > >> index 7838e9fb693e..391dc7a921d5 100644 > >> --- a/arch/arm64/kvm/arm.c > >> +++ b/arch/arm64/kvm/arm.c > >> @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, > >> long ext) > >> r = 1; > >> break; > >> case KVM_CAP_NR_VCPUS: > >> - r = num_online_cpus(); > >> + if (kvm) > >> + r = min_t(unsigned int, num_online_cpus(), > >> + kvm->arch.max_vcpus); > >> + else > >> + r = min_t(unsigned int, num_online_cpus(), > >> + kvm_arm_default_max_vcpus()); > >> break; > >> case KVM_CAP_MAX_VCPUS: > >> case KVM_CAP_MAX_VCPU_ID: > > > > This looks odd. This means that depending on the phase userspace is > > in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing > > or the other. > > > > For example, I create a VM on a 32 CPU system, NR_VCPUS says 32. > > I create a GICv2 interrupt controller, it now says 8. > > > > That's a change in behaviour that is visible by userspace > > Yes, I realize this is a userspace visible change. The reason I suggest > it is that logically, it seems very odd that the maximum recommended > number of vCPUs (KVM_CAP_NR_VCPUS) can be higher, than the maximum > supported number of vCPUs (KVM_CAP_MAX_VCPUS). All userspaces which use > this information somehow should already contain some workaround for this > case. (maybe it's a rare one and nobody hit it yet or maybe there are no > userspaces using KVM_CAP_NR_VCPUS for anything besides complaining -- > like QEMU). > > I'd like KVM to be consistent across architectures and have the same > (similar) meaning for KVM_CAP_NR_VCPUS. KVM_CAP_NR_VCPUS seems pretty useless if we just want to tell userspace the same thing it can get with _SC_NPROCESSORS_ONLN. In fact, if userspace knows something we don't about the future onlining of some pcpus, then maybe userspace would prefer to check _SC_NPROCESSORS_CONF. > > > which I'm keen on avoiding. I'd rather have the kvm and !kvm cases > > return the same thing. > > Forgive me my (ARM?) ignorance but what would it be then? If we go for > min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat > this can still go above KVM_CAP_MAX_VCPUS after vGIC is created? So the GIC version case looks like the type of thing that could make KVM_CAP_NR_VCPUS useful, i.e. being able to tell userspace a maximum number of vcpus supported for a given configuration. However, even in that case the concept of "recommended" number doesn't make sense, because, for the GICv2 example, a VM cannot configure more than 8 VCPUs, so it's a real limit, not a recommendation. Maybe KVM_CAP_NR_VCPUS should just be left alone, but deprecated, and, if there's need, a new CAP could be created for a config-vcpu-max. Thanks, drew
On 11/12/21 11:38, Andrew Jones wrote: >> >> I'd like KVM to be consistent across architectures and have the same >> (similar) meaning for KVM_CAP_NR_VCPUS. > KVM_CAP_NR_VCPUS seems pretty useless if we just want to tell userspace > the same thing it can get with _SC_NPROCESSORS_ONLN. In fact, if userspace > knows something we don't about the future onlining of some pcpus, then > maybe userspace would prefer to check _SC_NPROCESSORS_CONF. It's the same for most architectures, but for example PPC has to take into account the handling of threads, which can exist while the VMs run but not in the host. So KVM_CAP_NR_VCPUS on PPC is _SC_NPROCESSORS_CONF times the number of threads per core. If you don't count PPC (not sure about s390), it _is_ pretty useless indeed. Paolo
On Fri, 12 Nov 2021 09:51:10 +0000, Vitaly Kuznetsov <vkuznets@redhat.com> wrote: > > Marc Zyngier <maz@kernel.org> writes: > > > Hi Vitaly, > > > > On 2021-11-11 16:27, Vitaly Kuznetsov wrote: > >> It doesn't make sense to return the recommended maximum number of > >> vCPUs which exceeds the maximum possible number of vCPUs. > >> > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> > >> --- > >> arch/arm64/kvm/arm.c | 7 ++++++- > >> 1 file changed, 6 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > >> index 7838e9fb693e..391dc7a921d5 100644 > >> --- a/arch/arm64/kvm/arm.c > >> +++ b/arch/arm64/kvm/arm.c > >> @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, > >> long ext) > >> r = 1; > >> break; > >> case KVM_CAP_NR_VCPUS: > >> - r = num_online_cpus(); > >> + if (kvm) > >> + r = min_t(unsigned int, num_online_cpus(), > >> + kvm->arch.max_vcpus); > >> + else > >> + r = min_t(unsigned int, num_online_cpus(), > >> + kvm_arm_default_max_vcpus()); > >> break; > >> case KVM_CAP_MAX_VCPUS: > >> case KVM_CAP_MAX_VCPU_ID: > > > > This looks odd. This means that depending on the phase userspace is > > in while initialising the VM, KVM_CAP_NR_VCPUS can return one thing > > or the other. > > > > For example, I create a VM on a 32 CPU system, NR_VCPUS says 32. > > I create a GICv2 interrupt controller, it now says 8. > > > > That's a change in behaviour that is visible by userspace > > Yes, I realize this is a userspace visible change. The reason I suggest > it is that logically, it seems very odd that the maximum recommended > number of vCPUs (KVM_CAP_NR_VCPUS) can be higher, than the maximum > supported number of vCPUs (KVM_CAP_MAX_VCPUS). I'm all for this change. > All userspaces which use > this information somehow should already contain some workaround for this > case. (maybe it's a rare one and nobody hit it yet or maybe there are no > userspaces using KVM_CAP_NR_VCPUS for anything besides complaining -- > like QEMU). > > I'd like KVM to be consistent across architectures and have the same > (similar) meaning for KVM_CAP_NR_VCPUS. Sure, but this is a pretty useless piece of information anyway. As Andrew pointed out, the information is available somewhere else, and all we need to do is to cap it to the number of supported vcpus, which is effectively a KVM limitation. Also, we are talking about representing the architecture to userspace. No amount of massaging is going to make an arm64 box look like an x86. > > which I'm keen on avoiding. I'd rather have the kvm and !kvm cases > > return the same thing. > > Forgive me my (ARM?) ignorance but what would it be then? If we go for > min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat > this can still go above KVM_CAP_MAX_VCPUS after vGIC is created? "min(num_online_cpus(), kvm_arm_default_max_vcpus())" is probably the right thing in all cases. Yes, KVM_CAP_NR_VCPUS will keep reporting more than the VM can actually support. But that's why we have KVM_CAP_MAX_VCPUS, which tells you now many vcpus you can create for a given configuration. This shows how useless KVM_CAP_NR_VCPUS is, and I wouldn't mind a documentation patch stating this. Thanks, M.
On 11/12/21 15:02, Marc Zyngier wrote: >> I'd like KVM to be consistent across architectures and have the same >> (similar) meaning for KVM_CAP_NR_VCPUS. > Sure, but this is a pretty useless piece of information anyway. As > Andrew pointed out, the information is available somewhere else, and > all we need to do is to cap it to the number of supported vcpus, which > is effectively a KVM limitation. > > Also, we are talking about representing the architecture to userspace. > No amount of massaging is going to make an arm64 box look like an x86. Not sure what you mean? The API is about providing a piece of information independent of the architecture, while catering for a ppc weirdness. Yes it's mostly useless if you don't care about ppc, but it's not about making arm64 look like x86 or ppc; it's about not having to special case ppc in userspace. If anything, if KVM_CAP_NR_VCPUS returns the same for kvm and !kvm, then *that* is making an arm64 box look like an x86. On ARM the max vCPUs depends on VM's GIC configuration, so KVM_CAP_NR_VCPUS should take that into account. Or KVM_CAP_NR_VCPUS should have been only for !kvm; but the ship for that has sailed. Paolo >>> which I'm keen on avoiding. I'd rather have the kvm and !kvm cases >>> return the same thing. >> Forgive me my (ARM?) ignorance but what would it be then? If we go for >> min(num_online_cpus(), kvm_arm_default_max_vcpus()) in both cases, cat >> this can still go above KVM_CAP_MAX_VCPUS after vGIC is created? > "min(num_online_cpus(), kvm_arm_default_max_vcpus())" is probably the > right thing in all cases. Yes, KVM_CAP_NR_VCPUS will keep reporting > more than the VM can actually support. But that's why we have > KVM_CAP_MAX_VCPUS, which tells you now many vcpus you can create for a > given configuration.
Paolo Bonzini <pbonzini@redhat.com> writes: > On 11/12/21 15:02, Marc Zyngier wrote: >>> I'd like KVM to be consistent across architectures and have the same >>> (similar) meaning for KVM_CAP_NR_VCPUS. >> Sure, but this is a pretty useless piece of information anyway. As >> Andrew pointed out, the information is available somewhere else, and >> all we need to do is to cap it to the number of supported vcpus, which >> is effectively a KVM limitation. >> >> Also, we are talking about representing the architecture to userspace. >> No amount of massaging is going to make an arm64 box look like an x86. > > Not sure what you mean? The API is about providing a piece of > information independent of the architecture, while catering for a ppc > weirdness. Yes it's mostly useless if you don't care about ppc, but > it's not about making arm64 look like x86 or ppc; it's about not having > to special case ppc in userspace. > > If anything, if KVM_CAP_NR_VCPUS returns the same for kvm and !kvm, then > *that* is making an arm64 box look like an x86. On ARM the max vCPUs > depends on VM's GIC configuration, so KVM_CAP_NR_VCPUS should take that > into account. (I'm about to send v2 as we have s390 sorted out.) So what do we decide about ARM? - Current approach (kvm->arch.max_vcpus/kvm_arm_default_max_vcpus() depending on 'if (kvm)') - that would be my preference. - Always kvm_arm_default_max_vcpus to make the output independent on 'if (kvm)'. - keep the status quo (drop the patch). Please advise)
On 11/16/21 14:23, Vitaly Kuznetsov wrote: > (I'm about to send v2 as we have s390 sorted out.) > > So what do we decide about ARM? > - Current approach (kvm->arch.max_vcpus/kvm_arm_default_max_vcpus() > depending on 'if (kvm)') - that would be my preference. That would be mine too. Paolo > - Always kvm_arm_default_max_vcpus to make the output independent on 'if > (kvm)'. > - keep the status quo (drop the patch).
On Tue, 16 Nov 2021 13:23:25 +0000, Vitaly Kuznetsov <vkuznets@redhat.com> wrote: > > Paolo Bonzini <pbonzini@redhat.com> writes: > > > On 11/12/21 15:02, Marc Zyngier wrote: > >>> I'd like KVM to be consistent across architectures and have the same > >>> (similar) meaning for KVM_CAP_NR_VCPUS. > >> Sure, but this is a pretty useless piece of information anyway. As > >> Andrew pointed out, the information is available somewhere else, and > >> all we need to do is to cap it to the number of supported vcpus, which > >> is effectively a KVM limitation. > >> > >> Also, we are talking about representing the architecture to userspace. > >> No amount of massaging is going to make an arm64 box look like an x86. > > > > Not sure what you mean? The API is about providing a piece of > > information independent of the architecture, while catering for a ppc > > weirdness. Yes it's mostly useless if you don't care about ppc, but > > it's not about making arm64 look like x86 or ppc; it's about not having > > to special case ppc in userspace. > > > > If anything, if KVM_CAP_NR_VCPUS returns the same for kvm and !kvm, then > > *that* is making an arm64 box look like an x86. On ARM the max vCPUs > > depends on VM's GIC configuration, so KVM_CAP_NR_VCPUS should take that > > into account. > > (I'm about to send v2 as we have s390 sorted out.) > > So what do we decide about ARM? [...] > - Always kvm_arm_default_max_vcpus to make the output independent on 'if > (kvm)'. This. Between two useless numbers, I prefer the one that doesn't introduce any userspace visible changes. Thanks, M.
On 11/16/21 16:55, Marc Zyngier wrote: >> - Always kvm_arm_default_max_vcpus to make the output independent on 'if >> (kvm)'. > This. Between two useless numbers, I prefer the one that doesn't > introduce any userspace visible changes. Fair enough, I'm not going to override you---but please add a comment that says /* * arm64 treats KVM_CAP_NR_CPUS different from all other * architectures, as it does not bound it to num_online_cpus(). * It should not matter much because this is just an advisory * value. */ or something like that. Paolo
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 7838e9fb693e..391dc7a921d5 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -223,7 +223,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = 1; break; case KVM_CAP_NR_VCPUS: - r = num_online_cpus(); + if (kvm) + r = min_t(unsigned int, num_online_cpus(), + kvm->arch.max_vcpus); + else + r = min_t(unsigned int, num_online_cpus(), + kvm_arm_default_max_vcpus()); break; case KVM_CAP_MAX_VCPUS: case KVM_CAP_MAX_VCPU_ID:
It doesn't make sense to return the recommended maximum number of vCPUs which exceeds the maximum possible number of vCPUs. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> --- arch/arm64/kvm/arm.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)