Message ID | 20190813100100.GC9567@blackberry |
---|---|
State | Accepted |
Headers | show |
Series | powerpc/xive: Fix race condition leading to host crashes and hangs | expand |
On 13/08/2019 12:01, Paul Mackerras wrote: > At present, when running a guest on POWER9 using HV KVM but not using > an in-kernel interrupt controller (XICS or XIVE), for example if QEMU > is run with the kernel_irqchip=off option, the guest entry code goes > ahead and tries to load the guest context into the XIVE hardware, even > though no context has been set up. > > To fix this, we check that the "CAM word" is non-zero before pushing > it to the hardware. The CAM word is initialized to a non-zero value > in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(), > and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu. If a "CAM word" is defined, it means the vCPU (VP) was enabled at the XIVE HW level. So this is the criteria to consider that a vCPU needs to update (push) its XIVE thread interrupt context when scheduled to run. Reviewed-by: Cédric Le Goater <clg@kaod.org> Thanks, C. > > Cc: stable@vger.kernel.org # v4.11+ > Reported-by: Cédric Le Goater <clg@kaod.org> > Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") > Signed-off-by: Paul Mackerras <paulus@ozlabs.org> > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 ++ > arch/powerpc/kvm/book3s_xive.c | 11 ++++++++++- > arch/powerpc/kvm/book3s_xive_native.c | 3 +++ > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index 2e7e788..07181d0 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -942,6 +942,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300) > ld r11, VCPU_XIVE_SAVED_STATE(r4) > li r9, TM_QW1_OS > lwz r8, VCPU_XIVE_CAM_WORD(r4) > + cmpwi r8, 0 > + beq no_xive > li r7, TM_QW1_OS + TM_WORD2 > mfmsr r0 > andi. r0, r0, MSR_DR /* in real mode? */ > diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c > index 09f838a..586867e 100644 > --- a/arch/powerpc/kvm/book3s_xive.c > +++ b/arch/powerpc/kvm/book3s_xive.c > @@ -67,8 +67,14 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) > void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt; > u64 pq; > > - if (!tima) > + /* > + * Nothing to do if the platform doesn't have a XIVE > + * or this vCPU doesn't have its own XIVE context > + * (e.g. because it's not using an in-kernel interrupt controller). > + */ > + if (!tima || !vcpu->arch.xive_cam_word) > return; > + > eieio(); > __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); > __raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2); > @@ -1146,6 +1152,9 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu) > /* Disable the VP */ > xive_native_disable_vp(xc->vp_id); > > + /* Clear the cam word so guest entry won't try to push context */ > + vcpu->arch.xive_cam_word = 0; > + > /* Free the queues */ > for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { > struct xive_q *q = &xc->queues[i]; > diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c > index 368427f..11b91b4 100644 > --- a/arch/powerpc/kvm/book3s_xive_native.c > +++ b/arch/powerpc/kvm/book3s_xive_native.c > @@ -81,6 +81,9 @@ void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) > /* Disable the VP */ > xive_native_disable_vp(xc->vp_id); > > + /* Clear the cam word so guest entry won't try to push context */ > + vcpu->arch.xive_cam_word = 0; > + > /* Free the queues */ > for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { > kvmppc_xive_native_cleanup_queue(vcpu, i); >
On Tue, 2019-08-13 at 10:01:00 UTC, Paul Mackerras wrote: > At present, when running a guest on POWER9 using HV KVM but not using > an in-kernel interrupt controller (XICS or XIVE), for example if QEMU > is run with the kernel_irqchip=off option, the guest entry code goes > ahead and tries to load the guest context into the XIVE hardware, even > though no context has been set up. > > To fix this, we check that the "CAM word" is non-zero before pushing > it to the hardware. The CAM word is initialized to a non-zero value > in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(), > and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu. > > Cc: stable@vger.kernel.org # v4.11+ > Reported-by: Cédric Le Goater <clg@kaod.org> > Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") > Signed-off-by: Paul Mackerras <paulus@ozlabs.org> > Reviewed-by: Cédric Le Goater <clg@kaod.org> Applied to powerpc topic/ppc-kvm, thanks. https://git.kernel.org/powerpc/c/8d4ba9c931bc384bcc6889a43915aaaf19d3e499 cheers
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 2e7e788..07181d0 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -942,6 +942,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300) ld r11, VCPU_XIVE_SAVED_STATE(r4) li r9, TM_QW1_OS lwz r8, VCPU_XIVE_CAM_WORD(r4) + cmpwi r8, 0 + beq no_xive li r7, TM_QW1_OS + TM_WORD2 mfmsr r0 andi. r0, r0, MSR_DR /* in real mode? */ diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c index 09f838a..586867e 100644 --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -67,8 +67,14 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt; u64 pq; - if (!tima) + /* + * Nothing to do if the platform doesn't have a XIVE + * or this vCPU doesn't have its own XIVE context + * (e.g. because it's not using an in-kernel interrupt controller). + */ + if (!tima || !vcpu->arch.xive_cam_word) return; + eieio(); __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); __raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2); @@ -1146,6 +1152,9 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu) /* Disable the VP */ xive_native_disable_vp(xc->vp_id); + /* Clear the cam word so guest entry won't try to push context */ + vcpu->arch.xive_cam_word = 0; + /* Free the queues */ for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { struct xive_q *q = &xc->queues[i]; diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c index 368427f..11b91b4 100644 --- a/arch/powerpc/kvm/book3s_xive_native.c +++ b/arch/powerpc/kvm/book3s_xive_native.c @@ -81,6 +81,9 @@ void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) /* Disable the VP */ xive_native_disable_vp(xc->vp_id); + /* Clear the cam word so guest entry won't try to push context */ + vcpu->arch.xive_cam_word = 0; + /* Free the queues */ for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { kvmppc_xive_native_cleanup_queue(vcpu, i);
At present, when running a guest on POWER9 using HV KVM but not using an in-kernel interrupt controller (XICS or XIVE), for example if QEMU is run with the kernel_irqchip=off option, the guest entry code goes ahead and tries to load the guest context into the XIVE hardware, even though no context has been set up. To fix this, we check that the "CAM word" is non-zero before pushing it to the hardware. The CAM word is initialized to a non-zero value in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(), and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu. Cc: stable@vger.kernel.org # v4.11+ Reported-by: Cédric Le Goater <clg@kaod.org> Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> --- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 ++ arch/powerpc/kvm/book3s_xive.c | 11 ++++++++++- arch/powerpc/kvm/book3s_xive_native.c | 3 +++ 3 files changed, 15 insertions(+), 1 deletion(-)