[v1,19/55] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse

Message ID	20210726035036.739609-20-npiggin@gmail.com
State	New
Headers	show Return-Path: <kvm-ppc-owner@vger.kernel.org> From: Nicholas Piggin <npiggin@gmail.com> To: kvm-ppc@vger.kernel.org Cc: Nicholas Piggin <npiggin@gmail.com>, linuxppc-dev@lists.ozlabs.org Subject: [PATCH v1 19/55] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse Date: Mon, 26 Jul 2021 13:50:00 +1000 Message-Id: <20210726035036.739609-20-npiggin@gmail.com> In-Reply-To: <20210726035036.739609-1-npiggin@gmail.com> References: <20210726035036.739609-1-npiggin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[v1,01/55] KVM: PPC: Book3S HV: Remove TM emulation from POWER7/8 path \| expand [v1,01/55] KVM: PPC: Book3S HV: Remove TM emulation from POWER7/8 path [v1,02/55] KVM: PPC: Book3S HV P9: Fixes for TM softpatch interrupt [v1,03/55] KVM: PPC: Book3S HV: Sanitise vcpu registers in nested path [v1,04/55] KVM: PPC: Book3S HV: Stop forwarding all HFUs to L1 [v1,05/55] KVM: PPC: Book3S HV Nested: Reflect guest PMU in-use to L0 when guest SPRs are live [v1,06/55] powerpc/64s: Remove WORT SPR from POWER9/10 [v1,07/55] KMV: PPC: Book3S HV P9: Use set_dec to set decrementer to host [v1,08/55] KVM: PPC: Book3S HV P9: Use host timer accounting to avoid decrementer read [v1,09/55] KVM: PPC: Book3S HV P9: Use large decrementer for HDEC [v1,10/55] KVM: PPC: Book3S HV P9: Reduce mftb per guest entry/exit [v1,11/55] powerpc/time: add API for KVM to re-arm the host timer/decrementer [v1,12/55] KVM: PPC: Book3S HV: POWER10 enable HAIL when running radix guests [v1,13/55] powerpc/64s: Keep AMOR SPR a constant ~0 at runtime [v1,14/55] KVM: PPC: Book3S HV: Don't always save PMU for guest capable of nesting [v1,15/55] powerpc/64s: Always set PMU control registers to frozen/disabled when not in use [v1,16/55] powerpc/64s: Implement PMU override command line option [v1,17/55] KVM: PPC: Book3S HV P9: Implement PMU save/restore in C [v1,18/55] KVM: PPC: Book3S HV P9: Factor PMU save/load into context switch functions [v1,19/55] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse [v1,20/55] KVM: PPC: Book3S HV P9: Factor out yield_count increment [v1,21/55] KVM: PPC: Book3S HV: CTRL SPR does not require read-modify-write [v1,22/55] KVM: PPC: Book3S HV P9: Move SPRG restore to restore_p9_host_os_sprs [v1,23/55] KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs [v1,24/55] KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE] disable [v1,25/55] KVM: PPC: Book3S HV P9: Add kvmppc_stop_thread to match kvmppc_start_thread [v1,26/55] KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase [v1,27/55] KVM: PPC: Book3S HV P9: Move TB updates [v1,28/55] KVM: PPC: Book3S HV P9: Optimise timebase reads [v1,29/55] KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls [v1,30/55] KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed [v1,31/55] KVM: PPC: Book3S HV P9: Juggle SPR switching around [v1,32/55] KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions [v1,33/55] KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in [v1,34/55] KVM: PPC: Book3S HV P9: Move nested guest entry into its own function [v1,35/55] KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low level entry [v1,36/55] KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit [v1,37/55] KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible [v1,38/55] KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that require it [v1,39/55] KVM: PPC: Book3S HV P9: More SPR speed improvements [v1,40/55] KVM: PPC: Book3S HV P9: Demand fault EBB facility registers [v1,41/55] KVM: PPC: Book3S HV P9: Demand fault TM facility registers [v1,42/55] KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs [v1,43/55] KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code [v1,44/55] KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs [v1,45/55] KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed [v1,46/55] KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit [v1,47/55] KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry [v1,48/55] KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry [v1,49/55] KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving [v1,50/55] KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_ready [v1,51/55] KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exit [v1,52/55] KVM: PPC: Book3S HV P9: Remove most of the vcore logic [v1,53/55] KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entry [v1,54/55] KVM: PPC: Book3S HV P9: Stop using vc->dpdes [v1,55/55] KVM: PPC: Book3S HV P9: Remove subcore HMI handling

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h index eaf3a562bf1e..df6bed4b2a46 100644 --- a/arch/powerpc/include/asm/kvm_book3s_64.h +++ b/arch/powerpc/include/asm/kvm_book3s_64.h @@ -39,6 +39,7 @@ struct kvm_nested_guest { pgd_t *shadow_pgtable; /* our page table for this guest */ u64 l1_gr_to_hr; /* L1's addr of part'n-scoped table */ u64 process_table; /* process table entry for this guest */ + u64 hfscr; /* L1's HFSCR */ long refcnt; /* number of pointers to this struct */ struct mutex tlb_lock; /* serialize page faults and tlbies */ struct kvm_nested_guest *next; diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 9f52f282b1aa..aee41edcfe6b 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -804,6 +804,7 @@ struct kvm_vcpu_arch { struct kvmppc_vpa slb_shadow; spinlock_t tbacct_lock; + u64 hfscr_permitted; /* A mask of permitted HFSCR facilities */ u64 busy_stolen; u64 busy_preempt; diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 091b67ef6eba..7c75f63648d6 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1421,6 +1421,23 @@ static int kvmppc_emulate_doorbell_instr(struct kvm_vcpu *vcpu) return RESUME_GUEST; } +/* + * If the lppaca had pmcregs_in_use clear when we exited the guest, then + * HFSCR_PM is cleared for next entry. If the guest then tries to access + * the PMU SPRs, we get this facility unavailable interrupt. Putting HFSCR_PM + * back in the guest HFSCR will cause the next entry to load the PMU SPRs and + * allow the guest access to continue. + */ +static int kvmppc_pmu_unavailable(struct kvm_vcpu *vcpu) +{ + if (!(vcpu->arch.hfscr_permitted & HFSCR_PM)) + return EMULATE_FAIL; + + vcpu->arch.hfscr |= HFSCR_PM; + + return RESUME_GUEST; +} + static int kvmppc_handle_exit_hv(struct kvm_vcpu *vcpu, struct task_struct *tsk) { @@ -1705,16 +1722,22 @@ XXX benchmark guest exits * to emulate. * Otherwise, we just generate a program interrupt to the guest. */ - case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: + case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: { r = EMULATE_FAIL; - if (((vcpu->arch.hfscr >> 56) == FSCR_MSGP_LG) && - cpu_has_feature(CPU_FTR_ARCH_300)) - r = kvmppc_emulate_doorbell_instr(vcpu); + if (cpu_has_feature(CPU_FTR_ARCH_300)) { + unsigned long cause = vcpu->arch.hfscr >> 56; + + if (cause == FSCR_MSGP_LG) + r = kvmppc_emulate_doorbell_instr(vcpu); + if (cause == FSCR_PM_LG) + r = kvmppc_pmu_unavailable(vcpu); + } if (r == EMULATE_FAIL) { kvmppc_core_queue_program(vcpu, SRR1_PROGILL); r = RESUME_GUEST; } break; + } case BOOK3S_INTERRUPT_HV_RM_HARD: r = RESUME_PASSTHROUGH; @@ -2723,6 +2746,13 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu) if (cpu_has_feature(CPU_FTR_TM_COMP)) vcpu->arch.hfscr |= HFSCR_TM; + vcpu->arch.hfscr_permitted = vcpu->arch.hfscr; + + /* + * PM is demand-faulted so start with it clear. + */ + vcpu->arch.hfscr &= ~HFSCR_PM; + kvmppc_mmu_book3s_hv_init(vcpu); vcpu->arch.state = KVMPPC_VCPU_NOTREADY; @@ -3793,6 +3823,14 @@ static void freeze_pmu(unsigned long mmcr0, unsigned long mmcra) static void switch_pmu_to_guest(struct kvm_vcpu *vcpu, struct p9_host_os_sprs *host_os_sprs) { + struct lppaca *lp; + int load_pmu = 1; + + lp = vcpu->arch.vpa.pinned_addr; + if (lp) + load_pmu = lp->pmcregs_in_use; + + /* Save host */ if (ppc_get_pmu_inuse()) { /* * It might be better to put PMU handling (at least for the @@ -3827,41 +3865,47 @@ static void switch_pmu_to_guest(struct kvm_vcpu *vcpu, } #ifdef CONFIG_PPC_PSERIES + /* After saving PMU, before loading guest PMU, flip pmcregs_in_use */ if (kvmhv_on_pseries()) { barrier(); - if (vcpu->arch.vpa.pinned_addr) { - struct lppaca *lp = vcpu->arch.vpa.pinned_addr; - get_lppaca()->pmcregs_in_use = lp->pmcregs_in_use; - } else { - get_lppaca()->pmcregs_in_use = 1; - } + get_lppaca()->pmcregs_in_use = load_pmu; barrier(); } #endif - /* load guest */ - mtspr(SPRN_PMC1, vcpu->arch.pmc[0]); - mtspr(SPRN_PMC2, vcpu->arch.pmc[1]); - mtspr(SPRN_PMC3, vcpu->arch.pmc[2]); - mtspr(SPRN_PMC4, vcpu->arch.pmc[3]); - mtspr(SPRN_PMC5, vcpu->arch.pmc[4]); - mtspr(SPRN_PMC6, vcpu->arch.pmc[5]); - mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]); - mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]); - mtspr(SPRN_SDAR, vcpu->arch.sdar); - mtspr(SPRN_SIAR, vcpu->arch.siar); - mtspr(SPRN_SIER, vcpu->arch.sier[0]); + /* + * Load guest. If the VPA said the PMCs are not in use but the guest + * tried to access them anyway, HFSCR[PM] will be set by the HFAC + * fault so we can make forward progress. + */ + if (load_pmu || (vcpu->arch.hfscr & HFSCR_PM)) { + mtspr(SPRN_PMC1, vcpu->arch.pmc[0]); + mtspr(SPRN_PMC2, vcpu->arch.pmc[1]); + mtspr(SPRN_PMC3, vcpu->arch.pmc[2]); + mtspr(SPRN_PMC4, vcpu->arch.pmc[3]); + mtspr(SPRN_PMC5, vcpu->arch.pmc[4]); + mtspr(SPRN_PMC6, vcpu->arch.pmc[5]); + mtspr(SPRN_MMCR1, vcpu->arch.mmcr[1]); + mtspr(SPRN_MMCR2, vcpu->arch.mmcr[2]); + mtspr(SPRN_SDAR, vcpu->arch.sdar); + mtspr(SPRN_SIAR, vcpu->arch.siar); + mtspr(SPRN_SIER, vcpu->arch.sier[0]); + + if (cpu_has_feature(CPU_FTR_ARCH_31)) { + mtspr(SPRN_MMCR3, vcpu->arch.mmcr[3]); + mtspr(SPRN_SIER2, vcpu->arch.sier[1]); + mtspr(SPRN_SIER3, vcpu->arch.sier[2]); + } - if (cpu_has_feature(CPU_FTR_ARCH_31)) { - mtspr(SPRN_MMCR3, vcpu->arch.mmcr[3]); - mtspr(SPRN_SIER2, vcpu->arch.sier[1]); - mtspr(SPRN_SIER3, vcpu->arch.sier[2]); - } + /* Set MMCRA then MMCR0 last */ + mtspr(SPRN_MMCRA, vcpu->arch.mmcra); + mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]); + /* No isync necessary because we're starting counters */ - /* Set MMCRA then MMCR0 last */ - mtspr(SPRN_MMCRA, vcpu->arch.mmcra); - mtspr(SPRN_MMCR0, vcpu->arch.mmcr[0]); - /* No isync necessary because we're starting counters */ + if (!vcpu->arch.nested && + (vcpu->arch.hfscr_permitted & HFSCR_PM)) + vcpu->arch.hfscr |= HFSCR_PM; + } } static void switch_pmu_to_host(struct kvm_vcpu *vcpu, @@ -3897,9 +3941,32 @@ static void switch_pmu_to_host(struct kvm_vcpu *vcpu, vcpu->arch.sier[1] = mfspr(SPRN_SIER2); vcpu->arch.sier[2] = mfspr(SPRN_SIER3); } - } else { + + } else if (vcpu->arch.hfscr & HFSCR_PM) { + /* + * The guest accessed PMC SPRs without specifying they should + * be preserved, or it cleared pmcregs_in_use after the last + * access. Just ensure they are frozen. + */ freeze_pmu(mfspr(SPRN_MMCR0), mfspr(SPRN_MMCRA)); - } + + /* + * Demand-fault PMU register access in the guest. + * + * This is used to grab the guest's VPA pmcregs_in_use value + * and reflect it into the host's VPA in the case of a nested + * hypervisor. + * + * It also avoids having to zero-out SPRs after each guest + * exit to avoid side-channels when. + * + * This is cleared here when we exit the guest, so later HFSCR + * interrupt handling can add it back to run the guest with + * PM enabled next time. + */ + if (!vcpu->arch.nested) + vcpu->arch.hfscr &= ~HFSCR_PM; + } /* otherwise the PMU should still be frozen */ #ifdef CONFIG_PPC_PSERIES if (kvmhv_on_pseries()) { diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c index 983628ed4376..3ffc63ffebc5 100644 --- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -104,16 +104,6 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, { struct kvmppc_vcore *vc = vcpu->arch.vcore; - /* - * When loading the hypervisor-privileged registers to run L2, - * we might have used bits from L1 state to restrict what the - * L2 state is allowed to be. Since L1 is not allowed to read - * the HV registers, do not include these modifications in the - * return state. - */ - hr->hfscr = ((~HFSCR_INTR_CAUSE & hr->hfscr) | - (HFSCR_INTR_CAUSE & vcpu->arch.hfscr)); - hr->dpdes = vc->dpdes; hr->purr = vcpu->arch.purr; hr->spurr = vcpu->arch.spurr; @@ -137,14 +127,23 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, case BOOK3S_INTERRUPT_H_INST_STORAGE: hr->asdr = vcpu->arch.fault_gpa; break; - case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: - { - u8 cause = vcpu->arch.hfscr >> 56; + case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: { + u64 cause = vcpu->arch.hfscr >> 56; WARN_ON_ONCE(cause >= BITS_PER_LONG); - if (!(hr->hfscr & (1UL << cause))) + /* + * When loading the hypervisor-privileged registers to run L2, + * we might have used bits from L1 state to restrict what the + * L2 state is allowed to be. Since L1 is not allowed to read + * the HV registers, do not include these modifications in the + * return state. + */ + hr->hfscr &= ~HFSCR_INTR_CAUSE; + if (!(hr->hfscr & (1UL << cause))) { + hr->hfscr |= vcpu->arch.hfscr & HFSCR_INTR_CAUSE; break; + } /* * We have disabled this facility, so it does not @@ -152,10 +151,6 @@ static void save_hv_return_state(struct kvm_vcpu *vcpu, */ vcpu->arch.trap = BOOK3S_INTERRUPT_H_EMUL_ASSIST; kvmppc_load_last_inst(vcpu, INST_GENERIC, &vcpu->arch.emul_inst); - - /* Don't leak the cause field */ - hr->hfscr &= ~HFSCR_INTR_CAUSE; - fallthrough; } case BOOK3S_INTERRUPT_H_EMUL_ASSIST: @@ -299,10 +294,10 @@ static void load_l2_hv_regs(struct kvm_vcpu *vcpu, (vc->lpcr & ~mask) | (*lpcr & mask)); /* - * Don't let L1 enable features for L2 which we've disabled for L1, - * but preserve the interrupt cause field. + * Don't let L1 enable features for L2 which we disallow for L1. + * Preserve the interrupt cause field. */ - vcpu->arch.hfscr = l2_hv->hfscr & (HFSCR_INTR_CAUSE | l1_hv->hfscr); + vcpu->arch.hfscr = l2_hv->hfscr & (HFSCR_INTR_CAUSE | vcpu->arch.hfscr_permitted); /* Don't let data address watchpoint match in hypervisor state */ vcpu->arch.dawrx0 = l2_hv->dawrx0 & ~DAWRX_HYP; @@ -389,6 +384,7 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) /* set L1 state to L2 state */ vcpu->arch.nested = l2; vcpu->arch.nested_vcpu_id = l2_hv.vcpu_token; + l2->hfscr = l2_hv.hfscr; vcpu->arch.regs = l2_regs; /* Guest must always run with ME enabled, HV disabled. */

[v1,19/55] KVM: PPC: Book3S HV P9: Demand fault PMU SPRs when marked not inuse

Commit Message

Patch