Message ID | 20240726235234.228822-49-seanjc@google.com (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | KVM: Stop grabbing references to PFNMAP'd pages | expand |
On 7/27/24 01:51, Sean Christopherson wrote: > Move KVM x86's helper that "finishes" the faultin process to common KVM > so that the logic can be shared across all architectures. Note, not all > architectures implement a fast page fault path, but the gist of the > comment applies to all architectures. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/mmu/mmu.c | 24 ++---------------------- > include/linux/kvm_host.h | 26 ++++++++++++++++++++++++++ > 2 files changed, 28 insertions(+), 22 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 95beb50748fc..2a0cfa225c8d 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -4323,28 +4323,8 @@ static u8 kvm_max_private_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, > static void kvm_mmu_finish_page_fault(struct kvm_vcpu *vcpu, > struct kvm_page_fault *fault, int r) > { > - lockdep_assert_once(lockdep_is_held(&vcpu->kvm->mmu_lock) || > - r == RET_PF_RETRY); > - > - if (!fault->refcounted_page) > - return; > - > - /* > - * If the page that KVM got from the *primary MMU* is writable, and KVM > - * installed or reused a SPTE, mark the page/folio dirty. Note, this > - * may mark a folio dirty even if KVM created a read-only SPTE, e.g. if > - * the GFN is write-protected. Folios can't be safely marked dirty > - * outside of mmu_lock as doing so could race with writeback on the > - * folio. As a result, KVM can't mark folios dirty in the fast page > - * fault handler, and so KVM must (somewhat) speculatively mark the > - * folio dirty if KVM could locklessly make the SPTE writable. > - */ > - if (r == RET_PF_RETRY) > - kvm_release_page_unused(fault->refcounted_page); > - else if (!fault->map_writable) > - kvm_release_page_clean(fault->refcounted_page); > - else > - kvm_release_page_dirty(fault->refcounted_page); > + kvm_release_faultin_page(vcpu->kvm, fault->refcounted_page, > + r == RET_PF_RETRY, fault->map_writable); Does it make sense to move RET_PF_* to common code, and avoid a bool argument here? Paolo > } > > static int kvm_mmu_faultin_pfn_private(struct kvm_vcpu *vcpu, > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 9d2a97eb30e4..91341cdc6562 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1216,6 +1216,32 @@ static inline void kvm_release_page_unused(struct page *page) > void kvm_release_page_clean(struct page *page); > void kvm_release_page_dirty(struct page *page); > > +static inline void kvm_release_faultin_page(struct kvm *kvm, struct page *page, > + bool unused, bool dirty) > +{ > + lockdep_assert_once(lockdep_is_held(&kvm->mmu_lock) || unused); > + > + if (!page) > + return; > + > + /* > + * If the page that KVM got from the *primary MMU* is writable, and KVM > + * installed or reused a SPTE, mark the page/folio dirty. Note, this > + * may mark a folio dirty even if KVM created a read-only SPTE, e.g. if > + * the GFN is write-protected. Folios can't be safely marked dirty > + * outside of mmu_lock as doing so could race with writeback on the > + * folio. As a result, KVM can't mark folios dirty in the fast page > + * fault handler, and so KVM must (somewhat) speculatively mark the > + * folio dirty if KVM could locklessly make the SPTE writable. > + */ > + if (unused) > + kvm_release_page_unused(page); > + else if (dirty) > + kvm_release_page_dirty(page); > + else > + kvm_release_page_clean(page); > +} > + > kvm_pfn_t kvm_lookup_pfn(struct kvm *kvm, gfn_t gfn); > kvm_pfn_t __kvm_faultin_pfn(const struct kvm_memory_slot *slot, gfn_t gfn, > unsigned int foll, bool *writable,
On Tue, Jul 30, 2024, Paolo Bonzini wrote: > On 7/27/24 01:51, Sean Christopherson wrote: > > Move KVM x86's helper that "finishes" the faultin process to common KVM > > so that the logic can be shared across all architectures. Note, not all > > architectures implement a fast page fault path, but the gist of the > > comment applies to all architectures. > > > > Signed-off-by: Sean Christopherson <seanjc@google.com> > > --- > > arch/x86/kvm/mmu/mmu.c | 24 ++---------------------- > > include/linux/kvm_host.h | 26 ++++++++++++++++++++++++++ > > 2 files changed, 28 insertions(+), 22 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 95beb50748fc..2a0cfa225c8d 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -4323,28 +4323,8 @@ static u8 kvm_max_private_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, > > static void kvm_mmu_finish_page_fault(struct kvm_vcpu *vcpu, > > struct kvm_page_fault *fault, int r) > > { > > - lockdep_assert_once(lockdep_is_held(&vcpu->kvm->mmu_lock) || > > - r == RET_PF_RETRY); > > - > > - if (!fault->refcounted_page) > > - return; > > - > > - /* > > - * If the page that KVM got from the *primary MMU* is writable, and KVM > > - * installed or reused a SPTE, mark the page/folio dirty. Note, this > > - * may mark a folio dirty even if KVM created a read-only SPTE, e.g. if > > - * the GFN is write-protected. Folios can't be safely marked dirty > > - * outside of mmu_lock as doing so could race with writeback on the > > - * folio. As a result, KVM can't mark folios dirty in the fast page > > - * fault handler, and so KVM must (somewhat) speculatively mark the > > - * folio dirty if KVM could locklessly make the SPTE writable. > > - */ > > - if (r == RET_PF_RETRY) > > - kvm_release_page_unused(fault->refcounted_page); > > - else if (!fault->map_writable) > > - kvm_release_page_clean(fault->refcounted_page); > > - else > > - kvm_release_page_dirty(fault->refcounted_page); > > + kvm_release_faultin_page(vcpu->kvm, fault->refcounted_page, > > + r == RET_PF_RETRY, fault->map_writable); > > Does it make sense to move RET_PF_* to common code, and avoid a bool > argument here? After this series, probably? Especially if/when we make "struct kvm_page_fault" a common structure and converge all arch code. In this series, definitely not, as it would require even more patches to convert other architectures, and it's not clear that it would be a net win, at least not without even more massaging.
On 7/30/24 21:15, Sean Christopherson wrote: >> Does it make sense to move RET_PF_* to common code, and avoid a bool >> argument here? > After this series, probably? Especially if/when we make "struct kvm_page_fault" > a common structure and converge all arch code. In this series, definitely not, > as it would require even more patches to convert other architectures, and it's > not clear that it would be a net win, at least not without even more massaging. It does not seem to be hard, but I agree that all the other architectures right now use 0/-errno in the callers of kvm_release_faultin_page(). Paolo
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 95beb50748fc..2a0cfa225c8d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4323,28 +4323,8 @@ static u8 kvm_max_private_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, static void kvm_mmu_finish_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int r) { - lockdep_assert_once(lockdep_is_held(&vcpu->kvm->mmu_lock) || - r == RET_PF_RETRY); - - if (!fault->refcounted_page) - return; - - /* - * If the page that KVM got from the *primary MMU* is writable, and KVM - * installed or reused a SPTE, mark the page/folio dirty. Note, this - * may mark a folio dirty even if KVM created a read-only SPTE, e.g. if - * the GFN is write-protected. Folios can't be safely marked dirty - * outside of mmu_lock as doing so could race with writeback on the - * folio. As a result, KVM can't mark folios dirty in the fast page - * fault handler, and so KVM must (somewhat) speculatively mark the - * folio dirty if KVM could locklessly make the SPTE writable. - */ - if (r == RET_PF_RETRY) - kvm_release_page_unused(fault->refcounted_page); - else if (!fault->map_writable) - kvm_release_page_clean(fault->refcounted_page); - else - kvm_release_page_dirty(fault->refcounted_page); + kvm_release_faultin_page(vcpu->kvm, fault->refcounted_page, + r == RET_PF_RETRY, fault->map_writable); } static int kvm_mmu_faultin_pfn_private(struct kvm_vcpu *vcpu, diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9d2a97eb30e4..91341cdc6562 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1216,6 +1216,32 @@ static inline void kvm_release_page_unused(struct page *page) void kvm_release_page_clean(struct page *page); void kvm_release_page_dirty(struct page *page); +static inline void kvm_release_faultin_page(struct kvm *kvm, struct page *page, + bool unused, bool dirty) +{ + lockdep_assert_once(lockdep_is_held(&kvm->mmu_lock) || unused); + + if (!page) + return; + + /* + * If the page that KVM got from the *primary MMU* is writable, and KVM + * installed or reused a SPTE, mark the page/folio dirty. Note, this + * may mark a folio dirty even if KVM created a read-only SPTE, e.g. if + * the GFN is write-protected. Folios can't be safely marked dirty + * outside of mmu_lock as doing so could race with writeback on the + * folio. As a result, KVM can't mark folios dirty in the fast page + * fault handler, and so KVM must (somewhat) speculatively mark the + * folio dirty if KVM could locklessly make the SPTE writable. + */ + if (unused) + kvm_release_page_unused(page); + else if (dirty) + kvm_release_page_dirty(page); + else + kvm_release_page_clean(page); +} + kvm_pfn_t kvm_lookup_pfn(struct kvm *kvm, gfn_t gfn); kvm_pfn_t __kvm_faultin_pfn(const struct kvm_memory_slot *slot, gfn_t gfn, unsigned int foll, bool *writable,
Move KVM x86's helper that "finishes" the faultin process to common KVM so that the logic can be shared across all architectures. Note, not all architectures implement a fast page fault path, but the gist of the comment applies to all architectures. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/mmu/mmu.c | 24 ++---------------------- include/linux/kvm_host.h | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+), 22 deletions(-)