Message ID | 20120508102408.GB6745@bloggs.ozlabs.ibm.com |
---|---|
State | New, archived |
Headers | show |
On 08.05.2012, at 12:24, Paul Mackerras wrote: > From: David Gibson <david@gibson.dropbear.id.au> > > The H_REGISTER_VPA hcall implementation in HV Power KVM needs to pin some > guest memory pages into host memory so that they can be safely accessed > from usermode. It does this used get_user_pages_fast(). When the VPA is > unregistered, or the VCPUs are cleaned up, these pages are released using > put_page(). > > However, the get_user_pages() is invoked on the specific memory are of the > VPA which could lie within hugepages. In case the pinned page is huge, > we explicitly find the head page of the compound page before calling > put_page() on it. > > At least with the latest kernel, this is not correct. put_page() already > handles finding the correct head page of a compound, and also deals with > various counts on the individual tail page which are important for > transparent huge pages. We don't support transparent hugepages on Power, > but even so, bypassing this count maintenance can lead (when the VM ends) > to a hugepage being released back to the pool with a non-zero mapcount on > one of the tail pages. This can then lead to a bad_page() when the page > is released from the hugepage pool. > > This removes the explicit compound_head() call to correct this bug. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Alexander Graf <agraf@suse.de> Avi, could you please make sure this makes the next 3.4-rc or -stable? Thanks! Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/08/2012 04:10 PM, Alexander Graf wrote: > On 08.05.2012, at 12:24, Paul Mackerras wrote: > > > From: David Gibson <david@gibson.dropbear.id.au> > > > > The H_REGISTER_VPA hcall implementation in HV Power KVM needs to pin some > > guest memory pages into host memory so that they can be safely accessed > > from usermode. It does this used get_user_pages_fast(). When the VPA is > > unregistered, or the VCPUs are cleaned up, these pages are released using > > put_page(). > > > > However, the get_user_pages() is invoked on the specific memory are of the > > VPA which could lie within hugepages. In case the pinned page is huge, > > we explicitly find the head page of the compound page before calling > > put_page() on it. > > > > At least with the latest kernel, this is not correct. put_page() already > > handles finding the correct head page of a compound, and also deals with > > various counts on the individual tail page which are important for > > transparent huge pages. We don't support transparent hugepages on Power, > > but even so, bypassing this count maintenance can lead (when the VM ends) > > to a hugepage being released back to the pool with a non-zero mapcount on > > one of the tail pages. This can then lead to a bad_page() when the page > > is released from the hugepage pool. > > > > This removes the explicit compound_head() call to correct this bug. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > Signed-off-by: Paul Mackerras <paulus@samba.org> > > Acked-by: Alexander Graf <agraf@suse.de> > > Avi, could you please make sure this makes the next 3.4-rc or -stable? > > Sure, applied to master, will push tomorrow.
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index ddc485a..c3beaee 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -258,6 +258,8 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn, !(memslot->userspace_addr & (s - 1))) { start &= ~(s - 1); pgsize = s; + get_page(hpage); + put_page(page); page = hpage; } } @@ -281,11 +283,8 @@ static long kvmppc_get_guest_page(struct kvm *kvm, unsigned long gfn, err = 0; out: - if (got) { - if (PageHuge(page)) - page = compound_head(page); + if (got) put_page(page); - } return err; up_err: @@ -678,8 +677,15 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu, SetPageDirty(page); out_put: - if (page) - put_page(page); + if (page) { + /* + * We drop pages[0] here, not page because page might + * have been set to the head page of a compound, but + * we have to drop the reference on the correct tail + * page to match the get inside gup() + */ + put_page(pages[0]); + } return ret; out_unlock: @@ -979,6 +985,7 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa, pa = *physp; } page = pfn_to_page(pa >> PAGE_SHIFT); + get_page(page); } else { hva = gfn_to_hva_memslot(memslot, gfn); npages = get_user_pages_fast(hva, 1, 1, pages); @@ -991,8 +998,6 @@ void *kvmppc_pin_guest_page(struct kvm *kvm, unsigned long gpa, page = compound_head(page); psize <<= compound_order(page); } - if (!kvm->arch.using_mmu_notifiers) - get_page(page); offset = gpa & (psize - 1); if (nb_ret) *nb_ret = psize - offset; @@ -1003,7 +1008,6 @@ void kvmppc_unpin_guest_page(struct kvm *kvm, void *va) { struct page *page = virt_to_page(va); - page = compound_head(page); put_page(page); } diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index d0bc79b..146c17c 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1368,8 +1368,6 @@ static void unpin_slot(struct kvm *kvm, int slot_id) continue; pfn = physp[j] >> PAGE_SHIFT; page = pfn_to_page(pfn); - if (PageHuge(page)) - page = compound_head(page); SetPageDirty(page); put_page(page); }