From patchwork Tue Feb 9 01:07:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438056 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=DultRc8v; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPvh1rPpz9sSC for ; Tue, 9 Feb 2021 12:10:24 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229822AbhBIBKG (ORCPT ); Mon, 8 Feb 2021 20:10:06 -0500 Received: from hqnvemgate25.nvidia.com ([216.228.121.64]:6583 "EHLO hqnvemgate25.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229777AbhBIBKF (ORCPT ); Mon, 8 Feb 2021 20:10:05 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:24 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL109.nvidia.com (172.20.187.15) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:24 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:23 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 1/9] mm/migrate.c: Always allow device private pages to migrate Date: Tue, 9 Feb 2021 12:07:14 +1100 Message-ID: <20210209010722.13839-2-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832964; bh=oZ4IzPZ4J9LSuy9imywhNP9s/LgdHZ4boj/cGLUh0kU=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=DultRc8vIMKcdwumpL2sx++22sXnXh0sI0HuZ45Lo/JxDXE5k6o+YyL/GYNTz+m64 0QqP3bGnpUovgMWODtNvQk/Izmx8iCPIZR7ecIGqzwuXcQA+QyEQgBRUYmxy3ns5Ho A8qCRn6N37XpK8elbFaFAuEbob8gwqIL0x0fK+wmzTEfEcydL3w4TJ+qSPV9Clp84s JMx6UviPU/Np/Sb4Vn7B8D3YTl72TMvNDsMCT0oxNWMqn4O1KJaiiXDb/4uND82fQ/ 0HvciJTVMbFEXXIe2sS/THs6zm/8m6Qa6crwneJ9KwZthtQ7c2jmyHPzsZPA0trSS6 dUawZWcQlMgjg== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Device private pages are used to represent device memory that is not directly accessible from the CPU. Extra references to a device private page are only used to ensure the struct page itself remains valid whilst waiting for migration entries. Therefore extra references should not prevent device private page migration as this can lead to failures to migrate pages back to the CPU which are fatal to the user process. Signed-off-by: Alistair Popple --- mm/migrate.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 20ca887ea769..053228559fd3 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -405,8 +405,13 @@ int migrate_page_move_mapping(struct address_space *mapping, int nr = thp_nr_pages(page); if (!mapping) { - /* Anonymous page without mapping */ - if (page_count(page) != expected_count) + /* + * Anonymous page without mapping. Device private pages should + * never have extra references except during migration, but it + * is safe to ignore these. + */ + if (!is_device_private_page(page) && + page_count(page) != expected_count) return -EAGAIN; /* No turning back from here */ From patchwork Tue Feb 9 01:07:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438057 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=C3dPDOP+; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPvh5ffVz9sWP for ; Tue, 9 Feb 2021 12:10:24 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230063AbhBIBKK (ORCPT ); Mon, 8 Feb 2021 20:10:10 -0500 Received: from hqnvemgate24.nvidia.com ([216.228.121.143]:5859 "EHLO hqnvemgate24.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229892AbhBIBKJ (ORCPT ); Mon, 8 Feb 2021 20:10:09 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:28 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:27 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:27 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 2/9] mm/migrate.c: Allow pfn flags to be passed to migrate_vma_setup() Date: Tue, 9 Feb 2021 12:07:15 +1100 Message-ID: <20210209010722.13839-3-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832968; bh=iKGh0Em+q0SV2cD8tosOGsNNeFRDCVNZNzeMEYUMuKY=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=C3dPDOP+AZYYPiUDaU8tmyIQKU0atPQPf0P8bxkKmEleXzrnWlxwA6cPTpzjRHxEw yjQU2wttE/M65pLMYv4hEABvsKiInM0Lbvb9GC/vgyKDfOtElNlUiIxXmGHGAdYk1K EzUcT+kxdkGr6MSirU2Qm96A54p4W+NJ4MHIQt9hzenYZeBUVEhs5B0esyZkLMU6b+ yO0kVMs6S8bX+GEKSKWOJjco0oQr3SuSMzXHTJhxr0GQoq8dW+TGbdHkXkTVTgr/As a1ccJhtxb23aKMFHeK97TewZeIPHFo62wBpSrxD7mqbISrdBfUV9e5kY+cvhgKWXtY uYSRX6/5DdaEw== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Currently migrate_vma_setup() zeros both src and dst pfn arrays. This means it is not possible to pass per-pfn flags to migrate_vma_setup(). A future patch introduces per-pfn flags for migrate_vma_setup(), so ensure existing callers will not be affected by having the caller zero both src and dst pfn arrays. Signed-off-by: Alistair Popple --- arch/powerpc/kvm/book3s_hv_uvmem.c | 4 ++-- lib/test_hmm.c | 6 ++++-- mm/migrate.c | 1 - 3 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index 84e5a2dc8be5..d434783b272a 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -506,7 +506,7 @@ static int __kvmppc_svm_page_out(struct vm_area_struct *vma, unsigned long end, unsigned long page_shift, struct kvm *kvm, unsigned long gpa) { - unsigned long src_pfn, dst_pfn = 0; + unsigned long src_pfn = 0, dst_pfn = 0; struct migrate_vma mig; struct page *dpage, *spage; struct kvmppc_uvmem_page_pvt *pvt; @@ -732,7 +732,7 @@ static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long page_shift, bool pagein) { - unsigned long src_pfn, dst_pfn = 0; + unsigned long src_pfn = 0, dst_pfn = 0; struct migrate_vma mig; struct page *spage; unsigned long pfn; diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 80a78877bd93..98848b96ff09 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -696,6 +696,8 @@ static int dmirror_migrate(struct dmirror *dmirror, if (next > vma->vm_end) next = vma->vm_end; + memset(src_pfns, 0, ARRAY_SIZE(src_pfns)); + memset(dst_pfns, 0, ARRAY_SIZE(dst_pfns)); args.vma = vma; args.src = src_pfns; args.dst = dst_pfns; @@ -1025,8 +1027,8 @@ static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma *args, static vm_fault_t dmirror_devmem_fault(struct vm_fault *vmf) { struct migrate_vma args; - unsigned long src_pfns; - unsigned long dst_pfns; + unsigned long src_pfns = 0; + unsigned long dst_pfns = 0; struct page *rpage; struct dmirror *dmirror; vm_fault_t ret; diff --git a/mm/migrate.c b/mm/migrate.c index 053228559fd3..fe8bb322e2e3 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2874,7 +2874,6 @@ int migrate_vma_setup(struct migrate_vma *args) if (!args->src || !args->dst) return -EINVAL; - memset(args->src, 0, sizeof(*args->src) * nr_pages); args->cpages = 0; args->npages = 0; From patchwork Tue Feb 9 01:07:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438058 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=RSfV/6n2; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPvj2WsMz9sW3 for ; Tue, 9 Feb 2021 12:10:25 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230107AbhBIBKQ (ORCPT ); Mon, 8 Feb 2021 20:10:16 -0500 Received: from hqnvemgate24.nvidia.com ([216.228.121.143]:5869 "EHLO hqnvemgate24.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230071AbhBIBKL (ORCPT ); Mon, 8 Feb 2021 20:10:11 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:31 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:30 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:30 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 3/9] mm/migrate: Add a unmap and pin migration mode Date: Tue, 9 Feb 2021 12:07:16 +1100 Message-ID: <20210209010722.13839-4-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832971; bh=Nl943mKMg37Pc1ltxa9ZN05rhPl/lD4q5AUltrL00lQ=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=RSfV/6n22L8B9L88ISCIcLRmuaycGEy2U2CrxEJJWlEZe3EV5du/srmJ5lOiUlAcc uysSrHk7c+ijaLvrbrhUr6SyTFhfOnKqL8pu83RcZrhJl/sNic7fkkNc79Nn3gCz9I Q/E5LTlFqudI5RvzKKpX2wkVW29/lTqJaCVwJP4BJGua1XqOoBlTBk1geVd+TBZr2s S96Vy+eKqKZWUWWIObQIdjaDr1ehjn38CwxZ/Wl7a1jEmTkHgKiBbvXSX0RxSUg/vh COAn9oBG9fN79irvxlx9flPiAGLu9kCNeAhJL8z97n1aA82q+DUhuxULGvj8QKW6xE PVOZAfn16nwsw== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Some drivers need to ensure that a device has access to a particular user page whilst preventing userspace access to that page. For example this is required to allow a driver to implement atomic access to a page when the device hardware does not support atomic access to system memory. This could be implemented by migrating the data to the device, however this is not always optimal and may fail in some circumstances. In these cases it is advantageous to remap the page for device access without actually migrating the data. To allow this kind of access introduce an unmap and pin flag called MIGRATE_PFN_PIN/UNPIN for migration pfns. This will cause the original page to be remapped to the provided device private page as normal, but instead of returning or freeing the original CPU page it will pin it and leave it isolated from the LRU. This ensures the page remains pinned so that a device may access it exclusively. Any userspace CPU accesses will fault and trigger the normal device private migrate_to_ram() callback which must migrate the mapping back to the original page, after which the device will no longer have exclusive access to the page. As the original page does not get freed it is safe to allow the unmap and pin operation to proceed in cases where there are extra page references present. Signed-off-by: Alistair Popple --- include/linux/migrate.h | 2 + include/linux/migrate_mode.h | 1 + mm/migrate.c | 74 +++++++++++++++++++++++++----------- 3 files changed, 54 insertions(+), 23 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 4594838a0f7c..449fc61f9a99 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -144,6 +144,8 @@ static inline int migrate_misplaced_transhuge_page(struct mm_struct *mm, #define MIGRATE_PFN_MIGRATE (1UL << 1) #define MIGRATE_PFN_LOCKED (1UL << 2) #define MIGRATE_PFN_WRITE (1UL << 3) +#define MIGRATE_PFN_PIN (1UL << 4) +#define MIGRATE_PFN_UNPIN (1UL << 4) #define MIGRATE_PFN_SHIFT 6 static inline struct page *migrate_pfn_to_page(unsigned long mpfn) diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h index 883c99249033..823497eda927 100644 --- a/include/linux/migrate_mode.h +++ b/include/linux/migrate_mode.h @@ -17,6 +17,7 @@ enum migrate_mode { MIGRATE_SYNC_LIGHT, MIGRATE_SYNC, MIGRATE_SYNC_NO_COPY, + MIGRATE_REFERENCED, }; #endif /* MIGRATE_MODE_H_INCLUDED */ diff --git a/mm/migrate.c b/mm/migrate.c index fe8bb322e2e3..71edc2679c8e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -410,7 +410,7 @@ int migrate_page_move_mapping(struct address_space *mapping, * never have extra references except during migration, but it * is safe to ignore these. */ - if (!is_device_private_page(page) && + if (!is_device_private_page(page) && extra_count >= 0 && page_count(page) != expected_count) return -EAGAIN; @@ -421,6 +421,8 @@ int migrate_page_move_mapping(struct address_space *mapping, __SetPageSwapBacked(newpage); return MIGRATEPAGE_SUCCESS; + } else if (extra_count < 0) { + return -EINVAL; } oldzone = page_zone(page); @@ -704,12 +706,15 @@ int migrate_page(struct address_space *mapping, BUG_ON(PageWriteback(page)); /* Writeback must be complete */ - rc = migrate_page_move_mapping(mapping, newpage, page, 0); + if (mode == MIGRATE_REFERENCED) + rc = migrate_page_move_mapping(mapping, newpage, page, -1); + else + rc = migrate_page_move_mapping(mapping, newpage, page, 0); if (rc != MIGRATEPAGE_SUCCESS) return rc; - if (mode != MIGRATE_SYNC_NO_COPY) + if (mode != MIGRATE_SYNC_NO_COPY && mode != MIGRATE_REFERENCED) migrate_page_copy(newpage, page); else migrate_page_states(newpage, page); @@ -2327,15 +2332,15 @@ static int migrate_vma_collect_hole(unsigned long start, if (!vma_is_anonymous(walk->vma)) { for (addr = start; addr < end; addr += PAGE_SIZE) { migrate->src[migrate->npages] = 0; - migrate->dst[migrate->npages] = 0; migrate->npages++; } return 0; } for (addr = start; addr < end; addr += PAGE_SIZE) { - migrate->src[migrate->npages] = MIGRATE_PFN_MIGRATE; - migrate->dst[migrate->npages] = 0; + if (vma_is_anonymous(walk->vma) && + !(migrate->src[migrate->npages] & MIGRATE_PFN_PIN)) + migrate->src[migrate->npages] = MIGRATE_PFN_MIGRATE; migrate->npages++; migrate->cpages++; } @@ -2425,7 +2430,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, pte = *ptep; if (pte_none(pte)) { - if (vma_is_anonymous(vma)) { + if (vma_is_anonymous(vma) && + !(migrate->src[migrate->npages] & MIGRATE_PFN_PIN)) { mpfn = MIGRATE_PFN_MIGRATE; migrate->cpages++; } @@ -2525,8 +2531,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, } next: - migrate->dst[migrate->npages] = 0; - migrate->src[migrate->npages++] = mpfn; + migrate->src[migrate->npages++] |= mpfn; } arch_leave_lazy_mmu_mode(); pte_unmap_unlock(ptep - 1, ptl); @@ -2695,7 +2700,13 @@ static void migrate_vma_prepare(struct migrate_vma *migrate) put_page(page); } - if (!migrate_vma_check_page(page)) { + /* + * If the page is being unmapped and pinned it isn't actually + * going to migrate, so it's safe to continue the operation with + * an elevated refcount. + */ + if (!migrate_vma_check_page(page) && + !(migrate->src[i] & MIGRATE_PFN_PIN)) { if (remap) { migrate->src[i] &= ~MIGRATE_PFN_MIGRATE; migrate->cpages--; @@ -2757,25 +2768,34 @@ static void migrate_vma_unmap(struct migrate_vma *migrate) if (!page || !(migrate->src[i] & MIGRATE_PFN_MIGRATE)) continue; - if (page_mapped(page)) { + if (page_mapped(page)) try_to_unmap(page, flags); - if (page_mapped(page)) - goto restore; + + if (page_mapped(page)) + migrate->src[i] &= ~MIGRATE_PFN_MIGRATE; + + if (!migrate_vma_check_page(page) && + !(migrate->src[i] & MIGRATE_PFN_PIN)) + migrate->src[i] &= ~MIGRATE_PFN_MIGRATE; + + if (migrate->src[i] & MIGRATE_PFN_PIN) { + if (page_maybe_dma_pinned(page)) + migrate->src[i] &= ~MIGRATE_PFN_MIGRATE; + else + page_ref_add(page, GUP_PIN_COUNTING_BIAS); } - if (migrate_vma_check_page(page)) + if (!(migrate->src[i] & MIGRATE_PFN_MIGRATE)) { + migrate->cpages--; + restore++; continue; - -restore: - migrate->src[i] &= ~MIGRATE_PFN_MIGRATE; - migrate->cpages--; - restore++; + } } for (addr = start, i = 0; i < npages && restore; addr += PAGE_SIZE, i++) { struct page *page = migrate_pfn_to_page(migrate->src[i]); - if (!page || (migrate->src[i] & MIGRATE_PFN_MIGRATE)) + if (!page || (migrate->src[i] & MIGRATE_PFN_MIGRATE)) continue; remove_migration_ptes(page, page, false); @@ -3092,7 +3112,11 @@ void migrate_vma_pages(struct migrate_vma *migrate) } } - r = migrate_page(mapping, newpage, page, MIGRATE_SYNC_NO_COPY); + if (migrate->src[i] & MIGRATE_PFN_PIN) + r = migrate_page(mapping, newpage, page, MIGRATE_REFERENCED); + else + r = migrate_page(mapping, newpage, page, MIGRATE_SYNC_NO_COPY); + if (r != MIGRATEPAGE_SUCCESS) migrate->src[i] &= ~MIGRATE_PFN_MIGRATE; } @@ -3148,15 +3172,19 @@ void migrate_vma_finalize(struct migrate_vma *migrate) if (is_zone_device_page(page)) put_page(page); - else + else if (!(migrate->src[i] & MIGRATE_PFN_PIN)) putback_lru_page(page); if (newpage != page) { unlock_page(newpage); if (is_zone_device_page(newpage)) put_page(newpage); - else + else { + if (migrate->dst[i] & MIGRATE_PFN_UNPIN) + page_ref_sub(newpage, GUP_PIN_COUNTING_BIAS); + putback_lru_page(newpage); + } } } } From patchwork Tue Feb 9 01:07:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438059 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=jeZjl3nf; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPvj5CFfz9sW5 for ; Tue, 9 Feb 2021 12:10:25 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230128AbhBIBKR (ORCPT ); Mon, 8 Feb 2021 20:10:17 -0500 Received: from hqnvemgate26.nvidia.com ([216.228.121.65]:9755 "EHLO hqnvemgate26.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230088AbhBIBKO (ORCPT ); Mon, 8 Feb 2021 20:10:14 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:34 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:34 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:33 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 4/9] Documentation: Add unmap and pin to HMM Date: Tue, 9 Feb 2021 12:07:17 +1100 Message-ID: <20210209010722.13839-5-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832974; bh=pyZp6/9XNMC0mK0WRRvgfyWXvzV3+8u2/OO5kZH2+Vg=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=jeZjl3nfuSeq3FgFlgTJUt58XPL3emJVARijzZ4vq7ltCLgRVsr0bJW0RX7cLsy4+ 6i2ssucV5jShnIDv0z47fip74hEH6p2G7xEJGuH3szVMZf9TbumLk+9q9cM2uqpJqF zp0STD/loZHiqwCk48WN2CUfYhM/po5OgnEr3p85w9nntk+tBlaEU/GEolQNOEp1iF zEl1WiliIccRiu3A/3Ps6CBlVCYypWGxVgimQYvt7/3b0HLJ79gnhK9hGtXogpZXon 2dFsEzwULBU/1bCkZF5FIg7ycfCM4V4o6iEhQSzOB1i8QXdnME+lFPKBFz1v+szxGz jVlzmZBk+PGQA== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Update the HMM documentation to include information on the unmap and pin operation. Signed-off-by: Alistair Popple --- Documentation/vm/hmm.rst | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index 09e28507f5b2..83234984f656 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -346,7 +346,15 @@ between device driver specific code and shared common code: from the LRU (if system memory since device private pages are not on the LRU), unmapped from the process, and a special migration PTE is inserted in place of the original PTE. - migrate_vma_setup() also clears the ``args->dst`` array. + + A device driver may also initialise ``src`` entries with the + ``MIGRATE_PFN_PIN`` flag. This allows the device driver to unmap and pin + the existing system page in place whilst migrating page metadata to a + device private page. This leaves the page isolated from the LRU and gives + the device exclusive access to the page data without the need to migrate + data as any CPU access will trigger a fault. The device driver needs to + keep track of the ``src`` page as it effectively becomes the owner of + the page and needs to pass it in when remapping and unpinning the page. 3. The device driver allocates destination pages and copies source pages to destination pages. @@ -357,8 +365,8 @@ between device driver specific code and shared common code: array for that page. The driver then allocates either a device private struct page or a - system memory page, locks the page with ``lock_page()``, and fills in the - ``dst`` array entry with:: + system memory page, locks the page with ``lock_page()``, and fills in + the ``dst`` array entry with:: dst[i] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; @@ -373,6 +381,14 @@ between device driver specific code and shared common code: destination or clear the destination device private memory if the pointer is ``NULL`` meaning the source page was not populated in system memory. + Alternatively a driver that is remapping and unpinning a source page + obtained from a ``MIGRATE_PFN_PIN`` operation should lock the original + source page and pass it in along with the ``MIGRATE_PFN_UNPIN`` flag + without any need to copy data:: + + dst[i] = migrate_pfn(page_to_pfn(spage)) | MIGRATE_PFN_LOCKED + | MIGRATE_PFN_UNPIN; + 4. ``migrate_vma_pages()`` This step is where the migration is actually "committed". From patchwork Tue Feb 9 01:07:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438060 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=ZUvc6YKw; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPw366d7z9sW5 for ; Tue, 9 Feb 2021 12:10:43 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230174AbhBIBKX (ORCPT ); Mon, 8 Feb 2021 20:10:23 -0500 Received: from hqnvemgate25.nvidia.com ([216.228.121.64]:6629 "EHLO hqnvemgate25.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230140AbhBIBKR (ORCPT ); Mon, 8 Feb 2021 20:10:17 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:37 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:36 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:36 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 5/9] hmm-tests: Add test for unmap and pin Date: Tue, 9 Feb 2021 12:07:18 +1100 Message-ID: <20210209010722.13839-6-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832977; bh=eX6MwlujUfKQJ41uDmpXkpSxuNOWqBqcyMk+9htUm4I=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=ZUvc6YKwpnF7J4Z7k4xQ9aVuBZ46oSRqSppWJIki35wS/IAu4LBbCO/dUkdeEVXL/ wBHqQ1XO0ZV2IHKJrqnyhrPM7QerfJ4Uwpa20mtIZGPiIj4AC4phTspMsvzcFN8y6/ /PVwslkyzXJmcdYHReSOHdlp6tsXxixwP9zm0B43laeT9TI1YRV0fMyTo0As6ItNAJ WHIqhCbx3m6af9VX9unRQtMEXsKIlAHyjZlfohaE2wGbbCstX/KnIGflWQ4W5vj6e+ M42Oh6R/N2rRhBjQs6mRLp7bbxWRmJqYH8njZ8H89vtZsRoRL+xvCXYfM97St2ClLq WMd2ksOkhewKw== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Adds a basic test of the HMM unmap and pin operation. Signed-off-by: Alistair Popple --- lib/test_hmm.c | 107 +++++++++++++++++++++---- lib/test_hmm_uapi.h | 1 + tools/testing/selftests/vm/hmm-tests.c | 49 +++++++++++ 3 files changed, 140 insertions(+), 17 deletions(-) diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 98848b96ff09..c78a473250a3 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -46,6 +46,7 @@ struct dmirror_bounce { unsigned long cpages; }; +#define DPT_XA_TAG_ATOMIC 1UL #define DPT_XA_TAG_WRITE 3UL /* @@ -83,6 +84,7 @@ struct dmirror_device { struct cdev cdevice; struct hmm_devmem *devmem; + unsigned int devmem_faults; unsigned int devmem_capacity; unsigned int devmem_count; struct dmirror_chunk **devmem_chunks; @@ -203,8 +205,18 @@ static void dmirror_do_update(struct dmirror *dmirror, unsigned long start, * Therefore, it is OK to just clear the entry. */ xa_for_each_range(&dmirror->pt, pfn, entry, start >> PAGE_SHIFT, - end >> PAGE_SHIFT) + end >> PAGE_SHIFT) { + /* + * Typically this would be done in devmap free page, but as + * we're using the XArray to store the reference to the original + * page do it here as it doesn't matter if clean up of the + * pinned page is delayed. + */ + if (xa_pointer_tag(entry) == DPT_XA_TAG_ATOMIC) + unpin_user_page(xa_untag_pointer(entry)); + xa_erase(&dmirror->pt, pfn); + } } static bool dmirror_interval_invalidate(struct mmu_interval_notifier *mni, @@ -571,7 +583,8 @@ static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice) } static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args, - struct dmirror *dmirror) + struct dmirror *dmirror, + int allow_ref) { struct dmirror_device *mdevice = dmirror->mdevice; const unsigned long *src = args->src; @@ -598,9 +611,17 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args, continue; rpage = dpage->zone_device_data; - if (spage) + if (spage && !(*src & MIGRATE_PFN_PIN)) copy_highpage(rpage, spage); else + /* + * In the MIGRATE_PFN_PIN case we don't really + * need rpage at all because the existing page is + * staying in place and will be mapped. However we need + * somewhere to store dmirror and that place is + * rpage->zone_device_data so we keep it for + * simplicity. + */ clear_highpage(rpage); /* @@ -620,7 +641,8 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args, } static int dmirror_migrate_finalize_and_map(struct migrate_vma *args, - struct dmirror *dmirror) + struct dmirror *dmirror, + int allow_ref) { unsigned long start = args->start; unsigned long end = args->end; @@ -647,8 +669,14 @@ static int dmirror_migrate_finalize_and_map(struct migrate_vma *args, * Store the page that holds the data so the page table * doesn't have to deal with ZONE_DEVICE private pages. */ - entry = dpage->zone_device_data; - if (*dst & MIGRATE_PFN_WRITE) + if (*src & MIGRATE_PFN_PIN) + entry = migrate_pfn_to_page(*src); + else + entry = dpage->zone_device_data; + + if (*src & MIGRATE_PFN_PIN) + entry = xa_tag_pointer(entry, DPT_XA_TAG_ATOMIC); + else if (*dst & MIGRATE_PFN_WRITE) entry = xa_tag_pointer(entry, DPT_XA_TAG_WRITE); entry = xa_store(&dmirror->pt, pfn, entry, GFP_ATOMIC); if (xa_is_err(entry)) { @@ -662,7 +690,8 @@ static int dmirror_migrate_finalize_and_map(struct migrate_vma *args, } static int dmirror_migrate(struct dmirror *dmirror, - struct hmm_dmirror_cmd *cmd) + struct hmm_dmirror_cmd *cmd, + int allow_ref) { unsigned long start, end, addr; unsigned long size = cmd->npages << PAGE_SHIFT; @@ -673,7 +702,7 @@ static int dmirror_migrate(struct dmirror *dmirror, struct dmirror_bounce bounce; struct migrate_vma args; unsigned long next; - int ret; + int i, ret; start = cmd->addr; end = start + size; @@ -696,8 +725,13 @@ static int dmirror_migrate(struct dmirror *dmirror, if (next > vma->vm_end) next = vma->vm_end; - memset(src_pfns, 0, ARRAY_SIZE(src_pfns)); - memset(dst_pfns, 0, ARRAY_SIZE(dst_pfns)); + if (allow_ref) + for (i = 0; i < 64; ++i) + src_pfns[i] = MIGRATE_PFN_PIN; + else + memset(src_pfns, 0, sizeof(src_pfns)); + memset(dst_pfns, 0, sizeof(dst_pfns)); + args.vma = vma; args.src = src_pfns; args.dst = dst_pfns; @@ -709,9 +743,9 @@ static int dmirror_migrate(struct dmirror *dmirror, if (ret) goto out; - dmirror_migrate_alloc_and_copy(&args, dmirror); + dmirror_migrate_alloc_and_copy(&args, dmirror, allow_ref); migrate_vma_pages(&args); - dmirror_migrate_finalize_and_map(&args, dmirror); + dmirror_migrate_finalize_and_map(&args, dmirror, allow_ref); migrate_vma_finalize(&args); } mmap_read_unlock(mm); @@ -739,6 +773,28 @@ static int dmirror_migrate(struct dmirror *dmirror, return ret; } +static int dmirror_migrate_pin(struct dmirror *dmirror, + struct hmm_dmirror_cmd *cmd) +{ + void *tmp; + int nr_pages = cmd->npages; + int ret; + + ret = dmirror_migrate(dmirror, cmd, true); + + tmp = kmalloc(nr_pages << PAGE_SHIFT, GFP_KERNEL); + if (!tmp) + return -ENOMEM; + + /* Make sure user access faults */ + dmirror->mdevice->devmem_faults = 0; + if (copy_from_user(tmp, u64_to_user_ptr(cmd->addr), nr_pages << PAGE_SHIFT)) + ret = -EFAULT; + cmd->faults = dmirror->mdevice->devmem_faults; + + return ret; +} + static void dmirror_mkentry(struct dmirror *dmirror, struct hmm_range *range, unsigned char *perm, unsigned long entry) { @@ -948,7 +1004,11 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp, break; case HMM_DMIRROR_MIGRATE: - ret = dmirror_migrate(dmirror, &cmd); + ret = dmirror_migrate(dmirror, &cmd, false); + break; + + case HMM_DMIRROR_MIGRATE_PIN: + ret = dmirror_migrate_pin(dmirror, &cmd); break; case HMM_DMIRROR_SNAPSHOT: @@ -1004,20 +1064,31 @@ static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma *args, for (addr = start; addr < end; addr += PAGE_SIZE, src++, dst++) { struct page *dpage, *spage; + void *entry; spage = migrate_pfn_to_page(*src); if (!spage || !(*src & MIGRATE_PFN_MIGRATE)) continue; - spage = spage->zone_device_data; - dpage = alloc_page_vma(GFP_HIGHUSER_MOVABLE, args->vma, addr); + entry = xa_load(&dmirror->pt, addr >> PAGE_SHIFT); + if (entry && xa_pointer_tag(entry) == DPT_XA_TAG_ATOMIC) { + spage = NULL; + dpage = xa_untag_pointer(entry); + *dst = migrate_pfn(page_to_pfn(dpage)) | + MIGRATE_PFN_LOCKED | MIGRATE_PFN_UNPIN; + } else { + spage = spage->zone_device_data; + dpage = alloc_page_vma(GFP_HIGHUSER_MOVABLE, args->vma, addr); + *dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + } + if (!dpage) continue; lock_page(dpage); xa_erase(&dmirror->pt, addr >> PAGE_SHIFT); - copy_highpage(dpage, spage); - *dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + if (spage) + copy_highpage(dpage, spage); if (*src & MIGRATE_PFN_WRITE) *dst |= MIGRATE_PFN_WRITE; } @@ -1041,6 +1112,8 @@ static vm_fault_t dmirror_devmem_fault(struct vm_fault *vmf) rpage = vmf->page->zone_device_data; dmirror = rpage->zone_device_data; + dmirror->mdevice->devmem_faults++; + /* FIXME demonstrate how we can adjust migrate range */ args.vma = vmf->vma; args.start = vmf->address; diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h index 670b4ef2a5b6..b40f4e6affe0 100644 --- a/lib/test_hmm_uapi.h +++ b/lib/test_hmm_uapi.h @@ -33,6 +33,7 @@ struct hmm_dmirror_cmd { #define HMM_DMIRROR_WRITE _IOWR('H', 0x01, struct hmm_dmirror_cmd) #define HMM_DMIRROR_MIGRATE _IOWR('H', 0x02, struct hmm_dmirror_cmd) #define HMM_DMIRROR_SNAPSHOT _IOWR('H', 0x03, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_MIGRATE_PIN _IOWR('H', 0x04, struct hmm_dmirror_cmd) /* * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT. diff --git a/tools/testing/selftests/vm/hmm-tests.c b/tools/testing/selftests/vm/hmm-tests.c index 5d1ac691b9f4..7111ebab93c7 100644 --- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -947,6 +947,55 @@ TEST_F(hmm, migrate_fault) hmm_buffer_free(buffer); } +TEST_F(hmm, migrate_fault_pin) +{ + struct hmm_buffer *buffer; + unsigned long npages; + unsigned long size; + unsigned long i; + int *ptr; + int ret; + + npages = ALIGN(HMM_BUFFER_SIZE, self->page_size) >> self->page_shift; + ASSERT_NE(npages, 0); + size = npages << self->page_shift; + + buffer = malloc(sizeof(*buffer)); + ASSERT_NE(buffer, NULL); + + buffer->fd = -1; + buffer->size = size; + buffer->mirror = malloc(size); + ASSERT_NE(buffer->mirror, NULL); + + buffer->ptr = mmap(NULL, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + buffer->fd, 0); + ASSERT_NE(buffer->ptr, MAP_FAILED); + + /* Initialize buffer in system memory. */ + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i) + ptr[i] = i; + + /* Migrate memory to device. */ + ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE_PIN, buffer, npages); + ASSERT_EQ(ret, 0); + ASSERT_EQ(buffer->cpages, npages); + + /* Check what the device read. */ + for (i = 0, ptr = buffer->mirror; i < size / sizeof(*ptr); ++i) + ASSERT_EQ(ptr[i], i); + + ASSERT_EQ(buffer->faults, npages); + + /* Fault pages back to system memory and check them. */ + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i) + ASSERT_EQ(ptr[i], i); + + hmm_buffer_free(buffer); +} + /* * Migrate anonymous shared memory to device private memory. */ From patchwork Tue Feb 9 01:07:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438061 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=GrM7/Cxp; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPw665sXz9sW5 for ; Tue, 9 Feb 2021 12:10:46 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230185AbhBIBKZ (ORCPT ); Mon, 8 Feb 2021 20:10:25 -0500 Received: from hqnvemgate24.nvidia.com ([216.228.121.143]:5880 "EHLO hqnvemgate24.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230156AbhBIBKU (ORCPT ); Mon, 8 Feb 2021 20:10:20 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:40 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL109.nvidia.com (172.20.187.15) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:39 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:39 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 6/9] nouveau/dmem: Only map migrating pages Date: Tue, 9 Feb 2021 12:07:19 +1100 Message-ID: <20210209010722.13839-7-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832980; bh=j494orzQQ/4FYYxabsjWq7LuI4u1CTU8UGQhY1HAfho=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=GrM7/CxpBZAwe7RHxu+YFiyzYNgL1HOTEBJQdsktC93nzibUn1jeZm0tr2gRUaars qpqNSnUF3nhPCA+0OEeWn9pWXvdylsRrLl7hxjb5ovVJ6Qc8LSoSOC0cwWbqiCDLyU T8cCwhv4UO3+a6sfDl30K3FrYggvEh+8Ag0WOpAijVkW9nixc+awEHkJGix1VmCx2N T2vLk8mQOanqJLK0XK1rfaWg5l86DIv3YqydmkZ1imQEr8pi+Ga/Pgm6cBWtIsdfBB VMgmfhij62bhsqdRGRXdg8n5lm34qWn5NQ7NlK3rJMBGrLIQdElVCzXgDq/ra1UtX0 to8+2hyN97AlA== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Only pages which were actually migrated should be mapped on the GPU. migrate_vma_pages() clears MIGRATE_PFN_MIGRATE in the src_pfn array, so test this prior to mapping the pages on the GPU. If any pages failed to migrate don't install any mappings - the GPU will demand fault any as required. Signed-off-by: Alistair Popple --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 92987daa5e17..9579bd001f11 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -618,8 +618,9 @@ static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm, dma_addr_t *dma_addrs, u64 *pfns) { struct nouveau_fence *fence; - unsigned long addr = args->start, nr_dma = 0, i; + unsigned long addr = args->start, nr_dma = 0, i, npages; + npages = (args->start - args->end) >> PAGE_SHIFT; for (i = 0; addr < args->end; i++) { args->dst[i] = nouveau_dmem_migrate_copy_one(drm, svmm, args->src[i], dma_addrs + nr_dma, pfns + i); @@ -631,7 +632,17 @@ static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm, nouveau_fence_new(drm->dmem->migrate.chan, false, &fence); migrate_vma_pages(args); nouveau_dmem_fence_done(&fence); - nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, i); + + for (i = 0; i < npages; i++) + if (!(args->src[i] & MIGRATE_PFN_MIGRATE)) + break; + + /* + * If all pages were migrated successfully map them on the GPU. If any + * failed just let the GPU fault to create the mapping. + */ + if (i == npages) + nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, npages); while (nr_dma--) { dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE, From patchwork Tue Feb 9 01:07:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438063 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=WF7S7iK6; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPwQ3qRjz9sWY for ; Tue, 9 Feb 2021 12:11:02 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230273AbhBIBK7 (ORCPT ); Mon, 8 Feb 2021 20:10:59 -0500 Received: from hqnvemgate24.nvidia.com ([216.228.121.143]:5937 "EHLO hqnvemgate24.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229587AbhBIBKw (ORCPT ); Mon, 8 Feb 2021 20:10:52 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:43 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:42 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:42 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 7/9] nouveau/svm: Refactor nouveau_range_fault Date: Tue, 9 Feb 2021 12:07:20 +1100 Message-ID: <20210209010722.13839-8-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832983; bh=/0aXIpMNQSL0FUewLuYlelhhBUIfEJ6iMMcRUZ+kk+8=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=WF7S7iK6/QWbrW9k0bDZ2GomZQO/T/YDGZc9n//0dm9ost/4B48mXEu6WaksKERkw jCWu1McL+qQAAxUhZGe85iX2RnhDmETJ6L/BZR6j9TtP9JM+QJyWZ6M//A+Ijh88XO /+AzmUBT/FdsjyOPMK5ttSrp3TNEHVVWidorRX2noCwW14IhGpoX0cz693HQ2QqdFn wi3fk44sBuSeILPVG1edpKN3ghizzLPtdsqn/l6QuTQPn41YMYVBtFS3lwgDhutbYm OOcbZ0YV2FGxZnzFce6e2K/UjCLafBcHGOwacYZMjG+VfTCTQ7k9q4qdV3l1z0q7fr GHRFvlsMqL36A== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Call mmu_interval_notifier_insert() as part of nouveau_range_fault(). This doesn't introduce any functional change but makes it easier for a subsequent patch to alter the behaviour of nouveau_range_fault() to support GPU atomic operations. Signed-off-by: Alistair Popple --- drivers/gpu/drm/nouveau/nouveau_svm.c | 34 ++++++++++++++++----------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 1c3f890377d2..63332387402e 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -567,18 +567,27 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm, unsigned long hmm_pfns[1]; struct hmm_range range = { .notifier = ¬ifier->notifier, - .start = notifier->notifier.interval_tree.start, - .end = notifier->notifier.interval_tree.last + 1, .default_flags = hmm_flags, .hmm_pfns = hmm_pfns, .dev_private_owner = drm->dev, }; - struct mm_struct *mm = notifier->notifier.mm; + struct mm_struct *mm = svmm->notifier.mm; int ret; + ret = mmu_interval_notifier_insert(¬ifier->notifier, mm, + args->p.addr, args->p.size, + &nouveau_svm_mni_ops); + if (ret) + return ret; + + range.start = notifier->notifier.interval_tree.start; + range.end = notifier->notifier.interval_tree.last + 1; + while (true) { - if (time_after(jiffies, timeout)) - return -EBUSY; + if (time_after(jiffies, timeout)) { + ret = -EBUSY; + goto out; + } range.notifier_seq = mmu_interval_read_begin(range.notifier); mmap_read_lock(mm); @@ -587,7 +596,7 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm, if (ret) { if (ret == -EBUSY) continue; - return ret; + goto out; } mutex_lock(&svmm->mutex); @@ -606,6 +615,9 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm, svmm->vmm->vmm.object.client->super = false; mutex_unlock(&svmm->mutex); +out: + mmu_interval_notifier_remove(¬ifier->notifier); + return ret; } @@ -727,14 +739,8 @@ nouveau_svm_fault(struct nvif_notify *notify) } notifier.svmm = svmm; - ret = mmu_interval_notifier_insert(¬ifier.notifier, mm, - args.i.p.addr, args.i.p.size, - &nouveau_svm_mni_ops); - if (!ret) { - ret = nouveau_range_fault(svmm, svm->drm, &args.i, - sizeof(args), hmm_flags, ¬ifier); - mmu_interval_notifier_remove(¬ifier.notifier); - } + ret = nouveau_range_fault(svmm, svm->drm, &args.i, + sizeof(args), hmm_flags, ¬ifier); mmput(mm); limit = args.i.p.addr + args.i.p.size; From patchwork Tue Feb 9 01:07:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438062 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=VKRG5N9N; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPw76HK5z9sVb for ; Tue, 9 Feb 2021 12:10:47 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230235AbhBIBK3 (ORCPT ); Mon, 8 Feb 2021 20:10:29 -0500 Received: from hqnvemgate26.nvidia.com ([216.228.121.65]:9773 "EHLO hqnvemgate26.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230197AbhBIBK0 (ORCPT ); Mon, 8 Feb 2021 20:10:26 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:46 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:45 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:45 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 8/9] nouveau/dmem: Add support for multiple page types Date: Tue, 9 Feb 2021 12:07:21 +1100 Message-ID: <20210209010722.13839-9-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832986; bh=7iHwtR9sau7241FKCrgteBkrT9cHXs6BLCPTLWgKcXQ=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=VKRG5N9NY9ae9S7GkfqgricGvo5ZGs2UBfcbmWhnNfa6uuRK5Oi4wNrQhOYQjkYs5 CtYJHlMQmzVd2GcxxcWItL2jNHWW0zO+4uoMRB4r/Mg1zI+NQBo4KEUztggcr/nOaa NxPUZ84dy/ZQRoeSStEePDecVkKIGsgW3/eDkS6lakAMbXLykzTozQj8RgBXyst0It bv4Lx4xAnRZfRkKbGIwOihbwBewLn6XMpOhiTSFfOvbktrB5fv5cMMDJ0eFcECGyG7 m+P9i28q28Ovin8mEgLRLtMysXyi/8arOL1T9Xf7Alt9GnBQWEodaOJ7g4rsh67f1x ZmdyxxPwBzagw== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Device private pages are used to track a per-page migrate_to_ram() callback which is called when the CPU attempts to access a GPU page from the CPU. Currently the same callback is used for all GPU pages tracked by Nouveau. However a future patch requires support for calling a different callback when accessing some GPU pages. This patch extends the existing Nouveau device private page allocator to make it easier to allocate device private pages with different callbacks but should not introduce any functional changes. Signed-off-by: Alistair Popple --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 27 ++++++++++++++------------ drivers/gpu/drm/nouveau/nouveau_dmem.h | 5 +++++ 2 files changed, 20 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 9579bd001f11..8fb4949f3778 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -67,6 +67,7 @@ struct nouveau_dmem_chunk { struct nouveau_bo *bo; struct nouveau_drm *drm; unsigned long callocated; + enum nouveau_dmem_type type; struct dev_pagemap pagemap; }; @@ -81,7 +82,7 @@ struct nouveau_dmem { struct nouveau_dmem_migrate migrate; struct list_head chunks; struct mutex mutex; - struct page *free_pages; + struct page *free_pages[NOUVEAU_DMEM_NTYPES]; spinlock_t lock; }; @@ -112,8 +113,8 @@ static void nouveau_dmem_page_free(struct page *page) struct nouveau_dmem *dmem = chunk->drm->dmem; spin_lock(&dmem->lock); - page->zone_device_data = dmem->free_pages; - dmem->free_pages = page; + page->zone_device_data = dmem->free_pages[chunk->type]; + dmem->free_pages[chunk->type] = page; WARN_ON(!chunk->callocated); chunk->callocated--; @@ -224,7 +225,8 @@ static const struct dev_pagemap_ops nouveau_dmem_pagemap_ops = { }; static int -nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage) +nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage, + enum nouveau_dmem_type type) { struct nouveau_dmem_chunk *chunk; struct resource *res; @@ -248,6 +250,7 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage) } chunk->drm = drm; + chunk->type = type; chunk->pagemap.type = MEMORY_DEVICE_PRIVATE; chunk->pagemap.range.start = res->start; chunk->pagemap.range.end = res->end; @@ -279,8 +282,8 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage) page = pfn_to_page(pfn_first); spin_lock(&drm->dmem->lock); for (i = 0; i < DMEM_CHUNK_NPAGES - 1; ++i, ++page) { - page->zone_device_data = drm->dmem->free_pages; - drm->dmem->free_pages = page; + page->zone_device_data = drm->dmem->free_pages[type]; + drm->dmem->free_pages[type] = page; } *ppage = page; chunk->callocated++; @@ -304,22 +307,22 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage) } static struct page * -nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm) +nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm, enum nouveau_dmem_type type) { struct nouveau_dmem_chunk *chunk; struct page *page = NULL; int ret; spin_lock(&drm->dmem->lock); - if (drm->dmem->free_pages) { - page = drm->dmem->free_pages; - drm->dmem->free_pages = page->zone_device_data; + if (drm->dmem->free_pages[type]) { + page = drm->dmem->free_pages[type]; + drm->dmem->free_pages[type] = page->zone_device_data; chunk = nouveau_page_to_chunk(page); chunk->callocated++; spin_unlock(&drm->dmem->lock); } else { spin_unlock(&drm->dmem->lock); - ret = nouveau_dmem_chunk_alloc(drm, &page); + ret = nouveau_dmem_chunk_alloc(drm, &page, type); if (ret) return NULL; } @@ -577,7 +580,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, if (!(src & MIGRATE_PFN_MIGRATE)) goto out; - dpage = nouveau_dmem_page_alloc_locked(drm); + dpage = nouveau_dmem_page_alloc_locked(drm, NOUVEAU_DMEM); if (!dpage) goto out; diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.h b/drivers/gpu/drm/nouveau/nouveau_dmem.h index 64da5d3635c8..02e261c4acf1 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.h +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.h @@ -28,6 +28,11 @@ struct nouveau_drm; struct nouveau_svmm; struct hmm_range; +enum nouveau_dmem_type { + NOUVEAU_DMEM, + NOUVEAU_DMEM_NTYPES, /* Number of types, must be last */ +}; + #if IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) void nouveau_dmem_init(struct nouveau_drm *); void nouveau_dmem_fini(struct nouveau_drm *); From patchwork Tue Feb 9 01:07:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 1438064 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.a=rsa-sha256 header.s=n1 header.b=hfFiA37v; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4DZPwS4vdmz9sWr for ; Tue, 9 Feb 2021 12:11:04 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230299AbhBIBLA (ORCPT ); Mon, 8 Feb 2021 20:11:00 -0500 Received: from hqnvemgate26.nvidia.com ([216.228.121.65]:9786 "EHLO hqnvemgate26.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230243AbhBIBKa (ORCPT ); Mon, 8 Feb 2021 20:10:30 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate26.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 08 Feb 2021 17:09:49 -0800 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL101.nvidia.com (172.20.187.10) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:49 +0000 Received: from localhost (172.20.145.6) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 9 Feb 2021 01:09:48 +0000 From: Alistair Popple To: , , , CC: , , , , , , , "Alistair Popple" Subject: [PATCH 9/9] nouveau/svm: Implement atomic SVM access Date: Tue, 9 Feb 2021 12:07:22 +1100 Message-ID: <20210209010722.13839-10-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210209010722.13839-1-apopple@nvidia.com> References: <20210209010722.13839-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To DRHQMAIL107.nvidia.com (10.27.9.16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612832989; bh=Whkjwci3XuSPfP2PeMnAPMBOZ5w0/Q9OPh0Whknn010=; h=From:To:CC:Subject:Date:Message-ID:X-Mailer:In-Reply-To: References:MIME-Version:Content-Transfer-Encoding:Content-Type: X-Originating-IP:X-ClientProxiedBy; b=hfFiA37vIHZ+oJYL+GYK4isP1fNlLzCchq/VL0lJOccXOty2fLKNLTUtLSmsRcByd j79RfYaij6YxftB1pww87D3T8Y0tRZJKHUJtxn4MWSWzrFpFL2MaqYOL444gFsD/MR 4Or1WN9OB06PKUzt0joWDvqLstHQu5KFDEy1MBMbY2duKZz0rLpft+gHcL6m3t98tP 0UDOv+CttC9jFWUV3ciUYUQehOXfeCUTX9OpKgngswDfgXtvDJJVo3ZYan9anycQ0q HClA2Q0FC/Gzmm+gantURF45MKBdn8KBf9zfBY6bFF4NCwNikhum8uNfNfaI8HpyFd yowDhLANzR+lA== Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Some NVIDIA GPUs do not support direct atomic access to system memory via PCIe. Instead this must be emulated by granting the GPU exclusive access to the memory. This is achieved by migrating the userspace mappings to device private pages whilst leaving the actual page in place. The driver then grants the GPU permission to update the page undergoing atomic access via the GPU page tables. When CPU access to the page is required a CPU fault is raised which calls into the device driver via migrate_to_ram() to remap the page into the CPU page tables and revoke GPU access. Signed-off-by: Alistair Popple --- drivers/gpu/drm/nouveau/include/nvif/if000c.h | 1 + drivers/gpu/drm/nouveau/nouveau_dmem.c | 148 ++++++++++++++++-- drivers/gpu/drm/nouveau/nouveau_dmem.h | 4 + drivers/gpu/drm/nouveau/nouveau_svm.c | 116 ++++++++++++-- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 + .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 6 + 6 files changed, 249 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h b/drivers/gpu/drm/nouveau/include/nvif/if000c.h index d6dd40f21eed..9c7ff56831c5 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h +++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h @@ -77,6 +77,7 @@ struct nvif_vmm_pfnmap_v0 { #define NVIF_VMM_PFNMAP_V0_APER 0x00000000000000f0ULL #define NVIF_VMM_PFNMAP_V0_HOST 0x0000000000000000ULL #define NVIF_VMM_PFNMAP_V0_VRAM 0x0000000000000010ULL +#define NVIF_VMM_PFNMAP_V0_A 0x0000000000000004ULL #define NVIF_VMM_PFNMAP_V0_W 0x0000000000000002ULL #define NVIF_VMM_PFNMAP_V0_V 0x0000000000000001ULL #define NVIF_VMM_PFNMAP_V0_NONE 0x0000000000000000ULL diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 8fb4949f3778..7b103670af56 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -69,6 +69,7 @@ struct nouveau_dmem_chunk { unsigned long callocated; enum nouveau_dmem_type type; struct dev_pagemap pagemap; + struct page **atomic_pages; }; struct nouveau_dmem_migrate { @@ -107,12 +108,51 @@ unsigned long nouveau_dmem_page_addr(struct page *page) return chunk->bo->offset + off; } +static struct page *nouveau_dpage_to_atomic(struct page *dpage) +{ + struct nouveau_dmem_chunk *chunk = nouveau_page_to_chunk(dpage); + int index; + + if (WARN_ON_ONCE(chunk->type != NOUVEAU_ATOMIC)) + return NULL; + + index = page_to_pfn(dpage) - (chunk->pagemap.range.start >> PAGE_SHIFT); + + if (WARN_ON_ONCE(index > DMEM_CHUNK_NPAGES)) + return NULL; + + return chunk->atomic_pages[index]; +} + +void nouveau_dmem_set_atomic(struct page *dpage, struct page *atomic_page) +{ + struct nouveau_dmem_chunk *chunk = nouveau_page_to_chunk(dpage); + int index; + + if (WARN_ON_ONCE(chunk->type != NOUVEAU_ATOMIC)) + return; + + index = page_to_pfn(dpage) - (chunk->pagemap.range.start >> PAGE_SHIFT); + + if (WARN_ON_ONCE(index > DMEM_CHUNK_NPAGES)) + return; + + chunk->atomic_pages[index] = atomic_page; +} + static void nouveau_dmem_page_free(struct page *page) { struct nouveau_dmem_chunk *chunk = nouveau_page_to_chunk(page); struct nouveau_dmem *dmem = chunk->drm->dmem; spin_lock(&dmem->lock); + if (chunk->type == NOUVEAU_ATOMIC) { + struct page *atomic_page; + + atomic_page = nouveau_dpage_to_atomic(page); + if (atomic_page) + unpin_user_page(atomic_page); + } page->zone_device_data = dmem->free_pages[chunk->type]; dmem->free_pages[chunk->type] = page; @@ -125,6 +165,65 @@ static void nouveau_dmem_page_free(struct page *page) spin_unlock(&dmem->lock); } +static vm_fault_t nouveau_atomic_page_migrate(struct vm_fault *vmf) +{ + struct nouveau_drm *drm = page_to_drm(vmf->page); + struct page *dpage = vmf->page; + struct nouveau_svmm *svmm = dpage->zone_device_data; + struct migrate_vma args; + unsigned long src_pfn = 0, dst_pfn = 0; + struct page *src_page, *old_page; + int retry; + int ret = 0; + + args.vma = vmf->vma; + args.src = &src_pfn; + args.dst = &dst_pfn; + args.start = vmf->address; + args.end = vmf->address + PAGE_SIZE; + args.pgmap_owner = drm->dev; + args.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE; + + for (retry = 0; retry < 10; retry++) { + ret = migrate_vma_setup(&args); + if (ret) { + ret = VM_FAULT_SIGBUS; + goto out; + } + + if (src_pfn & MIGRATE_PFN_MIGRATE) + break; + + cond_resched(); + } + + src_page = migrate_pfn_to_page(src_pfn); + old_page = nouveau_dpage_to_atomic(dpage); + + /* + * If we can't get a lock on the atomic page it means another thread is + * racing or migrating so retry the fault. + */ + if (src_page && trylock_page(old_page)) + dst_pfn = migrate_pfn(page_to_pfn(old_page)) | + MIGRATE_PFN_LOCKED | MIGRATE_PFN_UNPIN; + else + ret = VM_FAULT_RETRY; + + migrate_vma_pages(&args); + if (src_pfn & MIGRATE_PFN_MIGRATE) { + nouveau_svmm_invalidate(svmm, args.start, args.end); + nouveau_dmem_set_atomic(dpage, NULL); + } + migrate_vma_finalize(&args); + +out: + if (ret == VM_FAULT_RETRY) + mmap_read_unlock(vmf->vma->vm_mm); + + return ret; +} + static void nouveau_dmem_fence_done(struct nouveau_fence **fence) { if (fence) { @@ -224,6 +323,11 @@ static const struct dev_pagemap_ops nouveau_dmem_pagemap_ops = { .migrate_to_ram = nouveau_dmem_migrate_to_ram, }; +static const struct dev_pagemap_ops nouveau_atomic_pagemap_ops = { + .page_free = nouveau_dmem_page_free, + .migrate_to_ram = nouveau_atomic_page_migrate, +}; + static int nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage, enum nouveau_dmem_type type) @@ -255,18 +359,30 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage, chunk->pagemap.range.start = res->start; chunk->pagemap.range.end = res->end; chunk->pagemap.nr_range = 1; - chunk->pagemap.ops = &nouveau_dmem_pagemap_ops; + if (type == NOUVEAU_DMEM) + chunk->pagemap.ops = &nouveau_dmem_pagemap_ops; + else + chunk->pagemap.ops = &nouveau_atomic_pagemap_ops; chunk->pagemap.owner = drm->dev; - ret = nouveau_bo_new(&drm->client, DMEM_CHUNK_SIZE, 0, - NOUVEAU_GEM_DOMAIN_VRAM, 0, 0, NULL, NULL, - &chunk->bo); - if (ret) - goto out_release; + if (type == NOUVEAU_DMEM) { + ret = nouveau_bo_new(&drm->client, DMEM_CHUNK_SIZE, 0, + NOUVEAU_GEM_DOMAIN_VRAM, 0, 0, NULL, NULL, + &chunk->bo); + if (ret) + goto out_release; - ret = nouveau_bo_pin(chunk->bo, NOUVEAU_GEM_DOMAIN_VRAM, false); - if (ret) - goto out_bo_free; + ret = nouveau_bo_pin(chunk->bo, NOUVEAU_GEM_DOMAIN_VRAM, false); + if (ret) + goto out_bo_free; + } else { + chunk->atomic_pages = kcalloc(DMEM_CHUNK_NPAGES, + sizeof(void *), GFP_KERNEL); + if (!chunk->atomic_pages) { + ret = -ENOMEM; + goto out_release; + } + } ptr = memremap_pages(&chunk->pagemap, numa_node_id()); if (IS_ERR(ptr)) { @@ -289,8 +405,8 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage, chunk->callocated++; spin_unlock(&drm->dmem->lock); - NV_INFO(drm, "DMEM: registered %ldMB of device memory\n", - DMEM_CHUNK_SIZE >> 20); + NV_INFO(drm, "DMEM: registered %ldMB of device %smemory\n", + DMEM_CHUNK_SIZE >> 20, type == NOUVEAU_ATOMIC ? "atomic " : ""); return 0; @@ -306,7 +422,7 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage, return ret; } -static struct page * +struct page * nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm, enum nouveau_dmem_type type) { struct nouveau_dmem_chunk *chunk; @@ -382,8 +498,12 @@ nouveau_dmem_fini(struct nouveau_drm *drm) mutex_lock(&drm->dmem->mutex); list_for_each_entry_safe(chunk, tmp, &drm->dmem->chunks, list) { - nouveau_bo_unpin(chunk->bo); - nouveau_bo_ref(NULL, &chunk->bo); + if (chunk->type == NOUVEAU_DMEM) { + nouveau_bo_unpin(chunk->bo); + nouveau_bo_ref(NULL, &chunk->bo); + } else { + kfree(chunk->atomic_pages); + } list_del(&chunk->list); memunmap_pages(&chunk->pagemap); release_mem_region(chunk->pagemap.range.start, diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.h b/drivers/gpu/drm/nouveau/nouveau_dmem.h index 02e261c4acf1..6b52a7a8dea4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.h +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.h @@ -30,6 +30,7 @@ struct hmm_range; enum nouveau_dmem_type { NOUVEAU_DMEM, + NOUVEAU_ATOMIC, NOUVEAU_DMEM_NTYPES, /* Number of types, must be last */ }; @@ -45,6 +46,9 @@ int nouveau_dmem_migrate_vma(struct nouveau_drm *drm, unsigned long start, unsigned long end); unsigned long nouveau_dmem_page_addr(struct page *page); +struct page *nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm, + enum nouveau_dmem_type type); +void nouveau_dmem_set_atomic(struct page *dpage, struct page *atomic_page); #else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */ static inline void nouveau_dmem_init(struct nouveau_drm *drm) {} diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 63332387402e..e571c6907bfd 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -35,6 +35,7 @@ #include #include #include +#include struct nouveau_svm { struct nouveau_drm *drm; @@ -421,9 +422,9 @@ nouveau_svm_fault_cmp(const void *a, const void *b) return ret; if ((ret = (s64)fa->addr - fb->addr)) return ret; - /*XXX: atomic? */ - return (fa->access == 0 || fa->access == 3) - - (fb->access == 0 || fb->access == 3); + /* Atomic access (2) has highest priority */ + return (-1*(fa->access == 2) + (fa->access == 0 || fa->access == 3)) - + (-1*(fb->access == 2) + (fb->access == 0 || fb->access == 3)); } static void @@ -555,10 +556,86 @@ static void nouveau_hmm_convert_pfn(struct nouveau_drm *drm, args->p.phys[0] |= NVIF_VMM_PFNMAP_V0_W; } +static int nouveau_atomic_range_fault(struct nouveau_svmm *svmm, + struct nouveau_drm *drm, + struct nouveau_pfnmap_args *args, u32 size, + unsigned long hmm_flags, struct mm_struct *mm) +{ + struct page *dpage, *oldpage; + struct migrate_vma migrate_args; + unsigned long src_pfn = MIGRATE_PFN_PIN, dst_pfn = 0, start = args->p.addr; + int ret = 0; + + mmap_read_lock(mm); + + migrate_args.src = &src_pfn; + migrate_args.dst = &dst_pfn; + migrate_args.start = start; + migrate_args.end = migrate_args.start + PAGE_SIZE; + migrate_args.pgmap_owner = NULL; + migrate_args.flags = MIGRATE_VMA_SELECT_SYSTEM; + migrate_args.vma = find_vma_intersection(mm, migrate_args.start, migrate_args.end); + + ret = migrate_vma_setup(&migrate_args); + + if (ret) { + SVMM_ERR(svmm, "Unable to setup atomic page migration"); + ret = -ENOMEM; + goto out; + } + + oldpage = migrate_pfn_to_page(src_pfn); + + if (src_pfn & MIGRATE_PFN_MIGRATE) { + dpage = nouveau_dmem_page_alloc_locked(drm, NOUVEAU_ATOMIC); + if (!dpage) { + SVMM_ERR(svmm, "Unable to allocate atomic page"); + *migrate_args.dst = 0; + } else { + nouveau_dmem_set_atomic(dpage, oldpage); + dpage->zone_device_data = svmm; + *migrate_args.dst = migrate_pfn(page_to_pfn(dpage)) | + MIGRATE_PFN_LOCKED; + } + } + + migrate_vma_pages(&migrate_args); + + /* Map the page on the GPU for successfully migrated mappings. Do this + * before removing the migration PTEs in migrate_vma_finalize() as once + * that happens a CPU thread can race and invalidate the GPU mappings + * prior to mapping them here. + */ + if (src_pfn & MIGRATE_PFN_MIGRATE) { + args->p.page = 12; + args->p.size = PAGE_SIZE; + args->p.addr = start; + args->p.phys[0] = page_to_phys(oldpage) | + NVIF_VMM_PFNMAP_V0_V | + NVIF_VMM_PFNMAP_V0_A | + NVIF_VMM_PFNMAP_V0_HOST; + + if (migrate_args.vma->vm_flags & VM_WRITE) + args->p.phys[0] |= NVIF_VMM_PFNMAP_V0_W; + + mutex_lock(&svmm->mutex); + svmm->vmm->vmm.object.client->super = true; + ret = nvif_object_ioctl(&svmm->vmm->vmm.object, args, size, NULL); + svmm->vmm->vmm.object.client->super = false; + mutex_unlock(&svmm->mutex); + } + + migrate_vma_finalize(&migrate_args); + +out: + mmap_read_unlock(mm); + return ret; +} + static int nouveau_range_fault(struct nouveau_svmm *svmm, struct nouveau_drm *drm, struct nouveau_pfnmap_args *args, u32 size, - unsigned long hmm_flags, + unsigned long hmm_flags, int atomic, struct svm_notifier *notifier) { unsigned long timeout = @@ -608,12 +685,18 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm, break; } - nouveau_hmm_convert_pfn(drm, &range, args); + if (atomic) { + mutex_unlock(&svmm->mutex); + ret = nouveau_atomic_range_fault(svmm, drm, args, + size, hmm_flags, mm); + } else { + nouveau_hmm_convert_pfn(drm, &range, args); - svmm->vmm->vmm.object.client->super = true; - ret = nvif_object_ioctl(&svmm->vmm->vmm.object, args, size, NULL); - svmm->vmm->vmm.object.client->super = false; - mutex_unlock(&svmm->mutex); + svmm->vmm->vmm.object.client->super = true; + ret = nvif_object_ioctl(&svmm->vmm->vmm.object, args, size, NULL); + svmm->vmm->vmm.object.client->super = false; + mutex_unlock(&svmm->mutex); + } out: mmu_interval_notifier_remove(¬ifier->notifier); @@ -637,7 +720,7 @@ nouveau_svm_fault(struct nvif_notify *notify) unsigned long hmm_flags; u64 inst, start, limit; int fi, fn; - int replay = 0, ret; + int replay = 0, atomic = 0, ret; /* Parse available fault buffer entries into a cache, and update * the GET pointer so HW can reuse the entries. @@ -718,12 +801,15 @@ nouveau_svm_fault(struct nvif_notify *notify) /* * Determine required permissions based on GPU fault * access flags. - * XXX: atomic? */ switch (buffer->fault[fi]->access) { case 0: /* READ. */ hmm_flags = HMM_PFN_REQ_FAULT; break; + case 2: /* ATOMIC. */ + hmm_flags = HMM_PFN_REQ_FAULT | HMM_PFN_REQ_WRITE; + atomic = true; + break; case 3: /* PREFETCH. */ hmm_flags = 0; break; @@ -740,7 +826,7 @@ nouveau_svm_fault(struct nvif_notify *notify) notifier.svmm = svmm; ret = nouveau_range_fault(svmm, svm->drm, &args.i, - sizeof(args), hmm_flags, ¬ifier); + sizeof(args), hmm_flags, atomic, ¬ifier); mmput(mm); limit = args.i.p.addr + args.i.p.size; @@ -760,7 +846,11 @@ nouveau_svm_fault(struct nvif_notify *notify) !(args.phys[0] & NVIF_VMM_PFNMAP_V0_V)) || (buffer->fault[fi]->access != 0 /* READ. */ && buffer->fault[fi]->access != 3 /* PREFETCH. */ && - !(args.phys[0] & NVIF_VMM_PFNMAP_V0_W))) + !(args.phys[0] & NVIF_VMM_PFNMAP_V0_W)) || + (buffer->fault[fi]->access != 0 /* READ. */ && + buffer->fault[fi]->access != 1 /* WRITE. */ && + buffer->fault[fi]->access != 3 /* PREFETCH. */ && + !(args.phys[0] & NVIF_VMM_PFNMAP_V0_A))) break; } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h index a2b179568970..f6188aa9171c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h @@ -178,6 +178,7 @@ void nvkm_vmm_unmap_region(struct nvkm_vmm *, struct nvkm_vma *); #define NVKM_VMM_PFN_APER 0x00000000000000f0ULL #define NVKM_VMM_PFN_HOST 0x0000000000000000ULL #define NVKM_VMM_PFN_VRAM 0x0000000000000010ULL +#define NVKM_VMM_PFN_A 0x0000000000000004ULL #define NVKM_VMM_PFN_W 0x0000000000000002ULL #define NVKM_VMM_PFN_V 0x0000000000000001ULL #define NVKM_VMM_PFN_NONE 0x0000000000000000ULL diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c index 236db5570771..f02abd9cb4dd 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c @@ -88,6 +88,9 @@ gp100_vmm_pgt_pfn(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, if (!(*map->pfn & NVKM_VMM_PFN_W)) data |= BIT_ULL(6); /* RO. */ + if (!(*map->pfn & NVKM_VMM_PFN_A)) + data |= BIT_ULL(7); /* Atomic disable. */ + if (!(*map->pfn & NVKM_VMM_PFN_VRAM)) { addr = *map->pfn >> NVKM_VMM_PFN_ADDR_SHIFT; addr = dma_map_page(dev, pfn_to_page(addr), 0, @@ -322,6 +325,9 @@ gp100_vmm_pd0_pfn(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, if (!(*map->pfn & NVKM_VMM_PFN_W)) data |= BIT_ULL(6); /* RO. */ + if (!(*map->pfn & NVKM_VMM_PFN_A)) + data |= BIT_ULL(7); /* Atomic disable. */ + if (!(*map->pfn & NVKM_VMM_PFN_VRAM)) { addr = *map->pfn >> NVKM_VMM_PFN_ADDR_SHIFT; addr = dma_map_page(dev, pfn_to_page(addr), 0,