From patchwork Fri Apr 2 00:56:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461530 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=jEH+4lhN; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM8N6mqcz9sj0 for ; Fri, 2 Apr 2021 11:57:08 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234117AbhDBA5H (ORCPT ); Thu, 1 Apr 2021 20:57:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234029AbhDBA5F (ORCPT ); Thu, 1 Apr 2021 20:57:05 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06373C06178A for ; Thu, 1 Apr 2021 17:57:04 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id v6so7612534ybk.9 for ; Thu, 01 Apr 2021 17:57:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=DA+Fdq2lOX/CZ+xUDvMaD5HbV7riu4wHkKUqD8IM4/k=; b=jEH+4lhN683scpuXyDyNcFSmirhDLEuGk0lJE37N6PVMAVFNUPQDsSlroxmScN5YO3 IBCGvHhBjtc8BMlUOqXKu1gYXzYw4MZpMUA7myDRgmdzVC/ZR0PTr4HKxjo/4SCJHEih Y08yAAvk7+fWeRM2OJoXDnImvQy8YXKdFHJEsuWkAfDOWtaTi+0uKhWpi3Ie7p/zDjMP iq644k8yugcuqQZ9i1mkoun+uj7px8E1zhFnLvbBtwUp/G6IJhnB5biBIIFY7KsXYCJg jnmfpegvWrj9R8JFD2AZ/HKcezGw+adh+gF5gk/hNaz+UDc8nBr71LdSLSS+hgpp5xK2 yJkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=DA+Fdq2lOX/CZ+xUDvMaD5HbV7riu4wHkKUqD8IM4/k=; b=Zi5+7QOz6271GYS6wWYjGfa9Tx6lLNTsCE9pl+bX/JXdtdJgAwa1cq0ntydR38xZB3 ddsIIZIG+HAc5tYJXt5wlm/uUyHGwKr6q7//JGgGx95T0q+eVFHIsaTbeyJp83TfpE4v fRxO9kKUFjTT1FWBLh+++3us/KZL7x58QnKYNHL4GaFaRF7TK5cZKQxDt+bWaDLNbNVI CXr48XmN7k3kK4OoxFcSHEBEbmkyaxTurfFGFn59y36yI4lDlPRiR8Bb8M9njOqHLCVO +r1HTzIKwO+w+BvfCu5U9H6xCrwm+m+YQElJtazWkeRkO5AiyAoMX3DFx7n6POfTou7B qYtw== X-Gm-Message-State: AOAM530FAqMp87Ub9DYbj286BPGCWZAO25N+IkKyOCv2YW7PVKvlDlpy S+G0AQ3cjnWIsy8R6+fUVio30VfSrig= X-Google-Smtp-Source: ABdhPJygeKAdy1Ue7ggQyWInnKM/181me24PPoeX8s/XsaZF77/exnxBmdxBDDEECxkfyMlEpqfXzJSZ32w= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:af05:: with SMTP id a5mr16043377ybh.86.1617325023277; Thu, 01 Apr 2021 17:57:03 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:49 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-2-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 01/10] KVM: Assert that notifier count is elevated in .change_pte() From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org In KVM's .change_pte() notification callback, replace the notifier sequence bump with a WARN_ON assertion that the notifier count is elevated. An elevated count provides stricter protections than bumping the sequence, and the sequence is guarnateed to be bumped before the count hits zero. When .change_pte() was added by commit 828502d30073 ("ksm: add mmu_notifier set_pte_at_notify()"), bumping the sequence was necessary as .change_pte() would be invoked without any surrounding notifications. However, since commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end"), all calls to .change_pte() are guaranteed to be bookended by start() and end(), and so are guaranteed to run with an elevated notifier count. Note, wrapping .change_pte() with .invalidate_range_{start,end}() is a bug of sorts, as invalidating the secondary MMU's (KVM's) PTE defeats the purpose of .change_pte(). Every arch's kvm_set_spte_hva() assumes .change_pte() is called when the relevant SPTE is present in KVM's MMU, as the original goal was to accelerate Kernel Samepage Merging (KSM) by updating KVM's SPTEs without requiring a VM-Exit (due to invalidating the SPTE). I.e. it means that .change_pte() is effectively dead code on _all_ architectures. x86 and MIPS are clearcut nops if the old SPTE is not-present, and that is guaranteed due to the prior invalidation. PPC simply unmaps the SPTE, which again should be a nop due to the invalidation. arm64 is a bit murky, but it's also likely a nop because kvm_pgtable_stage2_map() is called without a cache pointer, which means it will map an entry if and only if an existing PTE was found. For now, take advantage of the bug to simplify future consolidation of KVMs's MMU notifier code. Doing so will not greatly complicate fixing .change_pte(), assuming it's even worth fixing. .change_pte() has been broken for 8+ years and no one has complained. Even if there are KSM+KVM users that care deeply about its performance, the benefits of avoiding VM-Exits via .change_pte() need to be reevaluated to justify the added complexity and testing burden. Ripping out .change_pte() entirely would be a lot easier. Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d1de843b7618..8df091950161 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -461,12 +461,17 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, trace_kvm_set_spte_hva(address); + /* + * .change_pte() must be bookended by .invalidate_range_{start,end}(), + * and so always runs with an elevated notifier count. This obviates + * the need to bump the sequence count. + */ + WARN_ON_ONCE(!kvm->mmu_notifier_count); + idx = srcu_read_lock(&kvm->srcu); KVM_MMU_LOCK(kvm); - kvm->mmu_notifier_seq++; - if (kvm_set_spte_hva(kvm, address, pte)) kvm_flush_remote_tlbs(kvm); From patchwork Fri Apr 2 00:56:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461531 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=ZEPo10A/; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM8Q6M30z9t18 for ; Fri, 2 Apr 2021 11:57:10 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234184AbhDBA5J (ORCPT ); Thu, 1 Apr 2021 20:57:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234072AbhDBA5G (ORCPT ); Thu, 1 Apr 2021 20:57:06 -0400 Received: from mail-qv1-xf4a.google.com (mail-qv1-xf4a.google.com [IPv6:2607:f8b0:4864:20::f4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39F1DC061797 for ; Thu, 1 Apr 2021 17:57:06 -0700 (PDT) Received: by mail-qv1-xf4a.google.com with SMTP id b15so4377138qvz.15 for ; Thu, 01 Apr 2021 17:57:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=cER1R/dfFtJjfhbrNzOyLV9RRl0KhAMFTqRfmNRSD/k=; b=ZEPo10A/i5CuK8Z7a3kFvJAVebClAmO1nr7bsFWBFhyT1y63s6sVcIDp9HPfiGzcrE u8KySeZmzirnyGQvb78/ORRpezMt/5dSJN1qvOHxoh25TUTPQdwOdzvInJgbCDQQ64mP Ah24+JFERvRlpVbxMPQdVSmzKh7oZ1iQD7Bia49W1pVT8BgOOKi+f1F4M6HCBH9esSp7 gDp3wYD+bIhpqb5rCM0q7IXFx3ATCu8BnCvhUuTZHOmjh5MtcMA4FxdD4XcXrj9f0+KA jlkbJaUSMcDs5bHAkhIk2+Qh6tcl6iY+1Dbj6Ds+OGY184gsk9HFn8xjY86baHqxEVPl COlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=cER1R/dfFtJjfhbrNzOyLV9RRl0KhAMFTqRfmNRSD/k=; b=WXxfVUWCAJjMDypRBm1VZ8POD71ykBPix/xi4JwTZKf67dbxIKagbCbGhx8U7nUkOS dbwi0BHnD2nqXny3SSFV09GTGgmdMVD+TOlf0bsdIzOvgCr4pOVhxfEj8rPPEI/g2x+v kwI2HScZCIHpq/lYwq04Q5gYGAB1BIpGBjbjW+A1Iegvk0Cw9ZK6bOUhb3pgsZ7Lm2wQ 0U4MNLgSg2cix8mVI6BXwkgtWcy5PpJWbWZ0xQzOXEOORH8dNbFqaEbB3Q4E+BzOoDVJ s+UV4VmbXt7sRCoaXYjMe3flXxFrs45pvIpgmjZnWed5ilwSy/yX+aWctRyExfHd8PDW twUg== X-Gm-Message-State: AOAM532L5Tplx2On56S4YI020WXnngqvKW3QcBPx/5DQctGDuR5F7m7z mDy1hpIJgWb5DWBdPkcjbY/9CbIwsqQ= X-Google-Smtp-Source: ABdhPJxSY74pPU7B3D8hY/M1SCeH0HWvtjWfkbAtwD6kT5NPWxOt3yk7it8vs6KdBSxeSSKkBFECBYQuN8k= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:ad4:4692:: with SMTP id bq18mr11010805qvb.0.1617325025366; Thu, 01 Apr 2021 17:57:05 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:50 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-3-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 02/10] KVM: Move x86's MMU notifier memslot walkers to generic code From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Move the hva->gfn lookup for MMU notifiers into common code. Every arch does a similar lookup, and some arch code is all but identical across multiple architectures. In addition to consolidating code, this will allow introducing optimizations that will benefit all architectures without incurring multiple walks of the memslots, e.g. by taking mmu_lock if and only if a relevant range exists in the memslots. The use of __always_inline to avoid indirect call retpolines, as done by x86, may also benefit other architectures. Consolidating the lookups also fixes a wart in x86, where the legacy MMU and TDP MMU each do their own memslot walks. Lastly, future enhancements to the memslot implementation, e.g. to add an interval tree to track host address, will need to touch far less arch specific code. MIPS, PPC, and arm64 will be converted one at a time in future patches. Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/mmu/mmu.c | 127 +++++++---------- arch/x86/kvm/mmu/tdp_mmu.c | 241 ++++++++++++-------------------- arch/x86/kvm/mmu/tdp_mmu.h | 14 +- include/linux/kvm_host.h | 14 ++ virt/kvm/kvm_main.c | 169 +++++++++++++++++++++- 6 files changed, 317 insertions(+), 249 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 99778ac51243..a21e3698f4dc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1718,6 +1718,7 @@ asmlinkage void kvm_spurious_fault(void); _ASM_EXTABLE(666b, 667b) #define KVM_ARCH_WANT_MMU_NOTIFIER +#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v); int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index efb41f31e80a..f2046c41eb93 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1298,26 +1298,25 @@ static bool kvm_zap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, return flush; } -static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, - struct kvm_memory_slot *slot, gfn_t gfn, int level, - unsigned long data) +static bool kvm_unmap_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + struct kvm_memory_slot *slot, gfn_t gfn, int level, + pte_t unused) { return kvm_zap_rmapp(kvm, rmap_head, slot); } -static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, - struct kvm_memory_slot *slot, gfn_t gfn, int level, - unsigned long data) +static bool kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + struct kvm_memory_slot *slot, gfn_t gfn, int level, + pte_t pte) { u64 *sptep; struct rmap_iterator iter; int need_flush = 0; u64 new_spte; - pte_t *ptep = (pte_t *)data; kvm_pfn_t new_pfn; - WARN_ON(pte_huge(*ptep)); - new_pfn = pte_pfn(*ptep); + WARN_ON(pte_huge(pte)); + new_pfn = pte_pfn(pte); restart: for_each_rmap_spte(rmap_head, &iter, sptep) { @@ -1326,7 +1325,7 @@ static int kvm_set_pte_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, need_flush = 1; - if (pte_write(*ptep)) { + if (pte_write(pte)) { pte_list_remove(rmap_head, sptep); goto restart; } else { @@ -1414,86 +1413,52 @@ static void slot_rmap_walk_next(struct slot_rmap_walk_iterator *iterator) slot_rmap_walk_okay(_iter_); \ slot_rmap_walk_next(_iter_)) -typedef int (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head, - struct kvm_memory_slot *slot, gfn_t gfn, - int level, unsigned long data); +typedef bool (*rmap_handler_t)(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + struct kvm_memory_slot *slot, gfn_t gfn, + int level, pte_t pte); -static __always_inline int kvm_handle_hva_range(struct kvm *kvm, - unsigned long start, - unsigned long end, - unsigned long data, - rmap_handler_t handler) +static __always_inline bool kvm_handle_gfn_range(struct kvm *kvm, + struct kvm_gfn_range *range, + rmap_handler_t handler) { - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; struct slot_rmap_walk_iterator iterator; - int ret = 0; - int i; + bool ret = false; - for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { - slots = __kvm_memslots(kvm, i); - kvm_for_each_memslot(memslot, slots) { - unsigned long hva_start, hva_end; - gfn_t gfn_start, gfn_end; - - hva_start = max(start, memslot->userspace_addr); - hva_end = min(end, memslot->userspace_addr + - (memslot->npages << PAGE_SHIFT)); - if (hva_start >= hva_end) - continue; - /* - * {gfn(page) | page intersects with [hva_start, hva_end)} = - * {gfn_start, gfn_start+1, ..., gfn_end-1}. - */ - gfn_start = hva_to_gfn_memslot(hva_start, memslot); - gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); - - for_each_slot_rmap_range(memslot, PG_LEVEL_4K, - KVM_MAX_HUGEPAGE_LEVEL, - gfn_start, gfn_end - 1, - &iterator) - ret |= handler(kvm, iterator.rmap, memslot, - iterator.gfn, iterator.level, data); - } - } + for_each_slot_rmap_range(range->slot, PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, + range->start, range->end - 1, &iterator) + ret |= handler(kvm, iterator.rmap, range->slot, iterator.gfn, + iterator.level, range->pte); return ret; } -static int kvm_handle_hva(struct kvm *kvm, unsigned long hva, - unsigned long data, rmap_handler_t handler) +bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_handle_hva_range(kvm, hva, hva + 1, data, handler); -} - -int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end, - unsigned flags) -{ - int r; + bool flush; - r = kvm_handle_hva_range(kvm, start, end, 0, kvm_unmap_rmapp); + flush = kvm_handle_gfn_range(kvm, range, kvm_unmap_rmapp); if (is_tdp_mmu_enabled(kvm)) - r |= kvm_tdp_mmu_zap_hva_range(kvm, start, end); + flush |= kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush); - return r; + return flush; } -int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) +bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - int r; + bool flush; - r = kvm_handle_hva(kvm, hva, (unsigned long)&pte, kvm_set_pte_rmapp); + flush = kvm_handle_gfn_range(kvm, range, kvm_set_pte_rmapp); if (is_tdp_mmu_enabled(kvm)) - r |= kvm_tdp_mmu_set_spte_hva(kvm, hva, &pte); + flush |= kvm_tdp_mmu_set_spte_gfn(kvm, range); - return r; + return flush; } -static int kvm_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, - struct kvm_memory_slot *slot, gfn_t gfn, int level, - unsigned long data) +static bool kvm_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + struct kvm_memory_slot *slot, gfn_t gfn, int level, + pte_t unused) { u64 *sptep; struct rmap_iterator iter; @@ -1505,9 +1470,9 @@ static int kvm_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, return young; } -static int kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, - struct kvm_memory_slot *slot, gfn_t gfn, - int level, unsigned long data) +static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head, + struct kvm_memory_slot *slot, gfn_t gfn, + int level, pte_t unused) { u64 *sptep; struct rmap_iterator iter; @@ -1529,29 +1494,31 @@ static void rmap_recycle(struct kvm_vcpu *vcpu, u64 *spte, gfn_t gfn) rmap_head = gfn_to_rmap(vcpu->kvm, gfn, sp); - kvm_unmap_rmapp(vcpu->kvm, rmap_head, NULL, gfn, sp->role.level, 0); + kvm_unmap_rmapp(vcpu->kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0)); kvm_flush_remote_tlbs_with_address(vcpu->kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level)); } -int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end) +bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - int young = false; + bool young; + + young = kvm_handle_gfn_range(kvm, range, kvm_age_rmapp); - young = kvm_handle_hva_range(kvm, start, end, 0, kvm_age_rmapp); if (is_tdp_mmu_enabled(kvm)) - young |= kvm_tdp_mmu_age_hva_range(kvm, start, end); + young |= kvm_tdp_mmu_age_gfn_range(kvm, range); return young; } -int kvm_test_age_hva(struct kvm *kvm, unsigned long hva) +bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - int young = false; + bool young; + + young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmapp); - young = kvm_handle_hva(kvm, hva, 0, kvm_test_age_rmapp); if (is_tdp_mmu_enabled(kvm)) - young |= kvm_tdp_mmu_test_age_hva(kvm, hva); + young |= kvm_tdp_mmu_test_age_gfn(kvm, range); return young; } diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index eeb644d2eb6f..7797d24f0937 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -878,142 +878,118 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, return ret; } -typedef int (*tdp_handler_t)(struct kvm *kvm, struct kvm_memory_slot *slot, - struct kvm_mmu_page *root, gfn_t start, gfn_t end, - unsigned long data); - -static __always_inline int kvm_tdp_mmu_handle_hva_range(struct kvm *kvm, - unsigned long start, - unsigned long end, - unsigned long data, - tdp_handler_t handler) +bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, + bool flush) { - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; struct kvm_mmu_page *root; - int ret = 0; - int as_id; - for (as_id = 0; as_id < KVM_ADDRESS_SPACE_NUM; as_id++) { - for_each_tdp_mmu_root_yield_safe(kvm, root, as_id) { - slots = __kvm_memslots(kvm, as_id); - kvm_for_each_memslot(memslot, slots) { - unsigned long hva_start, hva_end; - gfn_t gfn_start, gfn_end; + for_each_tdp_mmu_root(kvm, root, range->slot->as_id) + flush |= zap_gfn_range(kvm, root, range->start, range->end, + false, flush); + + return flush; +} + +typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter, + struct kvm_gfn_range *range); + +static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm, + struct kvm_gfn_range *range, + tdp_handler_t handler) +{ + struct kvm_mmu_page *root; + struct tdp_iter iter; + bool ret = false; - hva_start = max(start, memslot->userspace_addr); - hva_end = min(end, memslot->userspace_addr + - (memslot->npages << PAGE_SHIFT)); - if (hva_start >= hva_end) - continue; - /* - * {gfn(page) | page intersects with [hva_start, hva_end)} = - * {gfn_start, gfn_start+1, ..., gfn_end-1}. - */ - gfn_start = hva_to_gfn_memslot(hva_start, memslot); - gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); + rcu_read_lock(); - ret |= handler(kvm, memslot, root, gfn_start, - gfn_end, data); - } - } + for_each_tdp_mmu_root(kvm, root, range->slot->as_id) { + tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) + ret |= handler(kvm, &iter, range); } + rcu_read_unlock(); + return ret; } -static __always_inline int kvm_tdp_mmu_handle_hva(struct kvm *kvm, - unsigned long addr, - unsigned long data, - tdp_handler_t handler) -{ - return kvm_tdp_mmu_handle_hva_range(kvm, addr, addr + 1, data, handler); -} - -static int zap_gfn_range_hva_wrapper(struct kvm *kvm, - struct kvm_memory_slot *slot, - struct kvm_mmu_page *root, gfn_t start, - gfn_t end, unsigned long unused) -{ - return zap_gfn_range(kvm, root, start, end, false, false); -} - -int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start, - unsigned long end) -{ - return kvm_tdp_mmu_handle_hva_range(kvm, start, end, 0, - zap_gfn_range_hva_wrapper); -} - /* * Mark the SPTEs range of GFNs [start, end) unaccessed and return non-zero * if any of the GFNs in the range have been accessed. */ -static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, - struct kvm_mmu_page *root, gfn_t start, gfn_t end, - unsigned long unused) +static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter, + struct kvm_gfn_range *range) { - struct tdp_iter iter; - int young = 0; - u64 new_spte; + u64 new_spte = 0; - rcu_read_lock(); + /* If we have a non-accessed entry we don't need to change the pte. */ + if (!is_accessed_spte(iter->old_spte)) + return false; - tdp_root_for_each_leaf_pte(iter, root, start, end) { + new_spte = iter->old_spte; + + if (spte_ad_enabled(new_spte)) { + new_spte &= ~shadow_accessed_mask; + } else { /* - * If we have a non-accessed entry we don't need to change the - * pte. + * Capture the dirty status of the page, so that it doesn't get + * lost when the SPTE is marked for access tracking. */ - if (!is_accessed_spte(iter.old_spte)) - continue; - - new_spte = iter.old_spte; - - if (spte_ad_enabled(new_spte)) { - new_spte &= ~shadow_accessed_mask; - } else { - /* - * Capture the dirty status of the page, so that it doesn't get - * lost when the SPTE is marked for access tracking. - */ - if (is_writable_pte(new_spte)) - kvm_set_pfn_dirty(spte_to_pfn(new_spte)); - - new_spte = mark_spte_for_access_track(new_spte); - } - - tdp_mmu_set_spte_no_acc_track(kvm, &iter, new_spte); - young = 1; + if (is_writable_pte(new_spte)) + kvm_set_pfn_dirty(spte_to_pfn(new_spte)); + + new_spte = mark_spte_for_access_track(new_spte); } - rcu_read_unlock(); + tdp_mmu_set_spte_no_acc_track(kvm, iter, new_spte); - return young; + return true; } -int kvm_tdp_mmu_age_hva_range(struct kvm *kvm, unsigned long start, - unsigned long end) +bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_hva_range(kvm, start, end, 0, - age_gfn_range); + return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range); } -static int test_age_gfn(struct kvm *kvm, struct kvm_memory_slot *slot, - struct kvm_mmu_page *root, gfn_t gfn, gfn_t end, - unsigned long unused) +static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter, + struct kvm_gfn_range *range) { - struct tdp_iter iter; - - tdp_root_for_each_leaf_pte(iter, root, gfn, end) - if (is_accessed_spte(iter.old_spte)) - return 1; + return is_accessed_spte(iter->old_spte); +} - return 0; +bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) +{ + return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn); } -int kvm_tdp_mmu_test_age_hva(struct kvm *kvm, unsigned long hva) +static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter, + struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_hva(kvm, hva, 0, test_age_gfn); + u64 new_spte; + + /* Huge pages aren't expected to be modified without first being zapped. */ + WARN_ON(pte_huge(range->pte) || range->start + 1 != range->end); + + if (iter->level != PG_LEVEL_4K || + !is_shadow_present_pte(iter->old_spte)) + return false; + + /* + * Note, when changing a read-only SPTE, it's not strictly necessary to + * zero the SPTE before setting the new PFN, but doing so preserves the + * invariant that the PFN of a present * leaf SPTE can never change. + * See __handle_changed_spte(). + */ + tdp_mmu_set_spte(kvm, iter, 0); + + if (!pte_write(range->pte)) { + new_spte = kvm_mmu_changed_pte_notifier_make_spte(iter->old_spte, + pte_pfn(range->pte)); + + tdp_mmu_set_spte(kvm, iter, new_spte); + } + + return true; } /* @@ -1022,60 +998,15 @@ int kvm_tdp_mmu_test_age_hva(struct kvm *kvm, unsigned long hva) * notifier. * Returns non-zero if a flush is needed before releasing the MMU lock. */ -static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot, - struct kvm_mmu_page *root, gfn_t gfn, gfn_t end, - unsigned long data) +bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - struct tdp_iter iter; - pte_t *ptep = (pte_t *)data; - kvm_pfn_t new_pfn; - u64 new_spte; - int need_flush = 0; + bool flush = kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn); - rcu_read_lock(); + /* FIXME: return 'flush' instead of flushing here. */ + if (flush) + kvm_flush_remote_tlbs_with_address(kvm, range->start, 1); - WARN_ON(pte_huge(*ptep) || (gfn + 1) != end); - - new_pfn = pte_pfn(*ptep); - - tdp_root_for_each_leaf_pte(iter, root, gfn, gfn + 1) { - if (iter.level != PG_LEVEL_4K) - continue; - - if (!is_shadow_present_pte(iter.old_spte)) - break; - - /* - * Note, when changing a read-only SPTE, it's not strictly - * necessary to zero the SPTE before setting the new PFN, but - * doing so preserves the invariant that the PFN of a present - * leaf SPTE can never change. See __handle_changed_spte(). - */ - tdp_mmu_set_spte(kvm, &iter, 0); - - if (!pte_write(*ptep)) { - new_spte = kvm_mmu_changed_pte_notifier_make_spte( - iter.old_spte, new_pfn); - - tdp_mmu_set_spte(kvm, &iter, new_spte); - } - - need_flush = 1; - } - - if (need_flush) - kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); - - rcu_read_unlock(); - - return 0; -} - -int kvm_tdp_mmu_set_spte_hva(struct kvm *kvm, unsigned long address, - pte_t *host_ptep) -{ - return kvm_tdp_mmu_handle_hva(kvm, address, (unsigned long)host_ptep, - set_tdp_spte); + return false; } /* diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index bf3ce169122e..ee8efa58902f 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -38,15 +38,11 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, int map_writable, int max_level, kvm_pfn_t pfn, bool prefault); -int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start, - unsigned long end); - -int kvm_tdp_mmu_age_hva_range(struct kvm *kvm, unsigned long start, - unsigned long end); -int kvm_tdp_mmu_test_age_hva(struct kvm *kvm, unsigned long hva); - -int kvm_tdp_mmu_set_spte_hva(struct kvm *kvm, unsigned long address, - pte_t *host_ptep); +bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, + bool flush); +bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); +bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); +bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, struct kvm_memory_slot *slot, int min_level); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e6d77353025c..e6bb401dd856 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -219,11 +219,25 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #endif #ifdef KVM_ARCH_WANT_MMU_NOTIFIER +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS +struct kvm_gfn_range { + struct kvm_memory_slot *slot; + gfn_t start; + gfn_t end; + pte_t pte; + bool may_block; +}; +bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); +bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); +bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); +bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range); +#else int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end, unsigned flags); int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte); int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end); int kvm_test_age_hva(struct kvm *kvm, unsigned long hva); +#endif /* KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS */ #endif enum { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8df091950161..7a7e62ae5eb4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -451,14 +451,131 @@ static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, srcu_read_unlock(&kvm->srcu, idx); } +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + +typedef bool (*hva_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); + +struct kvm_hva_range { + unsigned long start; + unsigned long end; + pte_t pte; + hva_handler_t handler; + bool flush_on_ret; + bool may_block; +}; + +static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, + const struct kvm_hva_range *range) +{ + struct kvm_memory_slot *slot; + struct kvm_memslots *slots; + struct kvm_gfn_range gfn_range; + bool ret = false; + int i, idx; + + lockdep_assert_held_write(&kvm->mmu_lock); + + idx = srcu_read_lock(&kvm->srcu); + + for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { + slots = __kvm_memslots(kvm, i); + kvm_for_each_memslot(slot, slots) { + unsigned long hva_start, hva_end; + + hva_start = max(range->start, slot->userspace_addr); + hva_end = min(range->end, slot->userspace_addr + + (slot->npages << PAGE_SHIFT)); + if (hva_start >= hva_end) + continue; + + /* + * To optimize for the likely case where the address + * range is covered by zero or one memslots, don't + * bother making these conditional (to avoid writes on + * the second or later invocation of the handler). + */ + gfn_range.pte = range->pte; + gfn_range.may_block = range->may_block; + + /* + * {gfn(page) | page intersects with [hva_start, hva_end)} = + * {gfn_start, gfn_start+1, ..., gfn_end-1}. + */ + gfn_range.start = hva_to_gfn_memslot(hva_start, slot); + gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); + gfn_range.slot = slot; + + ret |= range->handler(kvm, &gfn_range); + } + } + + if (range->flush_on_ret && (ret || kvm->tlbs_dirty)) + kvm_flush_remote_tlbs(kvm); + + srcu_read_unlock(&kvm->srcu, idx); + + /* The notifiers are averse to booleans. :-( */ + return (int)ret; +} + +static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, + unsigned long start, + unsigned long end, + pte_t pte, + hva_handler_t handler) +{ + struct kvm *kvm = mmu_notifier_to_kvm(mn); + const struct kvm_hva_range range = { + .start = start, + .end = end, + .pte = pte, + .handler = handler, + .flush_on_ret = true, + .may_block = false, + }; + int ret; + + KVM_MMU_LOCK(kvm); + ret = __kvm_handle_hva_range(kvm, &range); + KVM_MMU_UNLOCK(kvm); + + return ret; +} + +static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn, + unsigned long start, + unsigned long end, + hva_handler_t handler) +{ + struct kvm *kvm = mmu_notifier_to_kvm(mn); + const struct kvm_hva_range range = { + .start = start, + .end = end, + .pte = __pte(0), + .handler = handler, + .flush_on_ret = false, + .may_block = false, + }; + int ret; + + KVM_MMU_LOCK(kvm); + ret = __kvm_handle_hva_range(kvm, &range); + KVM_MMU_UNLOCK(kvm); + + return ret; +} +#endif /* KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS */ + static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long address, pte_t pte) { struct kvm *kvm = mmu_notifier_to_kvm(mn); - int idx; +#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + int idx; +#endif trace_kvm_set_spte_hva(address); /* @@ -468,6 +585,9 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, */ WARN_ON_ONCE(!kvm->mmu_notifier_count); +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + kvm_handle_hva_range(mn, address, address + 1, pte, kvm_set_spte_gfn); +#else idx = srcu_read_lock(&kvm->srcu); KVM_MMU_LOCK(kvm); @@ -477,17 +597,32 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); +#endif } static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { struct kvm *kvm = mmu_notifier_to_kvm(mn); +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + const struct kvm_hva_range hva_range = { + .start = range->start, + .end = range->end, + .pte = __pte(0), + .handler = kvm_unmap_gfn_range, + .flush_on_ret = true, + .may_block = mmu_notifier_range_blockable(range), + }; +#else int need_tlb_flush = 0, idx; +#endif trace_kvm_unmap_hva_range(range->start, range->end); +#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS idx = srcu_read_lock(&kvm->srcu); +#endif + KVM_MMU_LOCK(kvm); /* * The count increase must become visible at unlock time as no @@ -513,14 +648,21 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, kvm->mmu_notifier_range_end = max(kvm->mmu_notifier_range_end, range->end); } + +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + __kvm_handle_hva_range(kvm, &hva_range); +#else need_tlb_flush = kvm_unmap_hva_range(kvm, range->start, range->end, range->flags); /* we've to flush the tlb before the pages can be freed */ if (need_tlb_flush || kvm->tlbs_dirty) kvm_flush_remote_tlbs(kvm); +#endif KVM_MMU_UNLOCK(kvm); +#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS srcu_read_unlock(&kvm->srcu, idx); +#endif return 0; } @@ -554,11 +696,15 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, unsigned long start, unsigned long end) { +#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS struct kvm *kvm = mmu_notifier_to_kvm(mn); int young, idx; - +#endif trace_kvm_age_hva(start, end); +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + return kvm_handle_hva_range(mn, start, end, __pte(0), kvm_age_gfn); +#else idx = srcu_read_lock(&kvm->srcu); KVM_MMU_LOCK(kvm); @@ -570,6 +716,7 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, srcu_read_unlock(&kvm->srcu, idx); return young; +#endif } static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, @@ -577,13 +724,13 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, unsigned long start, unsigned long end) { +#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS struct kvm *kvm = mmu_notifier_to_kvm(mn); int young, idx; +#endif trace_kvm_age_hva(start, end); - idx = srcu_read_lock(&kvm->srcu); - KVM_MMU_LOCK(kvm); /* * Even though we do not flush TLB, this will still adversely * affect performance on pre-Haswell Intel EPT, where there is @@ -597,22 +744,33 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, * cadence. If we find this inaccurate, we might come up with a * more sophisticated heuristic later. */ +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + return kvm_handle_hva_range_no_flush(mn, start, end, kvm_age_gfn); +#else + idx = srcu_read_lock(&kvm->srcu); + KVM_MMU_LOCK(kvm); young = kvm_age_hva(kvm, start, end); KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); return young; +#endif } static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long address) { +#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS struct kvm *kvm = mmu_notifier_to_kvm(mn); int young, idx; - +#endif trace_kvm_test_age_hva(address); +#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS + return kvm_handle_hva_range_no_flush(mn, address, address + 1, + kvm_test_age_gfn); +#else idx = srcu_read_lock(&kvm->srcu); KVM_MMU_LOCK(kvm); young = kvm_test_age_hva(kvm, address); @@ -620,6 +778,7 @@ static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, srcu_read_unlock(&kvm->srcu, idx); return young; +#endif } static void kvm_mmu_notifier_release(struct mmu_notifier *mn, From patchwork Fri Apr 2 00:56:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461532 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=lb+qs0QK; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM8b67cVz9sjB for ; Fri, 2 Apr 2021 11:57:19 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234354AbhDBA5S (ORCPT ); Thu, 1 Apr 2021 20:57:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233901AbhDBA5I (ORCPT ); Thu, 1 Apr 2021 20:57:08 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E5A9C061788 for ; Thu, 1 Apr 2021 17:57:08 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 10so7678168ybx.11 for ; Thu, 01 Apr 2021 17:57:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=tDfPmXbzjuXYEj7QinrksosPu+0bJFiZwViEuyBfiPM=; b=lb+qs0QKm/vKplDLvVTszodx9+lPNY1WATK8sX84CSZq8tI076ayThzvRmFWIC9ppR d8t9FSZnygTkkAg2cJC4qgr46yFW8WrEbI1c93uYTd4Tq+C40f7GmYGNn76no2eydFJm ELg1MV/l2E2VRXKBw+OxLEfDEUoMG2hdu1QZaxqsLn0ey5+2k2aTKxqSaN8KYwbGGpJZ LU2s/Dk7xnQsZ63ZuMGE1MCLqdVTzwBfjQTtDty7A1SwD1F+T3rYvXnjE6NGc0WaI/Wh rjmoa/1Y5xAuK8JFh3YXhP45SEhEa+DdFoZy2vyAMj8oY/jTXPmptXAWwzLudr/mz6DL n6Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=tDfPmXbzjuXYEj7QinrksosPu+0bJFiZwViEuyBfiPM=; b=YrJcKOuFwn5aTvmqXZHD/WQaYDX1DjSmDBJFQ9IbGY2sQ2rOEtxXj1O9cFwhzfae0A DVSvaBuo8Ewi6IYLgBj6uFPXWMVbI5+Tlsnnt0hLKcXS8FWzoO3xN0VvDZGBqp39yp2w UMpIlXP0kzBDmhnsstjk0lCQIWDWz26mkx9KOHJP4QkmX25WgWWVBG0eBw0iTS/Hy/Zf 0lGgcb2VC4n4C6Kt9fBFC6DjAovuFK8fgfsgV54RuxMF9Ntoh6KmsCfGBi2ckFyGBAZg z3BYCHgwYC0fQBtr7o5uJFS1TlqzTFFvNyDVNTLM5j6BWsD5AQl+70PtUOi1DvE6WqI3 8XNg== X-Gm-Message-State: AOAM5335JkzVQDnjkT/GfZ72rAXga+RjHR8tCBnjQ2dAxM33cTt8ZG5E +agSCxZ+WKGdy35gDj7JjEgiN64JtO0= X-Google-Smtp-Source: ABdhPJxbKr645X0CcfRDIWTDLNH8a38GjXA6vUJ3jECMSWadG+jHX2J4YQiJuT8sNhsO5srQAIOw4KIsjrs= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:f608:: with SMTP id t8mr16151942ybd.496.1617325027509; Thu, 01 Apr 2021 17:57:07 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:51 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-4-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 03/10] KVM: arm64: Convert to the gfn-based MMU notifier callbacks From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Move arm64 to the gfn-base MMU notifier APIs, which do the hva->gfn lookup in common code. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Signed-off-by: Sean Christopherson Reviewed-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/mmu.c | 117 ++++++++---------------------- 2 files changed, 33 insertions(+), 85 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 72e6b4600264..1ad729cf7b0d 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -582,6 +582,7 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events); #define KVM_ARCH_WANT_MMU_NOTIFIER +#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS void kvm_arm_halt_guest(struct kvm *kvm); void kvm_arm_resume_guest(struct kvm *kvm); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 4b7e1e327337..35728231e9a0 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -839,7 +839,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk * the page we just got a reference to gets unmapped before we have a * chance to grab the mmu_lock, which ensure that if the page gets - * unmapped afterwards, the call to kvm_unmap_hva will take it away + * unmapped afterwards, the call to kvm_unmap_gfn will take it away * from us again properly. This smp_rmb() interacts with the smp_wmb() * in kvm_mmu_notifier_invalidate_. */ @@ -1064,123 +1064,70 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) return ret; } -static int handle_hva_to_gpa(struct kvm *kvm, - unsigned long start, - unsigned long end, - int (*handler)(struct kvm *kvm, - gpa_t gpa, u64 size, - void *data), - void *data) -{ - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; - int ret = 0; - - slots = kvm_memslots(kvm); - - /* we only care about the pages that the guest sees */ - kvm_for_each_memslot(memslot, slots) { - unsigned long hva_start, hva_end; - gfn_t gpa; - - hva_start = max(start, memslot->userspace_addr); - hva_end = min(end, memslot->userspace_addr + - (memslot->npages << PAGE_SHIFT)); - if (hva_start >= hva_end) - continue; - - gpa = hva_to_gfn_memslot(hva_start, memslot) << PAGE_SHIFT; - ret |= handler(kvm, gpa, (u64)(hva_end - hva_start), data); - } - - return ret; -} - -static int kvm_unmap_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data) -{ - unsigned flags = *(unsigned *)data; - bool may_block = flags & MMU_NOTIFIER_RANGE_BLOCKABLE; - - __unmap_stage2_range(&kvm->arch.mmu, gpa, size, may_block); - return 0; -} - -int kvm_unmap_hva_range(struct kvm *kvm, - unsigned long start, unsigned long end, unsigned flags) +bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { if (!kvm->arch.mmu.pgt) return 0; - handle_hva_to_gpa(kvm, start, end, &kvm_unmap_hva_handler, &flags); - return 0; -} + __unmap_stage2_range(&kvm->arch.mmu, range->start << PAGE_SHIFT, + (range->end - range->start) << PAGE_SHIFT, + range->may_block); -static int kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data) -{ - kvm_pfn_t *pfn = (kvm_pfn_t *)data; - - WARN_ON(size != PAGE_SIZE); - - /* - * The MMU notifiers will have unmapped a huge PMD before calling - * ->change_pte() (which in turn calls kvm_set_spte_hva()) and - * therefore we never need to clear out a huge PMD through this - * calling path and a memcache is not required. - */ - kvm_pgtable_stage2_map(kvm->arch.mmu.pgt, gpa, PAGE_SIZE, - __pfn_to_phys(*pfn), KVM_PGTABLE_PROT_R, NULL); return 0; } -int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) +bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - unsigned long end = hva + PAGE_SIZE; - kvm_pfn_t pfn = pte_pfn(pte); + kvm_pfn_t pfn = pte_pfn(range->pte); if (!kvm->arch.mmu.pgt) return 0; + WARN_ON(range->end - range->start != 1); + /* * We've moved a page around, probably through CoW, so let's treat it * just like a translation fault and clean the cache to the PoC. */ clean_dcache_guest_page(pfn, PAGE_SIZE); - handle_hva_to_gpa(kvm, hva, end, &kvm_set_spte_handler, &pfn); + + /* + * The MMU notifiers will have unmapped a huge PMD before calling + * ->change_pte() (which in turn calls kvm_set_spte_gfn()) and + * therefore we never need to clear out a huge PMD through this + * calling path and a memcache is not required. + */ + kvm_pgtable_stage2_map(kvm->arch.mmu.pgt, range->start << PAGE_SHIFT, + PAGE_SIZE, __pfn_to_phys(pfn), + KVM_PGTABLE_PROT_R, NULL); + return 0; } -static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data) +bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - pte_t pte; + u64 size = (range->end - range->start) << PAGE_SHIFT; kvm_pte_t kpte; + pte_t pte; + + if (!kvm->arch.mmu.pgt) + return 0; WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE); - kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt, gpa); + + kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt, + range->start << PAGE_SHIFT); pte = __pte(kpte); return pte_valid(pte) && pte_young(pte); } -static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void *data) -{ - WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE); - return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt, gpa); -} - -int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end) -{ - if (!kvm->arch.mmu.pgt) - return 0; - - return handle_hva_to_gpa(kvm, start, end, kvm_age_hva_handler, NULL); -} - -int kvm_test_age_hva(struct kvm *kvm, unsigned long hva) +bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { if (!kvm->arch.mmu.pgt) return 0; - return handle_hva_to_gpa(kvm, hva, hva + PAGE_SIZE, - kvm_test_age_hva_handler, NULL); + return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt, + range->start << PAGE_SHIFT); } phys_addr_t kvm_mmu_get_httbr(void) From patchwork Fri Apr 2 00:56:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461533 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=ouD3RUjo; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM8d1LNyz9t0Y for ; Fri, 2 Apr 2021 11:57:21 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234400AbhDBA5U (ORCPT ); Thu, 1 Apr 2021 20:57:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234238AbhDBA5K (ORCPT ); Thu, 1 Apr 2021 20:57:10 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 720CCC0617A9 for ; Thu, 1 Apr 2021 17:57:10 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 10so7678246ybx.11 for ; Thu, 01 Apr 2021 17:57:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=noWYqCUZelFCMUSh7ZeHZ6bmzNaY4TJIP5y7J4PJixc=; b=ouD3RUjogj41I7jReVrpeXQU9Bb0xL3P8VWkyehuZzBOD88/Sc2uhZmTxennNsvLyT OqfLWztGLspvOdSW9kUrSN0V3oNghubKE1biSAWV6nMl4oHzXS716JlpFhiROhcyTwUH NbcktbtPM4RWwmh6SH9O64kxWxqQkxV/u5DbK70BqtuIOFBh4yhMHK2ChGOdkkS6Gv+I XgV7IbTc96h8bZePPBN13sw/RuQbx0zYdA+eYXja8N9DsEKEjoB1bbXu4L2tpOK43+zU jeK6OjT+/p9isY2SbCwiLQt0ceFPR6OFbFpaeX02nzYeRp1bvono9LbgC7yf01qw7JRI 8G3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=noWYqCUZelFCMUSh7ZeHZ6bmzNaY4TJIP5y7J4PJixc=; b=Qgqgx1yz/soySkJCrOniYtx57C/GMv0cffKD+DMgx5UbNfzCVJfNW6n2R0+W59qeGo FK7OV1rYLSurkKaiqOHvqk6UJkI+n85Lga+8hXuGzJjl3PHwolYAVim2r49doDgxsG83 fDJqLgtMk6+wXDDvjtil6oMrxfJBzIiGOORWmVfY2JI9AYxsjMQJ2Onzb8PFqgwk4DhX oEKTi+Gp76zn1mstbcCUxrouvaYErzsTMKxg4YrsbeB9YEVkfbA3WCf5a2kp64tVlEEr TpieRexyV+EzACjA9uqTQthWGcfm+Za2O2zRiioU8d+khr8jMh7KDUvuANaxm0ghBgEn Jt9w== X-Gm-Message-State: AOAM53079bkoC/TYVga/p+xSMvgrZshS0f0YIjVU9w0hB6mfjYlFYRWx qY2WZWBjWetDzyAkMbpvi0xczqDoBIw= X-Google-Smtp-Source: ABdhPJydkmZrVRRl/TCK20DP5D2eCr2nOcpte6dzBOlW4oeOa5/178wYriGJI5XuziulLinORup7GUWeoVI= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:5146:: with SMTP id f67mr15369859ybb.332.1617325029648; Thu, 01 Apr 2021 17:57:09 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:52 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-5-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 04/10] KVM: MIPS/MMU: Convert to the gfn-based MMU notifier callbacks From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Move MIPS to the gfn-based MMU notifier APIs, which do the hva->gfn lookup in common code, and whose code is nearly identical to MIPS' lookup. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Signed-off-by: Sean Christopherson --- arch/mips/include/asm/kvm_host.h | 1 + arch/mips/kvm/mmu.c | 97 ++++++-------------------------- 2 files changed, 17 insertions(+), 81 deletions(-) diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h index feaa77036b67..374a3c8806e8 100644 --- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -967,6 +967,7 @@ enum kvm_mips_fault_result kvm_trap_emul_gva_fault(struct kvm_vcpu *vcpu, bool write); #define KVM_ARCH_WANT_MMU_NOTIFIER +#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS /* Emulation */ int kvm_get_inst(u32 *opc, struct kvm_vcpu *vcpu, u32 *out); diff --git a/arch/mips/kvm/mmu.c b/arch/mips/kvm/mmu.c index 3dabeda82458..3dc885df2e32 100644 --- a/arch/mips/kvm/mmu.c +++ b/arch/mips/kvm/mmu.c @@ -439,85 +439,36 @@ static int kvm_mips_mkold_gpa_pt(struct kvm *kvm, gfn_t start_gfn, end_gfn << PAGE_SHIFT); } -static int handle_hva_to_gpa(struct kvm *kvm, - unsigned long start, - unsigned long end, - int (*handler)(struct kvm *kvm, gfn_t gfn, - gpa_t gfn_end, - struct kvm_memory_slot *memslot, - void *data), - void *data) +bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; - int ret = 0; - - slots = kvm_memslots(kvm); - - /* we only care about the pages that the guest sees */ - kvm_for_each_memslot(memslot, slots) { - unsigned long hva_start, hva_end; - gfn_t gfn, gfn_end; - - hva_start = max(start, memslot->userspace_addr); - hva_end = min(end, memslot->userspace_addr + - (memslot->npages << PAGE_SHIFT)); - if (hva_start >= hva_end) - continue; - - /* - * {gfn(page) | page intersects with [hva_start, hva_end)} = - * {gfn_start, gfn_start+1, ..., gfn_end-1}. - */ - gfn = hva_to_gfn_memslot(hva_start, memslot); - gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); - - ret |= handler(kvm, gfn, gfn_end, memslot, data); - } - - return ret; -} - - -static int kvm_unmap_hva_handler(struct kvm *kvm, gfn_t gfn, gfn_t gfn_end, - struct kvm_memory_slot *memslot, void *data) -{ - kvm_mips_flush_gpa_pt(kvm, gfn, gfn_end); - return 1; -} - -int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end, - unsigned flags) -{ - handle_hva_to_gpa(kvm, start, end, &kvm_unmap_hva_handler, NULL); + kvm_mips_flush_gpa_pt(kvm, range->start, range->end); kvm_mips_callbacks->flush_shadow_all(kvm); return 0; } -static int kvm_set_spte_handler(struct kvm *kvm, gfn_t gfn, gfn_t gfn_end, - struct kvm_memory_slot *memslot, void *data) +static bool __kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - gpa_t gpa = gfn << PAGE_SHIFT; - pte_t hva_pte = *(pte_t *)data; + gpa_t gpa = range->start << PAGE_SHIFT; + pte_t hva_pte = range->pte; pte_t *gpa_pte = kvm_mips_pte_for_gpa(kvm, NULL, gpa); pte_t old_pte; if (!gpa_pte) - return 0; + return false; /* Mapping may need adjusting depending on memslot flags */ old_pte = *gpa_pte; - if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES && !pte_dirty(old_pte)) + if (range->slot->flags & KVM_MEM_LOG_DIRTY_PAGES && !pte_dirty(old_pte)) hva_pte = pte_mkclean(hva_pte); - else if (memslot->flags & KVM_MEM_READONLY) + else if (range->slot->flags & KVM_MEM_READONLY) hva_pte = pte_wrprotect(hva_pte); set_pte(gpa_pte, hva_pte); /* Replacing an absent or old page doesn't need flushes */ if (!pte_present(old_pte) || !pte_young(old_pte)) - return 0; + return false; /* Pages swapped, aged, moved, or cleaned require flushes */ return !pte_present(hva_pte) || @@ -526,27 +477,21 @@ static int kvm_set_spte_handler(struct kvm *kvm, gfn_t gfn, gfn_t gfn_end, (pte_dirty(old_pte) && !pte_dirty(hva_pte)); } -int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) +bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - unsigned long end = hva + PAGE_SIZE; - int ret; - - ret = handle_hva_to_gpa(kvm, hva, end, &kvm_set_spte_handler, &pte); - if (ret) + if (__kvm_set_spte_gfn(kvm, range)) kvm_mips_callbacks->flush_shadow_all(kvm); - return 0; + return false; } -static int kvm_age_hva_handler(struct kvm *kvm, gfn_t gfn, gfn_t gfn_end, - struct kvm_memory_slot *memslot, void *data) +bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_mips_mkold_gpa_pt(kvm, gfn, gfn_end); + return kvm_mips_mkold_gpa_pt(kvm, range->start, range->end); } -static int kvm_test_age_hva_handler(struct kvm *kvm, gfn_t gfn, gfn_t gfn_end, - struct kvm_memory_slot *memslot, void *data) +bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - gpa_t gpa = gfn << PAGE_SHIFT; + gpa_t gpa = range->start << PAGE_SHIFT; pte_t *gpa_pte = kvm_mips_pte_for_gpa(kvm, NULL, gpa); if (!gpa_pte) @@ -554,16 +499,6 @@ static int kvm_test_age_hva_handler(struct kvm *kvm, gfn_t gfn, gfn_t gfn_end, return pte_young(*gpa_pte); } -int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end) -{ - return handle_hva_to_gpa(kvm, start, end, kvm_age_hva_handler, NULL); -} - -int kvm_test_age_hva(struct kvm *kvm, unsigned long hva) -{ - return handle_hva_to_gpa(kvm, hva, hva, kvm_test_age_hva_handler, NULL); -} - /** * _kvm_mips_map_page_fast() - Fast path GPA fault handler. * @vcpu: VCPU pointer. From patchwork Fri Apr 2 00:56:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461534 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=ht9+RccV; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM8g2kywz9sX2 for ; Fri, 2 Apr 2021 11:57:23 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234314AbhDBA5W (ORCPT ); Thu, 1 Apr 2021 20:57:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234327AbhDBA5S (ORCPT ); Thu, 1 Apr 2021 20:57:18 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD76DC0613A7 for ; Thu, 1 Apr 2021 17:57:12 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id v186so7617409ybe.5 for ; Thu, 01 Apr 2021 17:57:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=ExD4nHZ1gTZXgXom70NVZ5EPMql7Mn6cErqWsxaWOTo=; b=ht9+RccVS+Ni3FcE61F43bvFBtQU7Ogik1TA0EsPIihbnyEbZQ6vIvVR9CjZbb8QYI 5pnJtwuRcPZAz9xwSCho3xQRiqhGJCLHaVbgk1dsBACMCRDRG/qmqCnX0JLbtqViFAY+ MlK5X+97T5xWJfU+d0+DU4R0NaMhEBYtL0j1wVxb/US6JTdQmwL4I+i+Py/q/pMLfaBS +8/upG7zTAEE7HPHewkU/T3KOEB6EH1wXjpR0Q4a4L87SADYQ2Zhdxgxqnx/3jxRSSHM CZ2LYnjoes4JdTqEQhXGWevCGJxhxS+Ng32KhDcnknCOOzQl4b8tgV8CgMUCke5wwVJT eO+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=ExD4nHZ1gTZXgXom70NVZ5EPMql7Mn6cErqWsxaWOTo=; b=exPJ/3+FAou+GwVSNE/oAVYTZG+osd6L0yERz2mPPWebi3izL89bzeIGgOKK829ccg ADNIQwuc3lcTmi0axj/iMLFODjA2NJ78vvqyxHoxjWGmqNd+h2CRhdLhzRtW4jw5tCJK DoI8/e9sxC5zVivR7aEZERqkemK6VmHfFCyXUHQ5PqBViBG0NumSjWrVeC9sZUQITDau uq+3doarQOEZzkT6kFaBBhswtZAWUcg+kqU5L12U/hJeTAvS2K0+hUmeJkhX5CmQA3GO /2YUQuYBMdgKktlFBn42ti25LH+faBa9gWWswoSslZspsmdS6naB38sV2dcozq7ov2gq eY+A== X-Gm-Message-State: AOAM531mPXR79U2rc1RZh06N0aJfWP3dMxnFZmTykE9ccFCUBGgxJVJD iLxZCNJJ5liMjPs8MgRAoipUdA5jWuA= X-Google-Smtp-Source: ABdhPJwujQubJKZWaKvgWX3J/FOe+Sn16JN8JCoz25oiu8WQatrSvXcyB7qicuxobbf9I3kRkYuHcy4bekQ= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:d4cc:: with SMTP id m195mr16322358ybf.270.1617325032046; Thu, 01 Apr 2021 17:57:12 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:53 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-6-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 05/10] KVM: PPC: Convert to the gfn-based MMU notifier callbacks From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Move PPC to the gfn-base MMU notifier APIs, and update all 15 bajillion PPC-internal hooks to work with gfns instead of hvas. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Signed-off-by: Sean Christopherson --- arch/powerpc/include/asm/kvm_book3s.h | 12 ++-- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/include/asm/kvm_ppc.h | 9 ++- arch/powerpc/kvm/book3s.c | 18 +++-- arch/powerpc/kvm/book3s.h | 10 ++- arch/powerpc/kvm/book3s_64_mmu_hv.c | 98 +++++++------------------- arch/powerpc/kvm/book3s_64_mmu_radix.c | 25 +++---- arch/powerpc/kvm/book3s_hv.c | 12 ++-- arch/powerpc/kvm/book3s_pr.c | 56 +++++---------- arch/powerpc/kvm/e500_mmu_host.c | 27 +++---- 10 files changed, 95 insertions(+), 173 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 2f5f919f6cd3..2d03f2930767 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -210,12 +210,12 @@ extern void kvmppc_free_pgtable_radix(struct kvm *kvm, pgd_t *pgd, unsigned int lpid); extern int kvmppc_radix_init(void); extern void kvmppc_radix_exit(void); -extern int kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn); -extern int kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn); -extern int kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn); +extern bool kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn); +extern bool kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn); +extern bool kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn); extern long kvmppc_hv_get_dirty_log_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, unsigned long *map); extern void kvmppc_radix_flush_memslot(struct kvm *kvm, diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 1e83359f286b..1335f0001bdd 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -55,6 +55,7 @@ #include #define KVM_ARCH_WANT_MMU_NOTIFIER +#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS #define HPTEG_CACHE_NUM (1 << 15) #define HPTEG_HASH_BITS_PTE 13 diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 8aacd76bb702..21ab0332eb42 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -281,11 +281,10 @@ struct kvmppc_ops { const struct kvm_memory_slot *old, const struct kvm_memory_slot *new, enum kvm_mr_change change); - int (*unmap_hva_range)(struct kvm *kvm, unsigned long start, - unsigned long end); - int (*age_hva)(struct kvm *kvm, unsigned long start, unsigned long end); - int (*test_age_hva)(struct kvm *kvm, unsigned long hva); - void (*set_spte_hva)(struct kvm *kvm, unsigned long hva, pte_t pte); + bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range); + bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range); + bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range); + bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range); void (*free_memslot)(struct kvm_memory_slot *slot); int (*init_vm)(struct kvm *kvm); void (*destroy_vm)(struct kvm *kvm); diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 44bf567b6589..2b691f4d1f26 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -834,26 +834,24 @@ void kvmppc_core_commit_memory_region(struct kvm *kvm, kvm->arch.kvm_ops->commit_memory_region(kvm, mem, old, new, change); } -int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end, - unsigned flags) +bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm->arch.kvm_ops->unmap_hva_range(kvm, start, end); + return kvm->arch.kvm_ops->unmap_gfn_range(kvm, range); } -int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end) +bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm->arch.kvm_ops->age_hva(kvm, start, end); + return kvm->arch.kvm_ops->age_gfn(kvm, range); } -int kvm_test_age_hva(struct kvm *kvm, unsigned long hva) +bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm->arch.kvm_ops->test_age_hva(kvm, hva); + return kvm->arch.kvm_ops->test_age_gfn(kvm, range); } -int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) +bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - kvm->arch.kvm_ops->set_spte_hva(kvm, hva, pte); - return 0; + return kvm->arch.kvm_ops->set_spte_gfn(kvm, range); } int kvmppc_core_init_vm(struct kvm *kvm) diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h index 9b6323ec8e60..740e51def5a5 100644 --- a/arch/powerpc/kvm/book3s.h +++ b/arch/powerpc/kvm/book3s.h @@ -9,12 +9,10 @@ extern void kvmppc_core_flush_memslot_hv(struct kvm *kvm, struct kvm_memory_slot *memslot); -extern int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, - unsigned long end); -extern int kvm_age_hva_hv(struct kvm *kvm, unsigned long start, - unsigned long end); -extern int kvm_test_age_hva_hv(struct kvm *kvm, unsigned long hva); -extern void kvm_set_spte_hva_hv(struct kvm *kvm, unsigned long hva, pte_t pte); +extern bool kvm_unmap_gfn_range_hv(struct kvm *kvm, struct kvm_gfn_range *range); +extern bool kvm_age_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range); +extern bool kvm_test_age_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range); +extern bool kvm_set_spte_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range); extern int kvmppc_mmu_init_pr(struct kvm_vcpu *vcpu); extern void kvmppc_mmu_destroy_pr(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index bb6773594cf8..b7bd9ca040b8 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -752,51 +752,6 @@ void kvmppc_rmap_reset(struct kvm *kvm) srcu_read_unlock(&kvm->srcu, srcu_idx); } -typedef int (*hva_handler_fn)(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn); - -static int kvm_handle_hva_range(struct kvm *kvm, - unsigned long start, - unsigned long end, - hva_handler_fn handler) -{ - int ret; - int retval = 0; - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; - - slots = kvm_memslots(kvm); - kvm_for_each_memslot(memslot, slots) { - unsigned long hva_start, hva_end; - gfn_t gfn, gfn_end; - - hva_start = max(start, memslot->userspace_addr); - hva_end = min(end, memslot->userspace_addr + - (memslot->npages << PAGE_SHIFT)); - if (hva_start >= hva_end) - continue; - /* - * {gfn(page) | page intersects with [hva_start, hva_end)} = - * {gfn, gfn+1, ..., gfn_end-1}. - */ - gfn = hva_to_gfn_memslot(hva_start, memslot); - gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); - - for (; gfn < gfn_end; ++gfn) { - ret = handler(kvm, memslot, gfn); - retval |= ret; - } - } - - return retval; -} - -static int kvm_handle_hva(struct kvm *kvm, unsigned long hva, - hva_handler_fn handler) -{ - return kvm_handle_hva_range(kvm, hva, hva + 1, handler); -} - /* Must be called with both HPTE and rmap locked */ static void kvmppc_unmap_hpte(struct kvm *kvm, unsigned long i, struct kvm_memory_slot *memslot, @@ -840,8 +795,8 @@ static void kvmppc_unmap_hpte(struct kvm *kvm, unsigned long i, } } -static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn) +static bool kvm_unmap_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn) { unsigned long i; __be64 *hptep; @@ -874,16 +829,15 @@ static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, unlock_rmap(rmapp); __unlock_hpte(hptep, be64_to_cpu(hptep[0])); } - return 0; + return false; } -int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned long end) +bool kvm_unmap_gfn_range_hv(struct kvm *kvm, struct kvm_gfn_range *range) { - hva_handler_fn handler; + if (kvm_is_radix(kvm)) + return kvm_unmap_radix(kvm, range->slot, range->start); - handler = kvm_is_radix(kvm) ? kvm_unmap_radix : kvm_unmap_rmapp; - kvm_handle_hva_range(kvm, start, end, handler); - return 0; + return kvm_unmap_rmapp(kvm, range->slot, range->start); } void kvmppc_core_flush_memslot_hv(struct kvm *kvm, @@ -913,8 +867,8 @@ void kvmppc_core_flush_memslot_hv(struct kvm *kvm, } } -static int kvm_age_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn) +static bool kvm_age_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn) { struct revmap_entry *rev = kvm->arch.hpt.rev; unsigned long head, i, j; @@ -968,26 +922,26 @@ static int kvm_age_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, return ret; } -int kvm_age_hva_hv(struct kvm *kvm, unsigned long start, unsigned long end) +bool kvm_age_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range) { - hva_handler_fn handler; + if (kvm_is_radix(kvm)) + kvm_age_radix(kvm, range->slot, range->start); - handler = kvm_is_radix(kvm) ? kvm_age_radix : kvm_age_rmapp; - return kvm_handle_hva_range(kvm, start, end, handler); + return kvm_age_rmapp(kvm, range->slot, range->start); } -static int kvm_test_age_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn) +static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn) { struct revmap_entry *rev = kvm->arch.hpt.rev; unsigned long head, i, j; unsigned long *hp; - int ret = 1; + bool ret = true; unsigned long *rmapp; rmapp = &memslot->arch.rmap[gfn - memslot->base_gfn]; if (*rmapp & KVMPPC_RMAP_REFERENCED) - return 1; + return true; lock_rmap(rmapp); if (*rmapp & KVMPPC_RMAP_REFERENCED) @@ -1002,27 +956,27 @@ static int kvm_test_age_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot, goto out; } while ((i = j) != head); } - ret = 0; + ret = false; out: unlock_rmap(rmapp); return ret; } -int kvm_test_age_hva_hv(struct kvm *kvm, unsigned long hva) +bool kvm_test_age_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range) { - hva_handler_fn handler; + if (kvm_is_radix(kvm)) + kvm_test_age_radix(kvm, range->slot, range->start); - handler = kvm_is_radix(kvm) ? kvm_test_age_radix : kvm_test_age_rmapp; - return kvm_handle_hva(kvm, hva, handler); + return kvm_test_age_rmapp(kvm, range->slot, range->start); } -void kvm_set_spte_hva_hv(struct kvm *kvm, unsigned long hva, pte_t pte) +bool kvm_set_spte_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range) { - hva_handler_fn handler; + if (kvm_is_radix(kvm)) + return kvm_unmap_radix(kvm, range->slot, range->start); - handler = kvm_is_radix(kvm) ? kvm_unmap_radix : kvm_unmap_rmapp; - kvm_handle_hva(kvm, hva, handler); + return kvm_unmap_rmapp(kvm, range->slot, range->start); } static int vcpus_running(struct kvm *kvm) diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index e603de7ade52..ec4f58fa9f5a 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -993,8 +993,8 @@ int kvmppc_book3s_radix_page_fault(struct kvm_vcpu *vcpu, } /* Called with kvm->mmu_lock held */ -int kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn) +bool kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn) { pte_t *ptep; unsigned long gpa = gfn << PAGE_SHIFT; @@ -1002,24 +1002,24 @@ int kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE) { uv_page_inval(kvm->arch.lpid, gpa, PAGE_SHIFT); - return 0; + return false; } ptep = find_kvm_secondary_pte(kvm, gpa, &shift); if (ptep && pte_present(*ptep)) kvmppc_unmap_pte(kvm, ptep, gpa, shift, memslot, kvm->arch.lpid); - return 0; + return false; } /* Called with kvm->mmu_lock held */ -int kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn) +bool kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn) { pte_t *ptep; unsigned long gpa = gfn << PAGE_SHIFT; unsigned int shift; - int ref = 0; + bool ref = false; unsigned long old, *rmapp; if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE) @@ -1035,26 +1035,27 @@ int kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, kvmhv_update_nest_rmap_rc_list(kvm, rmapp, _PAGE_ACCESSED, 0, old & PTE_RPN_MASK, 1UL << shift); - ref = 1; + ref = true; } return ref; } /* Called with kvm->mmu_lock held */ -int kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, - unsigned long gfn) +bool kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, + unsigned long gfn) + { pte_t *ptep; unsigned long gpa = gfn << PAGE_SHIFT; unsigned int shift; - int ref = 0; + bool ref = false; if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_DONE) return ref; ptep = find_kvm_secondary_pte(kvm, gpa, &shift); if (ptep && pte_present(*ptep) && pte_young(*ptep)) - ref = 1; + ref = true; return ref; } diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 13bad6bf4c95..07682ad4110e 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -4770,7 +4770,7 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm) kvmhv_release_all_nested(kvm); kvmppc_rmap_reset(kvm); kvm->arch.process_table = 0; - /* Mutual exclusion with kvm_unmap_hva_range etc. */ + /* Mutual exclusion with kvm_unmap_gfn_range etc. */ spin_lock(&kvm->mmu_lock); kvm->arch.radix = 0; spin_unlock(&kvm->mmu_lock); @@ -4792,7 +4792,7 @@ int kvmppc_switch_mmu_to_radix(struct kvm *kvm) if (err) return err; kvmppc_rmap_reset(kvm); - /* Mutual exclusion with kvm_unmap_hva_range etc. */ + /* Mutual exclusion with kvm_unmap_gfn_range etc. */ spin_lock(&kvm->mmu_lock); kvm->arch.radix = 1; spin_unlock(&kvm->mmu_lock); @@ -5654,10 +5654,10 @@ static struct kvmppc_ops kvm_ops_hv = { .flush_memslot = kvmppc_core_flush_memslot_hv, .prepare_memory_region = kvmppc_core_prepare_memory_region_hv, .commit_memory_region = kvmppc_core_commit_memory_region_hv, - .unmap_hva_range = kvm_unmap_hva_range_hv, - .age_hva = kvm_age_hva_hv, - .test_age_hva = kvm_test_age_hva_hv, - .set_spte_hva = kvm_set_spte_hva_hv, + .unmap_gfn_range = kvm_unmap_gfn_range_hv, + .age_gfn = kvm_age_gfn_hv, + .test_age_gfn = kvm_test_age_gfn_hv, + .set_spte_gfn = kvm_set_spte_gfn_hv, .free_memslot = kvmppc_core_free_memslot_hv, .init_vm = kvmppc_core_init_vm_hv, .destroy_vm = kvmppc_core_destroy_vm_hv, diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 913944dc3620..d7733b07f489 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -425,61 +425,39 @@ static int kvmppc_core_check_requests_pr(struct kvm_vcpu *vcpu) } /************* MMU Notifiers *************/ -static void do_kvm_unmap_hva(struct kvm *kvm, unsigned long start, - unsigned long end) +static bool do_kvm_unmap_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { long i; struct kvm_vcpu *vcpu; - struct kvm_memslots *slots; - struct kvm_memory_slot *memslot; - slots = kvm_memslots(kvm); - kvm_for_each_memslot(memslot, slots) { - unsigned long hva_start, hva_end; - gfn_t gfn, gfn_end; + kvm_for_each_vcpu(i, vcpu, kvm) + kvmppc_mmu_pte_pflush(vcpu, range->start << PAGE_SHIFT, + range->end << PAGE_SHIFT); - hva_start = max(start, memslot->userspace_addr); - hva_end = min(end, memslot->userspace_addr + - (memslot->npages << PAGE_SHIFT)); - if (hva_start >= hva_end) - continue; - /* - * {gfn(page) | page intersects with [hva_start, hva_end)} = - * {gfn, gfn+1, ..., gfn_end-1}. - */ - gfn = hva_to_gfn_memslot(hva_start, memslot); - gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot); - kvm_for_each_vcpu(i, vcpu, kvm) - kvmppc_mmu_pte_pflush(vcpu, gfn << PAGE_SHIFT, - gfn_end << PAGE_SHIFT); - } + return false; } -static int kvm_unmap_hva_range_pr(struct kvm *kvm, unsigned long start, - unsigned long end) +static bool kvm_unmap_gfn_range_pr(struct kvm *kvm, struct kvm_gfn_range *range) { - do_kvm_unmap_hva(kvm, start, end); - - return 0; + return do_kvm_unmap_gfn(kvm, range); } -static int kvm_age_hva_pr(struct kvm *kvm, unsigned long start, - unsigned long end) +static bool kvm_age_gfn_pr(struct kvm *kvm, struct kvm_gfn_range *range) { /* XXX could be more clever ;) */ - return 0; + return false; } -static int kvm_test_age_hva_pr(struct kvm *kvm, unsigned long hva) +static bool kvm_test_age_gfn_pr(struct kvm *kvm, struct kvm_gfn_range *range) { /* XXX could be more clever ;) */ - return 0; + return false; } -static void kvm_set_spte_hva_pr(struct kvm *kvm, unsigned long hva, pte_t pte) +static bool kvm_set_spte_gfn_pr(struct kvm *kvm, struct kvm_gfn_range *range) { /* The page will get remapped properly on its next fault */ - do_kvm_unmap_hva(kvm, hva, hva + PAGE_SIZE); + return do_kvm_unmap_gfn(kvm, range); } /*****************************************/ @@ -2079,10 +2057,10 @@ static struct kvmppc_ops kvm_ops_pr = { .flush_memslot = kvmppc_core_flush_memslot_pr, .prepare_memory_region = kvmppc_core_prepare_memory_region_pr, .commit_memory_region = kvmppc_core_commit_memory_region_pr, - .unmap_hva_range = kvm_unmap_hva_range_pr, - .age_hva = kvm_age_hva_pr, - .test_age_hva = kvm_test_age_hva_pr, - .set_spte_hva = kvm_set_spte_hva_pr, + .unmap_gfn_range = kvm_unmap_gfn_range_pr, + .age_gfn = kvm_age_gfn_pr, + .test_age_gfn = kvm_test_age_gfn_pr, + .set_spte_gfn = kvm_set_spte_gfn_pr, .free_memslot = kvmppc_core_free_memslot_pr, .init_vm = kvmppc_core_init_vm_pr, .destroy_vm = kvmppc_core_destroy_vm_pr, diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 648aefe1a3e7..7f16afc331ef 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -721,43 +721,36 @@ int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, /************* MMU Notifiers *************/ -static int kvm_unmap_hva(struct kvm *kvm, unsigned long hva) +static bool kvm_e500_mmu_unmap_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { /* * Flush all shadow tlb entries everywhere. This is slow, but * we are 100% sure that we catch the to be unmapped page */ - kvm_flush_remote_tlbs(kvm); - - return 0; + return true; } -int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end, - unsigned flags) +bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - /* kvm_unmap_hva flushes everything anyways */ - kvm_unmap_hva(kvm, start); - - return 0; + return kvm_e500_mmu_unmap_gfn(kvm, range); } -int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end) +bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { /* XXX could be more clever ;) */ - return 0; + return false; } -int kvm_test_age_hva(struct kvm *kvm, unsigned long hva) +bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { /* XXX could be more clever ;) */ - return 0; + return false; } -int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) +bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { /* The page will get remapped properly on its next fault */ - kvm_unmap_hva(kvm, hva); - return 0; + return kvm_e500_mmu_unmap_gfn(kvm, range); } /*****************************************/ From patchwork Fri Apr 2 00:56:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461535 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=eMrfPNSF; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM8n5dttz9t1J for ; Fri, 2 Apr 2021 11:57:29 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234489AbhDBA52 (ORCPT ); Thu, 1 Apr 2021 20:57:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234103AbhDBA5T (ORCPT ); Thu, 1 Apr 2021 20:57:19 -0400 Received: from mail-qv1-xf49.google.com (mail-qv1-xf49.google.com [IPv6:2607:f8b0:4864:20::f49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B40FC0613AB for ; Thu, 1 Apr 2021 17:57:15 -0700 (PDT) Received: by mail-qv1-xf49.google.com with SMTP id bt20so4392717qvb.0 for ; Thu, 01 Apr 2021 17:57:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=JN6JeOh2os7TETUBdatO/SE8rrYNulCw/eK8aUoUKlo=; b=eMrfPNSFFf9FEy/PgOFXlE0QW6+eiJ/B/GFm3098SJ8t5ML9FvPe4akfK9uP8Odqgd 5bYn9hI52Cau/Zsc617HbZQPDpkDKTLNcWBqUn/yAO2EpjZEkp+jvMD3XcuPSs4b2NMx WTv2z4lQ7k1SMUeYuNPXv8EQZaJNVMMEpYyaJPjxJJHpO0BLXzl+0s9GnNslh5WIcl2o TmKMXkqNiylGj6uzaa0oXxBmNOQYDw86juXhCK7lDI7IRZjJKhF4iYXtGKqBsaBj0esq MW2yBN8IALvf1FH0CFOQHqQlm6PjoXPO+2fS171RoAUufjT5YLihSSEqzQLxWtTVqGS5 uufw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=JN6JeOh2os7TETUBdatO/SE8rrYNulCw/eK8aUoUKlo=; b=r86R8ugIiG6AT+FPfuf1BWQYQ30StNOrSda8i3wCm0J8pM+ntoDAS4xMzpmvW/6M07 4ZQW15ksQfT0vhlOIrrAiXCuYseC+KoOksAfPiz7v0ZVGGHVTcA0QbpiJTDI74Snajd6 kD8vnSgpxHuakg5U7CvzM7Y/O/mXZ+EUY9RNOQLE0BoP9g5Tf3dn8qTuBN5AxgL1Td2H wgvwePHuP45ahrkMtTrsaLmLQoqm6orG9Tj1/sULCza60ENGTjaxoXHGwHUthiOMue3j qioZ1eTp/e3NQGxaZZUylg6ZohCSShS1xgSHq8oc4737KKTScZaqwrfCT1iGf4HBCYCY Gihg== X-Gm-Message-State: AOAM530yb/L9yVw7s+Wf1b5aWOpDwXNuiqwESaDJ9ri9PbhT74uiYvzZ t0fMyyJrAqkX3DgS4kQ9s6mzpypTZUY= X-Google-Smtp-Source: ABdhPJykRnNFawXAR7GTX30/MArrY/XbyRXdmuqfk1vxAaup1bjQe5AJdJRGU4o3M0zacGSEUMNzYTdDoHo= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a0c:c243:: with SMTP id w3mr342583qvh.34.1617325034296; Thu, 01 Apr 2021 17:57:14 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:54 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-7-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 06/10] KVM: Kill off the old hva-based MMU notifier callbacks From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Yank out the hva-based MMU notifier APIs now that all architectures that use the notifiers have moved to the gfn-based APIs. No functional change intended. Signed-off-by: Sean Christopherson --- arch/arm64/include/asm/kvm_host.h | 1 - arch/mips/include/asm/kvm_host.h | 1 - arch/powerpc/include/asm/kvm_host.h | 1 - arch/x86/include/asm/kvm_host.h | 1 - include/linux/kvm_host.h | 8 --- virt/kvm/kvm_main.c | 85 ----------------------------- 6 files changed, 97 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 1ad729cf7b0d..72e6b4600264 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -582,7 +582,6 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events); #define KVM_ARCH_WANT_MMU_NOTIFIER -#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS void kvm_arm_halt_guest(struct kvm *kvm); void kvm_arm_resume_guest(struct kvm *kvm); diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h index 374a3c8806e8..feaa77036b67 100644 --- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -967,7 +967,6 @@ enum kvm_mips_fault_result kvm_trap_emul_gva_fault(struct kvm_vcpu *vcpu, bool write); #define KVM_ARCH_WANT_MMU_NOTIFIER -#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS /* Emulation */ int kvm_get_inst(u32 *opc, struct kvm_vcpu *vcpu, u32 *out); diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 1335f0001bdd..1e83359f286b 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -55,7 +55,6 @@ #include #define KVM_ARCH_WANT_MMU_NOTIFIER -#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS #define HPTEG_CACHE_NUM (1 << 15) #define HPTEG_HASH_BITS_PTE 13 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index a21e3698f4dc..99778ac51243 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1718,7 +1718,6 @@ asmlinkage void kvm_spurious_fault(void); _ASM_EXTABLE(666b, 667b) #define KVM_ARCH_WANT_MMU_NOTIFIER -#define KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v); int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e6bb401dd856..40ac2d40bb5a 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -219,7 +219,6 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #endif #ifdef KVM_ARCH_WANT_MMU_NOTIFIER -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS struct kvm_gfn_range { struct kvm_memory_slot *slot; gfn_t start; @@ -231,13 +230,6 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range); -#else -int kvm_unmap_hva_range(struct kvm *kvm, - unsigned long start, unsigned long end, unsigned flags); -int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte); -int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end); -int kvm_test_age_hva(struct kvm *kvm, unsigned long hva); -#endif /* KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS */ #endif enum { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 7a7e62ae5eb4..2e809d73c7f1 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -451,8 +451,6 @@ static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, srcu_read_unlock(&kvm->srcu, idx); } -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS - typedef bool (*hva_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); struct kvm_hva_range { @@ -564,8 +562,6 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn return ret; } -#endif /* KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS */ - static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long address, @@ -573,9 +569,6 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, { struct kvm *kvm = mmu_notifier_to_kvm(mn); -#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS - int idx; -#endif trace_kvm_set_spte_hva(address); /* @@ -585,26 +578,13 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, */ WARN_ON_ONCE(!kvm->mmu_notifier_count); -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS kvm_handle_hva_range(mn, address, address + 1, pte, kvm_set_spte_gfn); -#else - idx = srcu_read_lock(&kvm->srcu); - - KVM_MMU_LOCK(kvm); - - if (kvm_set_spte_hva(kvm, address, pte)) - kvm_flush_remote_tlbs(kvm); - - KVM_MMU_UNLOCK(kvm); - srcu_read_unlock(&kvm->srcu, idx); -#endif } static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { struct kvm *kvm = mmu_notifier_to_kvm(mn); -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS const struct kvm_hva_range hva_range = { .start = range->start, .end = range->end, @@ -613,16 +593,9 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, .flush_on_ret = true, .may_block = mmu_notifier_range_blockable(range), }; -#else - int need_tlb_flush = 0, idx; -#endif trace_kvm_unmap_hva_range(range->start, range->end); -#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS - idx = srcu_read_lock(&kvm->srcu); -#endif - KVM_MMU_LOCK(kvm); /* * The count increase must become visible at unlock time as no @@ -649,20 +622,9 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, max(kvm->mmu_notifier_range_end, range->end); } -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS __kvm_handle_hva_range(kvm, &hva_range); -#else - need_tlb_flush = kvm_unmap_hva_range(kvm, range->start, range->end, - range->flags); - /* we've to flush the tlb before the pages can be freed */ - if (need_tlb_flush || kvm->tlbs_dirty) - kvm_flush_remote_tlbs(kvm); -#endif KVM_MMU_UNLOCK(kvm); -#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS - srcu_read_unlock(&kvm->srcu, idx); -#endif return 0; } @@ -696,27 +658,9 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, unsigned long start, unsigned long end) { -#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS - struct kvm *kvm = mmu_notifier_to_kvm(mn); - int young, idx; -#endif trace_kvm_age_hva(start, end); -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS return kvm_handle_hva_range(mn, start, end, __pte(0), kvm_age_gfn); -#else - idx = srcu_read_lock(&kvm->srcu); - KVM_MMU_LOCK(kvm); - - young = kvm_age_hva(kvm, start, end); - if (young) - kvm_flush_remote_tlbs(kvm); - - KVM_MMU_UNLOCK(kvm); - srcu_read_unlock(&kvm->srcu, idx); - - return young; -#endif } static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, @@ -724,11 +668,6 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, unsigned long start, unsigned long end) { -#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS - struct kvm *kvm = mmu_notifier_to_kvm(mn); - int young, idx; -#endif - trace_kvm_age_hva(start, end); /* @@ -744,41 +683,17 @@ static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, * cadence. If we find this inaccurate, we might come up with a * more sophisticated heuristic later. */ -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS return kvm_handle_hva_range_no_flush(mn, start, end, kvm_age_gfn); -#else - idx = srcu_read_lock(&kvm->srcu); - KVM_MMU_LOCK(kvm); - young = kvm_age_hva(kvm, start, end); - KVM_MMU_UNLOCK(kvm); - srcu_read_unlock(&kvm->srcu, idx); - - return young; -#endif } static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long address) { -#ifndef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS - struct kvm *kvm = mmu_notifier_to_kvm(mn); - int young, idx; -#endif trace_kvm_test_age_hva(address); -#ifdef KVM_ARCH_WANT_NEW_MMU_NOTIFIER_APIS return kvm_handle_hva_range_no_flush(mn, address, address + 1, kvm_test_age_gfn); -#else - idx = srcu_read_lock(&kvm->srcu); - KVM_MMU_LOCK(kvm); - young = kvm_test_age_hva(kvm, address); - KVM_MMU_UNLOCK(kvm); - srcu_read_unlock(&kvm->srcu, idx); - - return young; -#endif } static void kvm_mmu_notifier_release(struct mmu_notifier *mn, From patchwork Fri Apr 2 00:56:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461536 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=PrHz1XlM; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM910R6Sz9sXb for ; Fri, 2 Apr 2021 11:57:41 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234618AbhDBA5j (ORCPT ); Thu, 1 Apr 2021 20:57:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234264AbhDBA5V (ORCPT ); Thu, 1 Apr 2021 20:57:21 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DAC1C0613AF for ; Thu, 1 Apr 2021 17:57:17 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id k15so1101007ybh.6 for ; Thu, 01 Apr 2021 17:57:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=y/zVj+e4eh+grd7JhvH5qtp7OcgDLrdJbmGHERKxrC4=; b=PrHz1XlM0Hr7dj/3k+U6DFcxjEzY8eyRBNbENuiKfougxaC5m5hKN5hYidhk0YwqdT 1LEDnDhb/EBVy/LOAZAarZgzg7AH365xy5vZG1To6JtHcs5VoUrCierjI5vawTWHHZaI S3XJrNV82DxrEwSC+E9EFd6HQDUfpciTkbU/weWRx5LhLPdoxmeW914MN5/QzqtXY4YB EGen6yDxmxresT7lL0GH8/iVOR4NyDpNgSg+dJZSEStAy1itt5EUClsyWwtGjaKrokJ+ a/9d/Dr1ASfLg0C1b3mfphnteKHbNcGgzAklp33uDved6rN1h/vUnL6shLD7FYWQwAdY mIAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=y/zVj+e4eh+grd7JhvH5qtp7OcgDLrdJbmGHERKxrC4=; b=P/AOPM+9gSvnkE+pJLDwnpYha8CIKeasMwfwmYk0gioDKLnBlFqiLnwJG4mzXBN1ff QxjqSKq1UulJljndddUyDuxcuCp00mqY2Cjv14DXtVrSLcUx0qryEESxUhc+OYRKB4P2 yfGrr0FfTaP5I8k6WMQFp0QIx4mMSeE0Bc/Mj2HA6j77rqIjvU4PjpOEcq8lueRiZJjm U4ykJNfHEyVU4ehExXRJw9mvv6gyqjXypWYunwzAzEGvql9WXD7cdXxMaecwiYR76rcz /JdekKVtzl/Ml4zDh9Ugq/4PJ5Fw+WAMAmweA+3jrdQI6cS4+fuJUiR82QPeJw4mCSJX 5X0w== X-Gm-Message-State: AOAM533/NsS29zWz+B/U0WBQoXPEGygeWhKWDkl4u4c+WqirQa95WGPo 0JJbIvj9g4tgnXFPA1YkARQt56KRrlE= X-Google-Smtp-Source: ABdhPJxN7BC0gdMs77a2cNfqqIS58kJ1lGVFM4Q/Y24srpStDulvaRiUUdymISmdyQpNglBO93VbQ9/q4KM= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:2b08:: with SMTP id r8mr15849707ybr.194.1617325036437; Thu, 01 Apr 2021 17:57:16 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:55 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-8-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 07/10] KVM: Move MMU notifier's mmu_lock acquisition into common helper From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Acquire and release mmu_lock in the __kvm_handle_hva_range() helper instead of requiring the caller to do the same. This paves the way for future patches to take mmu_lock if and only if an overlapping memslot is found, without also having to introduce the on_lock() shenanigans used to manipulate the notifier count and sequence. No functional change intended. Signed-off-by: Sean Christopherson --- Note, the WARN_ON_ONCE that asserts on_lock and handler aren't both null is optimized out of all functions on recent gcc (for x86). I wanted to make it a BUILD_BUG_ON, but older versions of gcc aren't agressive/smart enough to optimize it out, and using __builtin_constant_p() to get it to build on older compilers prevents the assertion from firing on newer compilers when given bad input. I'm also a-ok dropping the check altogether, it just felt wrong having the semi-funky on_lock -> !handler combo without documenting that handler isn't allowed to be null in the common case. virt/kvm/kvm_main.c | 125 +++++++++++++++++++++++++++++--------------- 1 file changed, 82 insertions(+), 43 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2e809d73c7f1..25ecb5235e17 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -453,28 +453,57 @@ static void kvm_mmu_notifier_invalidate_range(struct mmu_notifier *mn, typedef bool (*hva_handler_t)(struct kvm *kvm, struct kvm_gfn_range *range); +typedef void (*on_lock_fn_t)(struct kvm *kvm, unsigned long start, + unsigned long end); + struct kvm_hva_range { unsigned long start; unsigned long end; pte_t pte; hva_handler_t handler; + on_lock_fn_t on_lock; bool flush_on_ret; bool may_block; }; +/* + * Use a dedicated stub instead of NULL to indicate that there is no callback + * function/handler. The compiler technically can't guarantee that a real + * function will have a non-zero address, and so it will generate code to + * check for !NULL, whereas comparing against a stub will be elided at compile + * time (unless the compiler is getting long in the tooth, e.g. gcc 4.9). + */ +static void kvm_null_fn(void) +{ + +} +#define IS_KVM_NULL_FN(fn) ((fn) == (void *)kvm_null_fn) + static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, const struct kvm_hva_range *range) { - struct kvm_memory_slot *slot; - struct kvm_memslots *slots; struct kvm_gfn_range gfn_range; + struct kvm_memory_slot *slot; + struct kvm_memslots *slots; bool ret = false; int i, idx; - lockdep_assert_held_write(&kvm->mmu_lock); + /* A null handler is allowed if and only if on_lock() is provided. */ + if (WARN_ON_ONCE(IS_KVM_NULL_FN(range->on_lock) && + IS_KVM_NULL_FN(range->handler))) + return 0; + + KVM_MMU_LOCK(kvm); idx = srcu_read_lock(&kvm->srcu); + if (!IS_KVM_NULL_FN(range->on_lock)) { + range->on_lock(kvm, range->start, range->end); + + if (IS_KVM_NULL_FN(range->handler)) + goto out_unlock; + } + for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); kvm_for_each_memslot(slot, slots) { @@ -510,6 +539,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, if (range->flush_on_ret && (ret || kvm->tlbs_dirty)) kvm_flush_remote_tlbs(kvm); +out_unlock: + KVM_MMU_UNLOCK(kvm); + srcu_read_unlock(&kvm->srcu, idx); /* The notifiers are averse to booleans. :-( */ @@ -528,16 +560,12 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, .end = end, .pte = pte, .handler = handler, + .on_lock = (void *)kvm_null_fn, .flush_on_ret = true, .may_block = false, }; - int ret; - KVM_MMU_LOCK(kvm); - ret = __kvm_handle_hva_range(kvm, &range); - KVM_MMU_UNLOCK(kvm); - - return ret; + return __kvm_handle_hva_range(kvm, &range); } static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn, @@ -551,16 +579,12 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn .end = end, .pte = __pte(0), .handler = handler, + .on_lock = (void *)kvm_null_fn, .flush_on_ret = false, .may_block = false, }; - int ret; - KVM_MMU_LOCK(kvm); - ret = __kvm_handle_hva_range(kvm, &range); - KVM_MMU_UNLOCK(kvm); - - return ret; + return __kvm_handle_hva_range(kvm, &range); } static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, struct mm_struct *mm, @@ -581,22 +605,9 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, kvm_handle_hva_range(mn, address, address + 1, pte, kvm_set_spte_gfn); } -static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, - const struct mmu_notifier_range *range) +static void kvm_inc_notifier_count(struct kvm *kvm, unsigned long start, + unsigned long end) { - struct kvm *kvm = mmu_notifier_to_kvm(mn); - const struct kvm_hva_range hva_range = { - .start = range->start, - .end = range->end, - .pte = __pte(0), - .handler = kvm_unmap_gfn_range, - .flush_on_ret = true, - .may_block = mmu_notifier_range_blockable(range), - }; - - trace_kvm_unmap_hva_range(range->start, range->end); - - KVM_MMU_LOCK(kvm); /* * The count increase must become visible at unlock time as no * spte can be established without taking the mmu_lock and @@ -604,8 +615,8 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, */ kvm->mmu_notifier_count++; if (likely(kvm->mmu_notifier_count == 1)) { - kvm->mmu_notifier_range_start = range->start; - kvm->mmu_notifier_range_end = range->end; + kvm->mmu_notifier_range_start = start; + kvm->mmu_notifier_range_end = end; } else { /* * Fully tracking multiple concurrent ranges has dimishing @@ -617,24 +628,36 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, * complete. */ kvm->mmu_notifier_range_start = - min(kvm->mmu_notifier_range_start, range->start); + min(kvm->mmu_notifier_range_start, start); kvm->mmu_notifier_range_end = - max(kvm->mmu_notifier_range_end, range->end); + max(kvm->mmu_notifier_range_end, end); } - - __kvm_handle_hva_range(kvm, &hva_range); - - KVM_MMU_UNLOCK(kvm); - - return 0; } -static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, +static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { struct kvm *kvm = mmu_notifier_to_kvm(mn); + const struct kvm_hva_range hva_range = { + .start = range->start, + .end = range->end, + .pte = __pte(0), + .handler = kvm_unmap_gfn_range, + .on_lock = kvm_inc_notifier_count, + .flush_on_ret = true, + .may_block = mmu_notifier_range_blockable(range), + }; - KVM_MMU_LOCK(kvm); + trace_kvm_unmap_hva_range(range->start, range->end); + + __kvm_handle_hva_range(kvm, &hva_range); + + return 0; +} + +static void kvm_dec_notifier_count(struct kvm *kvm, unsigned long start, + unsigned long end) +{ /* * This sequence increase will notify the kvm page fault that * the page that is going to be mapped in the spte could have @@ -648,7 +671,23 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, * in conjunction with the smp_rmb in mmu_notifier_retry(). */ kvm->mmu_notifier_count--; - KVM_MMU_UNLOCK(kvm); +} + +static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, + const struct mmu_notifier_range *range) +{ + struct kvm *kvm = mmu_notifier_to_kvm(mn); + const struct kvm_hva_range hva_range = { + .start = range->start, + .end = range->end, + .pte = __pte(0), + .handler = (void *)kvm_null_fn, + .on_lock = kvm_dec_notifier_count, + .flush_on_ret = true, + .may_block = mmu_notifier_range_blockable(range), + }; + + __kvm_handle_hva_range(kvm, &hva_range); BUG_ON(kvm->mmu_notifier_count < 0); } From patchwork Fri Apr 2 00:56:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461537 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=oYFGYSyo; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM932bTyz9sXb for ; Fri, 2 Apr 2021 11:57:43 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234670AbhDBA5m (ORCPT ); Thu, 1 Apr 2021 20:57:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234200AbhDBA5V (ORCPT ); Thu, 1 Apr 2021 20:57:21 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7ED34C0613B4 for ; Thu, 1 Apr 2021 17:57:19 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id f75so7596068yba.8 for ; Thu, 01 Apr 2021 17:57:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=0/ecd8NmYTmeaiIYnkr6nuUXDKg3hRMT4+OQ0nyX4DE=; b=oYFGYSyo5B35UKJnfDl/ptvTjCVW8KNCbB5+juLbSKFFyhdI1o15HSF2gl9NbqxKmR WTeg8loXzFmw7sYPBcDz9SnE1ep/TQ8R11Jo0v6/+BXHFnVXD5GetugNTwD0b5Bqt1L/ mx+Wv2fDdgWFKAi4FQThRwTMjWA7vs3uA/I3KI9C+xT2UVJEymjeiJxY9WdDCMtNY1Gh GqU4Bg7UboHpTTQPVmttgw2eCvKNRYVbBfciANMUqXXzeQ8x+WKQOPRt9i/fgo0c9Wxl r3IbL0WnEuE8LcmbAeXvX6nCRCfN/9RtrNuT+vA3q5p7oxMFeV4Y6S1fLZYoQC3OglWL 5a1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=0/ecd8NmYTmeaiIYnkr6nuUXDKg3hRMT4+OQ0nyX4DE=; b=K+yWx4TFgMDfpoYZv52c8RlxM9SsinsUzhZByMnJeh+ctHjkUROzCOlm+vWJa4Vcb2 2A5DoWasTZDps64wEhpVID0DzvlNIArUpurXFhlKsBELdZJMAu6pvnw11PF3sg6isJDj YM7a4L7sqswphsjq4LpI8hOCTc6zj1fnqH4k/hST4bol+diMSMOgBAE/6LM6h5lS3pWR 5exK1mIbtn9XKoUL/yl6zKsiZzeGm4glZCFcybksQA2wcu9cCeUkambcBX7ZbB1+rJe9 Ep+k6Qud6q1KVK5rBWZrk3KT/sWuj1iliiyc19mi8EiN0Mp9umc2cxNElgKalBTAfAjM K6HQ== X-Gm-Message-State: AOAM531hlosV1h3TJxX5AxZWB+RoWpl39sF+fZbf+oTkUQUd7dEqG8H1 UQsmfdg6JSow8Nkl+w58XlyJVGrlAXg= X-Google-Smtp-Source: ABdhPJzt4o9P3PY2HrRiNrmz7rdm15sQ6vGfWjEexZR+eCJ9LN9XguP8ZdUR1MsXg+JiybkoEf+aAjrbxig= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:74d2:: with SMTP id p201mr16375842ybc.406.1617325038741; Thu, 01 Apr 2021 17:57:18 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:56 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-9-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 08/10] KVM: Take mmu_lock when handling MMU notifier iff the hva hits a memslot From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Defer acquiring mmu_lock in the MMU notifier paths until a "hit" has been detected in the memslots, i.e. don't take the lock for notifications that don't affect the guest. For small VMs, spurious locking is a minor annoyance. And for "volatile" setups where the majority of notifications _are_ relevant, this barely qualifies as an optimization. But, for large VMs (hundreds of threads) with static setups, e.g. no page migration, no swapping, etc..., the vast majority of MMU notifier callbacks will be unrelated to the guest, e.g. will often be in response to the userspace VMM adjusting its own virtual address space. In such large VMs, acquiring mmu_lock can be painful as it blocks vCPUs from handling page faults. In some scenarios it can even be "fatal" in the sense that it causes unacceptable brownouts, e.g. when rebuilding huge pages after live migration, a significant percentage of vCPUs will be attempting to handle page faults. x86's TDP MMU implementation is especially susceptible to spurious locking due it taking mmu_lock for read when handling page faults. Because rwlock is fair, a single writer will stall future readers, while the writer is itself stalled waiting for in-progress readers to complete. This is exacerbated by the MMU notifiers often firing multiple times in quick succession, e.g. moving a page will (always?) invoke three separate notifiers: .invalidate_range_start(), invalidate_range_end(), and .change_pte(). Unnecessarily taking mmu_lock each time means even a single spurious sequence can be problematic. Note, this optimizes only the unpaired callbacks. Optimizing the .invalidate_range_{start,end}() pairs is more complex and will be done in a future patch. Suggested-by: Ben Gardon Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 25ecb5235e17..f6697ad741ed 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -482,10 +482,10 @@ static void kvm_null_fn(void) static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, const struct kvm_hva_range *range) { + bool ret = false, locked = false; struct kvm_gfn_range gfn_range; struct kvm_memory_slot *slot; struct kvm_memslots *slots; - bool ret = false; int i, idx; /* A null handler is allowed if and only if on_lock() is provided. */ @@ -493,11 +493,13 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, IS_KVM_NULL_FN(range->handler))) return 0; - KVM_MMU_LOCK(kvm); - idx = srcu_read_lock(&kvm->srcu); + /* The on_lock() path does not yet support lock elision. */ if (!IS_KVM_NULL_FN(range->on_lock)) { + locked = true; + KVM_MMU_LOCK(kvm); + range->on_lock(kvm, range->start, range->end); if (IS_KVM_NULL_FN(range->handler)) @@ -532,6 +534,10 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); gfn_range.slot = slot; + if (!locked) { + locked = true; + KVM_MMU_LOCK(kvm); + } ret |= range->handler(kvm, &gfn_range); } } @@ -540,7 +546,8 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, kvm_flush_remote_tlbs(kvm); out_unlock: - KVM_MMU_UNLOCK(kvm); + if (locked) + KVM_MMU_UNLOCK(kvm); srcu_read_unlock(&kvm->srcu, idx); From patchwork Fri Apr 2 00:56:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461538 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=rqAz0oKw; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM9L2KPkz9t0Y for ; Fri, 2 Apr 2021 11:57:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234844AbhDBA55 (ORCPT ); Thu, 1 Apr 2021 20:57:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234323AbhDBA5W (ORCPT ); Thu, 1 Apr 2021 20:57:22 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9EDCC061797 for ; Thu, 1 Apr 2021 17:57:21 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id u128so7588775ybf.12 for ; Thu, 01 Apr 2021 17:57:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=C3aZnevHleE8p3XJKln/4r7iVsuYY+IWqHwKw7jbYZw=; b=rqAz0oKwg+xO+mqrd/TjAqHUu8EArbD/uuJGl7OKeBcY3EwrHR9gQRQe0GvHDQcbO1 TIhniaZ8jINCps9CIjZsN7kHjzSXtmyQjArZQoJmr6yXjaYfIPpYiVrCT4HaPfh1j2yx 1mTrVvzXlMCQhkTCOXnrqYezeg2n0qnNstVACti7MGtSCc7Mn0xeApxKkkFdybx8Qwa4 dW6zby8GfzrKIHI+e4hSp5xu7wY/rLusRKL8ViD/EbLaC7KG2YECDq4QlOssGCHrxNP9 wgUVLhXlSm+YIFfm/2xqy0d1HCwgoyoWZ78lO5MGEaw+LGDqY6eZwfL3ZC4K6ZeIge1Z H4vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=C3aZnevHleE8p3XJKln/4r7iVsuYY+IWqHwKw7jbYZw=; b=nO1fZXq99VYkCELvyV/P84qFg2iRELMfP8M2f2SBRHkz87Pljb/UP0i8XsYj7Zec9l /ij8AeoRaCAlW+0MQbcuiGZuWyHpgsiivVoXi9jhIKap1sUoM4GlRYyas37YDCygUCo0 DRXaR5Ri92eextQLtNmzE1wN/l366TX6ZZ47gUMYnKzuWWKA0oKs3I5FwdBpkJC8ysMk AVzMqCjJznoVLJOSRG5bJovvCJ/NiaTHA54U3RO1QG+Y9XFSgsZsSKFrQNS7okAcx2vi Z0Pi1OkflK8pmoZ/4eVADiFGuqRv156lVOU3rnz57jm31O2S8t16SLx7MSzdynDnlc4b ZTYA== X-Gm-Message-State: AOAM532mFi8YFBp25raHK40DJ7TIBRxuyqv6RikM+0yR5O9tYWOtnSVk 9DB4gUuWwCzoPHaWIuxIRQK4xlr7Dt0= X-Google-Smtp-Source: ABdhPJwqtEf1xCZRZlu0eDHw+f5rnx5wV3GYiZnH8BaCyZHgsIqWaN0RCtqefaVKt0oXIFZucaajI8x4kac= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:74d2:: with SMTP id p201mr16376013ybc.406.1617325041100; Thu, 01 Apr 2021 17:57:21 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:57 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-10-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 09/10] KVM: Don't take mmu_lock for range invalidation unless necessary From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Avoid taking mmu_lock for unrelated .invalidate_range_{start,end}() notifications. Because mmu_notifier_count must be modified while holding mmu_lock for write, and must always be paired across start->end to stay balanced, lock elision must happen in both or none. To meet that requirement, add a rwsem to prevent memslot updates across range_start() and range_end(). Use a rwsem instead of a rwlock since most notifiers _allow_ blocking, and the lock will be endl across the entire start() ... end() sequence. If anything in the sequence sleeps, including the caller or a different notifier, holding the spinlock would be disastrous. For notifiers that _disallow_ blocking, e.g. OOM reaping, simply go down the slow path of unconditionally acquiring mmu_lock. The sane alternative would be to try to acquire the lock and force the notifier to retry on failure. But since OOM is currently the _only_ scenario where blocking is disallowed attempting to optimize a guest that has been marked for death is pointless. Unconditionally define and use mmu_notifier_slots_lock in the memslots code, purely to avoid more #ifdefs. The overhead of acquiring the lock is negligible when the lock is uncontested, which will always be the case when the MMU notifiers are not used. Note, technically flag-only memslot updates could be allowed in parallel, but stalling a memslot update for a relatively short amount of time is not a scalability issue, and this is all more than complex enough. Based heavily on code from Ben Gardon. Suggested-by: Ben Gardon Signed-off-by: Sean Christopherson --- include/linux/kvm_host.h | 6 ++- virt/kvm/kvm_main.c | 96 +++++++++++++++++++++++++++++++--------- 2 files changed, 80 insertions(+), 22 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 40ac2d40bb5a..bc3dd2838bb8 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -472,6 +472,7 @@ struct kvm { #endif /* KVM_HAVE_MMU_RWLOCK */ struct mutex slots_lock; + struct rw_semaphore mmu_notifier_slots_lock; struct mm_struct *mm; /* userspace tied to this vm */ struct kvm_memslots __rcu *memslots[KVM_ADDRESS_SPACE_NUM]; struct kvm_vcpu *vcpus[KVM_MAX_VCPUS]; @@ -660,8 +661,9 @@ static inline struct kvm_memslots *__kvm_memslots(struct kvm *kvm, int as_id) { as_id = array_index_nospec(as_id, KVM_ADDRESS_SPACE_NUM); return srcu_dereference_check(kvm->memslots[as_id], &kvm->srcu, - lockdep_is_held(&kvm->slots_lock) || - !refcount_read(&kvm->users_count)); + lockdep_is_held(&kvm->slots_lock) || + lockdep_is_held(&kvm->mmu_notifier_slots_lock) || + !refcount_read(&kvm->users_count)); } static inline struct kvm_memslots *kvm_memslots(struct kvm *kvm) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f6697ad741ed..af28f39817a5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -462,6 +462,7 @@ struct kvm_hva_range { pte_t pte; hva_handler_t handler; on_lock_fn_t on_lock; + bool must_lock; bool flush_on_ret; bool may_block; }; @@ -479,6 +480,25 @@ static void kvm_null_fn(void) } #define IS_KVM_NULL_FN(fn) ((fn) == (void *)kvm_null_fn) + +/* Acquire mmu_lock if necessary. Returns %true if @handler is "null" */ +static __always_inline bool kvm_mmu_lock_and_check_handler(struct kvm *kvm, + const struct kvm_hva_range *range, + bool *locked) +{ + if (*locked) + return false; + + *locked = true; + + KVM_MMU_LOCK(kvm); + + if (!IS_KVM_NULL_FN(range->on_lock)) + range->on_lock(kvm, range->start, range->end); + + return IS_KVM_NULL_FN(range->handler); +} + static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, const struct kvm_hva_range *range) { @@ -495,16 +515,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, idx = srcu_read_lock(&kvm->srcu); - /* The on_lock() path does not yet support lock elision. */ - if (!IS_KVM_NULL_FN(range->on_lock)) { - locked = true; - KVM_MMU_LOCK(kvm); - - range->on_lock(kvm, range->start, range->end); - - if (IS_KVM_NULL_FN(range->handler)) - goto out_unlock; - } + if (range->must_lock && + kvm_mmu_lock_and_check_handler(kvm, range, &locked)) + goto out_unlock; for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); @@ -534,10 +547,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); gfn_range.slot = slot; - if (!locked) { - locked = true; - KVM_MMU_LOCK(kvm); - } + if (kvm_mmu_lock_and_check_handler(kvm, range, &locked)) + goto out_unlock; + ret |= range->handler(kvm, &gfn_range); } } @@ -568,6 +580,7 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, .pte = pte, .handler = handler, .on_lock = (void *)kvm_null_fn, + .must_lock = false, .flush_on_ret = true, .may_block = false, }; @@ -587,6 +600,7 @@ static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn .pte = __pte(0), .handler = handler, .on_lock = (void *)kvm_null_fn, + .must_lock = false, .flush_on_ret = false, .may_block = false, }; @@ -603,11 +617,15 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, trace_kvm_set_spte_hva(address); /* - * .change_pte() must be bookended by .invalidate_range_{start,end}(), - * and so always runs with an elevated notifier count. This obviates - * the need to bump the sequence count. + * .change_pte() must be bookended by .invalidate_range_{start,end}(). + * If mmu_notifier_count is zero, then start() didn't find a relevant + * memslot and wasn't forced down the slow path; rechecking here is + * unnecessary. This can only occur if memslot updates are blocked. */ - WARN_ON_ONCE(!kvm->mmu_notifier_count); + if (!kvm->mmu_notifier_count) { + lockdep_assert_held(&kvm->mmu_notifier_slots_lock); + return; + } kvm_handle_hva_range(mn, address, address + 1, pte, kvm_set_spte_gfn); } @@ -644,6 +662,7 @@ static void kvm_inc_notifier_count(struct kvm *kvm, unsigned long start, static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { + bool blockable = mmu_notifier_range_blockable(range); struct kvm *kvm = mmu_notifier_to_kvm(mn); const struct kvm_hva_range hva_range = { .start = range->start, @@ -651,12 +670,29 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn, .pte = __pte(0), .handler = kvm_unmap_gfn_range, .on_lock = kvm_inc_notifier_count, + .must_lock = !blockable, .flush_on_ret = true, - .may_block = mmu_notifier_range_blockable(range), + .may_block = blockable, }; trace_kvm_unmap_hva_range(range->start, range->end); + /* + * Prevent memslot modification between range_start() and range_end() + * so that conditionally locking provides the same result in both + * functions. Without that guarantee, the mmu_notifier_count + * adjustments will be imbalanced. + * + * Skip the memslot-lookup lock elision (set @must_lock above) to avoid + * having to take the semaphore on non-blockable calls, e.g. OOM kill. + * The complexity required to handle conditional locking for this case + * is not worth the marginal benefits, the VM is likely doomed anyways. + * + * Pairs with the unlock in range_end(). + */ + if (blockable) + down_read(&kvm->mmu_notifier_slots_lock); + __kvm_handle_hva_range(kvm, &hva_range); return 0; @@ -683,6 +719,7 @@ static void kvm_dec_notifier_count(struct kvm *kvm, unsigned long start, static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, const struct mmu_notifier_range *range) { + bool blockable = mmu_notifier_range_blockable(range); struct kvm *kvm = mmu_notifier_to_kvm(mn); const struct kvm_hva_range hva_range = { .start = range->start, @@ -690,12 +727,17 @@ static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, .pte = __pte(0), .handler = (void *)kvm_null_fn, .on_lock = kvm_dec_notifier_count, + .must_lock = !blockable, .flush_on_ret = true, - .may_block = mmu_notifier_range_blockable(range), + .may_block = blockable, }; __kvm_handle_hva_range(kvm, &hva_range); + /* Pairs with the lock in range_start(). */ + if (blockable) + up_read(&kvm->mmu_notifier_slots_lock); + BUG_ON(kvm->mmu_notifier_count < 0); } @@ -908,6 +950,7 @@ static struct kvm *kvm_create_vm(unsigned long type) mutex_init(&kvm->lock); mutex_init(&kvm->irq_lock); mutex_init(&kvm->slots_lock); + init_rwsem(&kvm->mmu_notifier_slots_lock); INIT_LIST_HEAD(&kvm->devices); BUILD_BUG_ON(KVM_MEM_SLOTS_NUM > SHRT_MAX); @@ -1028,6 +1071,16 @@ static void kvm_destroy_vm(struct kvm *kvm) kvm_coalesced_mmio_free(kvm); #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER) mmu_notifier_unregister(&kvm->mmu_notifier, kvm->mm); + /* + * Reset the lock used to prevent memslot updates between MMU notifier + * invalidate_range_start() and invalidate_range_end(). At this point, + * no more MMU notifiers will run and pending calls to ...start() have + * completed. But, the lock could still be held if KVM's notifier was + * removed between ...start() and ...end(). No threads can be waiting + * on the lock as the last reference on KVM has been dropped. If the + * lock is still held, freeing memslots will deadlock. + */ + init_rwsem(&kvm->mmu_notifier_slots_lock); #else kvm_arch_flush_shadow_all(kvm); #endif @@ -1279,7 +1332,10 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, WARN_ON(gen & KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS); slots->generation = gen | KVM_MEMSLOT_GEN_UPDATE_IN_PROGRESS; + down_write(&kvm->mmu_notifier_slots_lock); rcu_assign_pointer(kvm->memslots[as_id], slots); + up_write(&kvm->mmu_notifier_slots_lock); + synchronize_srcu_expedited(&kvm->srcu); /* From patchwork Fri Apr 2 00:56:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1461539 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=PRhdN0c+; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4FBM9M1yGQz9t1J for ; Fri, 2 Apr 2021 11:57:59 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234323AbhDBA55 (ORCPT ); Thu, 1 Apr 2021 20:57:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234523AbhDBA5c (ORCPT ); Thu, 1 Apr 2021 20:57:32 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15FAFC0613AA for ; Thu, 1 Apr 2021 17:57:24 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id i6so7669543ybk.2 for ; Thu, 01 Apr 2021 17:57:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=FTDHlh7r2o77AAwrA0E9qeoNYgpcbLuRaDEVNkBVwHg=; b=PRhdN0c+oVeC84FkNtK2OfzN4+x6lsMyI2b4OVZY6LEzjuMMZ4+EfdsvsDxWmIs9Nq zAmlFPOUqakGcwtEzlGlIW87jbHzydFRrCawVR4WJj9ElDeMOXggR/xUf8WnMbUKoAh3 lEpJ//3eALC8qqX6jN5ZPFveFaoAcp0W8wTiEBs2aObLKZCHeuoH095sVwHNZJ/PswMS FVyxeb81nksnb4YTup/Ov7PJ2+WqfzrIlxJ8okqoIMrlmmVrp7U8IQ90sZ+PuN+Ped4m oTDFOV8oL/hOxYcLRJggJQ+r9Hb1KLnULxlCGV+FPKgt3oJpOweN7336d7VL5zPVquco 5veQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=FTDHlh7r2o77AAwrA0E9qeoNYgpcbLuRaDEVNkBVwHg=; b=gttGBF1fAZXoJfdMITxI1tdyuGzjqzIBce+tZ8k2AwsuTMIzgePnQu6TuV0PxoMmYY Xbn9skymF9CcgrgzDwbgdOQKWObeVT6X4Mmvc9ZbOsnoyw1eoQYNckTftfLGd9l+UzGu eIHttztJfrZuAPiTNSxd/LQ2YN8KsDpUS3DZmW4U8/YFM6shQqgr7Ct2TJbp/XwYCO4s CXJsabQA3N6eVqYrGfR0N+Cr9fe14NmsGkX6Q7wchUTMBAAVt9oEd+zk4yHrch8JDq8K fkReF6LEdcHxj7rFh2b+gLEuYspweRG+pwnT9Z2yOuTx8nMMUm/R1cCnxSfREkBNVOhP 8Zbw== X-Gm-Message-State: AOAM530Kz5V1GQ8DubVEm8usoRiXGQ0yAk4GUyMgaxeUhSYw7/qADaJR t68NiGzxHbSKYWTqAW9hHWRrkHQKidc= X-Google-Smtp-Source: ABdhPJwqaRwScDcukbswHxcp5MoAwCeA4G6inLsOkqCP5OYhV3sDapC/0YKNmrNh7+I1S9Rwld4HhjfDqxk= X-Received: from seanjc798194.pdx.corp.google.com ([2620:15c:f:10:c0b4:8b8:bb34:6a56]) (user=seanjc job=sendgmr) by 2002:a25:cc81:: with SMTP id l123mr8448930ybf.272.1617325043283; Thu, 01 Apr 2021 17:57:23 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 1 Apr 2021 17:56:58 -0700 In-Reply-To: <20210402005658.3024832-1-seanjc@google.com> Message-Id: <20210402005658.3024832-11-seanjc@google.com> Mime-Version: 1.0 References: <20210402005658.3024832-1-seanjc@google.com> X-Mailer: git-send-email 2.31.0.208.g409f899ff0-goog Subject: [PATCH v2 10/10] KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if possible From: Sean Christopherson To: Marc Zyngier , Huacai Chen , Aleksandar Markovic , Paul Mackerras , Paolo Bonzini Cc: James Morse , Julien Thierry , Suzuki K Poulose , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-mips@vger.kernel.org, kvm@vger.kernel.org, kvm-ppc@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gardon Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Let the TDP MMU yield when unmapping a range in response to a MMU notification, if yielding is allowed by said notification. There is no reason to disallow yielding in this case, and in theory the range being invalidated could be quite large. Cc: Ben Gardon Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/tdp_mmu.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 7797d24f0937..dd17d9673ff2 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -885,7 +885,7 @@ bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, for_each_tdp_mmu_root(kvm, root, range->slot->as_id) flush |= zap_gfn_range(kvm, root, range->start, range->end, - false, flush); + range->may_block, flush); return flush; } @@ -903,6 +903,10 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm, rcu_read_lock(); + /* + * Don't support rescheduling, none of the MMU notifiers that funnel + * into this helper allow blocking; it'd be dead, wasteful code. + */ for_each_tdp_mmu_root(kvm, root, range->slot->as_id) { tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) ret |= handler(kvm, &iter, range);