From patchwork Fri May 26 23:44:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786690 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=hvb8rxkE; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShQ91HNQz20Q8 for ; Sat, 27 May 2023 09:46:09 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShQ90XBCz3fKm for ; Sat, 27 May 2023 09:46:09 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=hvb8rxkE; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3aurxzaykdoaawbjcqiqqing.eqonkpwzrre-fgxnkuvu.qbncdu.qti@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=hvb8rxkE; dkim-atps=neutral Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNd0R77z3c8r for ; Sat, 27 May 2023 09:44:47 +1000 (AEST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-bac6a453dd5so1730066276.2 for ; Fri, 26 May 2023 16:44:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144681; x=1687736681; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kGsSI0lPP8ERxd6p+DSa9eEPYGyuTkZRYkJGde6ThZo=; b=hvb8rxkERCbIh1bZc0q4NRTJ+rJW/WU+U0AR8hQmLag3Ue+qSWO6cakjY3a2+aFQIE x+dhqQS7EQpNz5Tcg/879WfdtsT2g9m9IEV/ar3WsH6hpivs+AgK+n3cV9xivqmDDPOm KKmHGpOcNrwi3l5Yw+fAvneY5R+j+Mh2hjIBHhqkVzBeZTwwqArwApBbzG0OmYJUYGc6 DRb2e1iDRkrr5VQNcGKChaSGvbQt6NfDiW4utZo1zEzUWIhsS0vIBmPt/jCICfYU7AGj 2EXBMubac0epYeTWkMmnp5dg4awgt7i7k/gPemJKnVVjNCrCZqeAJZJODgbRp4LsgzpY rjvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144681; x=1687736681; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kGsSI0lPP8ERxd6p+DSa9eEPYGyuTkZRYkJGde6ThZo=; b=AmOj+CJMtjWGpuod0PMxZ16bhFDTr9t8TePQ0we8Hrgl5gwchJwh78MUO+6mXmJNKJ lIjz6vsdOOeke/AicumXfFpmHhX2ECLJEKIoU/dhv+81kCE1OhJGUeDBA8M0IjmrC9sH LXGO38W3c9iHQf6g/WcwNLitXqKXn8giufekIWnRQ2JyafSAt17eya3oNXT27/pArfbq l3eUJ6bYB/gNF+MU1a5MHcHyz6qqSnCzaQ6AvH2DljbyRS7u1SNbwCuuXqR8T4If+pUk EPSb8EoTKVoN9OQ4FRFHx6L7kzuBju60rCSdOT5UNvYuBW9+mXCMCIizMwHd9NvDsewN o2Nw== X-Gm-Message-State: AC+VfDxHvl8uIM1KD+QARBkILc/cj0d/rdOj4GcZ7Kwx6jWUOgKhxSB2 rLECbgkgvIRnPVTtItRxywXWwArgeNk= X-Google-Smtp-Source: ACHHUZ6fi8C05HO0+KMDQENWGjZ4c7vAIwn8WK8PM0ILwAQbwBWVWIEplGkrZ1gydvcndVbPZA/WyjMMkr8= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a05:6902:50d:b0:ba8:3e2d:58f8 with SMTP id x13-20020a056902050d00b00ba83e2d58f8mr1858255ybs.5.1685144681575; Fri, 26 May 2023 16:44:41 -0700 (PDT) Date: Fri, 26 May 2023 17:44:26 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-2-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 01/10] mm/kvm: add mmu_notifier_ops->test_clear_young() From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Add mmu_notifier_ops->test_clear_young() to supersede test_young() and clear_young(). test_clear_young() has a fast path, which if supported, allows its callers to safely clear the accessed bit without taking kvm->mmu_lock. The fast path requires arch-specific code that generally relies on RCU and CAS: the former protects KVM page tables from being freed while the latter clears the accessed bit atomically against both the hardware and other software page table walkers. If the fast path is unsupported, test_clear_young() falls back to the existing slow path where kvm->mmu_lock is then taken. test_clear_young() can also operate on a range of KVM PTEs individually according to a bitmap, if the caller provides it. Signed-off-by: Yu Zhao --- include/linux/kvm_host.h | 22 +++++++++++ include/linux/mmu_notifier.h | 52 ++++++++++++++++++++++++ mm/mmu_notifier.c | 24 ++++++++++++ virt/kvm/kvm_main.c | 76 +++++++++++++++++++++++++++++++++++- 4 files changed, 173 insertions(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 0e571e973bc2..374262545f96 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -258,6 +258,7 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #ifdef KVM_ARCH_WANT_MMU_NOTIFIER struct kvm_gfn_range { struct kvm_memory_slot *slot; + void *args; gfn_t start; gfn_t end; pte_t pte; @@ -267,6 +268,27 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range); +bool kvm_should_clear_young(struct kvm_gfn_range *range, gfn_t gfn); +bool kvm_arch_test_clear_young(struct kvm *kvm, struct kvm_gfn_range *range); +#endif + +/* + * Architectures that implement kvm_arch_test_clear_young() should override + * kvm_arch_has_test_clear_young(). + * + * kvm_arch_has_test_clear_young() is allowed to return false positive, i.e., it + * can return true if kvm_arch_test_clear_young() is supported but disabled due + * to some runtime constraint. In this case, kvm_arch_test_clear_young() should + * return true; otherwise, it should return false. + * + * For each young KVM PTE, kvm_arch_test_clear_young() should call + * kvm_should_clear_young() to decide whether to clear the accessed bit. + */ +#ifndef kvm_arch_has_test_clear_young +static inline bool kvm_arch_has_test_clear_young(void) +{ + return false; +} #endif enum { diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index 64a3e051c3c4..dfdbb370682d 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -60,6 +60,8 @@ enum mmu_notifier_event { }; #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0) +#define MMU_NOTIFIER_RANGE_LOCKLESS (1 << 1) +#define MMU_NOTIFIER_RANGE_YOUNG (1 << 2) struct mmu_notifier_ops { /* @@ -122,6 +124,10 @@ struct mmu_notifier_ops { struct mm_struct *mm, unsigned long address); + int (*test_clear_young)(struct mmu_notifier *mn, struct mm_struct *mm, + unsigned long start, unsigned long end, + bool clear, unsigned long *bitmap); + /* * change_pte is called in cases that pte mapping to page is changed: * for example, when ksm remaps pte to point to a new shared page. @@ -392,6 +398,9 @@ extern int __mmu_notifier_clear_young(struct mm_struct *mm, unsigned long end); extern int __mmu_notifier_test_young(struct mm_struct *mm, unsigned long address); +extern int __mmu_notifier_test_clear_young(struct mm_struct *mm, + unsigned long start, unsigned long end, + bool clear, unsigned long *bitmap); extern void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address, pte_t pte); extern int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *r); @@ -440,6 +449,35 @@ static inline int mmu_notifier_test_young(struct mm_struct *mm, return 0; } +/* + * mmu_notifier_test_clear_young() returns nonzero if any of the KVM PTEs within + * a given range was young. Specifically, it returns MMU_NOTIFIER_RANGE_LOCKLESS + * if the fast path was successful, MMU_NOTIFIER_RANGE_YOUNG otherwise. + * + * The last parameter to the function is a bitmap and only the fast path + * supports it: if it is NULL, the function falls back to the slow path if the + * fast path was unsuccessful; otherwise, the function bails out. + * + * The bitmap has the following specifications: + * 1. The number of bits should be at least (end-start)/PAGE_SIZE. + * 2. The offset of each bit should be relative to the end, i.e., the offset + * corresponding to addr should be (end-addr)/PAGE_SIZE-1. This is convenient + * for batching while forward looping. + * + * When testing, this function sets the corresponding bit in the bitmap for each + * young KVM PTE. When clearing, this function clears the accessed bit for each + * young KVM PTE whose corresponding bit in the bitmap is set. + */ +static inline int mmu_notifier_test_clear_young(struct mm_struct *mm, + unsigned long start, unsigned long end, + bool clear, unsigned long *bitmap) +{ + if (mm_has_notifiers(mm)) + return __mmu_notifier_test_clear_young(mm, start, end, clear, bitmap); + + return 0; +} + static inline void mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address, pte_t pte) { @@ -684,12 +722,26 @@ static inline int mmu_notifier_clear_flush_young(struct mm_struct *mm, return 0; } +static inline int mmu_notifier_clear_young(struct mm_struct *mm, + unsigned long start, + unsigned long end) +{ + return 0; +} + static inline int mmu_notifier_test_young(struct mm_struct *mm, unsigned long address) { return 0; } +static inline int mmu_notifier_test_clear_young(struct mm_struct *mm, + unsigned long start, unsigned long end, + bool clear, unsigned long *bitmap) +{ + return 0; +} + static inline void mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address, pte_t pte) { diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 50c0dde1354f..7e6aba4bddcb 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -424,6 +424,30 @@ int __mmu_notifier_test_young(struct mm_struct *mm, return young; } +int __mmu_notifier_test_clear_young(struct mm_struct *mm, + unsigned long start, unsigned long end, + bool clear, unsigned long *bitmap) +{ + int idx; + struct mmu_notifier *mn; + int young = 0; + + idx = srcu_read_lock(&srcu); + + hlist_for_each_entry_srcu(mn, &mm->notifier_subscriptions->list, hlist, + srcu_read_lock_held(&srcu)) { + if (mn->ops->test_clear_young) + young |= mn->ops->test_clear_young(mn, mm, start, end, clear, bitmap); + + if (young && !clear) + break; + } + + srcu_read_unlock(&srcu, idx); + + return young; +} + void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address, pte_t pte) { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 51e4882d0873..31ee58754b19 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -541,6 +541,7 @@ typedef void (*on_lock_fn_t)(struct kvm *kvm, unsigned long start, typedef void (*on_unlock_fn_t)(struct kvm *kvm); struct kvm_hva_range { + void *args; unsigned long start; unsigned long end; pte_t pte; @@ -549,6 +550,7 @@ struct kvm_hva_range { on_unlock_fn_t on_unlock; bool flush_on_ret; bool may_block; + bool lockless; }; /* @@ -602,6 +604,8 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, hva_end = min(range->end, slot->userspace_addr + (slot->npages << PAGE_SHIFT)); + gfn_range.args = range->args; + /* * To optimize for the likely case where the address * range is covered by zero or one memslots, don't @@ -619,7 +623,7 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); gfn_range.slot = slot; - if (!locked) { + if (!range->lockless && !locked) { locked = true; KVM_MMU_LOCK(kvm); if (!IS_KVM_NULL_FN(range->on_lock)) @@ -628,6 +632,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, break; } ret |= range->handler(kvm, &gfn_range); + + if (range->lockless && ret) + break; } } @@ -880,6 +887,72 @@ static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, kvm_test_age_gfn); } +struct test_clear_young_args { + unsigned long *bitmap; + unsigned long end; + bool clear; + bool young; +}; + +bool kvm_should_clear_young(struct kvm_gfn_range *range, gfn_t gfn) +{ + struct test_clear_young_args *args = range->args; + + VM_WARN_ON_ONCE(gfn < range->start || gfn >= range->end); + + args->young = true; + + if (args->bitmap) { + int offset = hva_to_gfn_memslot(args->end - 1, range->slot) - gfn; + + if (args->clear) + return test_bit(offset, args->bitmap); + + __set_bit(offset, args->bitmap); + } + + return args->clear; +} + +static int kvm_mmu_notifier_test_clear_young(struct mmu_notifier *mn, struct mm_struct *mm, + unsigned long start, unsigned long end, + bool clear, unsigned long *bitmap) +{ + struct kvm *kvm = mmu_notifier_to_kvm(mn); + struct kvm_hva_range range = { + .start = start, + .end = end, + .on_lock = (void *)kvm_null_fn, + .on_unlock = (void *)kvm_null_fn, + }; + + trace_kvm_age_hva(start, end); + + if (kvm_arch_has_test_clear_young()) { + struct test_clear_young_args args = { + .bitmap = bitmap, + .end = end, + .clear = clear, + }; + + range.args = &args; + range.lockless = true; + range.handler = kvm_arch_test_clear_young; + + if (!__kvm_handle_hva_range(kvm, &range)) + return args.young ? MMU_NOTIFIER_RANGE_LOCKLESS : 0; + } + + if (bitmap) + return 0; + + range.args = NULL; + range.lockless = false; + range.handler = clear ? kvm_age_gfn : kvm_test_age_gfn; + + return __kvm_handle_hva_range(kvm, &range) ? MMU_NOTIFIER_RANGE_YOUNG : 0; +} + static void kvm_mmu_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm) { @@ -898,6 +971,7 @@ static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { .clear_flush_young = kvm_mmu_notifier_clear_flush_young, .clear_young = kvm_mmu_notifier_clear_young, .test_young = kvm_mmu_notifier_test_young, + .test_clear_young = kvm_mmu_notifier_test_clear_young, .change_pte = kvm_mmu_notifier_change_pte, .release = kvm_mmu_notifier_release, }; From patchwork Fri May 26 23:44:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786691 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=HaeURiP8; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShR73DFbz20Q8 for ; Sat, 27 May 2023 09:46:59 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShR62B3vz3fLk for ; Sat, 27 May 2023 09:46:58 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=HaeURiP8; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::114a; helo=mail-yw1-x114a.google.com; envelope-from=3a0rxzaykdoicydlesksskpi.gsqpmrybttg-hizpmwxw.sdpefw.svk@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=HaeURiP8; dkim-atps=neutral Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNd0XlHz3f7x for ; Sat, 27 May 2023 09:44:47 +1000 (AEST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5618857518dso17693087b3.2 for ; Fri, 26 May 2023 16:44:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144683; x=1687736683; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=74b92Wk/fESh4Us3jjAt9ER/dw3pjjcYNf6aM3EASmw=; b=HaeURiP8Vv+S5DfSxjNudO8jlQJ3LkpTBadHgzP5bqFZBs2b3foI2By3BUQ7HvtvPw W7QXj6uSkaA/QUNh4WGZeI/+nH65GkUhIUrM4lyZvIxfASIS2yU+XXbU6dLJN4ghefvO HpvDwxabNs2VhwI1FmYErw8clASpd/3GLJz6qPYetihMIY0W6vexsuS/yVGVZpRWw3il an2SbsIcj+SL6k3rb4VmBY4XZ/nf3i0WPF6L2efKh0GQc6ZIGr6GiKv97NKXrU9K5c92 qSgR2Y5nDu/cn+wzryf1LV/5CRQgKNqjczW+qgHgoPm+B6opAwXo+9yxImX5iYvtFZ8h Yz0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144683; x=1687736683; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=74b92Wk/fESh4Us3jjAt9ER/dw3pjjcYNf6aM3EASmw=; b=XrG8YAFhB8/Ji7ICDs2Bb8ygIfq7I7/IG9hIMmAReerQHs0ejBdF7/NZYd+XpPVPRt PxzxlJLaSGn8uZVSUBvGMb0m9v3nIFj2AB0mZb2g3tkDOAgjQHNUP+hqhvwczUtu1Dtn y13DaBLF08cC7PFfHwOdTsjtZghpHDV5i8MujrrE2bv+VtUjGHCorT/lAZh8PcbtjIWR oBtPf5p1dAV7pkuiLEWVoBxGueg0S8xhqKkizXMXnwzj4e+kQWZf0sCia7/gAQ0rFESy v6gPBwMK0l2ik7APUj3HCQQuX01PVHhUPZEBdOrHefUzbvcJ7b/nrMOwxp3ID7qY3Nsm jURg== X-Gm-Message-State: AC+VfDxadmfPsm26Km0sEEBkxHKY87ragqJOgXh6FGx9T8N0KfJbeG+8 6G9wBjzFFjbQFARwleamG1+5pfaX9p4= X-Google-Smtp-Source: ACHHUZ77t8VJ8lKvCLJIppCq4GXisPABEajtkBmVe9W7YGxSZPGguPwTv3qLn5OBBx4lDI9spcGEUr4VdcA= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a81:e608:0:b0:561:c4ef:1def with SMTP id u8-20020a81e608000000b00561c4ef1defmr2027484ywl.0.1685144683190; Fri, 26 May 2023 16:44:43 -0700 (PDT) Date: Fri, 26 May 2023 17:44:27 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-3-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 02/10] mm/kvm: use mmu_notifier_ops->test_clear_young() From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Replace test_young() and clear_young() with test_clear_young(). Signed-off-by: Yu Zhao Reviewed-by: Jason Gunthorpe --- include/linux/mmu_notifier.h | 29 ++----------------- include/trace/events/kvm.h | 15 ---------- mm/mmu_notifier.c | 42 ---------------------------- virt/kvm/kvm_main.c | 54 ------------------------------------ 4 files changed, 2 insertions(+), 138 deletions(-) diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index dfdbb370682d..c8f35fc08703 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -104,26 +104,6 @@ struct mmu_notifier_ops { unsigned long start, unsigned long end); - /* - * clear_young is a lightweight version of clear_flush_young. Like the - * latter, it is supposed to test-and-clear the young/accessed bitflag - * in the secondary pte, but it may omit flushing the secondary tlb. - */ - int (*clear_young)(struct mmu_notifier *subscription, - struct mm_struct *mm, - unsigned long start, - unsigned long end); - - /* - * test_young is called to check the young/accessed bitflag in - * the secondary pte. This is used to know if the page is - * frequently used without actually clearing the flag or tearing - * down the secondary mapping on the page. - */ - int (*test_young)(struct mmu_notifier *subscription, - struct mm_struct *mm, - unsigned long address); - int (*test_clear_young)(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, unsigned long end, bool clear, unsigned long *bitmap); @@ -393,11 +373,6 @@ extern void __mmu_notifier_release(struct mm_struct *mm); extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm, unsigned long start, unsigned long end); -extern int __mmu_notifier_clear_young(struct mm_struct *mm, - unsigned long start, - unsigned long end); -extern int __mmu_notifier_test_young(struct mm_struct *mm, - unsigned long address); extern int __mmu_notifier_test_clear_young(struct mm_struct *mm, unsigned long start, unsigned long end, bool clear, unsigned long *bitmap); @@ -437,7 +412,7 @@ static inline int mmu_notifier_clear_young(struct mm_struct *mm, unsigned long end) { if (mm_has_notifiers(mm)) - return __mmu_notifier_clear_young(mm, start, end); + return __mmu_notifier_test_clear_young(mm, start, end, true, NULL); return 0; } @@ -445,7 +420,7 @@ static inline int mmu_notifier_test_young(struct mm_struct *mm, unsigned long address) { if (mm_has_notifiers(mm)) - return __mmu_notifier_test_young(mm, address); + return __mmu_notifier_test_clear_young(mm, address, address + 1, false, NULL); return 0; } diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index 3bd31ea23fee..46c347e56e60 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -489,21 +489,6 @@ TRACE_EVENT(kvm_age_hva, __entry->start, __entry->end) ); -TRACE_EVENT(kvm_test_age_hva, - TP_PROTO(unsigned long hva), - TP_ARGS(hva), - - TP_STRUCT__entry( - __field( unsigned long, hva ) - ), - - TP_fast_assign( - __entry->hva = hva; - ), - - TP_printk("mmu notifier test age hva: %#016lx", __entry->hva) -); - #endif /* _TRACE_KVM_MAIN_H */ /* This part must be outside protection */ diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 7e6aba4bddcb..c7e9747c9920 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -382,48 +382,6 @@ int __mmu_notifier_clear_flush_young(struct mm_struct *mm, return young; } -int __mmu_notifier_clear_young(struct mm_struct *mm, - unsigned long start, - unsigned long end) -{ - struct mmu_notifier *subscription; - int young = 0, id; - - id = srcu_read_lock(&srcu); - hlist_for_each_entry_rcu(subscription, - &mm->notifier_subscriptions->list, hlist, - srcu_read_lock_held(&srcu)) { - if (subscription->ops->clear_young) - young |= subscription->ops->clear_young(subscription, - mm, start, end); - } - srcu_read_unlock(&srcu, id); - - return young; -} - -int __mmu_notifier_test_young(struct mm_struct *mm, - unsigned long address) -{ - struct mmu_notifier *subscription; - int young = 0, id; - - id = srcu_read_lock(&srcu); - hlist_for_each_entry_rcu(subscription, - &mm->notifier_subscriptions->list, hlist, - srcu_read_lock_held(&srcu)) { - if (subscription->ops->test_young) { - young = subscription->ops->test_young(subscription, mm, - address); - if (young) - break; - } - } - srcu_read_unlock(&srcu, id); - - return young; -} - int __mmu_notifier_test_clear_young(struct mm_struct *mm, unsigned long start, unsigned long end, bool clear, unsigned long *bitmap) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 31ee58754b19..977baaf1b248 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -674,25 +674,6 @@ static __always_inline int kvm_handle_hva_range(struct mmu_notifier *mn, return __kvm_handle_hva_range(kvm, &range); } -static __always_inline int kvm_handle_hva_range_no_flush(struct mmu_notifier *mn, - unsigned long start, - unsigned long end, - hva_handler_t handler) -{ - struct kvm *kvm = mmu_notifier_to_kvm(mn); - const struct kvm_hva_range range = { - .start = start, - .end = end, - .pte = __pte(0), - .handler = handler, - .on_lock = (void *)kvm_null_fn, - .on_unlock = (void *)kvm_null_fn, - .flush_on_ret = false, - .may_block = false, - }; - - return __kvm_handle_hva_range(kvm, &range); -} static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long address, @@ -854,39 +835,6 @@ static int kvm_mmu_notifier_clear_flush_young(struct mmu_notifier *mn, return kvm_handle_hva_range(mn, start, end, __pte(0), kvm_age_gfn); } -static int kvm_mmu_notifier_clear_young(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long start, - unsigned long end) -{ - trace_kvm_age_hva(start, end); - - /* - * Even though we do not flush TLB, this will still adversely - * affect performance on pre-Haswell Intel EPT, where there is - * no EPT Access Bit to clear so that we have to tear down EPT - * tables instead. If we find this unacceptable, we can always - * add a parameter to kvm_age_hva so that it effectively doesn't - * do anything on clear_young. - * - * Also note that currently we never issue secondary TLB flushes - * from clear_young, leaving this job up to the regular system - * cadence. If we find this inaccurate, we might come up with a - * more sophisticated heuristic later. - */ - return kvm_handle_hva_range_no_flush(mn, start, end, kvm_age_gfn); -} - -static int kvm_mmu_notifier_test_young(struct mmu_notifier *mn, - struct mm_struct *mm, - unsigned long address) -{ - trace_kvm_test_age_hva(address); - - return kvm_handle_hva_range_no_flush(mn, address, address + 1, - kvm_test_age_gfn); -} - struct test_clear_young_args { unsigned long *bitmap; unsigned long end; @@ -969,8 +917,6 @@ static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { .invalidate_range_start = kvm_mmu_notifier_invalidate_range_start, .invalidate_range_end = kvm_mmu_notifier_invalidate_range_end, .clear_flush_young = kvm_mmu_notifier_clear_flush_young, - .clear_young = kvm_mmu_notifier_clear_young, - .test_young = kvm_mmu_notifier_test_young, .test_clear_young = kvm_mmu_notifier_test_clear_young, .change_pte = kvm_mmu_notifier_change_pte, .release = kvm_mmu_notifier_release, From patchwork Fri May 26 23:44:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786689 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=lFfC+K1p; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShP66Mb2z20Q8 for ; Sat, 27 May 2023 09:45:13 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShP30nGmz3f87 for ; Sat, 27 May 2023 09:45:11 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=lFfC+K1p; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::1149; helo=mail-yw1-x1149.google.com; envelope-from=3berxzaykdomdzemftlttlqj.htrqnszcuuh-ijaqnxyx.teqfgx.twl@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=lFfC+K1p; dkim-atps=neutral Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNd0KyQz3bnr for ; Sat, 27 May 2023 09:44:47 +1000 (AEST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5651d8acfe2so30773967b3.2 for ; Fri, 26 May 2023 16:44:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144684; x=1687736684; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=W5mqYLgSgL41U5MVABcpsNGg2VTYVhdnWBmYAGgA32M=; b=lFfC+K1prBwz+QWauf5t3Z5nGOG+XH3WsUV6BbAJbu8yEh9dwSmQ7iKcNTjxofo1xQ drv4Lmvn48X7bRRKv2GPWasyFIShdI23SCQ+h/TYUmoXc/hOnlnscK719N8Jpf3m7n6U mEaD6Epy7t6qJzvv/9f9Jp/JqQmnMe81faOyaQAqHTEXDufnGK7wY+Ey5ZFKaBi7dzJd 6O8PEoHW6NsTzElKeYmJHsoCaRTvtaP/AkDf6+cRfXVKyOf9lMpoYxPIHrUwcDj+nl2+ 7q+JpYkRKelWoBaQmJuf7/axxLP6kaz1KnqZQAEmA4XWZSzVgynpinko26WGtZXkTjpp oXNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144684; x=1687736684; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=W5mqYLgSgL41U5MVABcpsNGg2VTYVhdnWBmYAGgA32M=; b=fkmfNh9ZdfZBHOBHAfDEfAciVo01PmvztjsoeBF5iJ2O0A306y0JKfZrjUwY2wpCTx 7YkcMDDziPnUIuV5P+Fs5PEVwjsKr4mZ5cu/dzWuT5qBQl+oJsqCqFeNa/w7YGgZ+wNP Hgm9f5JhBv8ZH+1gzGjt27YRZmRPFPSYRcp57z6Wt5Ijkib/Hki+H5btnmeHBXRl/CPV 2jX6z1/UDCdJzVNZbulPCO8w6oCQh938SzbOqy+/tOCoF7VjlWscWVIiejuoXT2U8oCy Dd+eitLDdv9QhiOKMBfUsudVjiZfcUHvyDmNtwcj0iMiWSJhpY6Jr4uRgPNRoNsQOzvq 9lPQ== X-Gm-Message-State: AC+VfDw3tGZ/mYKC0kMaHTuAsbvGjd96XPG6CBmV9UZWHxX/FDrQzevU xUMHqLxKTmvYSX0TNZ9bcZc3HJSXEik= X-Google-Smtp-Source: ACHHUZ5aYhy2MBmaWD1q0FxpVfKGE+Nrskt2Z7iv2nNf760O+qZ11RhzmW2rToUdRrv6PXs76j2V3Kvga7o= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a81:4421:0:b0:565:9f59:664f with SMTP id r33-20020a814421000000b005659f59664fmr2006806ywa.6.1685144684605; Fri, 26 May 2023 16:44:44 -0700 (PDT) Date: Fri, 26 May 2023 17:44:28 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-4-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 03/10] kvm/arm64: export stage2_try_set_pte() and macros From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" stage2_try_set_pte() and KVM_PTE_LEAF_ATTR_LO_S2_AF are needed to implement kvm_arch_test_clear_young(). Signed-off-by: Yu Zhao --- arch/arm64/include/asm/kvm_pgtable.h | 53 ++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/pgtable.c | 53 ---------------------------- 2 files changed, 53 insertions(+), 53 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index dc3c072e862f..ff520598b62c 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -44,6 +44,49 @@ typedef u64 kvm_pte_t; #define KVM_PHYS_INVALID (-1ULL) +#define KVM_PTE_TYPE BIT(1) +#define KVM_PTE_TYPE_BLOCK 0 +#define KVM_PTE_TYPE_PAGE 1 +#define KVM_PTE_TYPE_TABLE 1 + +#define KVM_PTE_LEAF_ATTR_LO GENMASK(11, 2) + +#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX GENMASK(4, 2) +#define KVM_PTE_LEAF_ATTR_LO_S1_AP GENMASK(7, 6) +#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO 3 +#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW 1 +#define KVM_PTE_LEAF_ATTR_LO_S1_SH GENMASK(9, 8) +#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS 3 +#define KVM_PTE_LEAF_ATTR_LO_S1_AF BIT(10) + +#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR GENMASK(5, 2) +#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R BIT(6) +#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W BIT(7) +#define KVM_PTE_LEAF_ATTR_LO_S2_SH GENMASK(9, 8) +#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS 3 +#define KVM_PTE_LEAF_ATTR_LO_S2_AF BIT(10) + +#define KVM_PTE_LEAF_ATTR_HI GENMASK(63, 51) + +#define KVM_PTE_LEAF_ATTR_HI_SW GENMASK(58, 55) + +#define KVM_PTE_LEAF_ATTR_HI_S1_XN BIT(54) + +#define KVM_PTE_LEAF_ATTR_HI_S2_XN BIT(54) + +#define KVM_PTE_LEAF_ATTR_S2_PERMS (KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \ + KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \ + KVM_PTE_LEAF_ATTR_HI_S2_XN) + +#define KVM_INVALID_PTE_OWNER_MASK GENMASK(9, 2) +#define KVM_MAX_OWNER_ID 1 + +/* + * Used to indicate a pte for which a 'break-before-make' sequence is in + * progress. + */ +#define KVM_INVALID_PTE_LOCKED BIT(10) + static inline bool kvm_pte_valid(kvm_pte_t pte) { return pte & KVM_PTE_VALID; @@ -224,6 +267,16 @@ static inline bool kvm_pgtable_walk_shared(const struct kvm_pgtable_visit_ctx *c return ctx->flags & KVM_PGTABLE_WALK_SHARED; } +static inline bool stage2_try_set_pte(const struct kvm_pgtable_visit_ctx *ctx, kvm_pte_t new) +{ + if (!kvm_pgtable_walk_shared(ctx)) { + WRITE_ONCE(*ctx->ptep, new); + return true; + } + + return cmpxchg(ctx->ptep, ctx->old, new) == ctx->old; +} + /** * struct kvm_pgtable_walker - Hook into a page-table walk. * @cb: Callback function to invoke during the walk. diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 5282cb9ca4cf..24678ccba76a 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -12,49 +12,6 @@ #include -#define KVM_PTE_TYPE BIT(1) -#define KVM_PTE_TYPE_BLOCK 0 -#define KVM_PTE_TYPE_PAGE 1 -#define KVM_PTE_TYPE_TABLE 1 - -#define KVM_PTE_LEAF_ATTR_LO GENMASK(11, 2) - -#define KVM_PTE_LEAF_ATTR_LO_S1_ATTRIDX GENMASK(4, 2) -#define KVM_PTE_LEAF_ATTR_LO_S1_AP GENMASK(7, 6) -#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RO 3 -#define KVM_PTE_LEAF_ATTR_LO_S1_AP_RW 1 -#define KVM_PTE_LEAF_ATTR_LO_S1_SH GENMASK(9, 8) -#define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS 3 -#define KVM_PTE_LEAF_ATTR_LO_S1_AF BIT(10) - -#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR GENMASK(5, 2) -#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R BIT(6) -#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W BIT(7) -#define KVM_PTE_LEAF_ATTR_LO_S2_SH GENMASK(9, 8) -#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS 3 -#define KVM_PTE_LEAF_ATTR_LO_S2_AF BIT(10) - -#define KVM_PTE_LEAF_ATTR_HI GENMASK(63, 51) - -#define KVM_PTE_LEAF_ATTR_HI_SW GENMASK(58, 55) - -#define KVM_PTE_LEAF_ATTR_HI_S1_XN BIT(54) - -#define KVM_PTE_LEAF_ATTR_HI_S2_XN BIT(54) - -#define KVM_PTE_LEAF_ATTR_S2_PERMS (KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \ - KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \ - KVM_PTE_LEAF_ATTR_HI_S2_XN) - -#define KVM_INVALID_PTE_OWNER_MASK GENMASK(9, 2) -#define KVM_MAX_OWNER_ID 1 - -/* - * Used to indicate a pte for which a 'break-before-make' sequence is in - * progress. - */ -#define KVM_INVALID_PTE_LOCKED BIT(10) - struct kvm_pgtable_walk_data { struct kvm_pgtable_walker *walker; @@ -702,16 +659,6 @@ static bool stage2_pte_is_locked(kvm_pte_t pte) return !kvm_pte_valid(pte) && (pte & KVM_INVALID_PTE_LOCKED); } -static bool stage2_try_set_pte(const struct kvm_pgtable_visit_ctx *ctx, kvm_pte_t new) -{ - if (!kvm_pgtable_walk_shared(ctx)) { - WRITE_ONCE(*ctx->ptep, new); - return true; - } - - return cmpxchg(ctx->ptep, ctx->old, new) == ctx->old; -} - /** * stage2_try_break_pte() - Invalidates a pte according to the * 'break-before-make' requirements of the From patchwork Fri May 26 23:44:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786692 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=cWUN3P8r; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShS24Sdkz20Q8 for ; Sat, 27 May 2023 09:47:46 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShS21nnSz3fNq for ; Sat, 27 May 2023 09:47:46 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=cWUN3P8r; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::114a; helo=mail-yw1-x114a.google.com; envelope-from=3bkrxzaykdoufbgohvnvvnsl.jvtspubewwj-klcspzaz.vgshiz.vyn@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=cWUN3P8r; dkim-atps=neutral Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNd0W4Cz3f6n for ; Sat, 27 May 2023 09:44:48 +1000 (AEST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5655d99da53so32319547b3.0 for ; Fri, 26 May 2023 16:44:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144686; x=1687736686; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9t4P2GvyPz5aecPxpWswvlll2je3r362Snycxj82bkU=; b=cWUN3P8rM/d3APcGkGsw1zxYbzC0SLP23YWYAHvNkl8en24zIhenJIzsgauwSrM86L GtpWgjW+9PaIY51P7YD5ERtfy398MWWnA2qbOrXT+m6cJHuvmS8It0VeeYLeOMql5pG+ oF4R8PaA47tg1/M/GaUTjyQPHdY2Yt35bMOgs0yeO0A95+MuHExqrRIRhATT/aSAcAvS DuZIw+jQWwcC1PbwFYK5fVCc5aYJ2bcpXcCLA842C0fEFJDU8WE3MLQI6mTQVW9b35Gn iKpM9XslaKQGwLt7gqMS0LD/fBOZjhpB3/6lsoxW54UOFTglxZ15DmEhLn/Wm52Z8EKo 74zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144686; x=1687736686; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9t4P2GvyPz5aecPxpWswvlll2je3r362Snycxj82bkU=; b=BZ+IcwmJKsIx8r4fVZ7uzjid0hV7c2i3W3nfZ/M21MiQ2gq+Jcrsqd7+/1hVl/v6ig hkUv6bbMr9xpgHhlskSQKRQtByNAkPRqZR05S28vCnFqoD2UyDC9SKtKnXUtCKXLmrrm EDlWsH758YOjwAS/0hEqJL+GQ6/pbYRwDtqezXIDqEiObR4hyGs396itaLx0lhHButz3 e07AkPbopD+MZKjVJoxiero8lgwZOjcd6QA+s2P6Sc1rid1VJ6cANh6IMXgI+ECE1Apy FL23esRdcrTiZ/XFf2yejaX7NDY2lJ77xe4dKWtzqMGlGue7F4zKdz8tMbdeE/Z5W0uf 89gg== X-Gm-Message-State: AC+VfDyVaQxZKHXjP8FkPh62oJtBTFStL6uRopR5pA865MC5COe+vI33 RANjWUQfxSgCCnDV8E0i2JAoLxE3qDA= X-Google-Smtp-Source: ACHHUZ671mv9b1WNngTqma0ZpDsFyu7jCdBVYj6cC/Q6bUpKz7y4SPqrDazly1Z4TuFOLliyXqQloqgktcM= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a81:b627:0:b0:559:d859:d749 with SMTP id u39-20020a81b627000000b00559d859d749mr437593ywh.5.1685144686225; Fri, 26 May 2023 16:44:46 -0700 (PDT) Date: Fri, 26 May 2023 17:44:29 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-5-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 04/10] kvm/arm64: make stage2 page tables RCU safe From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Stage2 page tables are currently not RCU safe against unmapping or VM destruction. The previous mmu_notifier_ops members rely on kvm->mmu_lock to synchronize with those operations. However, the new mmu_notifier_ops member test_clear_young() provides a fast path that does not take kvm->mmu_lock. To implement kvm_arch_test_clear_young() for that path, unmapped page tables need to be freed by RCU and kvm_free_stage2_pgd() needs to be after mmu_notifier_unregister(). Remapping, specifically stage2_free_removed_table(), is already RCU safe. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/kvm_pgtable.h | 2 ++ arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/hyp/pgtable.c | 8 ++++++-- arch/arm64/kvm/mmu.c | 17 ++++++++++++++++- 4 files changed, 25 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index ff520598b62c..5cab52e3a35f 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -153,6 +153,7 @@ static inline bool kvm_level_supports_block_mapping(u32 level) * @put_page: Decrement the refcount on a page. When the * refcount reaches 0 the page is automatically * freed. + * @put_page_rcu: RCU variant of the above. * @page_count: Return the refcount of a page. * @phys_to_virt: Convert a physical address into a virtual * address mapped in the current context. @@ -170,6 +171,7 @@ struct kvm_pgtable_mm_ops { void (*free_removed_table)(void *addr, u32 level); void (*get_page)(void *addr); void (*put_page)(void *addr); + void (*put_page_rcu)(void *addr); int (*page_count)(void *addr); void* (*phys_to_virt)(phys_addr_t phys); phys_addr_t (*virt_to_phys)(void *addr); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 14391826241c..ee93271035d9 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -191,6 +191,7 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) */ void kvm_arch_destroy_vm(struct kvm *kvm) { + kvm_free_stage2_pgd(&kvm->arch.mmu); bitmap_free(kvm->arch.pmu_filter); free_cpumask_var(kvm->arch.supported_cpus); diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 24678ccba76a..dbace4c6a841 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -988,8 +988,12 @@ static int stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx, mm_ops->dcache_clean_inval_poc(kvm_pte_follow(ctx->old, mm_ops), kvm_granule_size(ctx->level)); - if (childp) - mm_ops->put_page(childp); + if (childp) { + if (mm_ops->put_page_rcu) + mm_ops->put_page_rcu(childp); + else + mm_ops->put_page(childp); + } return 0; } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 3b9d4d24c361..c3b3e2afe26f 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -172,6 +172,21 @@ static int kvm_host_page_count(void *addr) return page_count(virt_to_page(addr)); } +static void kvm_s2_rcu_put_page(struct rcu_head *head) +{ + put_page(container_of(head, struct page, rcu_head)); +} + +static void kvm_s2_put_page_rcu(void *addr) +{ + struct page *page = virt_to_page(addr); + + if (kvm_host_page_count(addr) == 1) + kvm_account_pgtable_pages(addr, -1); + + call_rcu(&page->rcu_head, kvm_s2_rcu_put_page); +} + static phys_addr_t kvm_host_pa(void *addr) { return __pa(addr); @@ -704,6 +719,7 @@ static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = { .free_removed_table = stage2_free_removed_table, .get_page = kvm_host_get_page, .put_page = kvm_s2_put_page, + .put_page_rcu = kvm_s2_put_page_rcu, .page_count = kvm_host_page_count, .phys_to_virt = kvm_host_va, .virt_to_phys = kvm_host_pa, @@ -1877,7 +1893,6 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) void kvm_arch_flush_shadow_all(struct kvm *kvm) { - kvm_free_stage2_pgd(&kvm->arch.mmu); } void kvm_arch_flush_shadow_memslot(struct kvm *kvm, From patchwork Fri May 26 23:44:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786694 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=A+5tnxly; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShTr4kcpz20Q8 for ; Sat, 27 May 2023 09:49:20 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShTr34Mtz3fPW for ; Sat, 27 May 2023 09:49:20 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=A+5tnxly; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3b0rxzaykdoygchpiwowwotm.kwutqvcfxxk-lmdtqaba.whtija.wzo@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=A+5tnxly; dkim-atps=neutral Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNd53s5z3c8V for ; Sat, 27 May 2023 09:44:49 +1000 (AEST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-ba8338f20bdso1709668276.0 for ; Fri, 26 May 2023 16:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144687; x=1687736687; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gdtrmeQ7r8fSzyqqoeow0Oco6w18CVCYcKcBQXKvLk0=; b=A+5tnxly4jAziM7PCgC2JSUrvZIuWUL9UuTqWd7F/s/V7U9ZvBSUtbCMn4PDjXhEkh xPkXA8ijRXU8kfBAAJh45kvENT/cQ+0BqqVDKv4CzNn3FeibDZ4esB2AS5+kj2xWW1A/ 5f+rTWaRryOWr0eLpQP68cwlR3USUfp7Ely4h3ulPbhNc/8a6O9EbhRXY0Hc8yu56mx8 LQMCwDasuefCJVI+rIDEYvr3/19bdInHAEUvVq6phC2Jc21wzdoWQlzM7WDhIq/i18eK BEwyErOKi0xkhmnwWH0N/n1UzNpwg5kRKa2pc24aeVs5XXh0p7GOcbD7gzDkKgQaeHXr C7IQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144687; x=1687736687; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gdtrmeQ7r8fSzyqqoeow0Oco6w18CVCYcKcBQXKvLk0=; b=AW04Jh/uNlXrfOyRDowaGu8k4iko/BkH99ex0EW0RP0Ju9En+SWwc/TzmB0XjLifT0 yaQb8S9ckoHM/+cBgLb0pLs2wvTYyZ91ELwlFJL7+tTv+Loah/eOX3mfQ0WMM1qxx390 S8tnz13grgNdel6f4nP4ImSQuJ76SJXZN6xNi+w0Q8OcGSFrN2WjqdqPHyWnIIBDZzqj uiKgkjb15GgAl6jfmnI2YWQPXsjIFTjiSmlFv2WkjBOeBWKuiPcDGaRJseFOgwa+Wn5I BaKRNv6pdm8SALprVUWVQtbw2KOUdnefX7IBYjoyFGhfaHuMtEd43uXevvuX4Cavz26h NiwQ== X-Gm-Message-State: AC+VfDztNXsmV3GXYkNaELFf9csHQq6kylQ/Or5YSw8xxL8IAkpOqM9s ZtGgkvkh4qvbThssHPMNLW+h5vYW7Dw= X-Google-Smtp-Source: ACHHUZ75roT4ZouTjM2lH4DZFyTBmS3QLSjPTFGxgUaEGtzpG1LtCtXvcTQXcsV4/+3guLPQ6FLc+WkYaOc= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a25:b28d:0:b0:bad:155a:1004 with SMTP id k13-20020a25b28d000000b00bad155a1004mr1830575ybj.2.1685144687512; Fri, 26 May 2023 16:44:47 -0700 (PDT) Date: Fri, 26 May 2023 17:44:30 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-6-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 05/10] kvm/arm64: add kvm_arch_test_clear_young() From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Implement kvm_arch_test_clear_young() to support the fast path in mmu_notifier_ops->test_clear_young(). It focuses on a simple case, i.e., hardware sets the accessed bit in KVM PTEs and VMs are not protected, where it can rely on RCU and cmpxchg to safely clear the accessed bit without taking kvm->mmu_lock. Complex cases fall back to the existing slow path where kvm->mmu_lock is then taken. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/kvm_host.h | 6 ++++++ arch/arm64/kvm/mmu.c | 36 +++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7e7e19ef6993..da32b0890716 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1113,4 +1113,10 @@ static inline void kvm_hyp_reserve(void) { } void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu); bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu); +#define kvm_arch_has_test_clear_young kvm_arch_has_test_clear_young +static inline bool kvm_arch_has_test_clear_young(void) +{ + return cpu_has_hw_af() && !is_protected_kvm_enabled(); +} + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index c3b3e2afe26f..26a8d955b49c 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1678,6 +1678,42 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) range->start << PAGE_SHIFT); } +static int stage2_test_clear_young(const struct kvm_pgtable_visit_ctx *ctx, + enum kvm_pgtable_walk_flags flags) +{ + kvm_pte_t new = ctx->old & ~KVM_PTE_LEAF_ATTR_LO_S2_AF; + + VM_WARN_ON_ONCE(!page_count(virt_to_page(ctx->ptep))); + + if (!kvm_pte_valid(new)) + return 0; + + if (new == ctx->old) + return 0; + + if (kvm_should_clear_young(ctx->arg, ctx->addr / PAGE_SIZE)) + stage2_try_set_pte(ctx, new); + + return 0; +} + +bool kvm_arch_test_clear_young(struct kvm *kvm, struct kvm_gfn_range *range) +{ + u64 start = range->start * PAGE_SIZE; + u64 end = range->end * PAGE_SIZE; + struct kvm_pgtable_walker walker = { + .cb = stage2_test_clear_young, + .arg = range, + .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_SHARED, + }; + + BUILD_BUG_ON(is_hyp_code()); + + kvm_pgtable_walk(kvm->arch.mmu.pgt, start, end - start, &walker); + + return false; +} + phys_addr_t kvm_mmu_get_httbr(void) { return __pa(hyp_pgtable->pgd); From patchwork Fri May 26 23:44:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786695 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Y62V38pt; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShVl1nSXz20Pc for ; Sat, 27 May 2023 09:50:07 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShVl11RJz3fj2 for ; Sat, 27 May 2023 09:50:07 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Y62V38pt; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::b4a; helo=mail-yb1-xb4a.google.com; envelope-from=3curxzaykdogiejrkyqyyqvo.mywvsxehzzm-nofvscdc.yjvklc.ybq@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Y62V38pt; dkim-atps=neutral Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNg5P3Sz3c8V for ; Sat, 27 May 2023 09:44:51 +1000 (AEST) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-ba2b9ecfadaso2800724276.2 for ; Fri, 26 May 2023 16:44:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144689; x=1687736689; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Bguo3hUZaLYxse9v+ik+MX3ufqUdHR10jXNpIhPD3cE=; b=Y62V38pt3Kf7P9M2YivEC5661SUHDQjGPJeLNpi65nB4y+p3TsqoRR0z/9AOIr1Kjf gl3yg0EBv9UJcyGGhkoV+eP5nxHHEckaVjNM+WV0ph8y6N8NKNrOoxejQzf6xVLF0GNF CMPzW7iyFKrk0I+YLOBH1dB2cwpR8z8XT9p+1tlyni+H85cDXJGB/ZGkaiDpbVC82/o/ GKwW1qcFloQI+vCY/C+X0I8xbKAWwMlWN0KpQsh6osmLARaX2g1tNlgMEbkEwKadQaPu 8s3poXNR7qoLyRgjp4108zRCziAP6oNak8GIUn6lV7gRCswVoQjlt0PKr+oTkaiyWyY0 Ciug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144689; x=1687736689; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Bguo3hUZaLYxse9v+ik+MX3ufqUdHR10jXNpIhPD3cE=; b=aUG9rFuaCsUUKORaccYm2MSWU005SbDvgC1LK99MBCf25dpFsXQiJFJurStft/iuo9 0Fr879fFZ+5iNeTQSDSSO/vzILFUb/QzFgZcJ+cEWjDAGkoxW0BWlKUM6z+yc5j2kOQJ wBWhNBlFs0xU3I805KW+j/uM5H4XLS09q2MA8YXVNoeF4FZal/gBOqsEfyDbXSo/hGFM o5BRGJWdbe6HeNtmfISIz93xd+q8WhUVBN2JOwHU2yHcWfD8VkfD3HBPE6S4sQQ+tXMn QFRGvob1swKSEBnCt5BT2QahF3jToS0tJDodne9szJZNjHJ35sEZaqKb9RWdWbOpqmsC lbWw== X-Gm-Message-State: AC+VfDw3eFsoT6oWPOcdR3jgk6bQdrEwCY+ccPNdjrMMEm+wtxoAadzE dMngvnob44N//ul3mwR07N+DiBT9q0c= X-Google-Smtp-Source: ACHHUZ72Yke0umLVJsAYBjLDCmB2C1ZbHNslgBIJrMfaEGzX80HaCkEwKhsoBR2khvT0ady9wDj4p4x/v3M= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a25:3cb:0:b0:ba8:337a:d8a3 with SMTP id 194-20020a2503cb000000b00ba8337ad8a3mr1807757ybd.11.1685144689141; Fri, 26 May 2023 16:44:49 -0700 (PDT) Date: Fri, 26 May 2023 17:44:31 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-7-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" KVM page tables are currently not RCU safe against remapping, i.e., kvmppc_unmap_free_pmd_entry_table() et al. The previous mmu_notifier_ops members rely on kvm->mmu_lock to synchronize with that operation. However, the new mmu_notifier_ops member test_clear_young() provides a fast path that does not take kvm->mmu_lock. To implement kvm_arch_test_clear_young() for that path, orphan page tables need to be freed by RCU. Unmapping, specifically kvm_unmap_radix(), does not free page tables, hence not a concern. Signed-off-by: Yu Zhao --- arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 461307b89c3a..3b65b3b11041 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -1469,13 +1469,15 @@ int kvmppc_radix_init(void) { unsigned long size = sizeof(void *) << RADIX_PTE_INDEX_SIZE; - kvm_pte_cache = kmem_cache_create("kvm-pte", size, size, 0, pte_ctor); + kvm_pte_cache = kmem_cache_create("kvm-pte", size, size, + SLAB_TYPESAFE_BY_RCU, pte_ctor); if (!kvm_pte_cache) return -ENOMEM; size = sizeof(void *) << RADIX_PMD_INDEX_SIZE; - kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size, 0, pmd_ctor); + kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size, + SLAB_TYPESAFE_BY_RCU, pmd_ctor); if (!kvm_pmd_cache) { kmem_cache_destroy(kvm_pte_cache); return -ENOMEM; From patchwork Fri May 26 23:44:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786696 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=k8BERP9A; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShWf5fnBz20Q1 for ; Sat, 27 May 2023 09:50:54 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShWf4lnNz3fkg for ; Sat, 27 May 2023 09:50:54 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=k8BERP9A; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3ckrxzaykdokjfkslzrzzrwp.nzxwtyfiaan-opgwtded.zkwlmd.zcr@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=k8BERP9A; dkim-atps=neutral Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNj0QySz3c8V for ; Sat, 27 May 2023 09:44:53 +1000 (AEST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-ba69d93a6b5so2830849276.1 for ; Fri, 26 May 2023 16:44:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144691; x=1687736691; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SAnqMQmXLaJfeZI+VeXqIwvB0doaLL+gp8U5/HUId5I=; b=k8BERP9A+6Uiw+Uaig20mMncTvjKAsRCKq+HC63zo5f14BiSqxPWFPVFrxvd4dUkfQ bKRY2Xn6Sj5cpDeqFqXaOaTxVBSBg6TnHG9z0eetRbywYU3ie/H8/CJOp1+Ez2wWWiIG CY1cPbCWNOAs/19D6iZC6kEHwAF0m7RMLVET0mpluU6zBeVTNhcDLR1hhBJpslaBZ/Nm 0fQXpAKGqnmpqrKCUVaNiYpgSsJF+6ODE5plgt1dvul/aWCITvYKB/5mkPp/iio/HINg 59adq2fXr8dYaO8RPmc4VSfW2nQVaqhTR9xX4O6C0F0j/4E6MdCu9N0KSqKFRTup9DJ6 82/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144691; x=1687736691; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SAnqMQmXLaJfeZI+VeXqIwvB0doaLL+gp8U5/HUId5I=; b=F+0gh3DuQamLXS/Qb9hrcqCPwzPk021K4pXhZjRhRiFUUy6KIGrnRmsZnfq/pIrAw3 bX6bvnFNg/fHYMQuxQLB9v/8JlnkRf2/leG+gt+avgc/eFzAXhex2u6Y7XbXsXz77DzI NTaPTGxyqaIQvhyVf9E4ZQeqm1r9rcJKMY8zavSFPCXU66KXRE83tJshRQ5Jl2tRzHBZ djY3UGhzTAaiO1und24r23yGC/d0ycBCzxc1mVtLdWA27IwRS5X20tr7BFOEfxjqCViJ nyb7uzhzHBBL1TVxXNCk6Zjt4D7wYKtNR2/+FL/0AkJ4VHJOrEV/HgkN57PkfmPnlSlP +t6Q== X-Gm-Message-State: AC+VfDzIly4a1F2RtVkAnPoV/ozcbHzKrLm9jbEb9NaLlg85OixAfgG8 sA02Pgs7MwTfgKe/oYSOwoI2zWPpoF4= X-Google-Smtp-Source: ACHHUZ56f5xRS5UFm0fGZF6KKL2sEC5mGRlgoj/Qp341qhbzjJy6m3cfCKx+HZdIQ3EgPejfKdLHbTHVxEI= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a25:8211:0:b0:ba6:e7ee:bb99 with SMTP id q17-20020a258211000000b00ba6e7eebb99mr1850590ybk.12.1685144690927; Fri, 26 May 2023 16:44:50 -0700 (PDT) Date: Fri, 26 May 2023 17:44:32 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-8-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 07/10] kvm/powerpc: add kvm_arch_test_clear_young() From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Implement kvm_arch_test_clear_young() to support the fast path in mmu_notifier_ops->test_clear_young(). It focuses on a simple case, i.e., radix MMU sets the accessed bit in KVM PTEs and VMs are not nested, where it can rely on RCU and pte_xchg() to safely clear the accessed bit without taking kvm->mmu_lock. Complex cases fall back to the existing slow path where kvm->mmu_lock is then taken. Signed-off-by: Yu Zhao --- arch/powerpc/include/asm/kvm_host.h | 8 ++++ arch/powerpc/include/asm/kvm_ppc.h | 1 + arch/powerpc/kvm/book3s.c | 6 +++ arch/powerpc/kvm/book3s.h | 1 + arch/powerpc/kvm/book3s_64_mmu_radix.c | 59 ++++++++++++++++++++++++++ arch/powerpc/kvm/book3s_hv.c | 5 +++ 6 files changed, 80 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 14ee0dece853..75c260ea8a9e 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -883,4 +883,12 @@ static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} +#define kvm_arch_has_test_clear_young kvm_arch_has_test_clear_young +static inline bool kvm_arch_has_test_clear_young(void) +{ + return IS_ENABLED(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && + cpu_has_feature(CPU_FTR_HVMODE) && cpu_has_feature(CPU_FTR_ARCH_300) && + radix_enabled(); +} + #endif /* __POWERPC_KVM_HOST_H__ */ diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 79a9c0bb8bba..ff1af6a7b44f 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -287,6 +287,7 @@ struct kvmppc_ops { bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range); bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range); bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range); + bool (*test_clear_young)(struct kvm *kvm, struct kvm_gfn_range *range); bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range); void (*free_memslot)(struct kvm_memory_slot *slot); int (*init_vm)(struct kvm *kvm); diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c index 686d8d9eda3e..37bf40b0c4ff 100644 --- a/arch/powerpc/kvm/book3s.c +++ b/arch/powerpc/kvm/book3s.c @@ -899,6 +899,12 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) return kvm->arch.kvm_ops->test_age_gfn(kvm, range); } +bool kvm_arch_test_clear_young(struct kvm *kvm, struct kvm_gfn_range *range) +{ + return !kvm->arch.kvm_ops->test_clear_young || + kvm->arch.kvm_ops->test_clear_young(kvm, range); +} + bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { return kvm->arch.kvm_ops->set_spte_gfn(kvm, range); diff --git a/arch/powerpc/kvm/book3s.h b/arch/powerpc/kvm/book3s.h index 58391b4b32ed..fa2659e21ccc 100644 --- a/arch/powerpc/kvm/book3s.h +++ b/arch/powerpc/kvm/book3s.h @@ -12,6 +12,7 @@ extern void kvmppc_core_flush_memslot_hv(struct kvm *kvm, extern bool kvm_unmap_gfn_range_hv(struct kvm *kvm, struct kvm_gfn_range *range); extern bool kvm_age_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range); extern bool kvm_test_age_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range); +extern bool kvm_test_clear_young_hv(struct kvm *kvm, struct kvm_gfn_range *range); extern bool kvm_set_spte_gfn_hv(struct kvm *kvm, struct kvm_gfn_range *range); extern int kvmppc_mmu_init_pr(struct kvm_vcpu *vcpu); diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 3b65b3b11041..0a392e9a100a 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -1088,6 +1088,65 @@ bool kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot, return ref; } +bool kvm_test_clear_young_hv(struct kvm *kvm, struct kvm_gfn_range *range) +{ + bool err; + gfn_t gfn = range->start; + + rcu_read_lock(); + + err = !kvm_is_radix(kvm); + if (err) + goto unlock; + + /* + * Case 1: This function kvmppc_switch_mmu_to_hpt() + * + * rcu_read_lock() + * Test kvm_is_radix() kvm->arch.radix = 0 + * Use kvm->arch.pgtable synchronize_rcu() + * rcu_read_unlock() + * kvmppc_free_radix() + * + * + * Case 2: This function kvmppc_switch_mmu_to_radix() + * + * kvmppc_init_vm_radix() + * smp_wmb() + * Test kvm_is_radix() kvm->arch.radix = 1 + * smp_rmb() + * Use kvm->arch.pgtable + */ + smp_rmb(); + + while (gfn < range->end) { + pte_t *ptep; + pte_t old, new; + unsigned int shift; + + ptep = find_kvm_secondary_pte_unlocked(kvm, gfn * PAGE_SIZE, &shift); + if (!ptep) + goto next; + + VM_WARN_ON_ONCE(!page_count(virt_to_page(ptep))); + + old = READ_ONCE(*ptep); + if (!pte_present(old) || !pte_young(old)) + goto next; + + new = pte_mkold(old); + + if (kvm_should_clear_young(range, gfn)) + pte_xchg(ptep, old, new); +next: + gfn += shift ? BIT(shift - PAGE_SHIFT) : 1; + } +unlock: + rcu_read_unlock(); + + return err; +} + /* Returns the number of PAGE_SIZE pages that are dirty */ static int kvm_radix_test_clear_dirty(struct kvm *kvm, struct kvm_memory_slot *memslot, int pagenum) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 130bafdb1430..20a81ec9fde8 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -5262,6 +5262,8 @@ int kvmppc_switch_mmu_to_hpt(struct kvm *kvm) spin_lock(&kvm->mmu_lock); kvm->arch.radix = 0; spin_unlock(&kvm->mmu_lock); + /* see the comments in kvm_test_clear_young_hv() */ + synchronize_rcu(); kvmppc_free_radix(kvm); lpcr = LPCR_VPM1; @@ -5286,6 +5288,8 @@ int kvmppc_switch_mmu_to_radix(struct kvm *kvm) if (err) return err; kvmppc_rmap_reset(kvm); + /* see the comments in kvm_test_clear_young_hv() */ + smp_wmb(); /* Mutual exclusion with kvm_unmap_gfn_range etc. */ spin_lock(&kvm->mmu_lock); kvm->arch.radix = 1; @@ -6185,6 +6189,7 @@ static struct kvmppc_ops kvm_ops_hv = { .unmap_gfn_range = kvm_unmap_gfn_range_hv, .age_gfn = kvm_age_gfn_hv, .test_age_gfn = kvm_test_age_gfn_hv, + .test_clear_young = kvm_test_clear_young_hv, .set_spte_gfn = kvm_set_spte_gfn_hv, .free_memslot = kvmppc_core_free_memslot_hv, .init_vm = kvmppc_core_init_vm_hv, From patchwork Fri May 26 23:44:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786697 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Ne/YRQ/s; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShXh3yPDz20Q1 for ; Sat, 27 May 2023 09:51:48 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShXh2r2rz3fJr for ; Sat, 27 May 2023 09:51:48 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Ne/YRQ/s; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3derxzaykdoslhmunbtbbtyr.pbzyvahkccp-qriyvfgf.bmynof.bet@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Ne/YRQ/s; dkim-atps=neutral Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNl0mWQz3cP0 for ; Sat, 27 May 2023 09:44:54 +1000 (AEST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-bad475920a8so633624276.1 for ; Fri, 26 May 2023 16:44:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144693; x=1687736693; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=7JLLUnbPXAaoH9i/N14KC8FRF4yfK0mZvbZkcTBOq5Y=; b=Ne/YRQ/svpJFGXhXpgHmwxc0D9cLMxIoz8hghy2YEExILLlaUzGTTLTu8b32WLIGJt Pl2W5ENMW5Sa2Kwrgh0yVLYa4j/IXKl/wxvEFqusCYfl9i2LCglCwg4qFOwnu/SiZlAZ epK5zgrczZWcQF0/2oLgZ8laj6CsPXbNpnBrvrF3XhoGhjELY5RWTIQgLbNJgOlO3uxt 4d+XiEr2KggZNk0D2a/3M0F1IRHplA5m+1NXrAN0tP918Y2nokL0n9zRLbfOWAMyaJ2Q beEb9W4X5NKj/DpGHg7ZQVC9QnQZ7wDRATscsNx8htwusD9k+VmLxpoV1j1SSPGezxSx 58Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144693; x=1687736693; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=7JLLUnbPXAaoH9i/N14KC8FRF4yfK0mZvbZkcTBOq5Y=; b=lcaLj1Aj/UmnxYCcJKMKWOjSMjp3B/TEuM346NgkJfZCU81ZfzqR2rMZMS2zKw5qwx LFEX39J5wMY1g8V0TyQjWAxpQfEgly/8SPLncuGcg5BuOs7JmBNVDrtZBXzgdVcN+k52 OlVR6wBEhIe8frOhlH60CQTm/oMZe1jHWdcL+p27DCY2z3G4DRKS2AEp3uTUF5beM5La ime3UVl3idqb1IkdVn264eTVgoS3RcT/NjObcFcjXlppVLal/qSG0XeSjzAgiaHagP4u zm0xtdtRHrYmiUXlH4ozuynI58eXHTl8ctozDv4zhwscexqxB2HwFVXNHzwhHjfJyALc XMgA== X-Gm-Message-State: AC+VfDxP7uMsq3oOD5XeAETcQcHkTfjva0Ic7YlUQnaJWEy2ndVvk+4v ILZNunpFIMLsxvSqP+PfHQ54sZ9eBKM= X-Google-Smtp-Source: ACHHUZ5CZlpEmT2VkUcPhwDn9jTP7EScM/+97DNWp/Z/n1uD6F2Bv8HNZfloYPpxuZT6MB0BvBWbDvtVv+8= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a5b:9c6:0:b0:ba8:381b:f764 with SMTP id y6-20020a5b09c6000000b00ba8381bf764mr354063ybq.3.1685144692904; Fri, 26 May 2023 16:44:52 -0700 (PDT) Date: Fri, 26 May 2023 17:44:33 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-9-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 08/10] kvm/x86: move tdp_mmu_enabled and shadow_accessed_mask From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" tdp_mmu_enabled and shadow_accessed_mask are needed to implement kvm_arch_has_test_clear_young(). Signed-off-by: Yu Zhao --- arch/x86/include/asm/kvm_host.h | 6 ++++++ arch/x86/kvm/mmu.h | 6 ------ arch/x86/kvm/mmu/spte.h | 1 - 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fb9d1f2d6136..753c67072c47 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1772,6 +1772,7 @@ struct kvm_arch_async_pf { extern u32 __read_mostly kvm_nr_uret_msrs; extern u64 __read_mostly host_efer; +extern u64 __read_mostly shadow_accessed_mask; extern bool __read_mostly allow_smaller_maxphyaddr; extern bool __read_mostly enable_apicv; extern struct kvm_x86_ops kvm_x86_ops; @@ -1855,6 +1856,11 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin, bool mask); extern bool tdp_enabled; +#ifdef CONFIG_X86_64 +extern bool tdp_mmu_enabled; +#else +#define tdp_mmu_enabled false +#endif u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 92d5a1924fc1..84aedb2671ef 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -253,12 +253,6 @@ static inline bool kvm_shadow_root_allocated(struct kvm *kvm) return smp_load_acquire(&kvm->arch.shadow_root_allocated); } -#ifdef CONFIG_X86_64 -extern bool tdp_mmu_enabled; -#else -#define tdp_mmu_enabled false -#endif - static inline bool kvm_memslots_have_rmaps(struct kvm *kvm) { return !tdp_mmu_enabled || kvm_shadow_root_allocated(kvm); diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 1279db2eab44..a82c4fa1c47b 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -153,7 +153,6 @@ extern u64 __read_mostly shadow_mmu_writable_mask; extern u64 __read_mostly shadow_nx_mask; extern u64 __read_mostly shadow_x_mask; /* mutual exclusive with nx_mask */ extern u64 __read_mostly shadow_user_mask; -extern u64 __read_mostly shadow_accessed_mask; extern u64 __read_mostly shadow_dirty_mask; extern u64 __read_mostly shadow_mmio_value; extern u64 __read_mostly shadow_mmio_mask; From patchwork Fri May 26 23:44:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786698 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=yrD5cypj; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShYb6NQgz20Q1 for ; Sat, 27 May 2023 09:52:35 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShYb5byhz3fnv for ; Sat, 27 May 2023 09:52:35 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=yrD5cypj; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3d0rxzaykdo4okpxqeweewbu.secbydknffs-tulbyiji.epbqri.ehw@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=yrD5cypj; dkim-atps=neutral Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNn37k3z3fF8 for ; Sat, 27 May 2023 09:44:57 +1000 (AEST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-bad06cc7fb7so2973483276.3 for ; Fri, 26 May 2023 16:44:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144695; x=1687736695; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=M3AF7fBmc1xnJAMOiZdfCsfXZHiS6oPk41ReBM4zbdo=; b=yrD5cypjwj76oGySeMKoL+EObdL+VlHBR0ZE4dlUQxoYbTEWiiVpSKdOJmkcaaiF/Z 4zilNg4E7S1qiA0OC/fMHUKvaq4lLe3h3XrcfBRajVZEzuTWsMjYGvjMBJ/wRyLbmfd/ x0p59R3lRKLAmUF0+sTBQhD8rRciT1hrCi2rbco+ObNSFPCgz7YI6gzDi295X90OlgMM zrJNip6i/8aKO3FHb/HgogYFOfPOgx60+DOedoHxymQQDAPdmrus+K28IPt1M7E0g0cI asbBmGj30I6PgAgE5VquUWTvHUX2/5VYaFLlsw0xBC7qUsoM7lDeTwedn+JU6nSVVIPC w2tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144695; x=1687736695; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=M3AF7fBmc1xnJAMOiZdfCsfXZHiS6oPk41ReBM4zbdo=; b=ggSuEgZYg1AKL8ng/h6afZs0fsHvFUDuo9dT3BZTDbmeKlmW2zi5Sd3fRenhRc8VCw 8Mj6unnG4NGvLhyN1XJzIzc1XnBmDe+a5FyQpwFKsZGSzJDRPJxsOq7nvlddsoEu3+PG cayugVwd6j+AN0vPOt70Q9CDAvZbQs4B/GFbWZv9QIs5s/RYfuUjxR9j/4UQPw4RITuN Z8zdt5V4CT9sVaVV1qttPchF22iyHCLHJALUxa8c0nUNRM6Iwm006/+naegwAIukAx4y 5TZRF0hW6dKmSIL8V24IImH1RdMSW9qishvzdZxIjgLFPHvjz7LpqkwRqdTjxBxlecwT oJlw== X-Gm-Message-State: AC+VfDypeMboP9YLixIqdpihjJrJxRdTK+nDaObAnonjZp6SpE24x0oU xPhGwNmNqL0xYuAifrasoJrQnVJ7AFc= X-Google-Smtp-Source: ACHHUZ4QgTvrucO0Q6McE6sU/ZYyF1Ujw9/eAceyjgq3u2aouYR16g/56Sc2t/zSHX7sxuPwIu+K0d0WjYg= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a05:6902:1341:b0:bac:6bb:2549 with SMTP id g1-20020a056902134100b00bac06bb2549mr1840571ybu.7.1685144695170; Fri, 26 May 2023 16:44:55 -0700 (PDT) Date: Fri, 26 May 2023 17:44:34 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-10-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 09/10] kvm/x86: add kvm_arch_test_clear_young() From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Implement kvm_arch_test_clear_young() to support the fast path in mmu_notifier_ops->test_clear_young(). It focuses on a simple case, i.e., TDP MMU sets the accessed bit in KVM PTEs and VMs are not nested, where it can rely on RCU and clear_bit() to safely clear the accessed bit without taking kvm->mmu_lock. Complex cases fall back to the existing slow path where kvm->mmu_lock is then taken. Signed-off-by: Yu Zhao --- arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/kvm/mmu/tdp_mmu.c | 34 +++++++++++++++++++++++++++++++++ 2 files changed, 41 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 753c67072c47..d6dfdebe3d94 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2223,4 +2223,11 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages); */ #define KVM_EXIT_HYPERCALL_MBZ GENMASK_ULL(31, 1) +#define kvm_arch_has_test_clear_young kvm_arch_has_test_clear_young +static inline bool kvm_arch_has_test_clear_young(void) +{ + return IS_ENABLED(CONFIG_X86_64) && + (!IS_REACHABLE(CONFIG_KVM) || (tdp_mmu_enabled && shadow_accessed_mask)); +} + #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 08340219c35a..6875a819e007 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1232,6 +1232,40 @@ bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn); } +bool kvm_arch_test_clear_young(struct kvm *kvm, struct kvm_gfn_range *range) +{ + struct kvm_mmu_page *root; + int offset = ffs(shadow_accessed_mask) - 1; + + if (kvm_shadow_root_allocated(kvm)) + return true; + + rcu_read_lock(); + + list_for_each_entry_rcu(root, &kvm->arch.tdp_mmu_roots, link) { + struct tdp_iter iter; + + if (kvm_mmu_page_as_id(root) != range->slot->as_id) + continue; + + tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) { + u64 *sptep = rcu_dereference(iter.sptep); + + VM_WARN_ON_ONCE(!page_count(virt_to_page(sptep))); + + if (!(iter.old_spte & shadow_accessed_mask)) + continue; + + if (kvm_should_clear_young(range, iter.gfn)) + clear_bit(offset, (unsigned long *)sptep); + } + } + + rcu_read_unlock(); + + return false; +} + static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter, struct kvm_gfn_range *range) { From patchwork Fri May 26 23:44:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Zhao X-Patchwork-Id: 1786699 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Mb/TwMrv; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QShZf2VL8z20Q1 for ; Sat, 27 May 2023 09:53:30 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4QShZf1Mwdz3fqk for ; Sat, 27 May 2023 09:53:30 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Mb/TwMrv; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--yuzhao.bounces.google.com (client-ip=2607:f8b0:4864:20::1149; helo=mail-yw1-x1149.google.com; envelope-from=3eurxzaykdpaqmrzsgyggydw.ugedafmphhu-vwndaklk.grdstk.gjy@flex--yuzhao.bounces.google.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20221208 header.b=Mb/TwMrv; dkim-atps=neutral Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4QShNq4DRRz3fB0 for ; Sat, 27 May 2023 09:44:59 +1000 (AEST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-56561689700so30245477b3.2 for ; Fri, 26 May 2023 16:44:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1685144697; x=1687736697; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/RAbaZ7nXlJx1O5X4p6tDmC38Mqs0IR+WFkcAEID/ew=; b=Mb/TwMrvjKVwx825EpuGBZelUYT9y1q5MsHWLZEtcnYl8b+S/FmG3PWbpa3sjl3UbU EMKVg4Z0PKkBIeb7T3roSS0cZqaFmagaFcdMijquvBPxR+8cgeB8lhD+sxDLIAW92iZI 96AhG517DtCX6M6wF+UEXQXmnvh+bt6+TMXBvO0EA6fZL0y2SwCLgjF2MAZnONkhWo1g K/7v7mGu5qj4MRpb1kQzn2rxfgDVCTzMnRuNoX97x4mDqSBErwlSjFlQrZ/jaGBbYaIg V4TtJk4J+bx9E13IjCR46uiaNAT2c4P7QhpLAYBCKMArI7L7xnoEXLbituEB7FbRo2Qm 2ytw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685144697; x=1687736697; h=cc:to:from:subject:references:mime-version:message-id:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/RAbaZ7nXlJx1O5X4p6tDmC38Mqs0IR+WFkcAEID/ew=; b=TqN4Rj2CEuWaz2FObYFf4zLhjIWlEjDIrG9bekkd+DZVFTm5KfOEUgsXDih84ogwMP QBBR0Br6Hvk1y9J7Ci4PmN+ACJKCaqHVVh9p8P8UiPjhlrKztTqkngVFOlYi7PhgtqTy TDp7QOh5wPisbKCf8I7kNga8uKnQYV9npC2j6EegHObl8NsmpbIOxiHcNoDxm9hUoL0Z lwt9rLhGe0WoS075qUDrge9AlSutfWG7T+LHluOaoiAwaMUyV9KY3heWUZaq9XQYOrIF thpaj6n7HoMSm+7HAfxniFKlndp5Dp1k3eWMZiIVId9/+k0u6+1CPj+WnRl9xwyQ0fOh ppDQ== X-Gm-Message-State: AC+VfDx0xlPcEnSJo5eNTByyyTqfzr2O7Ezs2YEx21fLPLrFS1w7yWh2 e2pMysAKOD4MvCqAj6t5Y7354zu1wfE= X-Google-Smtp-Source: ACHHUZ5acKzCjiqXNcqUiyX95r/zGZ1H0IqWVk5xr51BxFwYKtaOrUj82KqB/feXRTG2Ot17Gvh9/gGxmKY= X-Received: from yuzhao.bld.corp.google.com ([2620:15c:183:200:910f:8a15:592b:2087]) (user=yuzhao job=sendgmr) by 2002:a81:c542:0:b0:561:1d3b:af3f with SMTP id o2-20020a81c542000000b005611d3baf3fmr2060709ywj.8.1685144697389; Fri, 26 May 2023 16:44:57 -0700 (PDT) Date: Fri, 26 May 2023 17:44:35 -0600 In-Reply-To: <20230526234435.662652-1-yuzhao@google.com> Message-Id: <20230526234435.662652-11-yuzhao@google.com> Mime-Version: 1.0 References: <20230526234435.662652-1-yuzhao@google.com> X-Mailer: git-send-email 2.41.0.rc0.172.g3f132b7071-goog Subject: [PATCH mm-unstable v2 10/10] mm: multi-gen LRU: use mmu_notifier_test_clear_young() From: Yu Zhao To: Andrew Morton , Paolo Bonzini X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Jason A. Donenfeld" , x86@kernel.org, Gavin Shan , kvm@vger.kernel.org, linux-doc@vger.kernel.org, Catalin Marinas , Dave Hansen , Peter Xu , linux-mm@kvack.org, Ben Gardon , Chao Peng , Will Deacon , Gaosheng Cui , Marc Zyngier , "H. Peter Anvin" , Yu Zhao , Jonathan Corbet , Alistair Popple , Jason Gunthorpe , Ingo Molnar , Zenghui Yu , linux-trace-kernel@vger.kernel.org, linux-mm@google.com, Thomas Huth , Suzuki K Poulose , Nicholas Piggin , Borislav Petkov , Steven Rostedt , kvmarm@lists.linux.dev, Thomas Gleixner , linux-arm-ke rnel@lists.infradead.org, Fabiano Rosas , Michael Larabel , Sean Christopherson , linux-kernel@vger.kernel.org, Oliver Upton , James Morse , Masami Hiramatsu , Anup Patel , linuxppc-dev@lists.ozlabs.org, Mike Rapoport Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Use mmu_notifier_test_clear_young() to handle KVM PTEs in batches when the fast path is supported. This reduces the contention on kvm->mmu_lock when the host is under heavy memory pressure. An existing selftest can quickly demonstrate the effectiveness of this patch. On a generic workstation equipped with 128 CPUs and 256GB DRAM: $ sudo max_guest_memory_test -c 64 -m 250 -s 250 MGLRU run2 ------------------ Before [1] ~64s After ~51s kswapd (MGLRU before) 100.00% balance_pgdat 100.00% shrink_node 100.00% shrink_one 99.99% try_to_shrink_lruvec 99.71% evict_folios 97.29% shrink_folio_list ==>> 13.05% folio_referenced 12.83% rmap_walk_file 12.31% folio_referenced_one 7.90% __mmu_notifier_clear_young 7.72% kvm_mmu_notifier_clear_young 7.34% _raw_write_lock kswapd (MGLRU after) 100.00% balance_pgdat 100.00% shrink_node 100.00% shrink_one 99.99% try_to_shrink_lruvec 99.59% evict_folios 80.37% shrink_folio_list ==>> 3.74% folio_referenced 3.59% rmap_walk_file 3.19% folio_referenced_one 2.53% lru_gen_look_around 1.06% __mmu_notifier_test_clear_young [1] "mm: rmap: Don't flush TLB after checking PTE young for page reference" was included so that the comparison is apples to apples. https://lore.kernel.org/r/20220706112041.3831-1-21cnbao@gmail.com/ Signed-off-by: Yu Zhao --- Documentation/admin-guide/mm/multigen_lru.rst | 6 +- include/linux/mmzone.h | 6 +- mm/rmap.c | 8 +- mm/vmscan.c | 139 ++++++++++++++++-- 4 files changed, 138 insertions(+), 21 deletions(-) diff --git a/Documentation/admin-guide/mm/multigen_lru.rst b/Documentation/admin-guide/mm/multigen_lru.rst index 33e068830497..0ae2a6d4d94c 100644 --- a/Documentation/admin-guide/mm/multigen_lru.rst +++ b/Documentation/admin-guide/mm/multigen_lru.rst @@ -48,6 +48,10 @@ Values Components verified on x86 varieties other than Intel and AMD. If it is disabled, the multi-gen LRU will suffer a negligible performance degradation. +0x0008 Clearing the accessed bit in KVM page table entries in large + batches, when KVM MMU sets it (e.g., on x86_64). This can + improve the performance of guests when the host is under memory + pressure. [yYnN] Apply to all the components above. ====== =============================================================== @@ -56,7 +60,7 @@ E.g., echo y >/sys/kernel/mm/lru_gen/enabled cat /sys/kernel/mm/lru_gen/enabled - 0x0007 + 0x000f echo 5 >/sys/kernel/mm/lru_gen/enabled cat /sys/kernel/mm/lru_gen/enabled 0x0005 diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 5a7ada0413da..1b148f39fabc 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -369,6 +369,7 @@ enum { LRU_GEN_CORE, LRU_GEN_MM_WALK, LRU_GEN_NONLEAF_YOUNG, + LRU_GEN_KVM_MMU_WALK, NR_LRU_GEN_CAPS }; @@ -471,7 +472,7 @@ struct lru_gen_mm_walk { }; void lru_gen_init_lruvec(struct lruvec *lruvec); -void lru_gen_look_around(struct page_vma_mapped_walk *pvmw); +bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw); #ifdef CONFIG_MEMCG @@ -559,8 +560,9 @@ static inline void lru_gen_init_lruvec(struct lruvec *lruvec) { } -static inline void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +static inline bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { + return false; } #ifdef CONFIG_MEMCG diff --git a/mm/rmap.c b/mm/rmap.c index ae127f60a4fb..3a2c4e6a0887 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -825,12 +825,10 @@ static bool folio_referenced_one(struct folio *folio, return false; /* To break the loop */ } - if (pvmw.pte) { - if (lru_gen_enabled() && pte_young(*pvmw.pte)) { - lru_gen_look_around(&pvmw); + if (lru_gen_enabled() && pvmw.pte) { + if (lru_gen_look_around(&pvmw)) referenced++; - } - + } else if (pvmw.pte) { if (ptep_clear_flush_young_notify(vma, address, pvmw.pte)) referenced++; diff --git a/mm/vmscan.c b/mm/vmscan.c index ef687f9be13c..3f734588b600 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -58,6 +58,7 @@ #include #include #include +#include #include #include @@ -3244,6 +3245,20 @@ static bool should_clear_pmd_young(void) return arch_has_hw_nonleaf_pmd_young() && get_cap(LRU_GEN_NONLEAF_YOUNG); } +#if IS_ENABLED(CONFIG_KVM) +#include + +static bool should_walk_kvm_mmu(void) +{ + return kvm_arch_has_test_clear_young() && get_cap(LRU_GEN_KVM_MMU_WALK); +} +#else +static bool should_walk_kvm_mmu(void) +{ + return false; +} +#endif + /****************************************************************************** * shorthand helpers ******************************************************************************/ @@ -3982,6 +3997,55 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg, return folio; } +static bool test_spte_young(struct mm_struct *mm, unsigned long addr, unsigned long end, + unsigned long *bitmap, unsigned long *last) +{ + if (!should_walk_kvm_mmu()) + return false; + + if (*last > addr) + goto done; + + *last = end - addr > MIN_LRU_BATCH * PAGE_SIZE ? + addr + MIN_LRU_BATCH * PAGE_SIZE - 1 : end - 1; + bitmap_zero(bitmap, MIN_LRU_BATCH); + + mmu_notifier_test_clear_young(mm, addr, *last + 1, false, bitmap); +done: + return test_bit((*last - addr) / PAGE_SIZE, bitmap); +} + +static void clear_spte_young(struct mm_struct *mm, unsigned long addr, + unsigned long *bitmap, unsigned long *last) +{ + int i; + unsigned long start, end = *last + 1; + + if (addr + PAGE_SIZE != end) + return; + + i = find_last_bit(bitmap, MIN_LRU_BATCH); + if (i == MIN_LRU_BATCH) + return; + + start = end - (i + 1) * PAGE_SIZE; + + i = find_first_bit(bitmap, MIN_LRU_BATCH); + + end -= i * PAGE_SIZE; + + mmu_notifier_test_clear_young(mm, start, end, true, bitmap); +} + +static void skip_spte_young(struct mm_struct *mm, unsigned long addr, + unsigned long *bitmap, unsigned long *last) +{ + if (*last > addr) + __clear_bit((*last - addr) / PAGE_SIZE, bitmap); + + clear_spte_young(mm, addr, bitmap, last); +} + static bool suitable_to_scan(int total, int young) { int n = clamp_t(int, cache_line_size() / sizeof(pte_t), 2, 8); @@ -3997,6 +4061,8 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, pte_t *pte; spinlock_t *ptl; unsigned long addr; + DECLARE_BITMAP(bitmap, MIN_LRU_BATCH); + unsigned long last = 0; int total = 0; int young = 0; struct lru_gen_mm_walk *walk = args->private; @@ -4015,6 +4081,7 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, pte = pte_offset_map(pmd, start & PMD_MASK); restart: for (i = pte_index(start), addr = start; addr != end; i++, addr += PAGE_SIZE) { + bool ret; unsigned long pfn; struct folio *folio; @@ -4022,20 +4089,27 @@ static bool walk_pte_range(pmd_t *pmd, unsigned long start, unsigned long end, walk->mm_stats[MM_LEAF_TOTAL]++; pfn = get_pte_pfn(pte[i], args->vma, addr); - if (pfn == -1) + if (pfn == -1) { + skip_spte_young(args->vma->vm_mm, addr, bitmap, &last); continue; + } - if (!pte_young(pte[i])) { + ret = test_spte_young(args->vma->vm_mm, addr, end, bitmap, &last); + if (!ret && !pte_young(pte[i])) { + skip_spte_young(args->vma->vm_mm, addr, bitmap, &last); walk->mm_stats[MM_LEAF_OLD]++; continue; } folio = get_pfn_folio(pfn, memcg, pgdat, walk->can_swap); - if (!folio) + if (!folio) { + skip_spte_young(args->vma->vm_mm, addr, bitmap, &last); continue; + } - if (!ptep_test_and_clear_young(args->vma, addr, pte + i)) - VM_WARN_ON_ONCE(true); + clear_spte_young(args->vma->vm_mm, addr, bitmap, &last); + if (pte_young(pte[i])) + ptep_test_and_clear_young(args->vma, addr, pte + i); young++; walk->mm_stats[MM_LEAF_YOUNG]++; @@ -4629,6 +4703,23 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) * rmap/PT walk feedback ******************************************************************************/ +static bool should_look_around(struct vm_area_struct *vma, unsigned long addr, + pte_t *pte, int *young) +{ + int ret = mmu_notifier_clear_young(vma->vm_mm, addr, addr + PAGE_SIZE); + + if (pte_young(*pte)) { + ptep_test_and_clear_young(vma, addr, pte); + *young = true; + return true; + } + + if (ret) + *young = true; + + return ret & MMU_NOTIFIER_RANGE_LOCKLESS; +} + /* * This function exploits spatial locality when shrink_folio_list() walks the * rmap. It scans the adjacent PTEs of a young PTE and promotes hot pages. If @@ -4636,12 +4727,14 @@ static void lru_gen_age_node(struct pglist_data *pgdat, struct scan_control *sc) * the PTE table to the Bloom filter. This forms a feedback loop between the * eviction and the aging. */ -void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) +bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) { int i; unsigned long start; unsigned long end; struct lru_gen_mm_walk *walk; + DECLARE_BITMAP(bitmap, MIN_LRU_BATCH); + unsigned long last = 0; int young = 0; pte_t *pte = pvmw->pte; unsigned long addr = pvmw->address; @@ -4655,8 +4748,11 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) lockdep_assert_held(pvmw->ptl); VM_WARN_ON_ONCE_FOLIO(folio_test_lru(folio), folio); + if (!should_look_around(pvmw->vma, addr, pte, &young)) + return young; + if (spin_is_contended(pvmw->ptl)) - return; + return young; /* avoid taking the LRU lock under the PTL when possible */ walk = current->reclaim_state ? current->reclaim_state->mm_walk : NULL; @@ -4664,6 +4760,9 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) start = max(addr & PMD_MASK, pvmw->vma->vm_start); end = min(addr | ~PMD_MASK, pvmw->vma->vm_end - 1) + 1; + if (end - start == PAGE_SIZE) + return young; + if (end - start > MIN_LRU_BATCH * PAGE_SIZE) { if (addr - start < MIN_LRU_BATCH * PAGE_SIZE / 2) end = start + MIN_LRU_BATCH * PAGE_SIZE; @@ -4677,28 +4776,37 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) /* folio_update_gen() requires stable folio_memcg() */ if (!mem_cgroup_trylock_pages(memcg)) - return; + return young; arch_enter_lazy_mmu_mode(); pte -= (addr - start) / PAGE_SIZE; for (i = 0, addr = start; addr != end; i++, addr += PAGE_SIZE) { + bool ret; unsigned long pfn; pfn = get_pte_pfn(pte[i], pvmw->vma, addr); - if (pfn == -1) + if (pfn == -1) { + skip_spte_young(pvmw->vma->vm_mm, addr, bitmap, &last); continue; + } - if (!pte_young(pte[i])) + ret = test_spte_young(pvmw->vma->vm_mm, addr, end, bitmap, &last); + if (!ret && !pte_young(pte[i])) { + skip_spte_young(pvmw->vma->vm_mm, addr, bitmap, &last); continue; + } folio = get_pfn_folio(pfn, memcg, pgdat, !walk || walk->can_swap); - if (!folio) + if (!folio) { + skip_spte_young(pvmw->vma->vm_mm, addr, bitmap, &last); continue; + } - if (!ptep_test_and_clear_young(pvmw->vma, addr, pte + i)) - VM_WARN_ON_ONCE(true); + clear_spte_young(pvmw->vma->vm_mm, addr, bitmap, &last); + if (pte_young(pte[i])) + ptep_test_and_clear_young(pvmw->vma, addr, pte + i); young++; @@ -4728,6 +4836,8 @@ void lru_gen_look_around(struct page_vma_mapped_walk *pvmw) /* feedback from rmap walkers to page table walkers */ if (suitable_to_scan(i, young)) update_bloom_filter(lruvec, max_seq, pvmw->pmd); + + return young; } /****************************************************************************** @@ -5745,6 +5855,9 @@ static ssize_t enabled_show(struct kobject *kobj, struct kobj_attribute *attr, c if (should_clear_pmd_young()) caps |= BIT(LRU_GEN_NONLEAF_YOUNG); + if (should_walk_kvm_mmu()) + caps |= BIT(LRU_GEN_KVM_MMU_WALK); + return sysfs_emit(buf, "0x%04x\n", caps); }