From patchwork Thu Oct 10 18:23:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 1995823 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=XZ1iXO1Y; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=iFkBAzBX; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=kvm-riscv-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XPmH64gSxz1xsc for ; Fri, 11 Oct 2024 10:31:30 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:Reply-To:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID :References:Mime-Version:In-Reply-To:Date:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=E//wH2OyzW4rBanOG5l8ztT5AlhqmOiiQLdNz0UqNsE=; b=XZ1iXO1Y3wHhxT NAPzuFUULvK69C8vVhDXzOtxitfsFO+Ix+QgaG6uxt6YHhnXWTW/pP2IQtzpPktXW7K3a2935TOlJ h5sw3+6wm47th4L2hMoCMb8KYl2UtcQqPb4+3eLhDTp0c/1qObv4ZkdSqcmDxgqK7sTjyRbfSCTY6 gBW8sj3RlM8OnOyfAvV4r+VXmAkDTj+X4yKWFWihdtPpMpjxQI5K0oYRZgJA//4vHVc38KfRSqnNE H13Sy9lNI2YWXRXw2A3SKbep7FXbHPDHu9+z8diTTVryfNBGlOmYU5Hxf6HNtrO5oNmrFTAgjk4LG OJwUKNoXycV03T94kw3g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1sz2cz-0000000Ef4a-0xPW; Thu, 10 Oct 2024 23:31:29 +0000 Received: from mail-pl1-x64a.google.com ([2607:f8b0:4864:20::64a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1syxre-0000000DqjP-3eoU for kvm-riscv@lists.infradead.org; Thu, 10 Oct 2024 18:26:21 +0000 Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-20b921fa133so13297345ad.1 for ; Thu, 10 Oct 2024 11:26:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728584778; x=1729189578; darn=lists.infradead.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:from:to:cc:subject:date :message-id:reply-to; bh=sgK+YcjRpEQ+YhbfYv/o4akq/yPA2qNOXCMP41gc9EY=; b=iFkBAzBXYFBUziXH1DPHFwz0QuJbVZY9yzLrch8ot8iJUnZFwu0Ch7La+ANEyti4nR AiDSevGke+MAd+fF5+8281cC7OvrCxYOzA8Jn/wVBWgRyeNM1C+8Y4ZOM9b7hYkXpmDz rSxcKYrmZHnrNvGHVOgV5h+nWW41K7iFAYTqArL5ncvyqN2Nrg8M2ksNxs489+ODUdI5 RkxO0PgCZ3IqNb8GULWvgAuWQDd7DbJqF6gmz7/xpgXrONOBNZ8xnCYXKQDivmKtwi++ SkbfWdozQu9XNTOmoT67SyrcEA3ebbCiwMyOSypgPQ2bTRLYaltBc5xINem/udZ5gqOM R5OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728584778; x=1729189578; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=sgK+YcjRpEQ+YhbfYv/o4akq/yPA2qNOXCMP41gc9EY=; b=Yn+njDgd9icYXIHqWoC9cON+jwncnWcBDYmqe6Slb3dGnGNdlsBxaL2k4oJaJHkeQa CI5pVSX4gQthMiWjyzOBZyHah+2635VwxgpHI9L2rPzRk3Ia7MogVh3OCfFHSp+vduYV DGHLQ0QYHNMt/O/8eXWq2vlma+3vIUBMQHrYud+l8QmxvNiCzrmMU8Xm54dutsoiGobB CtgrXVTIrGWchEm8ESfz5vrkq+QLRK8/T9s+EShXliy4koWJzu0FyTHbizyPXfqPNCZ4 auiMTwzotQhwrvc7afa8RZ5vZ1JqmcohnybE7uM3OGSE0V8s5SjuyAYBWznN4n4w+o+O 0SUw== X-Forwarded-Encrypted: i=1; AJvYcCUVMncAisve4wNgBL5hl0K0t0YN9gDcyE7U2jviuyARpyo8MG8+yIG7KHUoUVRdoVDqqqrUFrkUVaA=@lists.infradead.org X-Gm-Message-State: AOJu0Yx+t7XP00GElKeeRAkepRwYNxweNYBEQJ0/LDrtFVN+nG9BuFLr Zr3h4UjvN8yWRRSAlH0mFAuHZ5RvBpF1Y3TmXYRhL/UnFfCJv/PReNCwS+o/zcqD4RtVqvJ6O8k Luw== X-Google-Smtp-Source: AGHT+IHgES02wxBmUBuw4MXImaMxOgBvLS6HzbS699NHinso/VI3SIlimd+hdhb5F/AgwxmTSyUtUzYKtBo= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:9d:3983:ac13:c240]) (user=seanjc job=sendgmr) by 2002:a17:902:ec85:b0:20b:7bfa:ac0f with SMTP id d9443c01a7336-20ca037f212mr485ad.1.1728584777475; Thu, 10 Oct 2024 11:26:17 -0700 (PDT) Date: Thu, 10 Oct 2024 11:23:43 -0700 In-Reply-To: <20241010182427.1434605-1-seanjc@google.com> Mime-Version: 1.0 References: <20241010182427.1434605-1-seanjc@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241010182427.1434605-42-seanjc@google.com> Subject: [PATCH v13 41/85] KVM: x86/mmu: Mark pages/folios dirty at the origin of make_spte() From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Sean Christopherson Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, " =?utf-8?q?Alex_Benn=C3=A9e?= " , Yan Zhao , David Matlack , David Stevens , Andrew Jones X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241010_112619_086860_61D3DB51 X-CRM114-Status: GOOD ( 17.21 ) X-Spam-Score: -9.5 (---------) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Move the marking of folios dirty from make_spte() out to its callers, which have access to the _struct page_, not just the underlying pfn. Once all architectures follow suit, this will allow removing [...] Content analysis details: (-9.5 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:64a listed in] [list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -7.5 USER_IN_DEF_DKIM_WL From: address is in the default DKIM welcome-list -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.0 DKIMWL_WL_MED DKIMwl.org - Medium trust sender X-BeenThere: kvm-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Sean Christopherson Sender: "kvm-riscv" Errors-To: kvm-riscv-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org Move the marking of folios dirty from make_spte() out to its callers, which have access to the _struct page_, not just the underlying pfn. Once all architectures follow suit, this will allow removing KVM's ugly hack where KVM elevates the refcount of VM_MIXEDMAP pfns that happen to be struct page memory. Tested-by: Alex Bennée Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 30 ++++++++++++++++++++++++++++-- arch/x86/kvm/mmu/paging_tmpl.h | 5 +++++ arch/x86/kvm/mmu/spte.c | 11 ----------- 3 files changed, 33 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 31a6ae41a6f4..f730870887dd 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2964,7 +2964,17 @@ static bool kvm_mmu_prefetch_sptes(struct kvm_vcpu *vcpu, gfn_t gfn, u64 *sptep, for (i = 0; i < nr_pages; i++, gfn++, sptep++) { mmu_set_spte(vcpu, slot, sptep, access, gfn, page_to_pfn(pages[i]), NULL); - kvm_release_page_clean(pages[i]); + + /* + * KVM always prefetches writable pages from the primary MMU, + * and KVM can make its SPTE writable in the fast page handler, + * without notifying the primary MMU. Mark pages/folios dirty + * now to ensure file data is written back if it ends up being + * written by the guest. Because KVM's prefetching GUPs + * writable PTEs, the probability of unnecessary writeback is + * extremely low. + */ + kvm_release_page_dirty(pages[i]); } return true; @@ -4360,7 +4370,23 @@ static u8 kvm_max_private_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, static void kvm_mmu_finish_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int r) { - kvm_release_pfn_clean(fault->pfn); + lockdep_assert_once(lockdep_is_held(&vcpu->kvm->mmu_lock) || + r == RET_PF_RETRY); + + /* + * If the page that KVM got from the *primary MMU* is writable, and KVM + * installed or reused a SPTE, mark the page/folio dirty. Note, this + * may mark a folio dirty even if KVM created a read-only SPTE, e.g. if + * the GFN is write-protected. Folios can't be safely marked dirty + * outside of mmu_lock as doing so could race with writeback on the + * folio. As a result, KVM can't mark folios dirty in the fast page + * fault handler, and so KVM must (somewhat) speculatively mark the + * folio dirty if KVM could locklessly make the SPTE writable. + */ + if (!fault->map_writable || r == RET_PF_RETRY) + kvm_release_pfn_clean(fault->pfn); + else + kvm_release_pfn_dirty(fault->pfn); } static int kvm_mmu_faultin_pfn_private(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 35d0c3f1a789..f4711674c47b 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -954,6 +954,11 @@ static int FNAME(sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, int spte_to_pfn(spte), spte, true, true, host_writable, &spte); + /* + * There is no need to mark the pfn dirty, as the new protections must + * be a subset of the old protections, i.e. synchronizing a SPTE cannot + * change the SPTE from read-only to writable. + */ return mmu_spte_update(sptep, spte); } diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 8e8d6ee79c8b..f1a50a78badb 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -277,17 +277,6 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, mark_page_dirty_in_slot(vcpu->kvm, slot, gfn); } - /* - * If the page that KVM got from the primary MMU is writable, i.e. if - * it's host-writable, mark the page/folio dirty. As alluded to above, - * folios can't be safely marked dirty in the fast page fault handler, - * and so KVM must (somewhat) speculatively mark the folio dirty even - * though it isn't guaranteed to be written as KVM won't mark the folio - * dirty if/when the SPTE is made writable. - */ - if (host_writable) - kvm_set_pfn_dirty(pfn); - *new_spte = spte; return wrprot; }