From patchwork Tue Jun 2 02:08:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Henrique Cerri X-Patchwork-Id: 1302029 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49bb831Xq1z9sSJ; Tue, 2 Jun 2020 12:09:23 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1jfwMf-0005rG-1T; Tue, 02 Jun 2020 02:09:17 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1jfwMR-0005ex-Pt for kernel-team@lists.ubuntu.com; Tue, 02 Jun 2020 02:09:03 +0000 Received: from mail-qv1-f70.google.com ([209.85.219.70]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1jfwMQ-00021r-Ct for kernel-team@lists.ubuntu.com; Tue, 02 Jun 2020 02:09:02 +0000 Received: by mail-qv1-f70.google.com with SMTP id i6so2056199qvq.17 for ; Mon, 01 Jun 2020 19:09:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BFUuAJSvVjIdYeDuzYUxnsOcBe8hK01EBIPXwISjhMA=; b=ucKvowg/Mwc7CjIg0frKKwVN3D7iIPsaszHmHGMR8+zPSLQ1JoTqtxuFlolx4ARiCq MN61m5WtzqysUN5c+a7MePUROGM/liR+vWvpxFHtfCiyC3aFRTM4bRQiwGvegKcLzeqp j2jsAlN24emDF9UAhAZMZavCezBr7qMczBcPedOE94m7gx54kGX5nLBylCD2CfJ7h3lb 9vXGm5VKiG7vb1D7ZqbGz0H6vl2FrJ8gVDenNTyV/PRN/Z1LySrcsKXtokToa32Kbm5N iU1r/Knf18pY/K22q0KsBW+9fOpMX9BD5wKYNPNeQcZv7dSm0a2iT/W9kXTluwOtXu3w oXFA== X-Gm-Message-State: AOAM5323UphuSTyBW11LrPz1eP1Ohldu+1mKVJOMbAEZNwc11FIErC88 RAYHSEPnMqgjiZHYYqdFnMo7pEczcJ7lM+xx0YTonHu2zJl3Z99ZQ8tb4NZAL8IUj67b4N4R8NQ +RrkjE1L5MlPAsOOHSCut+d28r4q0wubZ54ZfcEFp X-Received: by 2002:ac8:7683:: with SMTP id g3mr25121718qtr.240.1591063740850; Mon, 01 Jun 2020 19:09:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzH5617gFfI5reWZ1CV0Bo0XQGBxGhY0gKarsIZtHx5hPV1kXqHZmZ92LkOlKt0qjr4VPygdA== X-Received: by 2002:ac8:7683:: with SMTP id g3mr25121698qtr.240.1591063740442; Mon, 01 Jun 2020 19:09:00 -0700 (PDT) Received: from gallifrey.lan ([201.82.186.200]) by smtp.gmail.com with ESMTPSA id t13sm1138342qtc.77.2020.06.01.19.08.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jun 2020 19:08:59 -0700 (PDT) From: Marcelo Henrique Cerri To: kernel-team@lists.ubuntu.com Subject: [focal:linux-azure][PATCH 20/21] x86/hyperv: Suspend/resume the VP assist page for hibernation Date: Mon, 1 Jun 2020 23:08:16 -0300 Message-Id: <20200602020817.236422-21-marcelo.cerri@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200602020817.236422-1-marcelo.cerri@canonical.com> References: <20200602020817.236422-1-marcelo.cerri@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Dexuan Cui BugLink: http://bugs.launchpad.net/bugs/1880032 Unlike the other CPUs, CPU0 is never offlined during hibernation, so in the resume path, the "new" kernel's VP assist page is not suspended (i.e. not disabled), and later when we jump to the "old" kernel, the page is not properly re-enabled for CPU0 with the allocated page from the old kernel. So far, the VP assist page is used by hv_apic_eoi_write(), and is also used in the case of nested virtualization (running KVM atop Hyper-V). For hv_apic_eoi_write(), when the page is not properly re-enabled, hvp->apic_assist is always 0, so the HV_X64_MSR_EOI MSR is always written. This is not ideal with respect to performance, but Hyper-V can still correctly handle this according to the Hyper-V spec; nevertheless, Linux still must update the Hyper-V hypervisor with the correct VP assist page to prevent Hyper-V from writing to the stale page, which causes guest memory corruption and consequently may have caused the hangs and triple faults seen during non-boot CPUs resume. Fix the issue by calling hv_cpu_die()/hv_cpu_init() in the syscore ops. Without the fix, hibernation can fail at a rate of 1/300 ~ 1/500. With the fix, hibernation can pass a long-haul test of 2000 runs. In the case of nested virtualization, disabling/reenabling the assist page upon hibernation may be unsafe if there are active L2 guests. It looks KVM should be enhanced to abort the hibernation request if there is any active L2 guest. Fixes: 05bd330a7fd8 ("x86/hyperv: Suspend/resume the hypercall page for hibernation") Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui Link: https://lore.kernel.org/r/1587437171-2472-1-git-send-email-decui@microsoft.com Signed-off-by: Wei Liu (cherry picked from commit 421f090c819d695942a470051cd624dc43deaf95) Signed-off-by: Marcelo Henrique Cerri --- arch/x86/hyperv/hv_init.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index ce15f4228fde..c14469fe93d2 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -73,7 +73,8 @@ static int hv_cpu_init(unsigned int cpu) struct page *pg; input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); - pg = alloc_page(GFP_KERNEL); + /* hv_cpu_init() can be called with IRQs disabled from hv_resume() */ + pg = alloc_page(irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL); if (unlikely(!pg)) return -ENOMEM; *input_arg = page_address(pg); @@ -254,6 +255,7 @@ static int __init hv_pci_init(void) static int hv_suspend(void) { union hv_x64_msr_hypercall_contents hypercall_msr; + int ret; /* * Reset the hypercall page as it is going to be invalidated @@ -270,12 +272,17 @@ static int hv_suspend(void) hypercall_msr.enable = 0; wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); - return 0; + ret = hv_cpu_die(0); + return ret; } static void hv_resume(void) { union hv_x64_msr_hypercall_contents hypercall_msr; + int ret; + + ret = hv_cpu_init(0); + WARN_ON(ret); /* Re-enable the hypercall page */ rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); @@ -288,6 +295,7 @@ static void hv_resume(void) hv_hypercall_pg_saved = NULL; } +/* Note: when the ops are called, only CPU0 is online and IRQs are disabled. */ static struct syscore_ops hv_syscore_ops = { .suspend = hv_suspend, .resume = hv_resume,