From patchwork Wed Dec 19 10:03:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juerg Haefliger X-Patchwork-Id: 1015961 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43KVpG3hSrz9s3Z; Wed, 19 Dec 2018 21:03:34 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gZYhI-0007P1-J2; Wed, 19 Dec 2018 10:03:24 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gZYhG-0007OE-II for kernel-team@lists.ubuntu.com; Wed, 19 Dec 2018 10:03:22 +0000 Received: from mail-ed1-f72.google.com ([209.85.208.72]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gZYhG-00075r-Ar for kernel-team@lists.ubuntu.com; Wed, 19 Dec 2018 10:03:22 +0000 Received: by mail-ed1-f72.google.com with SMTP id b3so15831900edi.0 for ; Wed, 19 Dec 2018 02:03:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rFO+lAGSG/2pvPNwCjn71GZ3hbgF7kcs/oi/iJ6ZpSk=; b=SQ+Aou+sBWQfQ/gAQ/0DSUAzOXCzxrACIW6rKYOqrQBwGQjc6UBI36xJGcaMTmKH4O YUaI7FFiOK4A9Q9yc7yRtruarS7FnpgsktQd8EJa0eWikuHfLXmo5dq5YSMAk0311Jps TOJKLyzWqQ3pNzw+WTAIgjVQuIUx4NJfiQfzq/lhvbtQXLiBlQyU736b6tt4bP20+qbD OrwBUAjiDgNtQxX/kSu9FdPl1C1+xFN25LpdDDv+aXXZtrVm0XbBQRDV/rszjJhXTWg+ MfISnG+8paO/oWRzZE8u9esNXN0QGEeVJ5EDTfv+nLDnfEv4RQyCxjzh7v1apRZZltUQ 8hpQ== X-Gm-Message-State: AA+aEWbBeaqSOP8jGax3JCKUdW8/jkRaNh4n1rDz52f1uM3nZQBu++a2 GHRVZtlBMIvK7s+5Yq9YBCKPxnCuCyZd/dFqImWCbPu6olJtv0WWojBKBoUiHu1Q4WQg60DiDFP UkA5T3RdL6OFYRRiXWlwxiQp+OvcPC8DV1ea3wUJMOA== X-Received: by 2002:a17:906:8301:: with SMTP id j1-v6mr15850764ejx.60.1545213801748; Wed, 19 Dec 2018 02:03:21 -0800 (PST) X-Google-Smtp-Source: AFSGD/VJ5bcM86U1vz/6euU6uLzuGrZHtwMkG6+HmXZBMd5Ulrvz84SoO7bFlF+Q1nSbU66INTRuYQ== X-Received: by 2002:a17:906:8301:: with SMTP id j1-v6mr15850746ejx.60.1545213801370; Wed, 19 Dec 2018 02:03:21 -0800 (PST) Received: from gollum.fritz.box ([81.221.192.120]) by smtp.gmail.com with ESMTPSA id e14sm5236419edb.79.2018.12.19.02.03.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Dec 2018 02:03:20 -0800 (PST) From: Juerg Haefliger X-Google-Original-From: Juerg Haefliger To: kernel-team@lists.ubuntu.com Subject: [SRU][Xenial][PULL] Guests using IBRS incur a large performance penalty (LP: #1764956) Date: Wed, 19 Dec 2018 11:03:19 +0100 Message-Id: <20181219100319.5191-1-juergh@canonical.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <1542895772-7867-1-git-send-email-gavin.guo@canonical.com> References: <1542895772-7867-1-git-send-email-gavin.guo@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" [Impact] the IBRS would be mistakenly enabled in the host when the switching from an IBRS-enabled VM and that causes the performance overhead in the host. The other condition could also mistakenly disables the IBRS in VM when context-switching from the host. And this could be considered a CVE host. [Fix] The patch fixes the logic inside the x86_virt_spec_ctrl that it checks the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the x86_spec_ctrl_base by default is zero. Because the upstream implementation is not equal to the Xenial's implementation. Upstream doesn't use the IBRS as the formal fix. So, by default, it's zero. On the other hand, after the VM exit, the SPEC_CTRL register also needs to be saved manually by reading the SPEC_CTRL MSR as the MSR intercept is disabled by default in the hardware_setup(v4.4) and vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and doesn't trigger a trap. So, the vmx_set_msr() function isn't called. The v3.13 kernel hasn't been tested. However, the patch can be viewed at: http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru The v4.4 patch: http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg [Test] The patch has been tested on the 4.4.0-140.166 and works fine. The reproducing environment: Guest kernel version: 4.4.0-138.164 Host kernel version: 4.4.0-140.166 (host IBRS, guest IBRS) - 1). (0, 1). The case can be reproduced by the following instructions: guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled 1 host$ cat /proc/sys/kernel/ibrs_enabled 0 host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 11111111111111000000000000000000010010100000000000000000 Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly enabled. host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500 stress-ng: info: [11264] defaulting to a 86400 second run per stressor stress-ng: info: [11264] dispatching hogs: 1 cpu stress-ng: info: [11264] cache allocate: default cache size: 35840K stress-ng: info: [11264] successful run completed in 33.48s The host kernel didn't notice the IBRS bit is enabled. So, the situation is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host. And running the stress-ng is a pure userspace CPU capability calculation. So, the performance downgrades to about 1/3. Without the IBRS enabled, it needs about 10s. - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0). The guest IBRS has been mistakenly disabled. guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 11111111111111111111111111111111111111111111111111111111 host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 11111111111111111111111111111111111111111111111111111111 host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 00000000000000000000000000000000000000000000000000000000 guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 00000000000000000000000000000000000000000000000000000000 [juergh: MSR-isolation between guests and the host is incomplete in Xenial. This PR is supposed to fix this and bring Xenial up to par with stable v4.9.] Signed-off-by: Juerg Haefliger Acked-by: Stefan Bader --- The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529: UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100) are available in the Git repository at: git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622: UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100) ---------------------------------------------------------------- Ashok Raj (1): KVM/x86: Add IBPB support David Matlack (1): KVM: nVMX: mark vmcs12 pages dirty on L2 exit Jim Mattson (5): kvm: nVMX: VMCLEAR an active shadow VMCS after last use kvm: vmx: Scrub hardware GPRs at VM-exit KVM: nVMX: Eliminate vmcs02 pool kvm: x86: IA32_ARCH_CAPABILITIES is always supported kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb Juerg Haefliger (4): UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT KarimAllah Ahmed (3): KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs Paolo Bonzini (5): KVM: VMX: introduce alloc_loaded_vmcs KVM: VMX: make MSR bitmaps per-VCPU KVM/x86: Remove indirect MSR op calls from SPEC_CTRL KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() KVM: VMX: fixes for vmentry_l1d_flush module parameter Radim Krčmář (1): KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC Thomas Gleixner (2): KVM: SVM: Move spec control call after restore of GS KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled Tom Lendacky (1): KVM: SVM: Add MSR-based feature support for serializing LFENCE Wanpeng Li (1): KVM: X86: Allow userspace to define the microcode version arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kernel/cpu/bugs.c | 4 + arch/x86/kvm/cpuid.c | 25 +- arch/x86/kvm/cpuid.h | 74 ++-- arch/x86/kvm/svm.c | 209 +++++++++-- arch/x86/kvm/vmx.c | 777 ++++++++++++++++++++++------------------ arch/x86/kvm/x86.c | 12 +- 7 files changed, 691 insertions(+), 411 deletions(-)