Message ID | 20181219100319.5191-1-juergh@canonical.com |
---|---|
State | New |
Headers | show |
Series | [SRU,Xenial,PULL] Guests using IBRS incur a large performance penalty (LP: #1764956) | expand |
On 19.12.18 11:03, Juerg Haefliger wrote: > [Impact] > the IBRS would be mistakenly enabled in the host when the switching > from an IBRS-enabled VM and that causes the performance overhead in > the host. The other condition could also mistakenly disables the IBRS > in VM when context-switching from the host. And this could be > considered a CVE host. > > [Fix] > The patch fixes the logic inside the x86_virt_spec_ctrl that it checks > the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the > x86_spec_ctrl_base by default is zero. Because the upstream > implementation is not equal to the Xenial's implementation. Upstream > doesn't use the IBRS as the formal fix. So, by default, it's zero. > > On the other hand, after the VM exit, the SPEC_CTRL register also > needs to be saved manually by reading the SPEC_CTRL MSR as the MSR > intercept is disabled by default in the hardware_setup(v4.4) and > vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and > doesn't trigger a trap. So, the vmx_set_msr() function isn't called. > > The v3.13 kernel hasn't been tested. However, the patch can be viewed > at: > http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru > > The v4.4 patch: > http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg > > [Test] > > The patch has been tested on the 4.4.0-140.166 and works fine. > > The reproducing environment: > Guest kernel version: 4.4.0-138.164 > Host kernel version: 4.4.0-140.166 > > (host IBRS, guest IBRS) > > - 1). (0, 1). > The case can be reproduced by the following instructions: > guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled > 1 > > <Several minutes later...> > > host$ cat /proc/sys/kernel/ibrs_enabled > 0 > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 11111111111111000000000000000000010010100000000000000000 > > Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly > enabled. > > host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500 > stress-ng: info: [11264] defaulting to a 86400 second run per stressor > stress-ng: info: [11264] dispatching hogs: 1 cpu > stress-ng: info: [11264] cache allocate: default cache size: 35840K > stress-ng: info: [11264] successful run completed in 33.48s > > The host kernel didn't notice the IBRS bit is enabled. So, the situation > is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host. > And running the stress-ng is a pure userspace CPU capability > calculation. So, the performance downgrades to about 1/3. Without the > IBRS enabled, it needs about 10s. > > - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0). > The guest IBRS has been mistakenly disabled. > > guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 11111111111111111111111111111111111111111111111111111111 > > host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 11111111111111111111111111111111111111111111111111111111 > host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 00000000000000000000000000000000000000000000000000000000 > > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 00000000000000000000000000000000000000000000000000000000 > > > [juergh: MSR-isolation between guests and the host is incomplete in > Xenial. This PR is supposed to fix this and bring Xenial up to par with > stable v4.9.] > > Signed-off-by: Juerg Haefliger <juergh@canonical.com> Just for sanity checking: this pull request (hate hate hate) replaces the submitted patch, right? -Stefan > --- > > The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529: > > UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100) > > are available in the Git repository at: > > git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 > > for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622: > > UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100) > > ---------------------------------------------------------------- > Ashok Raj (1): > KVM/x86: Add IBPB support > > David Matlack (1): > KVM: nVMX: mark vmcs12 pages dirty on L2 exit > > Jim Mattson (5): > kvm: nVMX: VMCLEAR an active shadow VMCS after last use > kvm: vmx: Scrub hardware GPRs at VM-exit > KVM: nVMX: Eliminate vmcs02 pool > kvm: x86: IA32_ARCH_CAPABILITIES is always supported > kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb > > Juerg Haefliger (4): > UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD > UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic > UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent > UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT > > KarimAllah Ahmed (3): > KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL > KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL > X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs > > Paolo Bonzini (5): > KVM: VMX: introduce alloc_loaded_vmcs > KVM: VMX: make MSR bitmaps per-VCPU > KVM/x86: Remove indirect MSR op calls from SPEC_CTRL > KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() > KVM: VMX: fixes for vmentry_l1d_flush module parameter > > Radim Krčmář (1): > KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC > > Thomas Gleixner (2): > KVM: SVM: Move spec control call after restore of GS > KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled > > Tom Lendacky (1): > KVM: SVM: Add MSR-based feature support for serializing LFENCE > > Wanpeng Li (1): > KVM: X86: Allow userspace to define the microcode version > > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kernel/cpu/bugs.c | 4 + > arch/x86/kvm/cpuid.c | 25 +- > arch/x86/kvm/cpuid.h | 74 ++-- > arch/x86/kvm/svm.c | 209 +++++++++-- > arch/x86/kvm/vmx.c | 777 ++++++++++++++++++++++------------------ > arch/x86/kvm/x86.c | 12 +- > 7 files changed, 691 insertions(+), 411 deletions(-) >
On Mon, 7 Jan 2019 19:26:55 +0100 Stefan Bader <stefan.bader@canonical.com> wrote: > On 19.12.18 11:03, Juerg Haefliger wrote: > > [Impact] > > the IBRS would be mistakenly enabled in the host when the switching > > from an IBRS-enabled VM and that causes the performance overhead in > > the host. The other condition could also mistakenly disables the IBRS > > in VM when context-switching from the host. And this could be > > considered a CVE host. > > > > [Fix] > > The patch fixes the logic inside the x86_virt_spec_ctrl that it checks > > the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the > > x86_spec_ctrl_base by default is zero. Because the upstream > > implementation is not equal to the Xenial's implementation. Upstream > > doesn't use the IBRS as the formal fix. So, by default, it's zero. > > > > On the other hand, after the VM exit, the SPEC_CTRL register also > > needs to be saved manually by reading the SPEC_CTRL MSR as the MSR > > intercept is disabled by default in the hardware_setup(v4.4) and > > vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and > > doesn't trigger a trap. So, the vmx_set_msr() function isn't called. > > > > The v3.13 kernel hasn't been tested. However, the patch can be viewed > > at: > > http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru > > > > The v4.4 patch: > > http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg > > > > [Test] > > > > The patch has been tested on the 4.4.0-140.166 and works fine. > > > > The reproducing environment: > > Guest kernel version: 4.4.0-138.164 > > Host kernel version: 4.4.0-140.166 > > > > (host IBRS, guest IBRS) > > > > - 1). (0, 1). > > The case can be reproduced by the following instructions: > > guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled > > 1 > > > > <Several minutes later...> > > > > host$ cat /proc/sys/kernel/ibrs_enabled > > 0 > > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > > 11111111111111000000000000000000010010100000000000000000 > > > > Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly > > enabled. > > > > host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500 > > stress-ng: info: [11264] defaulting to a 86400 second run per stressor > > stress-ng: info: [11264] dispatching hogs: 1 cpu > > stress-ng: info: [11264] cache allocate: default cache size: 35840K > > stress-ng: info: [11264] successful run completed in 33.48s > > > > The host kernel didn't notice the IBRS bit is enabled. So, the situation > > is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host. > > And running the stress-ng is a pure userspace CPU capability > > calculation. So, the performance downgrades to about 1/3. Without the > > IBRS enabled, it needs about 10s. > > > > - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0). > > The guest IBRS has been mistakenly disabled. > > > > guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled > > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > > 11111111111111111111111111111111111111111111111111111111 > > > > host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled > > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > > 11111111111111111111111111111111111111111111111111111111 > > host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled > > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > > 00000000000000000000000000000000000000000000000000000000 > > > > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > > 00000000000000000000000000000000000000000000000000000000 > > > > > > [juergh: MSR-isolation between guests and the host is incomplete in > > Xenial. This PR is supposed to fix this and bring Xenial up to par with > > stable v4.9.] > > > > Signed-off-by: Juerg Haefliger <juergh@canonical.com> > > Just for sanity checking: this pull request (hate hate hate) ? The PR itself, or the fact that it's a PR, or the submitter of the PR? > replaces the > submitted patch, right? Yes. ...Juerg > > -Stefan > > --- > > > > The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529: > > > > UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100) > > > > are available in the Git repository at: > > > > git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 > > > > for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622: > > > > UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100) > > > > ---------------------------------------------------------------- > > Ashok Raj (1): > > KVM/x86: Add IBPB support > > > > David Matlack (1): > > KVM: nVMX: mark vmcs12 pages dirty on L2 exit > > > > Jim Mattson (5): > > kvm: nVMX: VMCLEAR an active shadow VMCS after last use > > kvm: vmx: Scrub hardware GPRs at VM-exit > > KVM: nVMX: Eliminate vmcs02 pool > > kvm: x86: IA32_ARCH_CAPABILITIES is always supported > > kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb > > > > Juerg Haefliger (4): > > UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD > > UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic > > UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent > > UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT > > > > KarimAllah Ahmed (3): > > KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL > > KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL > > X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs > > > > Paolo Bonzini (5): > > KVM: VMX: introduce alloc_loaded_vmcs > > KVM: VMX: make MSR bitmaps per-VCPU > > KVM/x86: Remove indirect MSR op calls from SPEC_CTRL > > KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() > > KVM: VMX: fixes for vmentry_l1d_flush module parameter > > > > Radim Krčmář (1): > > KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC > > > > Thomas Gleixner (2): > > KVM: SVM: Move spec control call after restore of GS > > KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled > > > > Tom Lendacky (1): > > KVM: SVM: Add MSR-based feature support for serializing LFENCE > > > > Wanpeng Li (1): > > KVM: X86: Allow userspace to define the microcode version > > > > arch/x86/include/asm/kvm_host.h | 1 + > > arch/x86/kernel/cpu/bugs.c | 4 + > > arch/x86/kvm/cpuid.c | 25 +- > > arch/x86/kvm/cpuid.h | 74 ++-- > > arch/x86/kvm/svm.c | 209 +++++++++-- > > arch/x86/kvm/vmx.c | 777 ++++++++++++++++++++++------------------ > > arch/x86/kvm/x86.c | 12 +- > > 7 files changed, 691 insertions(+), 411 deletions(-) > > > >
On 08.01.19 12:33, Juerg Haefliger wrote: > On Mon, 7 Jan 2019 19:26:55 +0100 > Stefan Bader <stefan.bader@canonical.com> wrote: > >> On 19.12.18 11:03, Juerg Haefliger wrote: >>> [Impact] >>> the IBRS would be mistakenly enabled in the host when the switching >>> from an IBRS-enabled VM and that causes the performance overhead in >>> the host. The other condition could also mistakenly disables the IBRS >>> in VM when context-switching from the host. And this could be >>> considered a CVE host. >>> >>> [Fix] >>> The patch fixes the logic inside the x86_virt_spec_ctrl that it checks >>> the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the >>> x86_spec_ctrl_base by default is zero. Because the upstream >>> implementation is not equal to the Xenial's implementation. Upstream >>> doesn't use the IBRS as the formal fix. So, by default, it's zero. >>> >>> On the other hand, after the VM exit, the SPEC_CTRL register also >>> needs to be saved manually by reading the SPEC_CTRL MSR as the MSR >>> intercept is disabled by default in the hardware_setup(v4.4) and >>> vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and >>> doesn't trigger a trap. So, the vmx_set_msr() function isn't called. >>> >>> The v3.13 kernel hasn't been tested. However, the patch can be viewed >>> at: >>> http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru >>> >>> The v4.4 patch: >>> http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg >>> >>> [Test] >>> >>> The patch has been tested on the 4.4.0-140.166 and works fine. >>> >>> The reproducing environment: >>> Guest kernel version: 4.4.0-138.164 >>> Host kernel version: 4.4.0-140.166 >>> >>> (host IBRS, guest IBRS) >>> >>> - 1). (0, 1). >>> The case can be reproduced by the following instructions: >>> guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled >>> 1 >>> >>> <Several minutes later...> >>> >>> host$ cat /proc/sys/kernel/ibrs_enabled >>> 0 >>> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done >>> 11111111111111000000000000000000010010100000000000000000 >>> >>> Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly >>> enabled. >>> >>> host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500 >>> stress-ng: info: [11264] defaulting to a 86400 second run per stressor >>> stress-ng: info: [11264] dispatching hogs: 1 cpu >>> stress-ng: info: [11264] cache allocate: default cache size: 35840K >>> stress-ng: info: [11264] successful run completed in 33.48s >>> >>> The host kernel didn't notice the IBRS bit is enabled. So, the situation >>> is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host. >>> And running the stress-ng is a pure userspace CPU capability >>> calculation. So, the performance downgrades to about 1/3. Without the >>> IBRS enabled, it needs about 10s. >>> >>> - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0). >>> The guest IBRS has been mistakenly disabled. >>> >>> guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled >>> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done >>> 11111111111111111111111111111111111111111111111111111111 >>> >>> host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled >>> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done >>> 11111111111111111111111111111111111111111111111111111111 >>> host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled >>> host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done >>> 00000000000000000000000000000000000000000000000000000000 >>> >>> guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done >>> 00000000000000000000000000000000000000000000000000000000 >>> >>> >>> [juergh: MSR-isolation between guests and the host is incomplete in >>> Xenial. This PR is supposed to fix this and bring Xenial up to par with >>> stable v4.9.] >>> >>> Signed-off-by: Juerg Haefliger <juergh@canonical.com> >> >> Just for sanity checking: this pull request (hate hate hate) > > ? The PR itself, or the fact that it's a PR, or the submitter of the PR? The submitter a bit for the size, but mostly Intel for getting us into this mess... > > >> replaces the >> submitted patch, right? > > Yes. > > ...Juerg > > >> >> -Stefan >>> --- >>> >>> The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529: >>> >>> UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100) >>> >>> are available in the Git repository at: >>> >>> git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 >>> >>> for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622: >>> >>> UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100) >>> >>> ---------------------------------------------------------------- >>> Ashok Raj (1): >>> KVM/x86: Add IBPB support >>> >>> David Matlack (1): >>> KVM: nVMX: mark vmcs12 pages dirty on L2 exit >>> >>> Jim Mattson (5): >>> kvm: nVMX: VMCLEAR an active shadow VMCS after last use >>> kvm: vmx: Scrub hardware GPRs at VM-exit >>> KVM: nVMX: Eliminate vmcs02 pool >>> kvm: x86: IA32_ARCH_CAPABILITIES is always supported >>> kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb >>> >>> Juerg Haefliger (4): >>> UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD >>> UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic >>> UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent >>> UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT >>> >>> KarimAllah Ahmed (3): >>> KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL >>> KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL >>> X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs >>> >>> Paolo Bonzini (5): >>> KVM: VMX: introduce alloc_loaded_vmcs >>> KVM: VMX: make MSR bitmaps per-VCPU >>> KVM/x86: Remove indirect MSR op calls from SPEC_CTRL >>> KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() >>> KVM: VMX: fixes for vmentry_l1d_flush module parameter >>> >>> Radim Krčmář (1): >>> KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC >>> >>> Thomas Gleixner (2): >>> KVM: SVM: Move spec control call after restore of GS >>> KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled >>> >>> Tom Lendacky (1): >>> KVM: SVM: Add MSR-based feature support for serializing LFENCE >>> >>> Wanpeng Li (1): >>> KVM: X86: Allow userspace to define the microcode version >>> >>> arch/x86/include/asm/kvm_host.h | 1 + >>> arch/x86/kernel/cpu/bugs.c | 4 + >>> arch/x86/kvm/cpuid.c | 25 +- >>> arch/x86/kvm/cpuid.h | 74 ++-- >>> arch/x86/kvm/svm.c | 209 +++++++++-- >>> arch/x86/kvm/vmx.c | 777 ++++++++++++++++++++++------------------ >>> arch/x86/kvm/x86.c | 12 +- >>> 7 files changed, 691 insertions(+), 411 deletions(-) >>> >> >> >
On 19.12.18 11:03, Juerg Haefliger wrote: > git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 Glancing over the changes I noticed that "UBUNTU: SAUCE: x86/speculation: Cleanup IBPB runtime control handling" adds use of spec_ctrl_mutex but only "UBUNTU: SAUCE: x86/speculation: Cleanup IBRS runtime control handling" which follows next actually seems to declare it. In the end that does not matter only could make bisection a pain. Minor pedantic nitbit: is not "UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk" moving the hunk further down? Apart from that the other changes roughly seemed to make sense but I have to admit that I went rather quickly over the non-SAUCE ones. The delta is just too big and I rely a bit on the successful testing. So if fixing up the first 2 patches (which should be doable when applying) Acked-by: Stefan Bader <stefan.bader@canonical.com>
On Tue, 8 Jan 2019 14:43:34 +0100 Stefan Bader <stefan.bader@canonical.com> wrote: > On 19.12.18 11:03, Juerg Haefliger wrote: > > git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 > > > Glancing over the changes I noticed that "UBUNTU: SAUCE: x86/speculation: > Cleanup IBPB runtime control handling" adds use of spec_ctrl_mutex spec_ctrl_mutex is already defined (introduced by the original IBRS/IBPB patches). > but only > "UBUNTU: SAUCE: x86/speculation: Cleanup IBRS runtime control handling" which > follows next actually seems to declare it. That patch just moves the definition from kernel/smp.c to kernel/sysctl.c. > In the end that does not matter only > could make bisection a pain. IIRC, I've compile-tested every single patch. > Minor pedantic nitbit: is not "UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW > hunk" moving the hunk further down? Nope. The diff shows a different hunk being moved down as a result of moving RSB_CTXSW up. So the title is misleading and should probably read 'Move X86_FEATURE_IBPB hunk'. ...Juerg > Apart from that the other changes roughly seemed to make sense but I have to > admit that I went rather quickly over the non-SAUCE ones. The delta is just too > big and I rely a bit on the successful testing. > > So if fixing up the first 2 patches (which should be doable when applying) > > Acked-by: Stefan Bader <stefan.bader@canonical.com> >
On 1/8/19 2:43 PM, Stefan Bader wrote: > On 19.12.18 11:03, Juerg Haefliger wrote: >> git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 > > Glancing over the changes I noticed that "UBUNTU: SAUCE: x86/speculation: > Cleanup IBPB runtime control handling" adds use of spec_ctrl_mutex but only > "UBUNTU: SAUCE: x86/speculation: Cleanup IBRS runtime control handling" which > follows next actually seems to declare it. In the end that does not matter only > could make bisection a pain. > > Minor pedantic nitbit: is not "UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW > hunk" moving the hunk further down? The mentioned commits, although they are on that branch, are not part of this PR (which is only d0b9a387cf1d68745c558d04fd3aa980497d1529..7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622). They seem to have been sent as part of an earlier thread titled "[SRU][Xenial][PATCH v2 0/4] Cleanups for CVE-2017-5715 (Spectre v2)". Should we treat these patches as separated requests or as a single PR?
On 08.01.19 17:32, Juerg Haefliger wrote: > On Tue, 8 Jan 2019 14:43:34 +0100 > Stefan Bader <stefan.bader@canonical.com> wrote: > >> On 19.12.18 11:03, Juerg Haefliger wrote: >>> git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 >> >> >> Glancing over the changes I noticed that "UBUNTU: SAUCE: x86/speculation: >> Cleanup IBPB runtime control handling" adds use of spec_ctrl_mutex > > spec_ctrl_mutex is already defined (introduced by the original IBRS/IBPB > patches). Hm, ok, sorry, then better not change it. Not that obvious looking at the patch. > > >> but only >> "UBUNTU: SAUCE: x86/speculation: Cleanup IBRS runtime control handling" which >> follows next actually seems to declare it. > > That patch just moves the definition from kernel/smp.c to kernel/sysctl.c. > > >> In the end that does not matter only >> could make bisection a pain. > > > IIRC, I've compile-tested every single patch. > > >> Minor pedantic nitbit: is not "UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW >> hunk" moving the hunk further down? > > Nope. The diff shows a different hunk being moved down as a result of moving > RSB_CTXSW up. So the title is misleading and should probably read 'Move > X86_FEATURE_IBPB hunk'. Actually could have been fooled by the way the diff shows. There was another patch which said move something before but then looked like moving something after. But that was just because what got moved was a hunk before the hunk that was claimed to be moved. And indeed, if you move something from before to after, then the thing after is implicitly moved to the front. :/ .Stefan > > ...Juerg > > >> Apart from that the other changes roughly seemed to make sense but I have to >> admit that I went rather quickly over the non-SAUCE ones. The delta is just too >> big and I rely a bit on the successful testing. >> >> So if fixing up the first 2 patches (which should be doable when applying) >> >> Acked-by: Stefan Bader <stefan.bader@canonical.com> >> >
On Tue, 8 Jan 2019 18:34:35 +0100 Kleber Souza <kleber.souza@canonical.com> wrote: > On 1/8/19 2:43 PM, Stefan Bader wrote: > > On 19.12.18 11:03, Juerg Haefliger wrote: > >> git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 > > > > Glancing over the changes I noticed that "UBUNTU: SAUCE: x86/speculation: > > Cleanup IBPB runtime control handling" adds use of spec_ctrl_mutex but only > > "UBUNTU: SAUCE: x86/speculation: Cleanup IBRS runtime control handling" which > > follows next actually seems to declare it. In the end that does not matter only > > could make bisection a pain. > > > > Minor pedantic nitbit: is not "UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW > > hunk" moving the hunk further down? > > The mentioned commits, although they are on that branch, are not part of > this PR (which is only > d0b9a387cf1d68745c558d04fd3aa980497d1529..7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622). > They seem to have been sent as part of an earlier thread titled > "[SRU][Xenial][PATCH v2 0/4] Cleanups for CVE-2017-5715 (Spectre v2)". Ah yes. That's a different patchset unrelated to the issue that the PR fixes. They just happen to be both IBRS/IBPB code changes. > Should we treat these patches as separated requests or as a single PR? Separately please. And somebody please review the assembly code changes (patch 3/4)... There are probably better ways to do this but my assembly foo isn't that stellar :-) ...Juerg
On 09.01.19 10:56, Juerg Haefliger wrote: > On Tue, 8 Jan 2019 18:34:35 +0100 > Kleber Souza <kleber.souza@canonical.com> wrote: > >> On 1/8/19 2:43 PM, Stefan Bader wrote: >>> On 19.12.18 11:03, Juerg Haefliger wrote: >>>> git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 >>> >>> Glancing over the changes I noticed that "UBUNTU: SAUCE: x86/speculation: >>> Cleanup IBPB runtime control handling" adds use of spec_ctrl_mutex but only >>> "UBUNTU: SAUCE: x86/speculation: Cleanup IBRS runtime control handling" which >>> follows next actually seems to declare it. In the end that does not matter only >>> could make bisection a pain. >>> >>> Minor pedantic nitbit: is not "UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW >>> hunk" moving the hunk further down? >> >> The mentioned commits, although they are on that branch, are not part of >> this PR (which is only >> d0b9a387cf1d68745c558d04fd3aa980497d1529..7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622). >> They seem to have been sent as part of an earlier thread titled >> "[SRU][Xenial][PATCH v2 0/4] Cleanups for CVE-2017-5715 (Spectre v2)". > > Ah yes. That's a different patchset unrelated to the issue that the PR fixes. > They just happen to be both IBRS/IBPB code changes. > > >> Should we treat these patches as separated requests or as a single PR? > > Separately please. And somebody please review the assembly code changes > (patch 3/4)... There are probably better ways to do this but my assembly foo > isn't that stellar :-) I did (but should respond about that to the other thread, I did not notice I went out of bounds yesterday). I think after a long time of head scratching those seemed to make sense. The whole r?x vs. e?x business is just architecturally ugly. > > ...Juerg > > >
On 08.01.19 14:43, Stefan Bader wrote: > On 19.12.18 11:03, Juerg Haefliger wrote: >> git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 > > > Glancing over the changes I noticed that "UBUNTU: SAUCE: x86/speculation: > Cleanup IBPB runtime control handling" adds use of spec_ctrl_mutex but only > "UBUNTU: SAUCE: x86/speculation: Cleanup IBRS runtime control handling" which > follows next actually seems to declare it. In the end that does not matter only > could make bisection a pain. > > Minor pedantic nitbit: is not "UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW > hunk" moving the hunk further down? > > Apart from that the other changes roughly seemed to make sense but I have to > admit that I went rather quickly over the non-SAUCE ones. The delta is just too > big and I rely a bit on the successful testing. > > So if fixing up the first 2 patches (which should be doable when applying) > > Acked-by: Stefan Bader <stefan.bader@canonical.com> > > Just to re-new my ack with the patches which come before this pull request applied before from the separate thread. -Stefan
The update to stable 4.4.168 brings in a lot of the commits from this PR so nacking it. Will send a fix for 1764956 after all the current stable updates are applied. ...Juerg On Wed, 19 Dec 2018 11:03:19 +0100 Juerg Haefliger <juerg.haefliger@canonical.com> wrote: > [Impact] > the IBRS would be mistakenly enabled in the host when the switching > from an IBRS-enabled VM and that causes the performance overhead in > the host. The other condition could also mistakenly disables the IBRS > in VM when context-switching from the host. And this could be > considered a CVE host. > > [Fix] > The patch fixes the logic inside the x86_virt_spec_ctrl that it checks > the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the > x86_spec_ctrl_base by default is zero. Because the upstream > implementation is not equal to the Xenial's implementation. Upstream > doesn't use the IBRS as the formal fix. So, by default, it's zero. > > On the other hand, after the VM exit, the SPEC_CTRL register also > needs to be saved manually by reading the SPEC_CTRL MSR as the MSR > intercept is disabled by default in the hardware_setup(v4.4) and > vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and > doesn't trigger a trap. So, the vmx_set_msr() function isn't called. > > The v3.13 kernel hasn't been tested. However, the patch can be viewed > at: > http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru > > The v4.4 patch: > http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg > > [Test] > > The patch has been tested on the 4.4.0-140.166 and works fine. > > The reproducing environment: > Guest kernel version: 4.4.0-138.164 > Host kernel version: 4.4.0-140.166 > > (host IBRS, guest IBRS) > > - 1). (0, 1). > The case can be reproduced by the following instructions: > guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled > 1 > > <Several minutes later...> > > host$ cat /proc/sys/kernel/ibrs_enabled > 0 > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 11111111111111000000000000000000010010100000000000000000 > > Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly > enabled. > > host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500 > stress-ng: info: [11264] defaulting to a 86400 second run per stressor > stress-ng: info: [11264] dispatching hogs: 1 cpu > stress-ng: info: [11264] cache allocate: default cache size: 35840K > stress-ng: info: [11264] successful run completed in 33.48s > > The host kernel didn't notice the IBRS bit is enabled. So, the situation > is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host. > And running the stress-ng is a pure userspace CPU capability > calculation. So, the performance downgrades to about 1/3. Without the > IBRS enabled, it needs about 10s. > > - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0). > The guest IBRS has been mistakenly disabled. > > guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 11111111111111111111111111111111111111111111111111111111 > > host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 11111111111111111111111111111111111111111111111111111111 > host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled > host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 00000000000000000000000000000000000000000000000000000000 > > guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done > 00000000000000000000000000000000000000000000000000000000 > > > [juergh: MSR-isolation between guests and the host is incomplete in > Xenial. This PR is supposed to fix this and bring Xenial up to par with > stable v4.9.] > > Signed-off-by: Juerg Haefliger <juergh@canonical.com> > --- > > The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529: > > UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100) > > are available in the Git repository at: > > git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 > > for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622: > > UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100) > > ---------------------------------------------------------------- > Ashok Raj (1): > KVM/x86: Add IBPB support > > David Matlack (1): > KVM: nVMX: mark vmcs12 pages dirty on L2 exit > > Jim Mattson (5): > kvm: nVMX: VMCLEAR an active shadow VMCS after last use > kvm: vmx: Scrub hardware GPRs at VM-exit > KVM: nVMX: Eliminate vmcs02 pool > kvm: x86: IA32_ARCH_CAPABILITIES is always supported > kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb > > Juerg Haefliger (4): > UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD > UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic > UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent > UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT > > KarimAllah Ahmed (3): > KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL > KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL > X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs > > Paolo Bonzini (5): > KVM: VMX: introduce alloc_loaded_vmcs > KVM: VMX: make MSR bitmaps per-VCPU > KVM/x86: Remove indirect MSR op calls from SPEC_CTRL > KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() > KVM: VMX: fixes for vmentry_l1d_flush module parameter > > Radim Krčmář (1): > KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC > > Thomas Gleixner (2): > KVM: SVM: Move spec control call after restore of GS > KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled > > Tom Lendacky (1): > KVM: SVM: Add MSR-based feature support for serializing LFENCE > > Wanpeng Li (1): > KVM: X86: Allow userspace to define the microcode version > > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kernel/cpu/bugs.c | 4 + > arch/x86/kvm/cpuid.c | 25 +- > arch/x86/kvm/cpuid.h | 74 ++-- > arch/x86/kvm/svm.c | 209 +++++++++-- > arch/x86/kvm/vmx.c | 777 ++++++++++++++++++++++------------------ > arch/x86/kvm/x86.c | 12 +- > 7 files changed, 691 insertions(+), 411 deletions(-)
[Impact] the IBRS would be mistakenly enabled in the host when the switching from an IBRS-enabled VM and that causes the performance overhead in the host. The other condition could also mistakenly disables the IBRS in VM when context-switching from the host. And this could be considered a CVE host. [Fix] The patch fixes the logic inside the x86_virt_spec_ctrl that it checks the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the x86_spec_ctrl_base by default is zero. Because the upstream implementation is not equal to the Xenial's implementation. Upstream doesn't use the IBRS as the formal fix. So, by default, it's zero. On the other hand, after the VM exit, the SPEC_CTRL register also needs to be saved manually by reading the SPEC_CTRL MSR as the MSR intercept is disabled by default in the hardware_setup(v4.4) and vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and doesn't trigger a trap. So, the vmx_set_msr() function isn't called. The v3.13 kernel hasn't been tested. However, the patch can be viewed at: http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru The v4.4 patch: http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg [Test] The patch has been tested on the 4.4.0-140.166 and works fine. The reproducing environment: Guest kernel version: 4.4.0-138.164 Host kernel version: 4.4.0-140.166 (host IBRS, guest IBRS) - 1). (0, 1). The case can be reproduced by the following instructions: guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled 1 <Several minutes later...> host$ cat /proc/sys/kernel/ibrs_enabled 0 host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 11111111111111000000000000000000010010100000000000000000 Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly enabled. host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500 stress-ng: info: [11264] defaulting to a 86400 second run per stressor stress-ng: info: [11264] dispatching hogs: 1 cpu stress-ng: info: [11264] cache allocate: default cache size: 35840K stress-ng: info: [11264] successful run completed in 33.48s The host kernel didn't notice the IBRS bit is enabled. So, the situation is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host. And running the stress-ng is a pure userspace CPU capability calculation. So, the performance downgrades to about 1/3. Without the IBRS enabled, it needs about 10s. - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0). The guest IBRS has been mistakenly disabled. guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 11111111111111111111111111111111111111111111111111111111 host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 11111111111111111111111111111111111111111111111111111111 host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 00000000000000000000000000000000000000000000000000000000 guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done 00000000000000000000000000000000000000000000000000000000 [juergh: MSR-isolation between guests and the host is incomplete in Xenial. This PR is supposed to fix this and bring Xenial up to par with stable v4.9.] Signed-off-by: Juerg Haefliger <juergh@canonical.com> --- The following changes since commit d0b9a387cf1d68745c558d04fd3aa980497d1529: UBUNTU: SAUCE: x86/speculation: Move RSB_CTXSW hunk (2018-12-13 13:03:55 +0100) are available in the Git repository at: git://git.launchpad.net/~juergh/+git/xenial-linux lp1764956-v2 for you to fetch changes up to 7ad0e9a99c1466f8fee92cba5ffeaa0af83f6622: UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT (2018-12-19 10:58:24 +0100) ---------------------------------------------------------------- Ashok Raj (1): KVM/x86: Add IBPB support David Matlack (1): KVM: nVMX: mark vmcs12 pages dirty on L2 exit Jim Mattson (5): kvm: nVMX: VMCLEAR an active shadow VMCS after last use kvm: vmx: Scrub hardware GPRs at VM-exit KVM: nVMX: Eliminate vmcs02 pool kvm: x86: IA32_ARCH_CAPABILITIES is always supported kvm: svm: Ensure an IBPB on all affected CPUs when freeing a vmcb Juerg Haefliger (4): UBUNTU: SAUCE: [Fix] KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD UBUNTU: SAUCE: [Fix] x86/KVM/VMX: Add L1D flush logic UBUNTU: SAUCE: KVM: Move code fragments, cleanup and re-indent UBUNTU: SAUCE: Restore the IBRS host state on VMEXIT KarimAllah Ahmed (3): KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs Paolo Bonzini (5): KVM: VMX: introduce alloc_loaded_vmcs KVM: VMX: make MSR bitmaps per-VCPU KVM/x86: Remove indirect MSR op calls from SPEC_CTRL KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() KVM: VMX: fixes for vmentry_l1d_flush module parameter Radim Krčmář (1): KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC Thomas Gleixner (2): KVM: SVM: Move spec control call after restore of GS KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled Tom Lendacky (1): KVM: SVM: Add MSR-based feature support for serializing LFENCE Wanpeng Li (1): KVM: X86: Allow userspace to define the microcode version arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kernel/cpu/bugs.c | 4 + arch/x86/kvm/cpuid.c | 25 +- arch/x86/kvm/cpuid.h | 74 ++-- arch/x86/kvm/svm.c | 209 +++++++++-- arch/x86/kvm/vmx.c | 777 ++++++++++++++++++++++------------------ arch/x86/kvm/x86.c | 12 +- 7 files changed, 691 insertions(+), 411 deletions(-)