mbox series

[SRU,X,0/2] Hard lockups due to unrestricted lapic timer delay

Message ID 20190227191950.12074-1-gpiccoli@canonical.com
Headers show
Series Hard lockups due to unrestricted lapic timer delay | expand

Message

Guilherme G. Piccoli Feb. 27, 2019, 7:19 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1817918

[Impact]

* There is a long-time report of an issue with the TSC delay present
in wait_lapic_expire() - basically the guest could have an expiration
timer configured in a way it induces host to wait a long time (with
preemption disabled), so there's a potential scenario for host lockups.

* The stack trace we have access (from an user report of this issue)
is (summarized) below:

NMI watchdog: Watchdog detected hard LOCKUP on cpu 16
[...]
CPU: 16 PID: 3024910 Comm: CPU 0/KVM Not tainted 4.4.0-139-generic #165-Ubuntu
RIP: 0010:[<addr>] [<addr>] delay_tsc+0x20/0x60
[...]
 __delay+0x15/0x20
wait_lapic_expire+0xc3/0x150 [kvm]
vcpu_enter_guest+0x743/0x11d0 [kvm]
kvm_arch_vcpu_ioctl_run+0xe6/0x410 [kvm]
kvm_vcpu_ioctl+0x33d/0x620 [kvm]
do_vfs_ioctl+0x2af/0x4b0
? __do_page_fault+0x1c1/0x410
? fire_user_return_notifiers+0x3e/0x50
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x22/0xc1

This matches the reported problem in the KVM mailing-list:
https://marc.info/?l=kvm&m=146374488028339

* A fix was proposed in the above thread, but discarded in favor of the
following approach: https://marc.info/?l=kvm&m=146647260109315
The patch was merged in Linus tree, hence we hereby request the SRU:
b606f189c7d5 ("KVM: LAPIC: cap __delay at lapic_timer_advance_ns").
There's one additional patch needed, which is just the header adjustment
for exporting a necessary function.

* The patch is missing only in 4.4 kernel series; Bionic (4.15) and the other
* newer releases have the patch already.

[Test Case]

* Unfortunately this is a hard to reproduce issue; we have reports of
this lockup from an user, hence the SRU request here.
Also, the patch was introduced originally in kernel 4.7, approx. 2.5 years
ago. So, we are confident that community is running this code long enough
without errors reported. Also, checked in the Linus tree and no fixes
for this code were introduced since kernel 4.7.

[Regression Potential]

* The code modification requested here affects the amount of delay in
a specific timer; the patch introduces a maximum time for delay, preventing
unbounded delays in host.
The regression potential is considered low, and given the nature of the
modification, latency issues in guests are likely to be the most problematic
regression potential we have.

Marcelo Tosatti (2):
  KVM: x86: move nsec_to_cycles from x86.c to x86.h
  KVM: LAPIC: cap __delay at lapic_timer_advance_ns

 arch/x86/kvm/lapic.c | 3 ++-
 arch/x86/kvm/x86.c   | 6 ------
 arch/x86/kvm/x86.h   | 8 ++++++++
 3 files changed, 10 insertions(+), 7 deletions(-)

Comments

Marcelo Henrique Cerri Feb. 27, 2019, 7:46 p.m. UTC | #1
Upstream patch, low regression risk and straightforward backport for
context adjustments.

Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Thadeu Lima de Souza Cascardo Feb. 27, 2019, 9:27 p.m. UTC | #2
On Wed, Feb 27, 2019 at 04:19:48PM -0300, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1817918
> 
> [Impact]
> 
> * There is a long-time report of an issue with the TSC delay present
> in wait_lapic_expire() - basically the guest could have an expiration
> timer configured in a way it induces host to wait a long time (with
> preemption disabled), so there's a potential scenario for host lockups.
> 
> * The stack trace we have access (from an user report of this issue)
> is (summarized) below:
> 
> NMI watchdog: Watchdog detected hard LOCKUP on cpu 16
> [...]
> CPU: 16 PID: 3024910 Comm: CPU 0/KVM Not tainted 4.4.0-139-generic #165-Ubuntu
> RIP: 0010:[<addr>] [<addr>] delay_tsc+0x20/0x60
> [...]
>  __delay+0x15/0x20
> wait_lapic_expire+0xc3/0x150 [kvm]
> vcpu_enter_guest+0x743/0x11d0 [kvm]
> kvm_arch_vcpu_ioctl_run+0xe6/0x410 [kvm]
> kvm_vcpu_ioctl+0x33d/0x620 [kvm]
> do_vfs_ioctl+0x2af/0x4b0
> ? __do_page_fault+0x1c1/0x410
> ? fire_user_return_notifiers+0x3e/0x50
> SyS_ioctl+0x79/0x90
> entry_SYSCALL_64_fastpath+0x22/0xc1
> 
> This matches the reported problem in the KVM mailing-list:
> https://marc.info/?l=kvm&m=146374488028339
> 
> * A fix was proposed in the above thread, but discarded in favor of the
> following approach: https://marc.info/?l=kvm&m=146647260109315
> The patch was merged in Linus tree, hence we hereby request the SRU:
> b606f189c7d5 ("KVM: LAPIC: cap __delay at lapic_timer_advance_ns").
> There's one additional patch needed, which is just the header adjustment
> for exporting a necessary function.
> 
> * The patch is missing only in 4.4 kernel series; Bionic (4.15) and the other
> * newer releases have the patch already.
> 
> [Test Case]
> 
> * Unfortunately this is a hard to reproduce issue; we have reports of
> this lockup from an user, hence the SRU request here.
> Also, the patch was introduced originally in kernel 4.7, approx. 2.5 years
> ago. So, we are confident that community is running this code long enough
> without errors reported. Also, checked in the Linus tree and no fixes
> for this code were introduced since kernel 4.7.
> 
> [Regression Potential]
> 
> * The code modification requested here affects the amount of delay in
> a specific timer; the patch introduces a maximum time for delay, preventing
> unbounded delays in host.
> The regression potential is considered low, and given the nature of the
> modification, latency issues in guests are likely to be the most problematic
> regression potential we have.
> 
> Marcelo Tosatti (2):
>   KVM: x86: move nsec_to_cycles from x86.c to x86.h
>   KVM: LAPIC: cap __delay at lapic_timer_advance_ns
> 
>  arch/x86/kvm/lapic.c | 3 ++-
>  arch/x86/kvm/x86.c   | 6 ------
>  arch/x86/kvm/x86.h   | 8 ++++++++
>  3 files changed, 10 insertions(+), 7 deletions(-)
> 
> -- 
> 2.20.1

Though there is a missing test case, the change seems reasonable. If you could
just launch a guest with a patched host, though, I would feel more in peace.

Test cases are not only about testing the bug is fixed, but that other things
are not broken.

Cascardo.
Khalid Elmously Feb. 27, 2019, 10:35 p.m. UTC | #3
On 2019-02-27 16:19:48 , Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1817918
> 
> [Impact]
> 
> * There is a long-time report of an issue with the TSC delay present
> in wait_lapic_expire() - basically the guest could have an expiration
> timer configured in a way it induces host to wait a long time (with
> preemption disabled), so there's a potential scenario for host lockups.
> 
> * The stack trace we have access (from an user report of this issue)
> is (summarized) below:
> 
> NMI watchdog: Watchdog detected hard LOCKUP on cpu 16
> [...]
> CPU: 16 PID: 3024910 Comm: CPU 0/KVM Not tainted 4.4.0-139-generic #165-Ubuntu
> RIP: 0010:[<addr>] [<addr>] delay_tsc+0x20/0x60
> [...]
>  __delay+0x15/0x20
> wait_lapic_expire+0xc3/0x150 [kvm]
> vcpu_enter_guest+0x743/0x11d0 [kvm]
> kvm_arch_vcpu_ioctl_run+0xe6/0x410 [kvm]
> kvm_vcpu_ioctl+0x33d/0x620 [kvm]
> do_vfs_ioctl+0x2af/0x4b0
> ? __do_page_fault+0x1c1/0x410
> ? fire_user_return_notifiers+0x3e/0x50
> SyS_ioctl+0x79/0x90
> entry_SYSCALL_64_fastpath+0x22/0xc1
> 
> This matches the reported problem in the KVM mailing-list:
> https://marc.info/?l=kvm&m=146374488028339
> 
> * A fix was proposed in the above thread, but discarded in favor of the
> following approach: https://marc.info/?l=kvm&m=146647260109315
> The patch was merged in Linus tree, hence we hereby request the SRU:
> b606f189c7d5 ("KVM: LAPIC: cap __delay at lapic_timer_advance_ns").
> There's one additional patch needed, which is just the header adjustment
> for exporting a necessary function.
> 
> * The patch is missing only in 4.4 kernel series; Bionic (4.15) and the other
> * newer releases have the patch already.
> 
> [Test Case]
> 
> * Unfortunately this is a hard to reproduce issue; we have reports of
> this lockup from an user, hence the SRU request here.
> Also, the patch was introduced originally in kernel 4.7, approx. 2.5 years
> ago. So, we are confident that community is running this code long enough
> without errors reported. Also, checked in the Linus tree and no fixes
> for this code were introduced since kernel 4.7.
> 
> [Regression Potential]
> 
> * The code modification requested here affects the amount of delay in
> a specific timer; the patch introduces a maximum time for delay, preventing
> unbounded delays in host.
> The regression potential is considered low, and given the nature of the
> modification, latency issues in guests are likely to be the most problematic
> regression potential we have.
> 
> Marcelo Tosatti (2):
>   KVM: x86: move nsec_to_cycles from x86.c to x86.h
>   KVM: LAPIC: cap __delay at lapic_timer_advance_ns
> 
>  arch/x86/kvm/lapic.c | 3 ++-
>  arch/x86/kvm/x86.c   | 6 ------
>  arch/x86/kvm/x86.h   | 8 ++++++++
>  3 files changed, 10 insertions(+), 7 deletions(-)
> 
> -- 
> 2.20.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Guilherme G. Piccoli Feb. 27, 2019, 10:49 p.m. UTC | #4
On Wed, Feb 27, 2019 at 6:27 PM Thadeu Lima de Souza Cascardo
<cascardo@canonical.com> wrote:
> Though there is a missing test case, the change seems reasonable. If you could
> just launch a guest with a patched host, though, I would feel more in peace.
>
> Test cases are not only about testing the bug is fixed, but that other things
> are not broken.

Thanks Khaled for applying the patch, and Cerri/Cascardo for the ack
and suggestion.
I've loaded the patched kernel in a  bare-metal machine and started a Xenial
guest, everything's working as expected.

Cheers,


Guilherme