Message ID | 20230912060246.150003-1-chengen.du@canonical.com |
---|---|
Headers | show |
Series | NULL Pointer Dereference During KVM MMU Page Invalidation | expand |
On 9/12/23 12:02 AM, Chengen Du wrote: > BugLink: https://bugs.launchpad.net/bugs/2035166 > > SRU Justification: > > [Impact] > During VM live migration, there is a potential risk of dereferencing a NULL pointer, > which can lead to memory access issues and result in an unstable environment. > > [Fix] > The call trace is as follows: > > kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008 > kernel: #PF: supervisor write access in kernel mode > kernel: #PF: error_code(0x0002) - not-present page > kernel: PGD 0 P4D 0 > kernel: Oops: 0002 [#1] SMP NOPTI > kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G IOE 5.15.0-53-generic #59~20.04.1-Ubuntu > kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021 > kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm] > kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de > kernel: RSP: 0018:ffffb580320278a8 EFLAGS: 00010246 > kernel: RAX: 0000000000000000 RBX: ffffa0fe29e94c38 RCX: 0000000000000027 > kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffb5801e24ba58 > kernel: RBP: ffffb58032027930 R08: 0000000000000000 R09: 0000000000000004 > kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000003 > kernel: R13: 0000000000000004 R14: 0000000000000000 R15: ffffb5801e235000 > kernel: FS: 00007f1553fff700(0000) GS:ffffa20eff780000(0000) knlGS:0000000000000000 > kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > kernel: CR2: 0000000000000008 CR3: 000000e7f7544004 CR4: 00000000007726e0 > kernel: PKRU: 55555554 > kernel: Call Trace: > kernel: <TASK> > kernel: ? __switch_to_xtra+0x109/0x510 > kernel: zap_gfn_range+0x218/0x360 [kvm] > kernel: ? __smp_call_single_queue+0x59/0x90 > kernel: ? alloc_cpumask_var_node+0x1/0x30 > kernel: ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm] > kernel: kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm] > kernel: kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm] > -- > kernel: RAX: ffffffffffffffda RBX: 000000004020ae46 RCX: 00007f15aa26e3ab > kernel: RDX: 00007f1553ffe050 RSI: 000000004020ae46 RDI: 000000000000002f > kernel: RBP: 00005602a885a410 R08: 00005602a82ad000 R09: 00007f154c087470 > kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1553ffe050 > kernel: R13: 00007f1553ffe160 R14: 0000000000000000 R15: 0000000000800000 > kernel: </TASK> > > The error occurred randomly in different production environments of the customer, all with the same call trace. > Therefore, the likelihood of other processes contaminating memory is low. > After analyzing the call trace with the help of debug symbols, we can pinpoint the source of the error. > > root@focal:~/ddeb# eu-addr2line -ifae ./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko __handle_changed_spte+0x3a9 > 0x0000000000068109 > __list_del inlined at /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 in __handle_changed_spte > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13 > __list_del_entry > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 > list_del > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2 > tdp_mmu_unlink_page > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2 > handle_removed_tdp_mmu_page > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2 > __handle_changed_spte > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3 > > The error occurred when the kernel attempted to delete an entry from a list. > This issue may potentially be related to timing and has proven challenging to reproduce consistently, making it difficult for us to pinpoint the cause. > It's worth noting that the current kernel has replaced the list_head with atomic_t, as indicated by the following commit. > > d25ceb926436 KVM: x86/mmu: Track the number of TDP MMU pages, but not the actual pages > > While this patch doesn't modify the triggering logic, it replaces the problematic section with a more reliable approach while keeping the original logic unchanged. > If the issue persists, it should not result in any memory access problems. > We also requested the customer to set up a test environment and simulate a workload similar to the production environment. > The patch worked well and did not introduce any adverse effects. > > [Test Plan] > Reproducing the issue has proven to be challenging. > Simulating heavy live migration activity in the customer's production environment is the appropriate approach to ensure that applying the patch will not result in any adverse effects. > > [Where problems could occur] > The patch will impact the live migration workflow, but it only modifies the data structure in use, and no functionality will be altered. > > Sean Christopherson (1): > KVM: x86/mmu: Track the number of TDP MMU pages, but not the actual > pages > > arch/x86/include/asm/kvm_host.h | 11 +++-------- > arch/x86/kvm/mmu/tdp_mmu.c | 20 +++++++++++--------- > 2 files changed, 14 insertions(+), 17 deletions(-) > Acked-by: Tim Gardner <tim.gardner@canonical.com>
On 12/09/2023 08:02, Chengen Du wrote: > BugLink: https://bugs.launchpad.net/bugs/2035166 > > SRU Justification: > > [Impact] > During VM live migration, there is a potential risk of dereferencing a NULL pointer, > which can lead to memory access issues and result in an unstable environment. > > [Fix] > The call trace is as follows: > > kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008 > kernel: #PF: supervisor write access in kernel mode > kernel: #PF: error_code(0x0002) - not-present page > kernel: PGD 0 P4D 0 > kernel: Oops: 0002 [#1] SMP NOPTI > kernel: CPU: 29 PID: 4063601 Comm: CPU 0/KVM Tainted: G IOE 5.15.0-53-generic #59~20.04.1-Ubuntu > kernel: Hardware name: Dell Inc. PowerEdge R640/0H28RR, BIOS 2.12.2 07/09/2021 > kernel: RIP: 0010:__handle_changed_spte+0x3a9/0x620 [kvm] > kernel: Code: 48 8b 58 28 44 0f b6 63 24 48 8b 43 28 41 83 e4 0f 48 89 45 a0 0f 1f 44 00 00 45 84 d2 0f 85 06 02 00 00 48 8b 43 08 48 8b 13 <48> 89 42 08 48 89 10 44 0f b6 6b 23 48 b8 00 01 00 00 00 00 ad de > kernel: RSP: 0018:ffffb580320278a8 EFLAGS: 00010246 > kernel: RAX: 0000000000000000 RBX: ffffa0fe29e94c38 RCX: 0000000000000027 > kernel: RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffb5801e24ba58 > kernel: RBP: ffffb58032027930 R08: 0000000000000000 R09: 0000000000000004 > kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000003 > kernel: R13: 0000000000000004 R14: 0000000000000000 R15: ffffb5801e235000 > kernel: FS: 00007f1553fff700(0000) GS:ffffa20eff780000(0000) knlGS:0000000000000000 > kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > kernel: CR2: 0000000000000008 CR3: 000000e7f7544004 CR4: 00000000007726e0 > kernel: PKRU: 55555554 > kernel: Call Trace: > kernel: <TASK> > kernel: ? __switch_to_xtra+0x109/0x510 > kernel: zap_gfn_range+0x218/0x360 [kvm] > kernel: ? __smp_call_single_queue+0x59/0x90 > kernel: ? alloc_cpumask_var_node+0x1/0x30 > kernel: ? kvm_make_vcpus_request_mask+0x150/0x1d0 [kvm] > kernel: kvm_tdp_mmu_zap_invalidated_roots+0x5b/0xb0 [kvm] > kernel: kvm_mmu_zap_all_fast+0x19a/0x1d0 [kvm] > -- > kernel: RAX: ffffffffffffffda RBX: 000000004020ae46 RCX: 00007f15aa26e3ab > kernel: RDX: 00007f1553ffe050 RSI: 000000004020ae46 RDI: 000000000000002f > kernel: RBP: 00005602a885a410 R08: 00005602a82ad000 R09: 00007f154c087470 > kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1553ffe050 > kernel: R13: 00007f1553ffe160 R14: 0000000000000000 R15: 0000000000800000 > kernel: </TASK> > > The error occurred randomly in different production environments of the customer, all with the same call trace. > Therefore, the likelihood of other processes contaminating memory is low. > After analyzing the call trace with the help of debug symbols, we can pinpoint the source of the error. > > root@focal:~/ddeb# eu-addr2line -ifae ./usr/lib/debug/lib/modules/5.15.0-53-generic/kernel/arch/x86/kvm/kvm.ko __handle_changed_spte+0x3a9 > 0x0000000000068109 > __list_del inlined at /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 in __handle_changed_spte > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:112:13 > __list_del_entry > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:135:2 > list_del > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/include/linux/list.h:146:2 > tdp_mmu_unlink_page > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:305:2 > handle_removed_tdp_mmu_page > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:340:2 > __handle_changed_spte > /build/linux-hwe-5.15-ZCQu4B/linux-hwe-5.15-5.15.0/arch/x86/kvm/mmu/tdp_mmu.c:491:3 > > The error occurred when the kernel attempted to delete an entry from a list. > This issue may potentially be related to timing and has proven challenging to reproduce consistently, making it difficult for us to pinpoint the cause. > It's worth noting that the current kernel has replaced the list_head with atomic_t, as indicated by the following commit. > > d25ceb926436 KVM: x86/mmu: Track the number of TDP MMU pages, but not the actual pages > > While this patch doesn't modify the triggering logic, it replaces the problematic section with a more reliable approach while keeping the original logic unchanged. > If the issue persists, it should not result in any memory access problems. > We also requested the customer to set up a test environment and simulate a workload similar to the production environment. > The patch worked well and did not introduce any adverse effects. > > [Test Plan] > Reproducing the issue has proven to be challenging. > Simulating heavy live migration activity in the customer's production environment is the appropriate approach to ensure that applying the patch will not result in any adverse effects. > > [Where problems could occur] > The patch will impact the live migration workflow, but it only modifies the data structure in use, and no functionality will be altered. > > Sean Christopherson (1): > KVM: x86/mmu: Track the number of TDP MMU pages, but not the actual > pages > > arch/x86/include/asm/kvm_host.h | 11 +++-------- > arch/x86/kvm/mmu/tdp_mmu.c | 20 +++++++++++--------- > 2 files changed, 14 insertions(+), 17 deletions(-) > Applied to jammy:master-next. I added the missing buglink. Thanks!