diff mbox

[0/1,focal/linux] focal/linux - Invalid VMA causes panic

Message ID 20220614185456.14339-1-tim.gardner@canonical.com
State New
Headers show

Commit Message

Tim Gardner June 14, 2022, 6:54 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1978719

SRU Justification

[Impact]

The 5.4.0 series of the Ubuntu kernel has missed a patch which resolves a null dereference:

[104602.951260] BUG: kernel NULL pointer dereference, address: 0000000000000034
[104602.951263] #PF: supervisor write access in kernel mode
[104602.951264] #PF: error_code(0x0002) - not-present page
[104602.951266] PGD 0 P4D 0
[104602.951269] Oops: 0002 [#1] SMP PTI
[104602.951272] CPU: 6 PID: 176572 Comm: ThreadPoolForeg Kdump: loaded Tainted: P OE 5.4.0-117-generic #132-Ubuntu
[104602.951273] Hardware name: System manufacturer System Product Name/P8P67 LE, BIOS 3801 09/12/2013
[104602.951278] RIP: 0010:unlink_anon_vmas+0x3e/0x1b0
[104602.951280] Code: 54 53 48 83 ec 08 48 8b 47 78 48 89 7d d0 48 8b 30 49 39 c5 0f 84 5e 01 00 00 4c 8d 78 f0 4c 8d 66 f0 31 db eb 21 49 8b 46 38 <83> 68 34 01 49 8b 44 24 10 49 8d 54 24 10 4d 89 e7 48 83 e8 10 49
[104602.951281] RSP: 0018:ffffc00908703bd8 EFLAGS: 00010246
[104602.951283] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[104602.951284] RDX: 0000000000000000 RSI: ffff99e8e5815c48 RDI: 0000000000000000
[104602.951286] RBP: ffffc00908703c08 R08: 0000000000000001 R09: ffffffffae665f00
[104602.951287] R10: ffff99eb808bd6c0 R11: 0000000000000001 R12: ffff99eb0fda27b8
[104602.951288] R13: ffff99eb0fda27c8 R14: ffff99e8e5815c08 R15: ffff99e7ee7af6c0
[104602.951290] FS: 0000000000000000(0000) GS:ffff99eb8eb80000(0000) knlGS:0000000000000000
[104602.951291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[104602.951293] CR2: 0000000000000034 CR3: 000000037ae0a005 CR4: 00000000000606e0
[104602.951294] Call Trace:
[104602.951299] free_pgtables+0x93/0xf0
[104602.951301] exit_mmap+0xc7/0x1b0
[104602.951304] mmput+0x5d/0x130
[104602.951306] do_exit+0x31a/0xaf0
[104602.951309] do_group_exit+0x47/0xb0
[104602.951312] get_signal+0x169/0x890
[104602.951315] do_signal+0x34/0x6c0
[104602.951318] ? _copy_from_user+0x3e/0x60
[104602.951321] ? __x64_sys_futex+0x13f/0x170
[104602.951324] exit_to_usermode_loop+0xbf/0x160
[104602.951327] do_syscall_64+0x163/0x190
[104602.951330] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[104602.951332] RIP: 0033:0x7f58d1db27d1
[104602.951335] Code: Bad RIP value.
[104602.951336] RSP: 002b:00007f58c6987370 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[104602.951338] RAX: fffffffffffffdfc RBX: 00007f58c69875e8 RCX: 00007f58d1db27d1
[104602.951339] RDX: 0000000000000000 RSI: 0000000000000089 RDI: 00007f58c6987600
[104602.951340] RBP: 00007f58c69875d8 R08: 0000000000000000 R09: 00000000ffffffff
[104602.951341] R10: 00007f58c6987460 R11: 0000000000000246 R12: 00007f58c69875fc
[104602.951343] R13: 00007f58c69875b0 R14: 00007f58c6987600 R15: 00007f58c69873c0

The patch was posted back in 2021 the linux kernel mailing lists: https://lore.kernel.org/linux-mm/20210224200449.hkU5GTEiH%25akpm@linux-foundation.org/

The defect is:
Date: Wed, 24 Feb 2021 12:04:49 -0800 [thread overview]
Message-ID: <20210224200449.hkU5GTEiH%akpm@linux-foundation.org> (raw)
In-Reply-To: <20210224115824.1e289a6895087f10c41dd8d6@linux-foundation.org>

From: Li Xinhai <lixinhai.lxh@gmail.com>
Subject: mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas()

In case the vma will continue to be used after unlink its relevant
anon_vma, we need to reset the vma->anon_vma pointer to NULL. So, later
when fault happen within this vma again, a new anon_vma will be prepared.

By this way, the vma will only be checked for reverse mapping of pages
which been fault in after the unlink_anon_vmas call.

Currently, the mremap with MREMAP_DONTUNMAP scenario will continue use the
vma after moved its page table entries to a new vma. For other scenarios,
the vma itself will be freed after call unlink_anon_vmas.

Link: https://lkml.kernel.org/r/20210119075126.3513154-1-lixinhai.lxh@gmail.com
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/rmap.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)


The Linux 5.4 package that Ubuntu is currently running on the latest kernel has the following code:
 if (vma->anon_vma)
  vma->anon_vma->degree--;
 unlock_anon_vma_root(root);

This is the 3rd time I've encountered the crash.

root@lazarus:/var/crash/202206141315# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal

[Test Case]

Difficult to trigger.

[Where things could go wrong]

VMSs could be erroneously orphaned.

Comments

Khalid Elmously June 14, 2022, 8:15 p.m. UTC | #1
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>


On 2022-06-14 12:54:55 , Tim Gardner wrote:
> BugLink: https://bugs.launchpad.net/bugs/1978719
> 
> SRU Justification
> 
> [Impact]
> 
> The 5.4.0 series of the Ubuntu kernel has missed a patch which resolves a null dereference:
> 
> [104602.951260] BUG: kernel NULL pointer dereference, address: 0000000000000034
> [104602.951263] #PF: supervisor write access in kernel mode
> [104602.951264] #PF: error_code(0x0002) - not-present page
> [104602.951266] PGD 0 P4D 0
> [104602.951269] Oops: 0002 [#1] SMP PTI
> [104602.951272] CPU: 6 PID: 176572 Comm: ThreadPoolForeg Kdump: loaded Tainted: P OE 5.4.0-117-generic #132-Ubuntu
> [104602.951273] Hardware name: System manufacturer System Product Name/P8P67 LE, BIOS 3801 09/12/2013
> [104602.951278] RIP: 0010:unlink_anon_vmas+0x3e/0x1b0
> [104602.951280] Code: 54 53 48 83 ec 08 48 8b 47 78 48 89 7d d0 48 8b 30 49 39 c5 0f 84 5e 01 00 00 4c 8d 78 f0 4c 8d 66 f0 31 db eb 21 49 8b 46 38 <83> 68 34 01 49 8b 44 24 10 49 8d 54 24 10 4d 89 e7 48 83 e8 10 49
> [104602.951281] RSP: 0018:ffffc00908703bd8 EFLAGS: 00010246
> [104602.951283] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [104602.951284] RDX: 0000000000000000 RSI: ffff99e8e5815c48 RDI: 0000000000000000
> [104602.951286] RBP: ffffc00908703c08 R08: 0000000000000001 R09: ffffffffae665f00
> [104602.951287] R10: ffff99eb808bd6c0 R11: 0000000000000001 R12: ffff99eb0fda27b8
> [104602.951288] R13: ffff99eb0fda27c8 R14: ffff99e8e5815c08 R15: ffff99e7ee7af6c0
> [104602.951290] FS: 0000000000000000(0000) GS:ffff99eb8eb80000(0000) knlGS:0000000000000000
> [104602.951291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [104602.951293] CR2: 0000000000000034 CR3: 000000037ae0a005 CR4: 00000000000606e0
> [104602.951294] Call Trace:
> [104602.951299] free_pgtables+0x93/0xf0
> [104602.951301] exit_mmap+0xc7/0x1b0
> [104602.951304] mmput+0x5d/0x130
> [104602.951306] do_exit+0x31a/0xaf0
> [104602.951309] do_group_exit+0x47/0xb0
> [104602.951312] get_signal+0x169/0x890
> [104602.951315] do_signal+0x34/0x6c0
> [104602.951318] ? _copy_from_user+0x3e/0x60
> [104602.951321] ? __x64_sys_futex+0x13f/0x170
> [104602.951324] exit_to_usermode_loop+0xbf/0x160
> [104602.951327] do_syscall_64+0x163/0x190
> [104602.951330] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [104602.951332] RIP: 0033:0x7f58d1db27d1
> [104602.951335] Code: Bad RIP value.
> [104602.951336] RSP: 002b:00007f58c6987370 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> [104602.951338] RAX: fffffffffffffdfc RBX: 00007f58c69875e8 RCX: 00007f58d1db27d1
> [104602.951339] RDX: 0000000000000000 RSI: 0000000000000089 RDI: 00007f58c6987600
> [104602.951340] RBP: 00007f58c69875d8 R08: 0000000000000000 R09: 00000000ffffffff
> [104602.951341] R10: 00007f58c6987460 R11: 0000000000000246 R12: 00007f58c69875fc
> [104602.951343] R13: 00007f58c69875b0 R14: 00007f58c6987600 R15: 00007f58c69873c0
> 
> The patch was posted back in 2021 the linux kernel mailing lists: https://lore.kernel.org/linux-mm/20210224200449.hkU5GTEiH%25akpm@linux-foundation.org/
> 
> The defect is:
> Date: Wed, 24 Feb 2021 12:04:49 -0800 [thread overview]
> Message-ID: <20210224200449.hkU5GTEiH%akpm@linux-foundation.org> (raw)
> In-Reply-To: <20210224115824.1e289a6895087f10c41dd8d6@linux-foundation.org>
> 
> From: Li Xinhai <lixinhai.lxh@gmail.com>
> Subject: mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas()
> 
> In case the vma will continue to be used after unlink its relevant
> anon_vma, we need to reset the vma->anon_vma pointer to NULL. So, later
> when fault happen within this vma again, a new anon_vma will be prepared.
> 
> By this way, the vma will only be checked for reverse mapping of pages
> which been fault in after the unlink_anon_vmas call.
> 
> Currently, the mremap with MREMAP_DONTUNMAP scenario will continue use the
> vma after moved its page table entries to a new vma. For other scenarios,
> the vma itself will be freed after call unlink_anon_vmas.
> 
> Link: https://lkml.kernel.org/r/20210119075126.3513154-1-lixinhai.lxh@gmail.com
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Brian Geffon <bgeffon@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/rmap.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> --- a/mm/rmap.c~mm-rmap-explicitly-reset-vma-anon_vma-in-unlink_anon_vmas
> +++ a/mm/rmap.c
> @@ -413,8 +413,15 @@ void unlink_anon_vmas(struct vm_area_str
>    list_del(&avc->same_vma);
>    anon_vma_chain_free(avc);
>   }
> - if (vma->anon_vma)
> + if (vma->anon_vma) {
>    vma->anon_vma->degree--;
> +
> + /*
> + * vma would still be needed after unlink, and anon_vma will be prepared
> + * when handle fault.
> + */
> + vma->anon_vma = NULL;
> + }
>   unlock_anon_vma_root(root);
> 
>   /*
> 
> The Linux 5.4 package that Ubuntu is currently running on the latest kernel has the following code:
>  if (vma->anon_vma)
>   vma->anon_vma->degree--;
>  unlock_anon_vma_root(root);
> 
> This is the 3rd time I've encountered the crash.
> 
> root@lazarus:/var/crash/202206141315# lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 20.04.4 LTS
> Release: 20.04
> Codename: focal
> 
> [Test Case]
> 
> Difficult to trigger.
> 
> [Where things could go wrong]
> 
> VMSs could be erroneously orphaned.
> 
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Cengiz Can June 15, 2022, 12:23 a.m. UTC | #2
Acked-by: Cengiz Can <cengiz.can@canonical.com>

On 22-06-14 12:54:55, Tim Gardner wrote:
> BugLink: https://bugs.launchpad.net/bugs/1978719
> 
> SRU Justification
> 
> [Impact]
> 
> The 5.4.0 series of the Ubuntu kernel has missed a patch which resolves a null dereference:
> 
> [104602.951260] BUG: kernel NULL pointer dereference, address: 0000000000000034
> [104602.951263] #PF: supervisor write access in kernel mode
> [104602.951264] #PF: error_code(0x0002) - not-present page
> [104602.951266] PGD 0 P4D 0
> [104602.951269] Oops: 0002 [#1] SMP PTI
> [104602.951272] CPU: 6 PID: 176572 Comm: ThreadPoolForeg Kdump: loaded Tainted: P OE 5.4.0-117-generic #132-Ubuntu
> [104602.951273] Hardware name: System manufacturer System Product Name/P8P67 LE, BIOS 3801 09/12/2013
> [104602.951278] RIP: 0010:unlink_anon_vmas+0x3e/0x1b0
> [104602.951280] Code: 54 53 48 83 ec 08 48 8b 47 78 48 89 7d d0 48 8b 30 49 39 c5 0f 84 5e 01 00 00 4c 8d 78 f0 4c 8d 66 f0 31 db eb 21 49 8b 46 38 <83> 68 34 01 49 8b 44 24 10 49 8d 54 24 10 4d 89 e7 48 83 e8 10 49
> [104602.951281] RSP: 0018:ffffc00908703bd8 EFLAGS: 00010246
> [104602.951283] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [104602.951284] RDX: 0000000000000000 RSI: ffff99e8e5815c48 RDI: 0000000000000000
> [104602.951286] RBP: ffffc00908703c08 R08: 0000000000000001 R09: ffffffffae665f00
> [104602.951287] R10: ffff99eb808bd6c0 R11: 0000000000000001 R12: ffff99eb0fda27b8
> [104602.951288] R13: ffff99eb0fda27c8 R14: ffff99e8e5815c08 R15: ffff99e7ee7af6c0
> [104602.951290] FS: 0000000000000000(0000) GS:ffff99eb8eb80000(0000) knlGS:0000000000000000
> [104602.951291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [104602.951293] CR2: 0000000000000034 CR3: 000000037ae0a005 CR4: 00000000000606e0
> [104602.951294] Call Trace:
> [104602.951299] free_pgtables+0x93/0xf0
> [104602.951301] exit_mmap+0xc7/0x1b0
> [104602.951304] mmput+0x5d/0x130
> [104602.951306] do_exit+0x31a/0xaf0
> [104602.951309] do_group_exit+0x47/0xb0
> [104602.951312] get_signal+0x169/0x890
> [104602.951315] do_signal+0x34/0x6c0
> [104602.951318] ? _copy_from_user+0x3e/0x60
> [104602.951321] ? __x64_sys_futex+0x13f/0x170
> [104602.951324] exit_to_usermode_loop+0xbf/0x160
> [104602.951327] do_syscall_64+0x163/0x190
> [104602.951330] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [104602.951332] RIP: 0033:0x7f58d1db27d1
> [104602.951335] Code: Bad RIP value.
> [104602.951336] RSP: 002b:00007f58c6987370 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> [104602.951338] RAX: fffffffffffffdfc RBX: 00007f58c69875e8 RCX: 00007f58d1db27d1
> [104602.951339] RDX: 0000000000000000 RSI: 0000000000000089 RDI: 00007f58c6987600
> [104602.951340] RBP: 00007f58c69875d8 R08: 0000000000000000 R09: 00000000ffffffff
> [104602.951341] R10: 00007f58c6987460 R11: 0000000000000246 R12: 00007f58c69875fc
> [104602.951343] R13: 00007f58c69875b0 R14: 00007f58c6987600 R15: 00007f58c69873c0
> 
> The patch was posted back in 2021 the linux kernel mailing lists: https://lore.kernel.org/linux-mm/20210224200449.hkU5GTEiH%25akpm@linux-foundation.org/
> 
> The defect is:
> Date: Wed, 24 Feb 2021 12:04:49 -0800 [thread overview]
> Message-ID: <20210224200449.hkU5GTEiH%akpm@linux-foundation.org> (raw)
> In-Reply-To: <20210224115824.1e289a6895087f10c41dd8d6@linux-foundation.org>
> 
> From: Li Xinhai <lixinhai.lxh@gmail.com>
> Subject: mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas()
> 
> In case the vma will continue to be used after unlink its relevant
> anon_vma, we need to reset the vma->anon_vma pointer to NULL. So, later
> when fault happen within this vma again, a new anon_vma will be prepared.
> 
> By this way, the vma will only be checked for reverse mapping of pages
> which been fault in after the unlink_anon_vmas call.
> 
> Currently, the mremap with MREMAP_DONTUNMAP scenario will continue use the
> vma after moved its page table entries to a new vma. For other scenarios,
> the vma itself will be freed after call unlink_anon_vmas.
> 
> Link: https://lkml.kernel.org/r/20210119075126.3513154-1-lixinhai.lxh@gmail.com
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Brian Geffon <bgeffon@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/rmap.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> --- a/mm/rmap.c~mm-rmap-explicitly-reset-vma-anon_vma-in-unlink_anon_vmas
> +++ a/mm/rmap.c
> @@ -413,8 +413,15 @@ void unlink_anon_vmas(struct vm_area_str
>    list_del(&avc->same_vma);
>    anon_vma_chain_free(avc);
>   }
> - if (vma->anon_vma)
> + if (vma->anon_vma) {
>    vma->anon_vma->degree--;
> +
> + /*
> + * vma would still be needed after unlink, and anon_vma will be prepared
> + * when handle fault.
> + */
> + vma->anon_vma = NULL;
> + }
>   unlock_anon_vma_root(root);
> 
>   /*
> 
> The Linux 5.4 package that Ubuntu is currently running on the latest kernel has the following code:
>  if (vma->anon_vma)
>   vma->anon_vma->degree--;
>  unlock_anon_vma_root(root);
> 
> This is the 3rd time I've encountered the crash.
> 
> root@lazarus:/var/crash/202206141315# lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 20.04.4 LTS
> Release: 20.04
> Codename: focal
> 
> [Test Case]
> 
> Difficult to trigger.
> 
> [Where things could go wrong]
> 
> VMSs could be erroneously orphaned.
> 
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Stefan Bader June 21, 2022, 5:17 p.m. UTC | #3
On 14.06.22 20:54, Tim Gardner wrote:
> BugLink: https://bugs.launchpad.net/bugs/1978719
> 
> SRU Justification
> 
> [Impact]
> 
> The 5.4.0 series of the Ubuntu kernel has missed a patch which resolves a null dereference:
> 
> [104602.951260] BUG: kernel NULL pointer dereference, address: 0000000000000034
> [104602.951263] #PF: supervisor write access in kernel mode
> [104602.951264] #PF: error_code(0x0002) - not-present page
> [104602.951266] PGD 0 P4D 0
> [104602.951269] Oops: 0002 [#1] SMP PTI
> [104602.951272] CPU: 6 PID: 176572 Comm: ThreadPoolForeg Kdump: loaded Tainted: P OE 5.4.0-117-generic #132-Ubuntu
> [104602.951273] Hardware name: System manufacturer System Product Name/P8P67 LE, BIOS 3801 09/12/2013
> [104602.951278] RIP: 0010:unlink_anon_vmas+0x3e/0x1b0
> [104602.951280] Code: 54 53 48 83 ec 08 48 8b 47 78 48 89 7d d0 48 8b 30 49 39 c5 0f 84 5e 01 00 00 4c 8d 78 f0 4c 8d 66 f0 31 db eb 21 49 8b 46 38 <83> 68 34 01 49 8b 44 24 10 49 8d 54 24 10 4d 89 e7 48 83 e8 10 49
> [104602.951281] RSP: 0018:ffffc00908703bd8 EFLAGS: 00010246
> [104602.951283] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [104602.951284] RDX: 0000000000000000 RSI: ffff99e8e5815c48 RDI: 0000000000000000
> [104602.951286] RBP: ffffc00908703c08 R08: 0000000000000001 R09: ffffffffae665f00
> [104602.951287] R10: ffff99eb808bd6c0 R11: 0000000000000001 R12: ffff99eb0fda27b8
> [104602.951288] R13: ffff99eb0fda27c8 R14: ffff99e8e5815c08 R15: ffff99e7ee7af6c0
> [104602.951290] FS: 0000000000000000(0000) GS:ffff99eb8eb80000(0000) knlGS:0000000000000000
> [104602.951291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [104602.951293] CR2: 0000000000000034 CR3: 000000037ae0a005 CR4: 00000000000606e0
> [104602.951294] Call Trace:
> [104602.951299] free_pgtables+0x93/0xf0
> [104602.951301] exit_mmap+0xc7/0x1b0
> [104602.951304] mmput+0x5d/0x130
> [104602.951306] do_exit+0x31a/0xaf0
> [104602.951309] do_group_exit+0x47/0xb0
> [104602.951312] get_signal+0x169/0x890
> [104602.951315] do_signal+0x34/0x6c0
> [104602.951318] ? _copy_from_user+0x3e/0x60
> [104602.951321] ? __x64_sys_futex+0x13f/0x170
> [104602.951324] exit_to_usermode_loop+0xbf/0x160
> [104602.951327] do_syscall_64+0x163/0x190
> [104602.951330] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [104602.951332] RIP: 0033:0x7f58d1db27d1
> [104602.951335] Code: Bad RIP value.
> [104602.951336] RSP: 002b:00007f58c6987370 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> [104602.951338] RAX: fffffffffffffdfc RBX: 00007f58c69875e8 RCX: 00007f58d1db27d1
> [104602.951339] RDX: 0000000000000000 RSI: 0000000000000089 RDI: 00007f58c6987600
> [104602.951340] RBP: 00007f58c69875d8 R08: 0000000000000000 R09: 00000000ffffffff
> [104602.951341] R10: 00007f58c6987460 R11: 0000000000000246 R12: 00007f58c69875fc
> [104602.951343] R13: 00007f58c69875b0 R14: 00007f58c6987600 R15: 00007f58c69873c0
> 
> The patch was posted back in 2021 the linux kernel mailing lists: https://lore.kernel.org/linux-mm/20210224200449.hkU5GTEiH%25akpm@linux-foundation.org/
> 
> The defect is:
> Date: Wed, 24 Feb 2021 12:04:49 -0800 [thread overview]
> Message-ID: <20210224200449.hkU5GTEiH%akpm@linux-foundation.org> (raw)
> In-Reply-To: <20210224115824.1e289a6895087f10c41dd8d6@linux-foundation.org>
> 
> From: Li Xinhai <lixinhai.lxh@gmail.com>
> Subject: mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas()
> 
> In case the vma will continue to be used after unlink its relevant
> anon_vma, we need to reset the vma->anon_vma pointer to NULL. So, later
> when fault happen within this vma again, a new anon_vma will be prepared.
> 
> By this way, the vma will only be checked for reverse mapping of pages
> which been fault in after the unlink_anon_vmas call.
> 
> Currently, the mremap with MREMAP_DONTUNMAP scenario will continue use the
> vma after moved its page table entries to a new vma. For other scenarios,
> the vma itself will be freed after call unlink_anon_vmas.
> 
> Link: https://lkml.kernel.org/r/20210119075126.3513154-1-lixinhai.lxh@gmail.com
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Brian Geffon <bgeffon@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>   mm/rmap.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> --- a/mm/rmap.c~mm-rmap-explicitly-reset-vma-anon_vma-in-unlink_anon_vmas
> +++ a/mm/rmap.c
> @@ -413,8 +413,15 @@ void unlink_anon_vmas(struct vm_area_str
>     list_del(&avc->same_vma);
>     anon_vma_chain_free(avc);
>    }
> - if (vma->anon_vma)
> + if (vma->anon_vma) {
>     vma->anon_vma->degree--;
> +
> + /*
> + * vma would still be needed after unlink, and anon_vma will be prepared
> + * when handle fault.
> + */
> + vma->anon_vma = NULL;
> + }
>    unlock_anon_vma_root(root);
> 
>    /*
> 
> The Linux 5.4 package that Ubuntu is currently running on the latest kernel has the following code:
>   if (vma->anon_vma)
>    vma->anon_vma->degree--;
>   unlock_anon_vma_root(root);
> 
> This is the 3rd time I've encountered the crash.
> 
> root@lazarus:/var/crash/202206141315# lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 20.04.4 LTS
> Release: 20.04
> Codename: focal
> 
> [Test Case]
> 
> Difficult to trigger.
> 
> [Where things could go wrong]
> 
> VMSs could be erroneously orphaned.
> 
> 
> 

Applied to focal:linux/master-next. Thanks.

-Stefan
diff mbox

Patch

--- a/mm/rmap.c~mm-rmap-explicitly-reset-vma-anon_vma-in-unlink_anon_vmas
+++ a/mm/rmap.c
@@ -413,8 +413,15 @@  void unlink_anon_vmas(struct vm_area_str
   list_del(&avc->same_vma);
   anon_vma_chain_free(avc);
  }
- if (vma->anon_vma)
+ if (vma->anon_vma) {
   vma->anon_vma->degree--;
+
+ /*
+ * vma would still be needed after unlink, and anon_vma will be prepared
+ * when handle fault.
+ */
+ vma->anon_vma = NULL;
+ }
  unlock_anon_vma_root(root);

  /*