diff mbox series

[v4,17/21] mm/mmap: Drop arch_unmap() call from all archs

Message ID 20240710192250.4114783-18-Liam.Howlett@oracle.com (mailing list archive)
State Handled Elsewhere
Headers show
Series None | expand

Commit Message

Liam R. Howlett July 10, 2024, 7:22 p.m. UTC
From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>

The arch_unmap call was previously moved above the rbtree modifications
in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
corruption").  The move was motivated by an issue with calling
arch_unmap() after the rbtree was modified.

Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
arch_unmap() prior to modifying the vma tree no longer exists
(regardless of rbtree or maple tree implementations).

Furthermore, the powerpc implementation is also no longer needed as per
[1] and [2].  So the arch_unmap() function can be completely removed.

Link: https://lore.kernel.org/lkml/20210611180242.711399-1-dima@arista.com/
Link: https://github.com/linuxppc/issues/issues/241
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: LEROY Christophe <christophe.leroy2@cs-soprasteria.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Dmitry Safonov <dima@arista.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/mmu_context.h |  9 ---------
 arch/x86/include/asm/mmu_context.h     |  5 -----
 include/asm-generic/mm_hooks.h         | 11 +++--------
 mm/mmap.c                              | 12 ++----------
 4 files changed, 5 insertions(+), 32 deletions(-)

Comments

Dave Hansen July 10, 2024, 7:27 p.m. UTC | #1
On 7/10/24 12:22, Liam R. Howlett wrote:
> The arch_unmap call was previously moved above the rbtree modifications
> in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
> corruption").  The move was motivated by an issue with calling
> arch_unmap() after the rbtree was modified.
> 
> Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
> ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
> arch_unmap() prior to modifying the vma tree no longer exists
> (regardless of rbtree or maple tree implementations).
> 
> Furthermore, the powerpc implementation is also no longer needed as per
> [1] and [2].  So the arch_unmap() function can be completely removed.

Thanks for doing this cleanup, Liam!

Acked-by: Dave Hansen <dave.hansen@intel.com>
LEROY Christophe July 10, 2024, 9:02 p.m. UTC | #2
Le 10/07/2024 à 21:22, Liam R. Howlett a écrit :
> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> 
> The arch_unmap call was previously moved above the rbtree modifications
> in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
> corruption").  The move was motivated by an issue with calling
> arch_unmap() after the rbtree was modified.
> 
> Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
> ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
> arch_unmap() prior to modifying the vma tree no longer exists
> (regardless of rbtree or maple tree implementations).
> 
> Furthermore, the powerpc implementation is also no longer needed as per
> [1] and [2].  So the arch_unmap() function can be completely removed.

I'm not sure to understand. Is it replaced by something else ?
We wanted to get rid of arch_unmap() but it was supposed to be replaced 
by some core function because the functionnality itself is still 
required and indeed all the discussion around [2] demonstrated that not 
only powerpc but at least arm and probably others needed to properly 
clean-up reference to VDSO mappings on unmapping.

So as mentioned by Michael you can't just drop that without replacing it 
by something else. We need the VDSO signal handling to properly fallback 
on stack-based trampoline when the VDSO trampoline gets mapped out.

Or did I miss something ?

Christophe

> 
> Link: https://lore.kernel.org/lkml/20210611180242.711399-1-dima@arista.com/
> Link: https://github.com/linuxppc/issues/issues/241
Liam R. Howlett July 10, 2024, 11:26 p.m. UTC | #3
* LEROY Christophe <christophe.leroy2@cs-soprasteria.com> [240710 17:02]:
> 
> 
> Le 10/07/2024 à 21:22, Liam R. Howlett a écrit :
> > From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> > 
> > The arch_unmap call was previously moved above the rbtree modifications
> > in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
> > corruption").  The move was motivated by an issue with calling
> > arch_unmap() after the rbtree was modified.
> > 
> > Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
> > ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
> > arch_unmap() prior to modifying the vma tree no longer exists
> > (regardless of rbtree or maple tree implementations).
> > 
> > Furthermore, the powerpc implementation is also no longer needed as per
> > [1] and [2].  So the arch_unmap() function can be completely removed.
> 
> I'm not sure to understand. Is it replaced by something else ?
> We wanted to get rid of arch_unmap() but it was supposed to be replaced 
> by some core function because the functionnality itself is still 
> required and indeed all the discussion around [2] demonstrated that not 
> only powerpc but at least arm and probably others needed to properly 
> clean-up reference to VDSO mappings on unmapping.
> 
> So as mentioned by Michael you can't just drop that without replacing it 
> by something else. We need the VDSO signal handling to properly fallback 
> on stack-based trampoline when the VDSO trampoline gets mapped out.

I'll address this after the part I missed..

> 
> Or did I miss something ?
> 

I think I missed something in regards to what you need in ppc.

From what I understand, other platforms still map and use the vdso
(context.vdso is set), but unmap_arch() does nothing.  It is only the
powerpc version that clears the vdso pointer if it is unmapped.

git grep -w arch_unmap shows:
arch/powerpc/include/asm/mmu_context.h
arch/x86/include/asm/mmu_context.h
include/asm-generic/mm_hooks.h
mm/mmap.c

The generic and x86 versions are empty.

From the patch set you referenced, we see changes related to the files
modified, but I don't think any of them did anything with unmap_arch().

arm: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all architectures")
arm64: d0fba04847ae ("arm64: vdso: Use generic union vdso_data_store")
mips: d697a9997a0d ("MIPS: vdso: Use generic union vdso_data_store")
s390: cb3444cfdb48 ("s390/vdso: Use generic union vdso_data_store")
riscv: eba755314fa7 ("riscv: vdso: Use generic union vdso_data_store")

ia64 is dead
nds32 is dead
hexagon has a bunch of vdso work in the logs as well.

There is also a6c19dfe3994 ("arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate area")

I do not see sparc changing away from what the patches were doing, but
again, the arch_unmap() seems to do nothing there as well.

So, what I was looking to do is to avoid a call to arch specific
functions that does nothing but set the vdso pointer to NULL for
powerpc.

The thread referenced in the git bug [1] seems to indicate this is for
CRIU unmapping/restoring a task, but CRIU now just moves the vdso
mapping (or just works on ppc at this point?).  Since [2] hasn't landed,
isn't this still broken for CRIU on powerpc as it is?

So, are we keeping the unmap_arch() function around, which has errors
that were never fixed, for a single application that utilizes a newer
method of moving the vdso anyways?

On the note of CRIU, it seems it cannot handle tasks which don't have
the vdso mapped anymore [3], so setting it to NULL is probably a bad
plan even for that one application?


So, I think this just leaves the fallback when the VDSO is unmapped..
Well, it seems what people have been doing is unmap the vdso to stop
these functions from working [4]. At least this is what some users are
doing.  The ability to replace this vma with a guard vma leads me to
believe that other archs don't fall back at all - please correct me if
I'm wrong!

I also cannot find any reference to other archs clearing the
context.vdso (aside from failures in __setup_additional_pages).

But maybe I don't fully understand how this works?

Thanks,
Liam


[1] https://lore.kernel.org/lkml/87d0lht1c0.fsf@concordia.ellerman.id.au/
[2] https://lore.kernel.org/lkml/9c2b2826-4083-fc9c-5a4d-c101858dd560@linux.vnet.ibm.com/
[3] https://github.com/checkpoint-restore/criu/issues/488
[4] https://github.com/insanitybit/void-ship

Thanks,
Liam
LEROY Christophe July 11, 2024, 8:28 a.m. UTC | #4
Le 11/07/2024 à 01:26, Liam R. Howlett a écrit :
> * LEROY Christophe <christophe.leroy2@cs-soprasteria.com> [240710 17:02]:
>>
>>
>> Le 10/07/2024 à 21:22, Liam R. Howlett a écrit :
>>> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
>>>
>>> The arch_unmap call was previously moved above the rbtree modifications
>>> in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
>>> corruption").  The move was motivated by an issue with calling
>>> arch_unmap() after the rbtree was modified.
>>>
>>> Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
>>> ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
>>> arch_unmap() prior to modifying the vma tree no longer exists
>>> (regardless of rbtree or maple tree implementations).
>>>
>>> Furthermore, the powerpc implementation is also no longer needed as per
>>> [1] and [2].  So the arch_unmap() function can be completely removed.
>>
>> I'm not sure to understand. Is it replaced by something else ?
>> We wanted to get rid of arch_unmap() but it was supposed to be replaced
>> by some core function because the functionnality itself is still
>> required and indeed all the discussion around [2] demonstrated that not
>> only powerpc but at least arm and probably others needed to properly
>> clean-up reference to VDSO mappings on unmapping.
>>
>> So as mentioned by Michael you can't just drop that without replacing it
>> by something else. We need the VDSO signal handling to properly fallback
>> on stack-based trampoline when the VDSO trampoline gets mapped out.
>
> I'll address this after the part I missed..

After ? What do you mean ? It needs to be addressed _before_ removing
arch_unmap()

>
>>
>> Or did I miss something ?
>>
>
> I think I missed something in regards to what you need in ppc.

It is not only powerpc. Powerpc is the only one doing it at the moment
but investigation has demonstrated that other architectures are affected.

>
>  From what I understand, other platforms still map and use the vdso
> (context.vdso is set), but unmap_arch() does nothing.  It is only the
> powerpc version that clears the vdso pointer if it is unmapped.

Yes on powerpc it works. On other platforms like arm it segfaults so it
should be fixed
(https://lore.kernel.org/lkml/87imd5h5kb.fsf@mpe.ellerman.id.au/)

Could be fixed by properly implementing arch_unmap() on every arch, or
carry-on with Dmitry's series.

>
> git grep -w arch_unmap shows:
> arch/powerpc/include/asm/mmu_context.h
> arch/x86/include/asm/mmu_context.h
> include/asm-generic/mm_hooks.h
> mm/mmap.c
>
> The generic and x86 versions are empty.
>
>  From the patch set you referenced, we see changes related to the files
> modified, but I don't think any of them did anything with unmap_arch().

In the v3 series from Dmitry, [PATCH v3 16/23] mm: Add vdso_base in
mm_struct
(https://lore.kernel.org/all/20210611180242.711399-17-dima@arista.com/)
it is done via special_mapping_close()

>
> arm: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all architectures")
> arm64: d0fba04847ae ("arm64: vdso: Use generic union vdso_data_store")
> mips: d697a9997a0d ("MIPS: vdso: Use generic union vdso_data_store")
> s390: cb3444cfdb48 ("s390/vdso: Use generic union vdso_data_store")
> riscv: eba755314fa7 ("riscv: vdso: Use generic union vdso_data_store")
>
> ia64 is dead
> nds32 is dead
> hexagon has a bunch of vdso work in the logs as well.
>
> There is also a6c19dfe3994 ("arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate area")
>
> I do not see sparc changing away from what the patches were doing, but
> again, the arch_unmap() seems to do nothing there as well.
>
> So, what I was looking to do is to avoid a call to arch specific
> functions that does nothing but set the vdso pointer to NULL for
> powerpc.

That's what is doing Dmitry's series, removing arch_unmap() and replace
it with core handling. The advantage being that it addresses it for all
affected architectures, improving the current situation.

>
> The thread referenced in the git bug [1] seems to indicate this is for
> CRIU unmapping/restoring a task, but CRIU now just moves the vdso
> mapping (or just works on ppc at this point?).  Since [2] hasn't landed,
> isn't this still broken for CRIU on powerpc as it is?
>
> So, are we keeping the unmap_arch() function around, which has errors
> that were never fixed, for a single application that utilizes a newer
> method of moving the vdso anyways?

Again, we want to remove arch_unmap() but we want the core-mm to handle
it instead.

>
> On the note of CRIU, it seems it cannot handle tasks which don't have
> the vdso mapped anymore [3], so setting it to NULL is probably a bad
> plan even for that one application?

But as mentioned by Dmitry it is not only CRIU. There has also been
issues with Valgrind.

>
>
> So, I think this just leaves the fallback when the VDSO is unmapped..
> Well, it seems what people have been doing is unmap the vdso to stop
> these functions from working [4]. At least this is what some users are
> doing.  The ability to replace this vma with a guard vma leads me to
> believe that other archs don't fall back at all - please correct me if
> I'm wrong!
>
> I also cannot find any reference to other archs clearing the
> context.vdso (aside from failures in __setup_additional_pages).
>
> But maybe I don't fully understand how this works?

I think you fully understand that it doesn't work as it is except on
powerpc. Again the goal should be to make it work on all architectures.

Thanks
Christophe

>
> Thanks,
> Liam
>
>
> [1] https://lore.kernel.org/lkml/87d0lht1c0.fsf@concordia.ellerman.id.au/
> [2] https://lore.kernel.org/lkml/9c2b2826-4083-fc9c-5a4d-c101858dd560@linux.vnet.ibm.com/
> [3] https://github.com/checkpoint-restore/criu/issues/488
> [4] https://github.com/insanitybit/void-ship
>
> Thanks,
> Liam
>
>
Liam R. Howlett July 11, 2024, 3:59 p.m. UTC | #5
* LEROY Christophe <christophe.leroy2@cs-soprasteria.com> [240711 04:28]:
> 
> 
> Le 11/07/2024 à 01:26, Liam R. Howlett a écrit :
> > * LEROY Christophe <christophe.leroy2@cs-soprasteria.com> [240710 17:02]:
> >>
> >>
> >> Le 10/07/2024 à 21:22, Liam R. Howlett a écrit :
> >>> From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
> >>>
> >>> The arch_unmap call was previously moved above the rbtree modifications
> >>> in commit 5a28fc94c914 ("x86/mpx, mm/core: Fix recursive munmap()
> >>> corruption").  The move was motivated by an issue with calling
> >>> arch_unmap() after the rbtree was modified.
> >>>
> >>> Since the above commit, mpx was dropped from the kernel in 45fc24e89b7c
> >>> ("x86/mpx: remove MPX from arch/x86"), so the motivation for calling
> >>> arch_unmap() prior to modifying the vma tree no longer exists
> >>> (regardless of rbtree or maple tree implementations).
> >>>
> >>> Furthermore, the powerpc implementation is also no longer needed as per
> >>> [1] and [2].  So the arch_unmap() function can be completely removed.
> >>
> >> I'm not sure to understand. Is it replaced by something else ?
> >> We wanted to get rid of arch_unmap() but it was supposed to be replaced
> >> by some core function because the functionnality itself is still
> >> required and indeed all the discussion around [2] demonstrated that not
> >> only powerpc but at least arm and probably others needed to properly
> >> clean-up reference to VDSO mappings on unmapping.
> >>
> >> So as mentioned by Michael you can't just drop that without replacing it
> >> by something else. We need the VDSO signal handling to properly fallback
> >> on stack-based trampoline when the VDSO trampoline gets mapped out.
> >
> > I'll address this after the part I missed..
> 
> After ? What do you mean ? It needs to be addressed _before_ removing
> arch_unmap()

After the later comments in this email, sorry that wasn't clear.

> 
> >
> >>
> >> Or did I miss something ?
> >>
> >
> > I think I missed something in regards to what you need in ppc.
> 
> It is not only powerpc. Powerpc is the only one doing it at the moment
> but investigation has demonstrated that other architectures are affected.
> 
> >
> >  From what I understand, other platforms still map and use the vdso
> > (context.vdso is set), but unmap_arch() does nothing.  It is only the
> > powerpc version that clears the vdso pointer if it is unmapped.
> 
> Yes on powerpc it works. On other platforms like arm it segfaults so it
> should be fixed
> (https://lore.kernel.org/lkml/87imd5h5kb.fsf@mpe.ellerman.id.au/)
> 
> Could be fixed by properly implementing arch_unmap() on every arch, or
> carry-on with Dmitry's series.

Okay, I understand what you are saying now.  I'm not going to tackle
that change within this series, so I'll just relocate the arch_munmap()
back to where it was, after the removal of the vmas in v5.

> I think you fully understand that it doesn't work as it is except on
> powerpc. Again the goal should be to make it work on all architectures.

Got it, thanks for clarifying.

Regards,
Liam
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 37bffa0f7918..a334a1368848 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -260,15 +260,6 @@  static inline void enter_lazy_tlb(struct mm_struct *mm,
 
 extern void arch_exit_mmap(struct mm_struct *mm);
 
-static inline void arch_unmap(struct mm_struct *mm,
-			      unsigned long start, unsigned long end)
-{
-	unsigned long vdso_base = (unsigned long)mm->context.vdso;
-
-	if (start <= vdso_base && vdso_base < end)
-		mm->context.vdso = NULL;
-}
-
 #ifdef CONFIG_PPC_MEM_KEYS
 bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
 			       bool execute, bool foreign);
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 8dac45a2c7fc..80f2a3187aa6 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -232,11 +232,6 @@  static inline bool is_64bit_mm(struct mm_struct *mm)
 }
 #endif
 
-static inline void arch_unmap(struct mm_struct *mm, unsigned long start,
-			      unsigned long end)
-{
-}
-
 /*
  * We only want to enforce protection keys on the current process
  * because we effectively have no access to PKRU for other
diff --git a/include/asm-generic/mm_hooks.h b/include/asm-generic/mm_hooks.h
index 4dbb177d1150..f7996376baf9 100644
--- a/include/asm-generic/mm_hooks.h
+++ b/include/asm-generic/mm_hooks.h
@@ -1,8 +1,8 @@ 
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * Define generic no-op hooks for arch_dup_mmap, arch_exit_mmap
- * and arch_unmap to be included in asm-FOO/mmu_context.h for any
- * arch FOO which doesn't need to hook these.
+ * Define generic no-op hooks for arch_dup_mmap and arch_exit_mmap to be
+ * included in asm-FOO/mmu_context.h for any arch FOO which doesn't need to hook
+ * these.
  */
 #ifndef _ASM_GENERIC_MM_HOOKS_H
 #define _ASM_GENERIC_MM_HOOKS_H
@@ -17,11 +17,6 @@  static inline void arch_exit_mmap(struct mm_struct *mm)
 {
 }
 
-static inline void arch_unmap(struct mm_struct *mm,
-			unsigned long start, unsigned long end)
-{
-}
-
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 		bool write, bool execute, bool foreign)
 {
diff --git a/mm/mmap.c b/mm/mmap.c
index d5bd404893a8..df565f51971d 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2652,6 +2652,7 @@  static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms,
 	mm = vms->mm;
 	mm->map_count -= vms->vma_count;
 	mm->locked_vm -= vms->locked_vm;
+
 	if (vms->unlock)
 		mmap_write_downgrade(mm);
 
@@ -2879,7 +2880,7 @@  do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
  *
  * This function takes a @mas that is either pointing to the previous VMA or set
  * to MA_START and sets it up to remove the mapping(s).  The @len will be
- * aligned and any arch_unmap work will be preformed.
+ * aligned.
  *
  * Return: 0 on success and drops the lock if so directed, error and leaves the
  * lock held otherwise.
@@ -2899,16 +2900,12 @@  int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm,
 		return -EINVAL;
 
 	/*
-	 * Check if memory is sealed before arch_unmap.
 	 * Prevent unmapping a sealed VMA.
 	 * can_modify_mm assumes we have acquired the lock on MM.
 	 */
 	if (unlikely(!can_modify_mm(mm, start, end)))
 		return -EPERM;
 
-	 /* arch_unmap() might do unmaps itself.  */
-	arch_unmap(mm, start, end);
-
 	/* Find the first overlapping VMA */
 	vma = vma_find(vmi, end);
 	if (!vma) {
@@ -2969,9 +2966,6 @@  unsigned long mmap_region(struct file *file, unsigned long addr,
 	if (unlikely(!can_modify_mm(mm, addr, end)))
 		return -EPERM;
 
-	 /* arch_unmap() might do unmaps itself.  */
-	arch_unmap(mm, addr, end);
-
 	/* Find the first overlapping VMA */
 	vma = vma_find(&vmi, end);
 	init_vma_munmap(&vms, &vmi, vma, addr, end, uf, /* unlock = */ false);
@@ -3348,14 +3342,12 @@  int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
 	struct mm_struct *mm = vma->vm_mm;
 
 	/*
-	 * Check if memory is sealed before arch_unmap.
 	 * Prevent unmapping a sealed VMA.
 	 * can_modify_mm assumes we have acquired the lock on MM.
 	 */
 	if (unlikely(!can_modify_mm(mm, start, end)))
 		return -EPERM;
 
-	arch_unmap(mm, start, end);
 	return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
 }