diff mbox series

powerpc/code-patching: Fix oops with DEBUG_VM enabled

Message ID 20221216125913.990972-1-mpe@ellerman.id.au (mailing list archive)
State Accepted
Commit 980411a4d1bb925d28cd9e8d8301dc982ece788d
Headers show
Series powerpc/code-patching: Fix oops with DEBUG_VM enabled | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 24 jobs.

Commit Message

Michael Ellerman Dec. 16, 2022, 12:59 p.m. UTC
Nathan reported that the new per-cpu mm patching oopses if DEBUG_VM is
enabled:

  ------------[ cut here ]------------
  kernel BUG at arch/powerpc/mm/pgtable.c:333!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc2+ #1
  Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 PowerNV
  ...
  NIP assert_pte_locked+0x180/0x1a0
  LR  assert_pte_locked+0x170/0x1a0
  Call Trace:
    0x60000000 (unreliable)
    patch_instruction+0x618/0x6d0
    arch_prepare_kprobe+0xfc/0x2d0
    register_kprobe+0x520/0x7c0
    arch_init_kprobes+0x28/0x3c
    init_kprobes+0x108/0x184
    do_one_initcall+0x60/0x2e0
    kernel_init_freeable+0x1f0/0x3e0
    kernel_init+0x34/0x1d0
    ret_from_kernel_thread+0x5c/0x64

It's caused by the assert_spin_locked() failing in assert_pte_locked().
The assert fails because the PTE was unlocked in text_area_cpu_up_mm(),
and never relocked.

The PTE page shouldn't be freed, the patching_mm is only used for
patching on this CPU, only that single PTE is ever mapped, and it's only
unmapped at CPU offline.

In fact assert_pte_locked() has a special case to ignore init_mm
entirely, and the patching_mm is more-or-less like init_mm, so possibly
the check could be skipped for patching_mm too.

But for now be conservative, and use the proper PTE accessors at
patching time, so that the PTE lock is held while the PTE is used. That
also avoids the warning in assert_pte_locked().

With that it's no longer necessary to save the PTE in
cpu_patching_context for the mm_patch_enabled() case.

Fixes: c28c15b6d28a ("powerpc/code-patching: Use temporary mm for Radix MMU")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/lib/code-patching.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Michael Ellerman Dec. 19, 2022, 3:59 a.m. UTC | #1
On Fri, 16 Dec 2022 23:59:13 +1100, Michael Ellerman wrote:
> Nathan reported that the new per-cpu mm patching oopses if DEBUG_VM is
> enabled:
> 
>   ------------[ cut here ]------------
>   kernel BUG at arch/powerpc/mm/pgtable.c:333!
>   Oops: Exception in kernel mode, sig: 5 [#1]
>   LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
>   Modules linked in:
>   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc2+ #1
>   Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 PowerNV
>   ...
>   NIP assert_pte_locked+0x180/0x1a0
>   LR  assert_pte_locked+0x170/0x1a0
>   Call Trace:
>     0x60000000 (unreliable)
>     patch_instruction+0x618/0x6d0
>     arch_prepare_kprobe+0xfc/0x2d0
>     register_kprobe+0x520/0x7c0
>     arch_init_kprobes+0x28/0x3c
>     init_kprobes+0x108/0x184
>     do_one_initcall+0x60/0x2e0
>     kernel_init_freeable+0x1f0/0x3e0
>     kernel_init+0x34/0x1d0
>     ret_from_kernel_thread+0x5c/0x64
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/code-patching: Fix oops with DEBUG_VM enabled
      https://git.kernel.org/powerpc/c/980411a4d1bb925d28cd9e8d8301dc982ece788d

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 73ce4b90bb1b..b00112d7ad46 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -178,7 +178,6 @@  static int text_area_cpu_up_mm(unsigned int cpu)
 
 	this_cpu_write(cpu_patching_context.mm, mm);
 	this_cpu_write(cpu_patching_context.addr, addr);
-	this_cpu_write(cpu_patching_context.pte, pte);
 
 	return 0;
 
@@ -195,7 +194,6 @@  static int text_area_cpu_down_mm(unsigned int cpu)
 
 	this_cpu_write(cpu_patching_context.mm, NULL);
 	this_cpu_write(cpu_patching_context.addr, 0);
-	this_cpu_write(cpu_patching_context.pte, NULL);
 
 	return 0;
 }
@@ -289,12 +287,16 @@  static int __do_patch_instruction_mm(u32 *addr, ppc_inst_t instr)
 	unsigned long pfn = get_patch_pfn(addr);
 	struct mm_struct *patching_mm;
 	struct mm_struct *orig_mm;
+	spinlock_t *ptl;
 
 	patching_mm = __this_cpu_read(cpu_patching_context.mm);
-	pte = __this_cpu_read(cpu_patching_context.pte);
 	text_poke_addr = __this_cpu_read(cpu_patching_context.addr);
 	patch_addr = (u32 *)(text_poke_addr + offset_in_page(addr));
 
+	pte = get_locked_pte(patching_mm, text_poke_addr, &ptl);
+	if (!pte)
+		return -ENOMEM;
+
 	__set_pte_at(patching_mm, text_poke_addr, pte, pfn_pte(pfn, PAGE_KERNEL), 0);
 
 	/* order PTE update before use, also serves as the hwsync */
@@ -321,6 +323,8 @@  static int __do_patch_instruction_mm(u32 *addr, ppc_inst_t instr)
 	 */
 	local_flush_tlb_page_psize(patching_mm, text_poke_addr, mmu_virtual_psize);
 
+	pte_unmap_unlock(pte, ptl);
+
 	return err;
 }