Message ID | 20190313145717.46369-1-cai@lca.pw (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [v2] kmemleak: skip scanning holes in the .bss section | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch next (9580b71b5a7863c24a9bd18bcd2ad759b86b1eff) |
snowpatch_ozlabs/build-ppc64le | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/build-ppc64be | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/build-ppc64e | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/build-pmac32 | success | build succeeded & removed 0 sparse warning(s) |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 0 warnings, 1 checks, 74 lines checked |
Hi Qian, On Wed, Mar 13, 2019 at 10:57:17AM -0400, Qian Cai wrote: > @@ -1531,7 +1547,14 @@ static void kmemleak_scan(void) > > /* data/bss scanning */ > scan_large_block(_sdata, _edata); > - scan_large_block(__bss_start, __bss_stop); > + > + if (bss_hole_start) { > + scan_large_block(__bss_start, bss_hole_start); > + scan_large_block(bss_hole_stop, __bss_stop); > + } else { > + scan_large_block(__bss_start, __bss_stop); > + } > + > scan_large_block(__start_ro_after_init, __end_ro_after_init); I'm not a fan of this approach but I couldn't come up with anything better. I was hoping we could check for PageReserved() in scan_block() but on arm64 it ends up not scanning the .bss at all. Until another user appears, I'm ok with this patch. Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Catalin Marinas <catalin.marinas@arm.com> writes: > Hi Qian, > > On Wed, Mar 13, 2019 at 10:57:17AM -0400, Qian Cai wrote: >> @@ -1531,7 +1547,14 @@ static void kmemleak_scan(void) >> >> /* data/bss scanning */ >> scan_large_block(_sdata, _edata); >> - scan_large_block(__bss_start, __bss_stop); >> + >> + if (bss_hole_start) { >> + scan_large_block(__bss_start, bss_hole_start); >> + scan_large_block(bss_hole_stop, __bss_stop); >> + } else { >> + scan_large_block(__bss_start, __bss_stop); >> + } >> + >> scan_large_block(__start_ro_after_init, __end_ro_after_init); > > I'm not a fan of this approach but I couldn't come up with anything > better. I was hoping we could check for PageReserved() in scan_block() > but on arm64 it ends up not scanning the .bss at all. > > Until another user appears, I'm ok with this patch. > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> I actually would like to rework this kvm_tmp thing to not be in bss at all. It's a bit of a hack and is incompatible with strict RWX. If we size it a bit more conservatively we can hopefully just reserve some space in the text section for it. I'm not going to have time to work on that immediately though, so if people want this fixed now then this patch could go in as a temporary solution. cheers
On Thu, Mar 21, 2019 at 12:15:46AM +1100, Michael Ellerman wrote: > Catalin Marinas <catalin.marinas@arm.com> writes: > > On Wed, Mar 13, 2019 at 10:57:17AM -0400, Qian Cai wrote: > >> @@ -1531,7 +1547,14 @@ static void kmemleak_scan(void) > >> > >> /* data/bss scanning */ > >> scan_large_block(_sdata, _edata); > >> - scan_large_block(__bss_start, __bss_stop); > >> + > >> + if (bss_hole_start) { > >> + scan_large_block(__bss_start, bss_hole_start); > >> + scan_large_block(bss_hole_stop, __bss_stop); > >> + } else { > >> + scan_large_block(__bss_start, __bss_stop); > >> + } > >> + > >> scan_large_block(__start_ro_after_init, __end_ro_after_init); > > > > I'm not a fan of this approach but I couldn't come up with anything > > better. I was hoping we could check for PageReserved() in scan_block() > > but on arm64 it ends up not scanning the .bss at all. > > > > Until another user appears, I'm ok with this patch. > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > I actually would like to rework this kvm_tmp thing to not be in bss at > all. It's a bit of a hack and is incompatible with strict RWX. > > If we size it a bit more conservatively we can hopefully just reserve > some space in the text section for it. > > I'm not going to have time to work on that immediately though, so if > people want this fixed now then this patch could go in as a temporary > solution. I think I have a simpler idea. Kmemleak allows punching holes in allocated objects, so just turn the data/bss sections into dedicated kmemleak objects. This happens when kmemleak is initialised, before the initcalls are invoked. The kvm_free_tmp() would just free the corresponding part of the bss. Patch below, only tested briefly on arm64. Qian, could you give it a try on powerpc? Thanks. --------8<------------------------------ diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c index 683b5b3805bd..c4b8cb3c298d 100644 --- a/arch/powerpc/kernel/kvm.c +++ b/arch/powerpc/kernel/kvm.c @@ -712,6 +712,8 @@ static void kvm_use_magic_page(void) static __init void kvm_free_tmp(void) { + kmemleak_free_part(&kvm_tmp[kvm_tmp_index], + ARRAY_SIZE(kvm_tmp) - kvm_tmp_index); free_reserved_area(&kvm_tmp[kvm_tmp_index], &kvm_tmp[ARRAY_SIZE(kvm_tmp)], -1, NULL); } diff --git a/mm/kmemleak.c b/mm/kmemleak.c index 707fa5579f66..0f6adcbfc2c7 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -1529,11 +1529,6 @@ static void kmemleak_scan(void) } rcu_read_unlock(); - /* data/bss scanning */ - scan_large_block(_sdata, _edata); - scan_large_block(__bss_start, __bss_stop); - scan_large_block(__start_ro_after_init, __end_ro_after_init); - #ifdef CONFIG_SMP /* per-cpu sections scanning */ for_each_possible_cpu(i) @@ -2071,6 +2066,15 @@ void __init kmemleak_init(void) } local_irq_restore(flags); + /* register the data/bss sections */ + create_object((unsigned long)_sdata, _edata - _sdata, + KMEMLEAK_GREY, GFP_ATOMIC); + create_object((unsigned long)__bss_start, __bss_stop - __bss_start, + KMEMLEAK_GREY, GFP_ATOMIC); + create_object((unsigned long)__start_ro_after_init, + __end_ro_after_init - __start_ro_after_init, + KMEMLEAK_GREY, GFP_ATOMIC); + /* * This is the point where tracking allocations is safe. Automatic * scanning is started during the late initcall. Add the early logged
On Wed, 2019-03-20 at 18:16 +0000, Catalin Marinas wrote: > I think I have a simpler idea. Kmemleak allows punching holes in > allocated objects, so just turn the data/bss sections into dedicated > kmemleak objects. This happens when kmemleak is initialised, before the > initcalls are invoked. The kvm_free_tmp() would just free the > corresponding part of the bss. > > Patch below, only tested briefly on arm64. Qian, could you give it a try > on powerpc? Thanks. It works great so far!
Catalin Marinas <catalin.marinas@arm.com> writes: > On Thu, Mar 21, 2019 at 12:15:46AM +1100, Michael Ellerman wrote: >> Catalin Marinas <catalin.marinas@arm.com> writes: >> > On Wed, Mar 13, 2019 at 10:57:17AM -0400, Qian Cai wrote: >> >> @@ -1531,7 +1547,14 @@ static void kmemleak_scan(void) >> >> >> >> /* data/bss scanning */ >> >> scan_large_block(_sdata, _edata); >> >> - scan_large_block(__bss_start, __bss_stop); >> >> + >> >> + if (bss_hole_start) { >> >> + scan_large_block(__bss_start, bss_hole_start); >> >> + scan_large_block(bss_hole_stop, __bss_stop); >> >> + } else { >> >> + scan_large_block(__bss_start, __bss_stop); >> >> + } >> >> + >> >> scan_large_block(__start_ro_after_init, __end_ro_after_init); >> > >> > I'm not a fan of this approach but I couldn't come up with anything >> > better. I was hoping we could check for PageReserved() in scan_block() >> > but on arm64 it ends up not scanning the .bss at all. >> > >> > Until another user appears, I'm ok with this patch. >> > >> > Acked-by: Catalin Marinas <catalin.marinas@arm.com> >> >> I actually would like to rework this kvm_tmp thing to not be in bss at >> all. It's a bit of a hack and is incompatible with strict RWX. >> >> If we size it a bit more conservatively we can hopefully just reserve >> some space in the text section for it. >> >> I'm not going to have time to work on that immediately though, so if >> people want this fixed now then this patch could go in as a temporary >> solution. > > I think I have a simpler idea. Kmemleak allows punching holes in > allocated objects, so just turn the data/bss sections into dedicated > kmemleak objects. This happens when kmemleak is initialised, before the > initcalls are invoked. The kvm_free_tmp() would just free the > corresponding part of the bss. > > Patch below, only tested briefly on arm64. Qian, could you give it a try > on powerpc? Thanks. > > --------8<------------------------------ > diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c > index 683b5b3805bd..c4b8cb3c298d 100644 > --- a/arch/powerpc/kernel/kvm.c > +++ b/arch/powerpc/kernel/kvm.c > @@ -712,6 +712,8 @@ static void kvm_use_magic_page(void) > > static __init void kvm_free_tmp(void) > { > + kmemleak_free_part(&kvm_tmp[kvm_tmp_index], > + ARRAY_SIZE(kvm_tmp) - kvm_tmp_index); > free_reserved_area(&kvm_tmp[kvm_tmp_index], > &kvm_tmp[ARRAY_SIZE(kvm_tmp)], -1, NULL); > } Fine by me as long as it works (sounds like it does). Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) cheers
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c index 683b5b3805bd..5cddc8fc56bb 100644 --- a/arch/powerpc/kernel/kvm.c +++ b/arch/powerpc/kernel/kvm.c @@ -26,6 +26,7 @@ #include <linux/slab.h> #include <linux/of.h> #include <linux/pagemap.h> +#include <linux/kmemleak.h> #include <asm/reg.h> #include <asm/sections.h> @@ -712,6 +713,8 @@ static void kvm_use_magic_page(void) static __init void kvm_free_tmp(void) { + kmemleak_bss_hole(&kvm_tmp[kvm_tmp_index], + &kvm_tmp[ARRAY_SIZE(kvm_tmp)]); free_reserved_area(&kvm_tmp[kvm_tmp_index], &kvm_tmp[ARRAY_SIZE(kvm_tmp)], -1, NULL); } diff --git a/include/linux/kmemleak.h b/include/linux/kmemleak.h index 5ac416e2d339..17d3684e81ab 100644 --- a/include/linux/kmemleak.h +++ b/include/linux/kmemleak.h @@ -46,6 +46,7 @@ extern void kmemleak_alloc_phys(phys_addr_t phys, size_t size, int min_count, extern void kmemleak_free_part_phys(phys_addr_t phys, size_t size) __ref; extern void kmemleak_not_leak_phys(phys_addr_t phys) __ref; extern void kmemleak_ignore_phys(phys_addr_t phys) __ref; +extern void kmemleak_bss_hole(void *start, void *stop) __init; static inline void kmemleak_alloc_recursive(const void *ptr, size_t size, int min_count, slab_flags_t flags, @@ -131,6 +132,9 @@ static inline void kmemleak_not_leak_phys(phys_addr_t phys) static inline void kmemleak_ignore_phys(phys_addr_t phys) { } +static inline void kmemleak_bss_hole(void *start, void *stop) +{ +} #endif /* CONFIG_DEBUG_KMEMLEAK */ diff --git a/mm/kmemleak.c b/mm/kmemleak.c index 707fa5579f66..a2d894d3de07 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -237,6 +237,10 @@ static int kmemleak_skip_disable; /* If there are leaks that can be reported */ static bool kmemleak_found_leaks; +/* Skip scanning of a range in the .bss section. */ +static void *bss_hole_start; +static void *bss_hole_stop; + static bool kmemleak_verbose; module_param_named(verbose, kmemleak_verbose, bool, 0600); @@ -1265,6 +1269,18 @@ void __ref kmemleak_ignore_phys(phys_addr_t phys) } EXPORT_SYMBOL(kmemleak_ignore_phys); +/** + * kmemleak_bss_hole - skip scanning a range in the .bss section + * + * @start: start of the range + * @stop: end of the range + */ +void __init kmemleak_bss_hole(void *start, void *stop) +{ + bss_hole_start = start; + bss_hole_stop = stop; +} + /* * Update an object's checksum and return true if it was modified. */ @@ -1531,7 +1547,14 @@ static void kmemleak_scan(void) /* data/bss scanning */ scan_large_block(_sdata, _edata); - scan_large_block(__bss_start, __bss_stop); + + if (bss_hole_start) { + scan_large_block(__bss_start, bss_hole_start); + scan_large_block(bss_hole_stop, __bss_stop); + } else { + scan_large_block(__bss_start, __bss_stop); + } + scan_large_block(__start_ro_after_init, __end_ro_after_init); #ifdef CONFIG_SMP
The commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds kvm_tmp[] into the .bss section and then free the rest of unused spaces back to the page allocator. kernel_init kvm_guest_init kvm_free_tmp free_reserved_area free_unref_page free_unref_page_prepare With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the result, kmemleak scan will trigger a panic below when it scans the .bss section with unmapped pages. Since this is done way before the first kmemleak_scan(), just go lockless to make the implementation simple and skip those pages when scanning the .bss section. Later, those pages could be tracked by kmemleak again once allocated by the page allocator. Overall, this is such a special case, so no need to make it a generic to let kmemleak gain an ability to skip blocks in scan_large_block() for now. BUG: Unable to handle kernel data access at 0xc000000001610000 Faulting instruction address: 0xc0000000003cc178 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA pSeries CPU: 3 PID: 130 Comm: kmemleak Kdump: loaded Not tainted 5.0.0+ #9 REGS: c0000004b05bf940 TRAP: 0300 Not tainted (5.0.0+) NIP [c0000000003cc178] scan_block+0xa8/0x190 LR [c0000000003cc170] scan_block+0xa0/0x190 Call Trace: [c0000004b05bfbd0] [c0000000003cc170] scan_block+0xa0/0x190 (unreliable) [c0000004b05bfc30] [c0000000003cc2c0] scan_large_block+0x60/0xa0 [c0000004b05bfc70] [c0000000003ccc64] kmemleak_scan+0x254/0x960 [c0000004b05bfd40] [c0000000003cdd50] kmemleak_scan_thread+0xec/0x12c [c0000004b05bfdb0] [c000000000104388] kthread+0x1b8/0x1c0 [c0000004b05bfe20] [c00000000000b364] ret_from_kernel_thread+0x5c/0x78 Instruction dump: 7fa3eb78 4844667d 60000000 60000000 60000000 60000000 3bff0008 7fbcf840 409d00b8 4bfffeed 2fa30000 409e00ac <e87f0000> e93e0128 7fa91840 419dffdc Signed-off-by: Qian Cai <cai@lca.pw> --- v2: make the function __init per Andrew. arch/powerpc/kernel/kvm.c | 3 +++ include/linux/kmemleak.h | 4 ++++ mm/kmemleak.c | 25 ++++++++++++++++++++++++- 3 files changed, 31 insertions(+), 1 deletion(-)