Message ID | 156166327993.13320.10788410344711883330.stgit@hbathini.in.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/2] powerpc: reserve memory for capture kernel after hugepages init | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch next (c7d64b560ce80d8c44f082eee8352f0778a73195) |
snowpatch_ozlabs/build-ppc64le | success | Build succeeded |
snowpatch_ozlabs/build-ppc64be | success | Build succeeded |
snowpatch_ozlabs/build-ppc64e | success | Build succeeded |
snowpatch_ozlabs/build-pmac32 | success | Build succeeded |
snowpatch_ozlabs/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 67 lines checked |
On Fri, 28 Jun 2019 00:51:19 +0530 Hari Bathini <hbathini@linux.ibm.com> wrote: > Currently, if memory_limit is specified and it overlaps with memory to > be reserved for capture kernel, memory_limit is adjusted to accommodate > capture kernel. With memory reservation for capture kernel moved later > (after enforcing memory limit), this adjustment no longer holds water. > So, avoid adjusting memory_limit and error out instead. Can you split out the memory limit adjustment out of memory reservation so it can still be adjusted? Thanks Michal > > Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> > --- > arch/powerpc/kernel/fadump.c | 16 ---------------- > arch/powerpc/kernel/machine_kexec.c | 22 +++++++++++----------- > 2 files changed, 11 insertions(+), 27 deletions(-) > > diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c > index 4eab972..a784695 100644 > --- a/arch/powerpc/kernel/fadump.c > +++ b/arch/powerpc/kernel/fadump.c > @@ -476,22 +476,6 @@ int __init fadump_reserve_mem(void) > #endif > } > > - /* > - * Calculate the memory boundary. > - * If memory_limit is less than actual memory boundary then reserve > - * the memory for fadump beyond the memory_limit and adjust the > - * memory_limit accordingly, so that the running kernel can run with > - * specified memory_limit. > - */ > - if (memory_limit && memory_limit < memblock_end_of_DRAM()) { > - size = get_fadump_area_size(); > - if ((memory_limit + size) < memblock_end_of_DRAM()) > - memory_limit += size; > - else > - memory_limit = memblock_end_of_DRAM(); > - printk(KERN_INFO "Adjusted memory_limit for firmware-assisted" > - " dump, now %#016llx\n", memory_limit); > - } > if (memory_limit) > memory_boundary = memory_limit; > else > diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c > index c4ed328..fc5533b 100644 > --- a/arch/powerpc/kernel/machine_kexec.c > +++ b/arch/powerpc/kernel/machine_kexec.c > @@ -125,10 +125,8 @@ void __init reserve_crashkernel(void) > crashk_res.end = crash_base + crash_size - 1; > } > > - if (crashk_res.end == crashk_res.start) { > - crashk_res.start = crashk_res.end = 0; > - return; > - } > + if (crashk_res.end == crashk_res.start) > + goto error_out; > > /* We might have got these values via the command line or the > * device tree, either way sanitise them now. */ > @@ -170,15 +168,13 @@ void __init reserve_crashkernel(void) > if (overlaps_crashkernel(__pa(_stext), _end - _stext)) { > printk(KERN_WARNING > "Crash kernel can not overlap current kernel\n"); > - crashk_res.start = crashk_res.end = 0; > - return; > + goto error_out; > } > > /* Crash kernel trumps memory limit */ > if (memory_limit && memory_limit <= crashk_res.end) { > - memory_limit = crashk_res.end + 1; > - printk("Adjusted memory limit for crashkernel, now 0x%llx\n", > - memory_limit); > + pr_err("Crash kernel size can't exceed memory_limit\n"); > + goto error_out; > } > > printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " > @@ -190,9 +186,13 @@ void __init reserve_crashkernel(void) > if (!memblock_is_region_memory(crashk_res.start, crash_size) || > memblock_reserve(crashk_res.start, crash_size)) { > pr_err("Failed to reserve memory for crashkernel!\n"); > - crashk_res.start = crashk_res.end = 0; > - return; > + goto error_out; > } > + > + return; > +error_out: > + crashk_res.start = crashk_res.end = 0; > + return; > } > > int overlaps_crashkernel(unsigned long start, unsigned long size) >
On 7/22/19 11:19 PM, Michal Suchánek wrote: > On Fri, 28 Jun 2019 00:51:19 +0530 > Hari Bathini <hbathini@linux.ibm.com> wrote: > >> Currently, if memory_limit is specified and it overlaps with memory to >> be reserved for capture kernel, memory_limit is adjusted to accommodate >> capture kernel. With memory reservation for capture kernel moved later >> (after enforcing memory limit), this adjustment no longer holds water. >> So, avoid adjusting memory_limit and error out instead. > > Can you split out the memory limit adjustment out of memory reservation > so it can still be adjusted? Do you mean adjust the memory limit before we do the actual reservation ? > > Thanks > > Michal >> >> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> >> --- >> arch/powerpc/kernel/fadump.c | 16 ---------------- >> arch/powerpc/kernel/machine_kexec.c | 22 +++++++++++----------- >> 2 files changed, 11 insertions(+), 27 deletions(-) >> >> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c >> index 4eab972..a784695 100644 >> --- a/arch/powerpc/kernel/fadump.c >> +++ b/arch/powerpc/kernel/fadump.c >> @@ -476,22 +476,6 @@ int __init fadump_reserve_mem(void) >> #endif >> } >> >> - /* >> - * Calculate the memory boundary. >> - * If memory_limit is less than actual memory boundary then reserve >> - * the memory for fadump beyond the memory_limit and adjust the >> - * memory_limit accordingly, so that the running kernel can run with >> - * specified memory_limit. >> - */ >> - if (memory_limit && memory_limit < memblock_end_of_DRAM()) { >> - size = get_fadump_area_size(); >> - if ((memory_limit + size) < memblock_end_of_DRAM()) >> - memory_limit += size; >> - else >> - memory_limit = memblock_end_of_DRAM(); >> - printk(KERN_INFO "Adjusted memory_limit for firmware-assisted" >> - " dump, now %#016llx\n", memory_limit); >> - } >> if (memory_limit) >> memory_boundary = memory_limit; >> else >> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c >> index c4ed328..fc5533b 100644 >> --- a/arch/powerpc/kernel/machine_kexec.c >> +++ b/arch/powerpc/kernel/machine_kexec.c >> @@ -125,10 +125,8 @@ void __init reserve_crashkernel(void) >> crashk_res.end = crash_base + crash_size - 1; >> } >> >> - if (crashk_res.end == crashk_res.start) { >> - crashk_res.start = crashk_res.end = 0; >> - return; >> - } >> + if (crashk_res.end == crashk_res.start) >> + goto error_out; >> >> /* We might have got these values via the command line or the >> * device tree, either way sanitise them now. */ >> @@ -170,15 +168,13 @@ void __init reserve_crashkernel(void) >> if (overlaps_crashkernel(__pa(_stext), _end - _stext)) { >> printk(KERN_WARNING >> "Crash kernel can not overlap current kernel\n"); >> - crashk_res.start = crashk_res.end = 0; >> - return; >> + goto error_out; >> } >> >> /* Crash kernel trumps memory limit */ >> if (memory_limit && memory_limit <= crashk_res.end) { >> - memory_limit = crashk_res.end + 1; >> - printk("Adjusted memory limit for crashkernel, now 0x%llx\n", >> - memory_limit); >> + pr_err("Crash kernel size can't exceed memory_limit\n"); >> + goto error_out; >> } >> >> printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " >> @@ -190,9 +186,13 @@ void __init reserve_crashkernel(void) >> if (!memblock_is_region_memory(crashk_res.start, crash_size) || >> memblock_reserve(crashk_res.start, crash_size)) { >> pr_err("Failed to reserve memory for crashkernel!\n"); >> - crashk_res.start = crashk_res.end = 0; >> - return; >> + goto error_out; >> } >> + >> + return; >> +error_out: >> + crashk_res.start = crashk_res.end = 0; >> + return; >> } >> >> int overlaps_crashkernel(unsigned long start, unsigned long size) >> >
On Wed, Jul 24, 2019 at 11:26:59AM +0530, Mahesh Jagannath Salgaonkar wrote: > On 7/22/19 11:19 PM, Michal Suchánek wrote: > > On Fri, 28 Jun 2019 00:51:19 +0530 > > Hari Bathini <hbathini@linux.ibm.com> wrote: > > > >> Currently, if memory_limit is specified and it overlaps with memory to > >> be reserved for capture kernel, memory_limit is adjusted to accommodate > >> capture kernel. With memory reservation for capture kernel moved later > >> (after enforcing memory limit), this adjustment no longer holds water. > >> So, avoid adjusting memory_limit and error out instead. > > > > Can you split out the memory limit adjustment out of memory reservation > > so it can still be adjusted? > > Do you mean adjust the memory limit before we do the actual reservation ? Yes, without that you get a regression in ability to enable fadump with limited memory - something like the below patch should fix it. Then again, there is no code to un-move the memory_limit in case the allocation fails, and we now have cma allocation which is dubious to allocate beyond memory_limit. So maybe removing the memory_limit adjustment is a bugfix removing 'feature' that has bitrotten over time. Thanks Michal From: Michal Suchanek <msuchanek@suse.de> Date: Mon, 6 Jan 2020 14:55:40 +0100 Subject: [PATCH 2/2] powerpc/fadump: adjust memlimit before MMU early init Moving the farump memory reservation before early MMU init makes the memlimit adjustment to make room for fadump ineffective. Move the adjustment back before early MMU init. Signed-off-by: Michal Suchanek <msuchanek@suse.de> --- arch/powerpc/include/asm/fadump.h | 3 +- arch/powerpc/kernel/fadump.c | 80 +++++++++++++++++++++++-------- arch/powerpc/kernel/prom.c | 3 ++ 3 files changed, 66 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h index 526a6a647312..76d3cbe1379c 100644 --- a/arch/powerpc/include/asm/fadump.h +++ b/arch/powerpc/include/asm/fadump.h @@ -30,6 +30,7 @@ static inline void fadump_cleanup(void) { } #if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP) extern int early_init_dt_scan_fw_dump(unsigned long node, const char *uname, int depth, void *data); -extern int fadump_reserve_mem(void); +int fadump_adjust_memlimit(void); +int fadump_reserve_mem(void); #endif #endif /* _ASM_POWERPC_FADUMP_H */ diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index 8ad6d8d1cdbe..4d76452dcb3d 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -431,19 +431,22 @@ static int __init fadump_get_boot_mem_regions(void) return ret; } -int __init fadump_reserve_mem(void) +static inline u64 fadump_get_reserve_alignment(void) { - u64 base, size, mem_boundary, bootmem_min, align = PAGE_SIZE; - bool is_memblock_bottom_up = memblock_bottom_up(); - int ret = 1; + u64 align = PAGE_SIZE; - if (!fw_dump.fadump_enabled) - return 0; +#ifdef CONFIG_CMA + if (!fw_dump.nocma) + align = FADUMP_CMA_ALIGNMENT; +#endif - if (!fw_dump.fadump_supported) { - pr_info("Firmware-Assisted Dump is not supported on this hardware\n"); - goto error_out; - } + return align; +} + +static inline u64 fadump_get_bootmem_min(void) +{ + u64 bootmem_min = 0; + u64 align = fadump_get_reserve_alignment(); /* * Initialize boot memory size @@ -455,7 +458,6 @@ int __init fadump_reserve_mem(void) PAGE_ALIGN(fadump_calculate_reserve_size()); #ifdef CONFIG_CMA if (!fw_dump.nocma) { - align = FADUMP_CMA_ALIGNMENT; fw_dump.boot_memory_size = ALIGN(fw_dump.boot_memory_size, align); } @@ -472,8 +474,43 @@ int __init fadump_reserve_mem(void) pr_err("Too many holes in boot memory area to enable fadump\n"); goto error_out; } + + } + + return bootmem_min; +error_out: + fw_dump.fadump_enabled = 0; + return 0; +} + +int __init fadump_adjust_memlimit(void) +{ + u64 size, bootmem_min; + + if (!fw_dump.fadump_enabled) + return 0; + + if (!fw_dump.fadump_supported) { + pr_info("Firmware-Assisted Dump is not supported on this hardware\n"); + fw_dump.fadump_enabled = 0; + return 0; } +#ifdef CONFIG_HUGETLB_PAGE + if (fw_dump.dump_active) { + /* + * FADump capture kernel doesn't care much about hugepages. + * In fact, handling hugepages in capture kernel is asking for + * trouble. So, disable HugeTLB support when fadump is active. + */ + hugetlb_disabled = true; + } +#endif + + bootmem_min = fadump_get_bootmem_min(); + if (!bootmem_min) + return 0; + /* * Calculate the memory boundary. * If memory_limit is less than actual memory boundary then reserve @@ -490,6 +527,19 @@ int __init fadump_reserve_mem(void) printk(KERN_INFO "Adjusted memory_limit for firmware-assisted" " dump, now %#016llx\n", memory_limit); } + + return 0; +} + +int __init fadump_reserve_mem(void) +{ + u64 base, size, mem_boundary, align = fadump_get_reserve_alignment(); + bool is_memblock_bottom_up = memblock_bottom_up(); + int ret = 1; + + if (!fw_dump.fadump_enabled) + return 0; + if (memory_limit) mem_boundary = memory_limit; else @@ -501,14 +551,6 @@ int __init fadump_reserve_mem(void) if (fw_dump.dump_active) { pr_info("Firmware-assisted dump is active.\n"); -#ifdef CONFIG_HUGETLB_PAGE - /* - * FADump capture kernel doesn't care much about hugepages. - * In fact, handling hugepages in capture kernel is asking for - * trouble. So, disable HugeTLB support when fadump is active. - */ - hugetlb_disabled = true; -#endif /* * If last boot has crashed then reserve all the memory * above boot memory size so that we don't touch it until diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 7b6cdd9bf78d..4fff8c2222b1 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -731,6 +731,9 @@ void __init early_init_devtree(void *params) if (PHYSICAL_START > MEMORY_START) memblock_reserve(MEMORY_START, 0x8000); reserve_kdump_trampoline(); +#if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP) + fadump_adjust_memlimit(); +#endif early_reserve_mem(); /* Ensure that total memory size is page-aligned. */
On Fri, Jun 28, 2019 at 12:51:19AM +0530, Hari Bathini wrote: > Currently, if memory_limit is specified and it overlaps with memory to > be reserved for capture kernel, memory_limit is adjusted to accommodate > capture kernel. With memory reservation for capture kernel moved later > (after enforcing memory limit), this adjustment no longer holds water. > So, avoid adjusting memory_limit and error out instead. The adjustment of memory limit does not look quite sound - There is no code to undo the adjustment in case reservation fails - I don't think reservation is still forced to the end of memory causing the kernel to use memory it was supposed not to - The CMA reservation again causes teh reserved memory to be used - Finally the CMA reservation makes this obsolete because the reserved memory is can be used by the system > > Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Michal Suchanek <msuchanek@suse.de> > --- > arch/powerpc/kernel/fadump.c | 16 ---------------- > arch/powerpc/kernel/machine_kexec.c | 22 +++++++++++----------- > 2 files changed, 11 insertions(+), 27 deletions(-) > > diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c > index 4eab972..a784695 100644 > --- a/arch/powerpc/kernel/fadump.c > +++ b/arch/powerpc/kernel/fadump.c > @@ -476,22 +476,6 @@ int __init fadump_reserve_mem(void) > #endif > } > > - /* > - * Calculate the memory boundary. > - * If memory_limit is less than actual memory boundary then reserve > - * the memory for fadump beyond the memory_limit and adjust the > - * memory_limit accordingly, so that the running kernel can run with > - * specified memory_limit. > - */ > - if (memory_limit && memory_limit < memblock_end_of_DRAM()) { > - size = get_fadump_area_size(); > - if ((memory_limit + size) < memblock_end_of_DRAM()) > - memory_limit += size; > - else > - memory_limit = memblock_end_of_DRAM(); > - printk(KERN_INFO "Adjusted memory_limit for firmware-assisted" > - " dump, now %#016llx\n", memory_limit); > - } > if (memory_limit) > memory_boundary = memory_limit; > else > diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c > index c4ed328..fc5533b 100644 > --- a/arch/powerpc/kernel/machine_kexec.c > +++ b/arch/powerpc/kernel/machine_kexec.c > @@ -125,10 +125,8 @@ void __init reserve_crashkernel(void) > crashk_res.end = crash_base + crash_size - 1; > } > > - if (crashk_res.end == crashk_res.start) { > - crashk_res.start = crashk_res.end = 0; > - return; > - } > + if (crashk_res.end == crashk_res.start) > + goto error_out; > > /* We might have got these values via the command line or the > * device tree, either way sanitise them now. */ > @@ -170,15 +168,13 @@ void __init reserve_crashkernel(void) > if (overlaps_crashkernel(__pa(_stext), _end - _stext)) { > printk(KERN_WARNING > "Crash kernel can not overlap current kernel\n"); > - crashk_res.start = crashk_res.end = 0; > - return; > + goto error_out; > } > > /* Crash kernel trumps memory limit */ > if (memory_limit && memory_limit <= crashk_res.end) { > - memory_limit = crashk_res.end + 1; > - printk("Adjusted memory limit for crashkernel, now 0x%llx\n", > - memory_limit); > + pr_err("Crash kernel size can't exceed memory_limit\n"); > + goto error_out; > } > > printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " > @@ -190,9 +186,13 @@ void __init reserve_crashkernel(void) > if (!memblock_is_region_memory(crashk_res.start, crash_size) || > memblock_reserve(crashk_res.start, crash_size)) { > pr_err("Failed to reserve memory for crashkernel!\n"); > - crashk_res.start = crashk_res.end = 0; > - return; > + goto error_out; > } > + > + return; > +error_out: > + crashk_res.start = crashk_res.end = 0; > + return; > } > > int overlaps_crashkernel(unsigned long start, unsigned long size) >
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index 4eab972..a784695 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -476,22 +476,6 @@ int __init fadump_reserve_mem(void) #endif } - /* - * Calculate the memory boundary. - * If memory_limit is less than actual memory boundary then reserve - * the memory for fadump beyond the memory_limit and adjust the - * memory_limit accordingly, so that the running kernel can run with - * specified memory_limit. - */ - if (memory_limit && memory_limit < memblock_end_of_DRAM()) { - size = get_fadump_area_size(); - if ((memory_limit + size) < memblock_end_of_DRAM()) - memory_limit += size; - else - memory_limit = memblock_end_of_DRAM(); - printk(KERN_INFO "Adjusted memory_limit for firmware-assisted" - " dump, now %#016llx\n", memory_limit); - } if (memory_limit) memory_boundary = memory_limit; else diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c index c4ed328..fc5533b 100644 --- a/arch/powerpc/kernel/machine_kexec.c +++ b/arch/powerpc/kernel/machine_kexec.c @@ -125,10 +125,8 @@ void __init reserve_crashkernel(void) crashk_res.end = crash_base + crash_size - 1; } - if (crashk_res.end == crashk_res.start) { - crashk_res.start = crashk_res.end = 0; - return; - } + if (crashk_res.end == crashk_res.start) + goto error_out; /* We might have got these values via the command line or the * device tree, either way sanitise them now. */ @@ -170,15 +168,13 @@ void __init reserve_crashkernel(void) if (overlaps_crashkernel(__pa(_stext), _end - _stext)) { printk(KERN_WARNING "Crash kernel can not overlap current kernel\n"); - crashk_res.start = crashk_res.end = 0; - return; + goto error_out; } /* Crash kernel trumps memory limit */ if (memory_limit && memory_limit <= crashk_res.end) { - memory_limit = crashk_res.end + 1; - printk("Adjusted memory limit for crashkernel, now 0x%llx\n", - memory_limit); + pr_err("Crash kernel size can't exceed memory_limit\n"); + goto error_out; } printk(KERN_INFO "Reserving %ldMB of memory at %ldMB " @@ -190,9 +186,13 @@ void __init reserve_crashkernel(void) if (!memblock_is_region_memory(crashk_res.start, crash_size) || memblock_reserve(crashk_res.start, crash_size)) { pr_err("Failed to reserve memory for crashkernel!\n"); - crashk_res.start = crashk_res.end = 0; - return; + goto error_out; } + + return; +error_out: + crashk_res.start = crashk_res.end = 0; + return; } int overlaps_crashkernel(unsigned long start, unsigned long size)
Currently, if memory_limit is specified and it overlaps with memory to be reserved for capture kernel, memory_limit is adjusted to accommodate capture kernel. With memory reservation for capture kernel moved later (after enforcing memory limit), this adjustment no longer holds water. So, avoid adjusting memory_limit and error out instead. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> --- arch/powerpc/kernel/fadump.c | 16 ---------------- arch/powerpc/kernel/machine_kexec.c | 22 +++++++++++----------- 2 files changed, 11 insertions(+), 27 deletions(-)