Message ID | 20241206065545.14815-1-avnish@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] powerpc: increase MIN RMA size for CAS negotiation | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 5 jobs. |
snowpatch_ozlabs/github-powerpc_sparse | success | Successfully ran 4 jobs. |
snowpatch_ozlabs/github-powerpc_kernel_qemu | success | Successfully ran 21 jobs. |
Avnish Chouhan <avnish@linux.ibm.com> writes: > Change RMA size from 512 MB to 768 MB which will result > in more RMA at boot time for PowerPC. Did you consider just increasing it to 1GB? It's possible there's some folks running LPARs with less than 1GB, but they are unlikely to continue doing so by the time this change trickles into distros. To be supported modern RHEL requires 2GB minimum RAM anyway. Can you also describe the behaviour users will see when they configure an LPAR with less than 768MB of RAM. cheers
Hi Michael, Hope you're doing wonderful! Thank you so much for your response. I have checked on your queries. Please find the findings below: 1. Did you consider just increasing it to 1GB? We have observed in our recent Out Of Memory issues, a shortage of around 50-60 MBs space in RMA in current issues. So we decided to increase the RMA by 256 MBs. Please give me couple of days, I'm analyzing this 1 GB change and update you on it soon. 2. an LPAR with less than 768MB of RAM I have analyzed the multiple RAM scenarios. The behavior seems similar regardless of RMA size 512 or 768 MBs, as the RMA region is used by PFW and GRUB2 for booting. Even if GRUB is able to load the kernel for booting, the machines isn't booting and behaving well in low amount of RAM. We observe kernel panics, mostly "Out of memory: Killed process...." when RAM is less than 3 GBs. The different RAM configs (via HMC LPAR properties) and behaviors are given below: i. RAM (3 GBs) System boot fine when RAM is minimum 3 GBs (It does depend on system config as well). ii. RMA (512 MBs) With RAM as 512 MB, the system fails to boot with firmware error (eg B2006006). iii. RAM (768 MB and 1 GB) With RAM as 768 MB and 1 GB, System boot with kernel panic as "Kernel panic - not syncing: System is deadlocked on memory". iv. RAM (2 GBs) System does boot fine, but abnormal behavior after the boot. I observed system panic in one scenario while doing a reboot. "Out of memory: Killed process 167....." Regards, Avnish Chouhan On 2024-12-07 07:28, Michael Ellerman wrote: > Avnish Chouhan <avnish@linux.ibm.com> writes: >> Change RMA size from 512 MB to 768 MB which will result >> in more RMA at boot time for PowerPC. > > Did you consider just increasing it to 1GB? > > It's possible there's some folks running LPARs with less than 1GB, but > they are unlikely to continue doing so by the time this change trickles > into distros. To be supported modern RHEL requires 2GB minimum RAM > anyway. > > Can you also describe the behaviour users will see when they configure > an LPAR with less than 768MB of RAM. > > cheers
Hello Michael, On 07/12/24 07:28, Michael Ellerman wrote: > Avnish Chouhan <avnish@linux.ibm.com> writes: >> Change RMA size from 512 MB to 768 MB which will result >> in more RMA at boot time for PowerPC. > Did you consider just increasing it to 1GB? I see an impact of setting RMA to 1GB on fadump. Here’s how: The minimum reservation recommended by us for fadump is 768MB. In this case, the boot_mem_top in fadump.c will be 768MB, and the memory reservation for fadump will start from a 768MB offset. Now, let’s say the production kernel crashes, and the firmware copies the 0–768MB region to the fadump reserved area and boots the fadump kernel to capture the dump. If RMA is set to 1GB, it will allow GRUB to use memory up to 1GB. In such a case, there is a possibility that GRUB could corrupt the fadump reserved area. Thanks, Sourabh Jain
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index fbb68fc28ed3..c42fd5a826c0 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -1061,7 +1061,7 @@ static const struct ibm_arch_vec ibm_architecture_vec_template __initconst = { .virt_base = cpu_to_be32(0xffffffff), .virt_size = cpu_to_be32(0xffffffff), .load_base = cpu_to_be32(0xffffffff), - .min_rma = cpu_to_be32(512), /* 512MB min RMA */ + .min_rma = cpu_to_be32(768), /* 768MB min RMA */ .min_load = cpu_to_be32(0xffffffff), /* full client load */ .min_rma_percent = 0, /* min RMA percentage of total RAM */ .max_pft_size = 48, /* max log_2(hash table size) */
Change RMA size from 512 MB to 768 MB which will result in more RMA at boot time for PowerPC. When PowerPC LPAR use/uses vTPM, Secure Boot or FADump, the 512 MB RMA memory is not sufficient for booting. With this 512 MB RMA, GRUB2 run out of memory and unable to load the necessary. Sometimes even usage of CDROM which requires more memory for installation along with the options mentioned above troubles the boot memory and result in boot failures. Increasing the RMA size will resolves multiple out of memory issues observed in PowerPC. Failure details: 1. GRUB2 kern/ieee1275/init.c:550: mm requested region of size 8513000, flags 1 kern/ieee1275/init.c:563: Cannot satisfy allocation and retain minimum runtime space kern/ieee1275/init.c:550: mm requested region of size 8513000, flags 0 kern/ieee1275/init.c:563: Cannot satisfy allocation and retain minimum runtime space kern/file.c:215: Closing `/ppc/ppc64/initrd.img' ... kern/disk.c:297: Closing `ieee1275//vdevice/v-scsi @30000067/disk@8300000000000000'... kern/disk.c:311: Closing `ieee1275//vdevice/v-scsi @30000067/disk@8300000000000000' succeeded. kern/file.c:225: Closing `/ppc/ppc64/initrd.img' failed with 3. kern/file.c:148: Opening `/ppc/ppc64/initrd.img' succeeded. error: ../../grub-core/kern/mm.c:552:out of memory. 2. Kernel [ 0.777633] List of all partitions: [ 0.777639] No filesystem could mount root, tried: [ 0.777640] [ 0.777649] Kernel panic - not syncing: VFS: Unable to mount root fs on "" or unknown-block(0,0) [ 0.777658] CPU: 17 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-0.rc4.20.el10.ppc64le #1 [ 0.777669] Hardware name: IBM,9009-22A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.B0 (VL950_149) hv:phyp pSeries [ 0.777678] Call Trace: [ 0.777682] [c000000003db7b60] [c000000001119714] dump_stack_lvl+0x88/0xc4 (unreliable) [ 0.777700] [c000000003db7b90] [c00000000016c274] panic+0x174/0x460 [ 0.777711] [c000000003db7c30] [c00000000200631c] mount_root_generic+0x320/0x354 [ 0.777724] [c000000003db7d00] [c0000000020066f8] prepare_namespace+0x27c/0x2f4 [ 0.777735] [c000000003db7d90] [c000000002005824] kernel_init_freeable+0x254/0x294 [ 0.777747] [c000000003db7df0] [c00000000001131c] kernel_init+0x30/0x1c4 [ 0.777757] [c000000003db7e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c [ 0.777768] --- interrupt: 0 at 0x0 [ 0.784238] pstore: backend (nvram) writing error (-1) [ 0.790447] Rebooting in 10 seconds.. Signed-off-by: Avnish Chouhan <avnish@linux.ibm.com> --- Change logs: v2: - Added GRUB2 debug logs and Kernel traces. --- arch/powerpc/kernel/prom_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)