From patchwork Thu Nov 25 18:09:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hari Bathini X-Patchwork-Id: 1559854 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=q7uxVGYk; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4J0QsR6t2gz9sRN for ; Fri, 26 Nov 2021 05:10:35 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4J0QsR49k9z3cJS for ; Fri, 26 Nov 2021 05:10:35 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=q7uxVGYk; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=hbathini@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=q7uxVGYk; dkim-atps=neutral Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4J0Qs65YpXz2yxx for ; Fri, 26 Nov 2021 05:10:18 +1100 (AEDT) Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1APEpeSf013701; Thu, 25 Nov 2021 18:10:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=pp1; bh=OKWBU8LWk4qSTxMOhPHg+5vnYh/2X7klsg2iPCn1Sgw=; b=q7uxVGYkZxUF3LnREkzw2elomHWGYOMdzux9y0EcirDJbzT2Ka6ALdc71jgiL7K317Go RHxaXVwnAplEyUYQ9Kt1hSFsYpzdBFIjlVchr59UaxROCLUgwfA15LxmW4AUxHuPPHYS mKBExmskXQ/8HptaRsn1TK1CRpRhauWL4WGKcvjAbDhIohnIhoR/E9toel5Oxad1CkJQ 3nRpr84r7W78bT/S/9RdCZX8R/ltmYxcl6KNw8TCRcv3AAtOqem7fuj9vhbXbp4peR5y i9Ht1z6cSCAxO29SiXEYNLtIf6UCF5YWjGXYpYy0V+hDQ9CaiCijggNcXWyrGZ4znkcg PA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3cjcmd3n28-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 18:10:12 +0000 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1APHuNCX008724; Thu, 25 Nov 2021 18:10:11 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com with ESMTP id 3cjcmd3n1f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 18:10:11 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1API2RrH015460; Thu, 25 Nov 2021 18:10:09 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma05fra.de.ibm.com with ESMTP id 3cernabe5q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 18:10:09 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1APIA6iR27197918 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 25 Nov 2021 18:10:06 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F3D72AE051; Thu, 25 Nov 2021 18:10:05 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BE823AE067; Thu, 25 Nov 2021 18:10:02 +0000 (GMT) Received: from hbathini-workstation.ibm.com.com (unknown [9.211.79.35]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 25 Nov 2021 18:10:02 +0000 (GMT) From: Hari Bathini To: mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com Subject: [PATCH v2 1/2] powerpc: handle kdump appropriately with crash_kexec_post_notifiers option Date: Thu, 25 Nov 2021 23:39:55 +0530 Message-Id: <20211125180956.384886-1-hbathini@linux.ibm.com> X-Mailer: git-send-email 2.33.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: hCxgoAVDVCNnaVAzkywXcxpujBLIDFSb X-Proofpoint-ORIG-GUID: 2iXpiJVN1NdIogJFJnAKcnHH_WLTkrK6 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-25_06,2021-11-25_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 lowpriorityscore=0 mlxscore=0 suspectscore=0 bulkscore=0 phishscore=0 mlxlogscore=999 spamscore=0 adultscore=0 malwarescore=0 clxscore=1011 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111250099 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hari Bathini , mahesh@linux.ibm.com, sourabhjain@linux.ibm.com Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Kdump can be triggered after panic_notifers since commit f06e5153f4ae2 ("kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers") introduced crash_kexec_post_notifiers option. But using this option would mean smp_send_stop(), that marks all other CPUs as offline, gets called before kdump is triggered. As a result, kdump routines fail to save other CPUs' registers. To fix this, kdump friendly crash_smp_send_stop() function was introduced with kernel commit 0ee59413c967 ("x86/panic: replace smp_send_stop() with kdump friendly version in panic path"). Override this kdump friendly weak function to handle crash_kexec_post_notifiers option appropriately on powerpc. Signed-off-by: Hari Bathini Reported-by: kernel test robot --- * New patch to handle the case where kdump is triggered after panic notifiers. arch/powerpc/kernel/smp.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index c23ee842c4c3..d34e6b67684c 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -620,6 +620,32 @@ void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *)) } #endif +static void crash_stop_this_cpu(struct pt_regs *regs) +{ + /* + * Just busy wait here and avoid marking CPU as offline to ensure + * register data of all these CPUs is captured appropriately. + */ + while (1) + cpu_relax(); +} + +void crash_smp_send_stop(void) +{ + static bool stopped = false; + + if (stopped) + return; + + stopped = true; + +#ifdef CONFIG_NMI_IPI + smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, crash_stop_this_cpu, 1000000); +#else + smp_call_function(crash_stop_this_cpu, NULL, 0); +#endif /* CONFIG_NMI_IPI */ +} + #ifdef CONFIG_NMI_IPI static void nmi_stop_this_cpu(struct pt_regs *regs) { From patchwork Thu Nov 25 18:09:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hari Bathini X-Patchwork-Id: 1559855 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=FwxDTqRi; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4J0QtD5svJz9sRN for ; Fri, 26 Nov 2021 05:11:16 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4J0QtD4KGTz3cks for ; Fri, 26 Nov 2021 05:11:16 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=FwxDTqRi; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=hbathini@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=FwxDTqRi; dkim-atps=neutral Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4J0QsC1Ysxz2yxx for ; Fri, 26 Nov 2021 05:10:23 +1100 (AEDT) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1APHmBqQ019608; Thu, 25 Nov 2021 18:10:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=xiFdOIbwOR+pc48g1L/jSJGeDHSncLZTbd7OdVTCKTg=; b=FwxDTqRip9sHj6ztBwqb+2n/U/tizHbdXEQxpebp1stTMASNENI2HtNOzIDwJmEfPdjf QYGVHFoLgzGU/AcrWEMSXbk2+J+zOmlsG9kc0mXEzz7wrzufHCEvIt/WZ/YoWkmsDxQA GFfrq3Y5HjqLg7WezWdYW2+MQ/GLNh6GmTtgRlaByXe3ps73g4leqlnpDMhdZSRsgYPH lVLfdwTZOMRYTGKfcCubUXVqeNF50klu3yEuYVOndx3r5iFqGE9TnEcBGgD5me7JzKZo QdjiflJIxVNf2Zzc9Z8iseywQGTd/9x7mG1i8mX8WdeNmcRYh+qnvgVfZHrXc6fJQeoU 6g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3cjf748aku-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 18:10:17 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1APHmsCu020840; Thu, 25 Nov 2021 18:10:17 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0b-001b2d01.pphosted.com with ESMTP id 3cjf748ak9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 18:10:17 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1API2cME006090; Thu, 25 Nov 2021 18:10:15 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma02fra.de.ibm.com with ESMTP id 3cernabe8n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 18:10:15 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1API2tjZ61931858 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 25 Nov 2021 18:02:55 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E2E6FAE053; Thu, 25 Nov 2021 18:10:11 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A3F2DAE056; Thu, 25 Nov 2021 18:10:08 +0000 (GMT) Received: from hbathini-workstation.ibm.com.com (unknown [9.211.79.35]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 25 Nov 2021 18:10:08 +0000 (GMT) From: Hari Bathini To: mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org, npiggin@gmail.com Subject: [PATCH v2 2/2] ppc64/fadump: fix inaccurate CPU state info in vmcore generated with panic Date: Thu, 25 Nov 2021 23:39:56 +0530 Message-Id: <20211125180956.384886-2-hbathini@linux.ibm.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211125180956.384886-1-hbathini@linux.ibm.com> References: <20211125180956.384886-1-hbathini@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: r33ZbDCtCFN147qig8ZVdGl6_lUDYF6O X-Proofpoint-ORIG-GUID: 1uUqWAR_MRmjORb02m2YSprgwA4R2bAR X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-25_06,2021-11-25_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 bulkscore=0 impostorscore=0 phishscore=0 suspectscore=0 priorityscore=1501 spamscore=0 clxscore=1015 mlxlogscore=999 adultscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111250099 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hari Bathini , mahesh@linux.ibm.com, sourabhjain@linux.ibm.com Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" In panic path, fadump is triggered via a panic notifier function. Before calling panic notifier functions, smp_send_stop() gets called, which stops all CPUs except the panic'ing CPU. Commit 8389b37dffdc ("powerpc: stop_this_cpu: remove the cpu from the online map.") and again commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") started marking CPUs as offline while stopping them. So, if a kernel has either of the above commits, vmcore captured with fadump via panic path would not process register data for all CPUs except the panic'ing CPU. Sample output of crash-utility with such vmcore: # crash vmlinux vmcore ... KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 1 DATE: Wed Nov 10 09:56:34 EST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 2.27, 0.69, 0.24 TASKS: 183 NODENAME: XXXXXXXXX RELEASE: 5.15.0+ VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021 MACHINE: ppc64le (2500 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 3394 COMMAND: "bash" TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> p -x __cpu_online_mask __cpu_online_mask = $1 = { bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> crash> crash> p -x __cpu_active_mask __cpu_active_mask = $2 = { bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> While this has been the case since fadump was introduced, the issue was not identified for two probable reasons: - In general, the bulk of the vmcores analyzed were from crash due to exception. - The above did change since commit 8341f2f222d7 ("sysrq: Use panic() to force a crash") started using panic() instead of deferencing NULL pointer to force a kernel crash. But then commit de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs") stopped marking CPUs as offline till kernel commit bab26238bbd4 ("powerpc: Offline CPU in stop_this_cpu()") reverted that change. To ensure post processing register data of all other CPUs happens as intended, let panic() function take the crash friendly path (read crash_smp_send_stop()) with the help of crash_kexec_post_notifiers option. Also, as register data for all CPUs is captured by f/w, skip IPI callbacks here for fadump, to avoid any complications in finding the right backtraces. Signed-off-by: Hari Bathini --- Changes in v2: * Let panic(0 take crash friendly path when fadump is enabled using crash_kexec_post_notifiers option. arch/powerpc/kernel/fadump.c | 8 ++++++++ arch/powerpc/kernel/smp.c | 10 ++++++++++ 2 files changed, 18 insertions(+) diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index b7ceb041743c..60f5fc14aa23 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1641,6 +1641,14 @@ int __init setup_fadump(void) else if (fw_dump.reserve_dump_area_size) fw_dump.ops->fadump_init_mem_struct(&fw_dump); + /* + * In case of panic, fadump is triggered via ppc_panic_event() + * panic notifier. Setting crash_kexec_post_notifiers to 'true' + * lets panic() function take crash friendly path before panic + * notifiers are invoked. + */ + crash_kexec_post_notifiers = true; + return 1; } subsys_initcall(setup_fadump); diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index d34e6b67684c..00a52b6e3888 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -61,6 +61,7 @@ #include #include #include +#include #ifdef DEBUG #include @@ -634,6 +635,15 @@ void crash_smp_send_stop(void) { static bool stopped = false; + /* + * In case of fadump, register data for all CPUs is captured by f/w + * on ibm,os-term rtas call. Skip IPI callbacks to other CPUs before + * this rtas call to avoid tricky post processing of those CPUs' + * backtraces. + */ + if (should_fadump_crash()) + return; + if (stopped) return;