From patchwork Wed Jan 15 05:16:13 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 310955 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from ozlabs.org (localhost [IPv6:::1]) by ozlabs.org (Postfix) with ESMTP id 3FB2A2C03A7 for ; Wed, 15 Jan 2014 16:16:55 +1100 (EST) Received: by ozlabs.org (Postfix) id 6E7732C009C; Wed, 15 Jan 2014 16:16:25 +1100 (EST) Delivered-To: linuxppc-dev@ozlabs.org Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.151]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id D27DA2C008F for ; Wed, 15 Jan 2014 16:16:24 +1100 (EST) Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 14 Jan 2014 22:16:22 -0700 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 14 Jan 2014 22:16:21 -0700 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 5361A19D8046 for ; Tue, 14 Jan 2014 22:16:11 -0700 (MST) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by b03cxnp07028.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s0F5G6JM9306594 for ; Wed, 15 Jan 2014 06:16:06 +0100 Received: from d03av02.boulder.ibm.com (localhost [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s0F5GJFE007996 for ; Tue, 14 Jan 2014 22:16:20 -0700 Received: from shangw (shangw.cn.ibm.com [9.125.213.121]) by d03av02.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with SMTP id s0F5GHst007794; Tue, 14 Jan 2014 22:16:18 -0700 Received: by shangw (Postfix, from userid 1000) id 4AF6E300449; Wed, 15 Jan 2014 13:16:15 +0800 (CST) From: Gavin Shan To: linuxppc-dev@ozlabs.org Subject: [PATCH v2 3/3] powerpc/eeh: Escalate error on non-existing PE Date: Wed, 15 Jan 2014 13:16:13 +0800 Message-Id: <1389762974-14176-3-git-send-email-shangw@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1389762974-14176-1-git-send-email-shangw@linux.vnet.ibm.com> References: <1389762974-14176-1-git-send-email-shangw@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14011505-0928-0000-0000-0000056A37C4 Cc: Gavin Shan X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Sometimes, especially in sinario of loading another kernel with kdump, we got EEH error on non-existing PE. That means the PEEV / PEST in the corresponding PHB would be messy and we can't handle that case. The patch escalates the error to fenced PHB so that the PHB could be rested in order to revoer the errors on non-existing PEs. Reported-by: Mahesh Salgaonkar Signed-off-by: Gavin Shan Tested-by: Mahesh Salgaonkar --- arch/powerpc/platforms/powernv/eeh-ioda.c | 31 +++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c index e0b12d0..92aa1f9 100644 --- a/arch/powerpc/platforms/powernv/eeh-ioda.c +++ b/arch/powerpc/platforms/powernv/eeh-ioda.c @@ -862,11 +862,7 @@ static int ioda_eeh_get_pe(struct pci_controller *hose, dev.phb = hose; dev.pe_config_addr = pe_no; dev_pe = eeh_pe_get(&dev); - if (!dev_pe) { - pr_warning("%s: Can't find PE for PHB#%x - PE#%x\n", - __func__, hose->global_number, pe_no); - return -EEXIST; - } + if (!dev_pe) return -EEXIST; *pe = dev_pe; return 0; @@ -980,12 +976,27 @@ static int ioda_eeh_next_error(struct eeh_pe **pe) break; case OPAL_EEH_PE_ERROR: - if (ioda_eeh_get_pe(hose, frozen_pe_no, pe)) - break; + /* + * If we can't find the corresponding PE, the + * PEEV / PEST would be messy. So we force an + * fenced PHB so that it can be recovered. + */ + if (ioda_eeh_get_pe(hose, frozen_pe_no, pe)) { + if (!ioda_eeh_get_phb_pe(hose, pe)) { + pr_err("EEH: Escalated fenced PHB#%x " + "detected for PE#%llx\n", + hose->global_number, + frozen_pe_no); + ret = EEH_NEXT_ERR_FENCED_PHB; + } else { + ret = EEH_NEXT_ERR_NONE; + } + } else { + pr_err("EEH: Frozen PE#%x on PHB#%x detected\n", + (*pe)->addr, (*pe)->phb->global_number); + ret = EEH_NEXT_ERR_FROZEN_PE; + } - pr_err("EEH: Frozen PE#%x on PHB#%x detected\n", - (*pe)->addr, (*pe)->phb->global_number); - ret = EEH_NEXT_ERR_FROZEN_PE; break; default: pr_warn("%s: Unexpected error type %d\n",