From patchwork Mon Mar 19 04:15:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vaibhav Jain X-Patchwork-Id: 887570 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 404N6Y6tZnz9sVM for ; Mon, 19 Mar 2018 15:16:21 +1100 (AEDT) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 404N6Y2qvMzF1VM for ; Mon, 19 Mar 2018 15:16:21 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=vaibhav@linux.vnet.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 404N6M0NkgzDqZH for ; Mon, 19 Mar 2018 15:16:09 +1100 (AEDT) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w2J4EZnn093376 for ; Mon, 19 Mar 2018 00:16:07 -0400 Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com [195.75.94.110]) by mx0a-001b2d01.pphosted.com with ESMTP id 2gt2ckdxb6-1 (version=TLSv1.2 cipher=AES256-SHA256 bits=256 verify=NOT) for ; Mon, 19 Mar 2018 00:16:07 -0400 Received: from localhost by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 19 Mar 2018 04:16:05 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp14.uk.ibm.com (192.168.101.144) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 19 Mar 2018 04:16:02 -0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w2J4G22449152224; Mon, 19 Mar 2018 04:16:02 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6C289AE051; Mon, 19 Mar 2018 04:06:24 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 259CFAE045; Mon, 19 Mar 2018 04:06:21 +0000 (GMT) Received: from vajain21.in.ibm.com.com (unknown [9.80.192.1]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 19 Mar 2018 04:06:20 +0000 (GMT) From: Vaibhav Jain To: Christophe Lombard , Frederic Barrat Date: Mon, 19 Mar 2018 09:45:35 +0530 X-Mailer: git-send-email 2.14.3 X-TM-AS-GCONF: 00 x-cbid: 18031904-0016-0000-0000-000005349F70 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18031904-0017-0000-0000-00002871B4C5 Message-Id: <20180319041535.27251-1-vaibhav@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-03-19_01:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1803190052 Subject: [Skiboot] [PATCH v3] capi: Poll Err/Status register during CAPP recovery X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Philippe Bergheaud , Andrew Donnellan , skiboot@lists.ozlabs.org MIME-Version: 1.0 Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" This patch updates do_capp_recovery_scoms() to poll the CAPP Err/Status control register, check for CAPP-Recovery to complete/fail based on indications of BITS-1,5,9 and then proceed with the CAPP-Recovery scoms iif recovery completed successfully. This would prevent cases where we bring-up the PCIe link while recovery sequencer on CAPP is still busy with casting out cache lines. In case CAPP-Recovery didn't complete successfully an error is returned from do_capp_recovery_scoms() asking phb4_creset() to keep the phb4 fenced and mark it as broken. The loop that implements polling of Err/Status register will also log an error on the PHB when it continues for more than 168ms which is the max time to failure for CAPP-Recovery. Signed-off-by: Vaibhav Jain --- Changelog: v3 -> Introduced a timeout for the Poll loop of 168ms [Christophe] v2 -> Added an extra check for Bit(0) in Err/Status reg at the beginning to check if recovery mode was entered. [Christophe] --- hw/phb4.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 68 insertions(+), 17 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index 47175df2..30f46f9a 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -2857,25 +2857,75 @@ static int64_t load_capp_ucode(struct phb4 *p) return rc; } -static void do_capp_recovery_scoms(struct phb4 *p) +static int do_capp_recovery_scoms(struct phb4 *p) { - uint64_t reg; - uint32_t offset; + uint64_t rc, reg, end; + uint64_t offset = PHB4_CAPP_REG_OFFSET(p); - PHBDBG(p, "Doing CAPP recovery scoms\n"); - - offset = PHB4_CAPP_REG_OFFSET(p); - /* disable snoops */ - xscom_write(p->chip_id, SNOOP_CAPI_CONFIG + offset, 0); - load_capp_ucode(p); - /* clear err rpt reg*/ - xscom_write(p->chip_id, CAPP_ERR_RPT_CLR + offset, 0); - /* clear capp fir */ - xscom_write(p->chip_id, CAPP_FIR + offset, 0); + /* Get the status of CAPP recovery */ xscom_read(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, ®); - reg &= ~(PPC_BIT(0) | PPC_BIT(1)); - xscom_write(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, reg); + + /* No recovery in progress ignore */ + if ((reg & PPC_BIT(0)) == 0) { + PHBDBG(p, "CAPP: No recovery in progress\n"); + return 0; + } + + PHBDBG(p, "CAPP: Waiting for recovery to complete\n"); + /* recovery timer failure period 168ms */ + end = mftb() + msecs_to_tb(168); + while ((reg & (PPC_BIT(1) | PPC_BIT(5) | PPC_BIT(9))) == 0) { + + time_wait_ms(5); + xscom_read(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, ®); + + if (end && tb_compare(mftb(), end) != TB_AAFTERB) { + PHBERR(p, "CAPP: Capp recovery Timed-out.\n"); + end = 0; + break; + } + } + + /* Check if the recovery failed or passed */ + if (reg & PPC_BIT(1)) { + PHBDBG(p, "Doing CAPP recovery scoms\n"); + /* disable snoops */ + xscom_write(p->chip_id, SNOOP_CAPI_CONFIG + offset, 0); + load_capp_ucode(p); + + /* clear err rpt reg*/ + xscom_write(p->chip_id, CAPP_ERR_RPT_CLR + offset, 0); + + /* clear capp fir */ + xscom_write(p->chip_id, CAPP_FIR + offset, 0); + + /* Just reset Bit-0,1 and dont touch any other bit */ + xscom_read(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, ®); + reg &= ~(PPC_BIT(0) | PPC_BIT(1)); + xscom_write(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, reg); + + PHBDBG(p, "CAPP recovery complete\n"); + rc = OPAL_SUCCESS; + + } else { + /* Most likely will checkstop here due to FIR ACTION for + * failed recovery. So this message would never be logged. + * But if we still enter here then return an error forcing a + * fence of the PHB. + */ + if (reg & PPC_BIT(5)) + PHBERR(p, "CAPP: Capp recovery Failed\n"); + else if (reg & PPC_BIT(9)) + PHBERR(p, "CAPP: Capp recovery hang detected\n"); + else if (end != 0) + PHBERR(p, "CAPP: Unknown recovery failure\n"); + + PHBDBG(p, "CAPP: Err/Status-reg=0x%016llx\n", reg); + rc = OPAL_HARDWARE; + } + + return rc; } static int64_t phb4_creset(struct pci_slot *slot) @@ -2934,8 +2984,9 @@ static int64_t phb4_creset(struct pci_slot *slot) PHBDBG(p, "CRESET: No pending transactions\n"); /* capp recovery */ - if (p->flags & PHB4_CAPP_RECOVERY) - do_capp_recovery_scoms(p); + if (p->flags & PHB4_CAPP_RECOVERY && + do_capp_recovery_scoms(p)) + goto error; /* Clear errors in PFIR and NFIR */ xscom_write(p->chip_id, p->pci_stk_xscom + 0x1,