From patchwork Tue Mar 15 07:33:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Russell Currey X-Patchwork-Id: 597378 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qPRG03fGGz9sDb for ; Tue, 15 Mar 2016 18:34:32 +1100 (AEDT) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3qPRG02P4vzDr2P for ; Tue, 15 Mar 2016 18:34:32 +1100 (AEDT) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Received: from russell.cc (russell.cc [IPv6:2404:9400:2:0:216:3eff:fee0:3370]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3qPRFf4CtYzDq5v for ; Tue, 15 Mar 2016 18:34:14 +1100 (AEDT) Received: from snap.ozlabs.ibm.com (static-82-10.transact.net.au [122.99.82.10]) by russell.cc (OpenSMTPD) with ESMTPSA id ff53bca6 TLS version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES128-SHA256 bits=128 verify=NO; Tue, 15 Mar 2016 07:34:17 +0000 (UTC) From: Russell Currey To: skiboot@lists.ozlabs.org Date: Tue, 15 Mar 2016 18:33:55 +1100 Message-Id: <1458027237-8926-6-git-send-email-ruscur@russell.cc> X-Mailer: git-send-email 2.7.3 In-Reply-To: <1458027237-8926-1-git-send-email-ruscur@russell.cc> References: <1458027237-8926-1-git-send-email-ruscur@russell.cc> Subject: [Skiboot] [PATCH 5/7] hmi: Don't cause an unrecoverable HMI if a CPU is asleep X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" decode_core_fir looks at the FIR for every CPU. If the XSCOM read fails, which it will if the CPU is asleep, the code path results in raising an unrecoverable HMI. This is incorrect as if the CPU is asleep, it's likely not the cause of any problems. Resolve this by skipping the CPU if it's asleep. If the sleeping CPU was actually the cause of the HMI and no other components were found to have errors (i.e other CPUs, NX, CAPP), an unknown, unrecoverable HMI is raised anyway. This patch just prevents unrecoverable errors being thrown when a recoverable component has a HMI and a CPU happens to be sleeping. Signed-off-by: Russell Currey --- core/hmi.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/core/hmi.c b/core/hmi.c index 127686f..09bf610 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -299,6 +299,7 @@ static bool decode_core_fir(struct cpu_thread *cpu, uint32_t core_id; int i; bool found = false; + int64_t ret; /* Sanity check */ if (!cpu || !hmi_evt) @@ -307,10 +308,18 @@ static bool decode_core_fir(struct cpu_thread *cpu, core_id = pir_to_core_id(cpu->pir); /* Get CORE FIR register value. */ - if (xscom_read(cpu->chip_id, XSCOM_ADDR_P8_EX(core_id, CORE_FIR), - &core_fir) != 0) { + ret = xscom_read(cpu->chip_id, XSCOM_ADDR_P8_EX(core_id, CORE_FIR), + &core_fir); + + if (ret == OPAL_HARDWARE) { prerror("HMI: XSCOM error reading CORE FIR\n"); return false; + } else if (ret == OPAL_WRONG_STATE) { + /* CPU is asleep, so it probably didn't cause the checkstop */ + prlog(PR_DEBUG, + "HMI: FIR read failed, chip %d core %d asleep\n", + cpu->chip_id, core_id); + return true; } prlog(PR_INFO, "HMI: CHIP ID: %x, CORE ID: %x, FIR: %016llx\n",