From patchwork Fri Nov 19 11:31:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1557047 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=qCT1kEhD; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HwZKF1tSTz9sPf for ; Fri, 19 Nov 2021 22:32:49 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4HwZKF0F1Qz3bXW for ; Fri, 19 Nov 2021 22:32:49 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=qCT1kEhD; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::432; helo=mail-pf1-x432.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=qCT1kEhD; dkim-atps=neutral Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4HwZJH1V2Xz2yw7 for ; Fri, 19 Nov 2021 22:31:58 +1100 (AEDT) Received: by mail-pf1-x432.google.com with SMTP id g18so9143257pfk.5 for ; Fri, 19 Nov 2021 03:31:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P8cKd/ap7gRxKSNxNnKNhjT9PDh6i5QuBJKJNrshI1I=; b=qCT1kEhD3At4G7r33uxKXw3zuGWT9mVaOAmvgdy7CJHEVEtfkhROvFQmqoBFt6kdQs G44Whig+McXv/v0WI1n/fGYkHPQWcQazOHpHnguesZca1pWQVz87OYdnOR752tNqJ8fb zfxNOijCBVhr7dh8TfZcIVphHbUNW0UxOb07FpgTL0dzIg2URSc5xu4Y27OIlz/cA0SG MLqMlmf+pA2O6iOhEYvLr6/QFIOdiXAzVwVqKI4VEVtPJ8yMM33KK2KwKsPLNk4p8muv QOK0AmIEPfaml9290HAHLtdUJhQUGcr8sAmbAjUIrGGFEf1WzQCx3QX7OMIrDyLEZH4R qOOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P8cKd/ap7gRxKSNxNnKNhjT9PDh6i5QuBJKJNrshI1I=; b=I9vN4tsXDvBJVYwE1GcuY4GscTli0AZuAfoVvBk7PcglWnp0M/P44nvIn1/6PoPASb Ds/8/mkQ+el7KnLIC3zfBZJw8BtJdS1rzeN72xP/YmeSH/FY3Xzjf0N9pGcqze34+Lji 9bBqKwjxR5p6hULd8Zg7KtlDzkWL7Tgpg0QhCz9FDt4fMedKt8kNxsBRW9du6lu8I2xA 57OyWmwbAngjVHdUw99UwlXeeMuXTY3KAZd2xBM3jc9v+SniP0ij64mIU0KWec3TxEF7 u1SwTa0p5oZ05+i1FFw6of4bmBwC4bwdrtoP9PZxDJwUT93P1kG4vV/edlS8k2XQviBD x4/Q== X-Gm-Message-State: AOAM532jBZ0yJpFQg1s2L51RTyg/2JDqUf0irPS5TNYnGHCvwJyI/C3W n8FY9So9vvWCMInzaZn45HKoHRKpp+o= X-Google-Smtp-Source: ABdhPJwq2Wsowms5H9lkM6y9y2hMqSccgV0RTuXCYdo6+2nWCioTt3g6MEEUoNXVVFidm90JSNLD6A== X-Received: by 2002:a62:8f93:0:b0:49f:ed44:54ac with SMTP id n141-20020a628f93000000b0049fed4454acmr63589331pfd.72.1637321516298; Fri, 19 Nov 2021 03:31:56 -0800 (PST) Received: from bobo.ozlabs.ibm.com (60-240-2-228.tpgi.com.au. [60.240.2.228]) by smtp.gmail.com with ESMTPSA id g17sm2632626pfv.136.2021.11.19.03.31.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Nov 2021 03:31:56 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v4 1/5] powerpc/watchdog: Fix missed watchdog reset due to memory ordering race Date: Fri, 19 Nov 2021 21:31:42 +1000 Message-Id: <20211119113146.752759-2-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20211119113146.752759-1-npiggin@gmail.com> References: <20211119113146.752759-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Dufour , Nicholas Piggin , Daniel Axtens Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" It is possible for all CPUs to miss the pending cpumask becoming clear, and then nobody resetting it, which will cause the lockup detector to stop working. It will eventually expire, but watchdog_smp_panic will avoid doing anything if the pending mask is clear and it will never be reset. Order the cpumask clear vs the subsequent test to close this race. Add an extra check for an empty pending mask when the watchdog fires and finds its bit still clear, to try to catch any other possible races or bugs here and keep the watchdog working. The extra test in arch_touch_nmi_watchdog is required to prevent the new warning from firing off. Debugged-by: Laurent Dufour Reviewed-by: Laurent Dufour Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/watchdog.c | 41 +++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index 3fa6d240bade..ad94a2c6b733 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -135,6 +135,10 @@ static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb) { cpumask_or(&wd_smp_cpus_stuck, &wd_smp_cpus_stuck, cpumask); cpumask_andnot(&wd_smp_cpus_pending, &wd_smp_cpus_pending, cpumask); + /* + * See wd_smp_clear_cpu_pending() + */ + smp_mb(); if (cpumask_empty(&wd_smp_cpus_pending)) { wd_smp_last_reset_tb = tb; cpumask_andnot(&wd_smp_cpus_pending, @@ -221,13 +225,44 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb) cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck); wd_smp_unlock(&flags); + } else { + /* + * The last CPU to clear pending should have reset the + * watchdog so we generally should not find it empty + * here if our CPU was clear. However it could happen + * due to a rare race with another CPU taking the + * last CPU out of the mask concurrently. + * + * We can't add a warning for it. But just in case + * there is a problem with the watchdog that is causing + * the mask to not be reset, try to kick it along here. + */ + if (unlikely(cpumask_empty(&wd_smp_cpus_pending))) + goto none_pending; } return; } + cpumask_clear_cpu(cpu, &wd_smp_cpus_pending); + + /* + * Order the store to clear pending with the load(s) to check all + * words in the pending mask to check they are all empty. This orders + * with the same barrier on another CPU. This prevents two CPUs + * clearing the last 2 pending bits, but neither seeing the other's + * store when checking if the mask is empty, and missing an empty + * mask, which ends with a false positive. + */ + smp_mb(); if (cpumask_empty(&wd_smp_cpus_pending)) { unsigned long flags; +none_pending: + /* + * Double check under lock because more than one CPU could see + * a clear mask with the lockless check after clearing their + * pending bits. + */ wd_smp_lock(&flags); if (cpumask_empty(&wd_smp_cpus_pending)) { wd_smp_last_reset_tb = tb; @@ -318,8 +353,12 @@ void arch_touch_nmi_watchdog(void) { unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000; int cpu = smp_processor_id(); - u64 tb = get_tb(); + u64 tb; + if (!cpumask_test_cpu(cpu, &watchdog_cpumask)) + return; + + tb = get_tb(); if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) { per_cpu(wd_timer_tb, cpu) = tb; wd_smp_clear_cpu_pending(cpu, tb); From patchwork Fri Nov 19 11:31:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1557049 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=URmJEXNO; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HwZKy5lrLz9sR4 for ; Fri, 19 Nov 2021 22:33:26 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4HwZKy2XFDz3cCr for ; Fri, 19 Nov 2021 22:33:26 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=URmJEXNO; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::42f; helo=mail-pf1-x42f.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=URmJEXNO; dkim-atps=neutral Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4HwZJL0Wpsz2ynp for ; Fri, 19 Nov 2021 22:32:01 +1100 (AEDT) Received: by mail-pf1-x42f.google.com with SMTP id x5so9205371pfr.0 for ; Fri, 19 Nov 2021 03:32:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HNWfPJEA+TK3OOf6ZR3iwJQRckjNFQtzKStHMcKosHg=; b=URmJEXNOclDW1ealeA2WsN0nInHSFowJsaKhpz/ZVDoKioP2WdCfjtQctCza90ugiw FnUvwOfS1JKLZbkcRCxlWujtg/U1QIeIGFxEpqGX0cTlKNLeVUJOnwqmOuakvGEO0nZp GvIvcaAAGpcxlnsyar5TbHHglp/s9NJIBR3/uV4VztiuQmldg6GaSF4RyOHO6gJKVwwf GQlE0LNjlam62/k/yHO3x6Ue3/noJwV10r6LtxO/ms1bwERYkJ8hBeBTkG/P8xRQ7l67 n+rIpM5BluREfHjNUGDTCw//AScFsh3MhsR8DDj2N/0pzqtqD08v7vsUTBnYkjUO6Cxb vj9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HNWfPJEA+TK3OOf6ZR3iwJQRckjNFQtzKStHMcKosHg=; b=E4mgFj4yE40DpZCtEKS4IGyxmA6dyq5u0sf4w3H59i0dbyBHnmkLvom8vmBwR4LciX EgUoyJ3d0vLtDHJZhDfUB0HxrWkRejwvNY9L43TOF7jnpPv+YaVb6eEDOyO/Bh+nbwPO tPoBN/Clm5trCz9f3Twp/1mjQ72TNX93Su4yvd6I7jJ2cizgIxEQ0y5tpMwIg1lsB1GZ 84Lj6Q2oymiWfvZNs4IgqkooIjkQtUREbI+PAS4PepdCSKUS4Qkuj9mQncUpsrDOcp6h a0V/13qyPLZMp+7hTezDgBQAS0IkupO3o0hH4vW256Fj2ukjLepyFgFnXarqZVDWfVCy nYlA== X-Gm-Message-State: AOAM530P4ajXOoMZQLORG+fV6y3ZoFjlRLyJbkvI0hrueGC1d/uDK53A W7RSKkPwN9cboflSAg/LpxJK3Sr3QFA= X-Google-Smtp-Source: ABdhPJz0KmcKJ9xREgY6bbC//IExVboXL5OmX5lEX+jCjkKvflk5rHa6fKIeCs1vWgvorPDWOw/aUQ== X-Received: by 2002:a63:684:: with SMTP id 126mr16639968pgg.213.1637321519594; Fri, 19 Nov 2021 03:31:59 -0800 (PST) Received: from bobo.ozlabs.ibm.com (60-240-2-228.tpgi.com.au. [60.240.2.228]) by smtp.gmail.com with ESMTPSA id g17sm2632626pfv.136.2021.11.19.03.31.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Nov 2021 03:31:59 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v4 2/5] powerpc/watchdog: tighten non-atomic read-modify-write access Date: Fri, 19 Nov 2021 21:31:43 +1000 Message-Id: <20211119113146.752759-3-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20211119113146.752759-1-npiggin@gmail.com> References: <20211119113146.752759-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Dufour , Nicholas Piggin , Daniel Axtens Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Most updates to wd_smp_cpus_pending are under lock except the watchdog interrupt bit clear. This can race with non-atomic RMW updates to the mask under lock, which can happen in two instances: Firstly, if another CPU detects this one is stuck, removes it from the mask, mask becomes empty and is re-filled with non-atomic stores. This is okay because it would re-fill the mask with this CPU's bit clear anyway (because this CPU is now stuck), so it doesn't matter that the bit clear update got "lost". Add a comment for this. Secondly, if another CPU detects a different CPU is stuck and removes it from the pending mask with a non-atomic store to bytes which also include the bit of this CPU. This case can result in the bit clear being lost and the end result being the bit is set. This should be so rare it hardly matters, but to make things simpler to reason about just avoid the non-atomic access for that case. Reviewed-by: Laurent Dufour Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/watchdog.c | 36 ++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index ad94a2c6b733..588f54350d19 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -131,10 +131,10 @@ static void wd_lockup_ipi(struct pt_regs *regs) /* Do not panic from here because that can recurse into NMI IPI layer */ } -static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb) +static bool set_cpu_stuck(int cpu, u64 tb) { - cpumask_or(&wd_smp_cpus_stuck, &wd_smp_cpus_stuck, cpumask); - cpumask_andnot(&wd_smp_cpus_pending, &wd_smp_cpus_pending, cpumask); + cpumask_set_cpu(cpu, &wd_smp_cpus_stuck); + cpumask_clear_cpu(cpu, &wd_smp_cpus_pending); /* * See wd_smp_clear_cpu_pending() */ @@ -144,11 +144,9 @@ static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb) cpumask_andnot(&wd_smp_cpus_pending, &wd_cpus_enabled, &wd_smp_cpus_stuck); + return true; } -} -static void set_cpu_stuck(int cpu, u64 tb) -{ - set_cpumask_stuck(cpumask_of(cpu), tb); + return false; } static void watchdog_smp_panic(int cpu, u64 tb) @@ -177,15 +175,17 @@ static void watchdog_smp_panic(int cpu, u64 tb) * get a backtrace on all of them anyway. */ for_each_cpu(c, &wd_smp_cpus_pending) { + bool empty; if (c == cpu) continue; + /* Take the stuck CPUs out of the watch group */ + empty = set_cpu_stuck(c, tb); smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000); + if (empty) + break; } } - /* Take the stuck CPUs out of the watch group */ - set_cpumask_stuck(&wd_smp_cpus_pending, tb); - wd_smp_unlock(&flags); if (sysctl_hardlockup_all_cpu_backtrace) @@ -243,6 +243,22 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb) return; } + /* + * All other updates to wd_smp_cpus_pending are performed under + * wd_smp_lock. All of them are atomic except the case where the + * mask becomes empty and is reset. This will not happen here because + * cpu was tested to be in the bitmap (above), and a CPU only clears + * its own bit. _Except_ in the case where another CPU has detected a + * hard lockup on our CPU and takes us out of the pending mask. So in + * normal operation there will be no race here, no problem. + * + * In the lockup case, this atomic clear-bit vs a store that refills + * other bits in the accessed word wll not be a problem. The bit clear + * is atomic so it will not cause the store to get lost, and the store + * will never set this bit so it will not overwrite the bit clear. The + * only way for a stuck CPU to return to the pending bitmap is to + * become unstuck itself. + */ cpumask_clear_cpu(cpu, &wd_smp_cpus_pending); /* From patchwork Fri Nov 19 11:31:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1557050 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ZZAR8gZB; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HwZLg4FKJz9sPf for ; Fri, 19 Nov 2021 22:34:03 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4HwZLg0tgPz3cN9 for ; Fri, 19 Nov 2021 22:34:03 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ZZAR8gZB; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::632; helo=mail-pl1-x632.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ZZAR8gZB; dkim-atps=neutral Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4HwZJP08fqz301g for ; Fri, 19 Nov 2021 22:32:04 +1100 (AEDT) Received: by mail-pl1-x632.google.com with SMTP id b11so7885276pld.12 for ; Fri, 19 Nov 2021 03:32:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=840qpGoOaQomLKQnTFv1++UtXLylpSHRhF54Gmo/a/E=; b=ZZAR8gZBxjpw1512A9JXX2SZCxzVITcK7tRXeCwkta/3dMyOhqveMSCZQB9PE1DWb6 xCYcYmrTH82SUdV5N/eFbY0eKLzYnWwWifp/Q0pr/2F+vIIYLYehNnjN+QM8n+OoOfXh 04vBd4VIx3pR1AyAMh2mrm6l1PRtyvKqXqAYKPgFFlUpe2ZT4MowXORAA0vulBNU8Lnr u65530UnyV7G5Z8YFLkj4xIJXitTmIY3j3SZkwqFgXakJ5JOBHlfTj8apTK+t+2yPWen fS+Usi8ksuL+xpaxx4851agM2Df1SAzCBFQvipQI9xUHgnFFUqHzB1RKb623vHEkwDiG /7lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=840qpGoOaQomLKQnTFv1++UtXLylpSHRhF54Gmo/a/E=; b=A6t2c5SV1sd/iulF6qD0HtBkPPMjZcxHVJ0vn6ZbVmMIG+xkFh1gsFfGxLnlRzQWKy OxH7PzdzUUXWhwayQAiNV4s1YjxjgEjNbyLgmcp3SuoOUNLob0fHcMOOauNQ9N/4O7hx a7a4a0dQki17mbenMancrZQPyQno02t6/Uvt8VZknKkausDtQEc3DYqJW8TAp8qFd1br XUVM7aQPFxtc4h+mlqmFFecGMETbrn/h/VE4gN36KLnATCK5NEHv45bDSH8mplrEx+8k coSYbECORfeY+kKLKVzQH65CLWL62vc7/b6TB3lR0m0qChY2jrv6LOBpigyU50st28hw ODKg== X-Gm-Message-State: AOAM531YJXYHdL05aGNzxGf8msdxqhnqAkAxQswUV7EFzJ7Dzm9f8z7O C/hQrWYfAe1OoTsoIPtYWATAxSDQ9e8= X-Google-Smtp-Source: ABdhPJwXjzy4cpfFekxiGns2qm/W1s6yKUkjMuq64UFAs5lmUOcAzS0mISp8OHATeVCmd7k/UNW+OA== X-Received: by 2002:a17:90b:4c8c:: with SMTP id my12mr3724859pjb.157.1637321522392; Fri, 19 Nov 2021 03:32:02 -0800 (PST) Received: from bobo.ozlabs.ibm.com (60-240-2-228.tpgi.com.au. [60.240.2.228]) by smtp.gmail.com with ESMTPSA id g17sm2632626pfv.136.2021.11.19.03.31.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Nov 2021 03:32:02 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v4 3/5] powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi Date: Fri, 19 Nov 2021 21:31:44 +1000 Message-Id: <20211119113146.752759-4-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20211119113146.752759-1-npiggin@gmail.com> References: <20211119113146.752759-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Dufour , Nicholas Piggin , Daniel Axtens Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" There is a deadlock with the console_owner lock and the wd_smp_lock: CPU x takes the console_owner lock CPU y takes a watchdog timer interrupt and takes __wd_smp_lock CPU x takes a soft-NMI interrupt, detects deadlock, spins on __wd_smp_lock CPU y detects deadlock, tries to print something and spins on console_owner -> deadlock Change the watchdog locking scheme so wd_smp_lock protects the watchdog internal data, but "reporting" (printing, issuing NMI IPIs, taking any action outside of watchdog) uses a non-waiting exclusion. If a CPU detects a problem but can not take the reporting lock, it just returns because something else is already reporting. It will try again at some point. Typically hard lockup watchdog report usefulness is not impacted due to failure to spewing a large enough amount of data in as short a time as possible, but by messages getting garbled. Laurent debugged this and found the deadlock, and this patch is based on his general approach to avoid expensive operations while holding the lock. With the addition of the reporting exclusion. Signed-off-by: Laurent Dufour [np: rework to add reporting exclusion update changelog] Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/watchdog.c | 100 +++++++++++++++++++++++++-------- 1 file changed, 78 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index 588f54350d19..cfd45049ec7f 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -85,10 +85,36 @@ static DEFINE_PER_CPU(u64, wd_timer_tb); /* SMP checker bits */ static unsigned long __wd_smp_lock; +static unsigned long __wd_reporting; static cpumask_t wd_smp_cpus_pending; static cpumask_t wd_smp_cpus_stuck; static u64 wd_smp_last_reset_tb; +/* + * Try to take the exclusive watchdog action / NMI IPI / printing lock. + * wd_smp_lock must be held. If this fails, we should return and wait + * for the watchdog to kick in again (or another CPU to trigger it). + * + * Importantly, if hardlockup_panic is set, wd_try_report failure should + * not delay the panic, because whichever other CPU is reporting will + * call panic. + */ +static bool wd_try_report(void) +{ + if (__wd_reporting) + return false; + __wd_reporting = 1; + return true; +} + +/* End printing after successful wd_try_report. wd_smp_lock not required. */ +static void wd_end_reporting(void) +{ + smp_mb(); /* End printing "critical section" */ + WARN_ON_ONCE(__wd_reporting == 0); + WRITE_ONCE(__wd_reporting, 0); +} + static inline void wd_smp_lock(unsigned long *flags) { /* @@ -151,45 +177,54 @@ static bool set_cpu_stuck(int cpu, u64 tb) static void watchdog_smp_panic(int cpu, u64 tb) { + static cpumask_t wd_smp_cpus_ipi; // protected by reporting unsigned long flags; + u64 last_reset; int c; wd_smp_lock(&flags); /* Double check some things under lock */ - if ((s64)(tb - wd_smp_last_reset_tb) < (s64)wd_smp_panic_timeout_tb) + last_reset = wd_smp_last_reset_tb; + if ((s64)(tb - last_reset) < (s64)wd_smp_panic_timeout_tb) goto out; if (cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) goto out; - if (cpumask_weight(&wd_smp_cpus_pending) == 0) + if (!wd_try_report()) goto out; + for_each_online_cpu(c) { + if (!cpumask_test_cpu(c, &wd_smp_cpus_pending)) + continue; + if (c == cpu) + continue; // should not happen + + __cpumask_set_cpu(c, &wd_smp_cpus_ipi); + if (set_cpu_stuck(c, tb)) + break; + } + if (cpumask_empty(&wd_smp_cpus_ipi)) { + wd_end_reporting(); + goto out; + } + wd_smp_unlock(&flags); pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n", - cpu, cpumask_pr_args(&wd_smp_cpus_pending)); + cpu, cpumask_pr_args(&wd_smp_cpus_ipi)); pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n", - cpu, tb, wd_smp_last_reset_tb, - tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000); + cpu, tb, last_reset, tb_to_ns(tb - last_reset) / 1000000); if (!sysctl_hardlockup_all_cpu_backtrace) { /* * Try to trigger the stuck CPUs, unless we are going to * get a backtrace on all of them anyway. */ - for_each_cpu(c, &wd_smp_cpus_pending) { - bool empty; - if (c == cpu) - continue; - /* Take the stuck CPUs out of the watch group */ - empty = set_cpu_stuck(c, tb); + for_each_cpu(c, &wd_smp_cpus_ipi) { smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000); - if (empty) - break; + __cpumask_clear_cpu(c, &wd_smp_cpus_ipi); } - } - - wd_smp_unlock(&flags); - - if (sysctl_hardlockup_all_cpu_backtrace) + } else { trigger_allbutself_cpu_backtrace(); + cpumask_clear(&wd_smp_cpus_ipi); + } /* * Force flush any remote buffers that might be stuck in IRQ context @@ -200,6 +235,8 @@ static void watchdog_smp_panic(int cpu, u64 tb) if (hardlockup_panic) nmi_panic(NULL, "Hard LOCKUP"); + wd_end_reporting(); + return; out: @@ -213,8 +250,6 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb) struct pt_regs *regs = get_irq_regs(); unsigned long flags; - wd_smp_lock(&flags); - pr_emerg("CPU %d became unstuck TB:%lld\n", cpu, tb); print_irqtrace_events(current); @@ -223,6 +258,7 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb) else dump_stack(); + wd_smp_lock(&flags); cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck); wd_smp_unlock(&flags); } else { @@ -318,13 +354,28 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) tb = get_tb(); if (tb - per_cpu(wd_timer_tb, cpu) >= wd_panic_timeout_tb) { + /* + * Taking wd_smp_lock here means it is a soft-NMI lock, which + * means we can't take any regular or irqsafe spin locks while + * holding this lock. This is why timers can't printk while + * holding the lock. + */ wd_smp_lock(&flags); if (cpumask_test_cpu(cpu, &wd_smp_cpus_stuck)) { wd_smp_unlock(&flags); return 0; } + if (!wd_try_report()) { + wd_smp_unlock(&flags); + /* Couldn't report, try again in 100ms */ + mtspr(SPRN_DEC, 100 * tb_ticks_per_usec * 1000); + return 0; + } + set_cpu_stuck(cpu, tb); + wd_smp_unlock(&flags); + pr_emerg("CPU %d self-detected hard LOCKUP @ %pS\n", cpu, (void *)regs->nip); pr_emerg("CPU %d TB:%lld, last heartbeat TB:%lld (%lldms ago)\n", @@ -334,14 +385,19 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) print_irqtrace_events(current); show_regs(regs); - wd_smp_unlock(&flags); - if (sysctl_hardlockup_all_cpu_backtrace) trigger_allbutself_cpu_backtrace(); if (hardlockup_panic) nmi_panic(regs, "Hard LOCKUP"); + + wd_end_reporting(); } + /* + * We are okay to change DEC in soft_nmi_interrupt because the masked + * handler has marked a DEC as pending, so the timer interrupt will be + * replayed as soon as local irqs are enabled again. + */ if (wd_panic_timeout_tb < 0x7fffffff) mtspr(SPRN_DEC, wd_panic_timeout_tb); From patchwork Fri Nov 19 11:31:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1557051 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=VLDY6lYW; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HwZMX1PlKz9sR4 for ; Fri, 19 Nov 2021 22:34:48 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4HwZMX04l3z3cTJ for ; Fri, 19 Nov 2021 22:34:48 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=VLDY6lYW; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::629; helo=mail-pl1-x629.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=VLDY6lYW; dkim-atps=neutral Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4HwZJR4C3Lz3bWX for ; Fri, 19 Nov 2021 22:32:07 +1100 (AEDT) Received: by mail-pl1-x629.google.com with SMTP id v19so7909350plo.7 for ; Fri, 19 Nov 2021 03:32:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=szKvYnxJqWq8uRHA9z21YAvL81GGQGx0+gaVFJ7anCA=; b=VLDY6lYW7USmNXlbMfMCHNIz8XC/wMU0hLo8ahPl7QBf5Wo4ev+q92lYif5YQXwKjY OnLr/Ltx924hmHFDF4e1cGAPvWKGdjXCz/UKOA1d0cYp3Zq9sAQeUiAo6ZEMwHk44I4u 4vdZi93EenNytb5xDNeLPuKXSO/OJdGpBmWMMqNRnduS1NW1DNkx8HfJEYFyfRrFKGoQ jeeME5O66VMoukSsV3jY8ZisSVjtavu7WYqFIWq8FW+y/Iebssy7gwHMcPf8Z6Kp0KE4 7gYlCDnvLH3rgX7aQs2BLO6NiK3qsKbKD0m05exeeZowYCHKPZCeLHbbOP3A8+KUSyS7 /qRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=szKvYnxJqWq8uRHA9z21YAvL81GGQGx0+gaVFJ7anCA=; b=EojHb/RN9Je10r1Lo6kCrRyLInBJDFEsb2RckIXdkq9qgrYoJIayOiy6t+8j7b78L0 keIUE5opfJbmUdtEiN3vB1VNnCImV/xM9Kj2wNWTy6o5+f/fu7dj+nH/PpxOkmtJzvM+ zZJmGcexx6LYlI+lo4zjUIBhFCCpcwTA9hzzrawy8ScgOBwlVrkxDvcO6Rb1iD4m48TI qvsjpnm5l79PLw4Hj8mEp+7w5KiTxZMH0H2hBsUSSjPzCtey8pb/Dc3bwXgCafx/8R7D qrgf7SCtmwCuUlbbGaebiLwoFXkPTaBzS0PO2jqM4IoZYfl+tsc7YB3ArxXk1FivZaUP 0yDg== X-Gm-Message-State: AOAM533J5Tis1EIBfDyd93MEaO2IjxnZ0edHC0kA95ZwQWjGt4VDIFSd 1t01KY/cph150cMxN3QPuK0db2QgQ3Q= X-Google-Smtp-Source: ABdhPJzqFcKFsrxIjTw46yIA+gNDYpBNr32h7kBr744wKcdFAU1Wd5gfK42b5ppcYFf/gYvFE651mg== X-Received: by 2002:a17:90b:1bc4:: with SMTP id oa4mr3811121pjb.179.1637321525097; Fri, 19 Nov 2021 03:32:05 -0800 (PST) Received: from bobo.ozlabs.ibm.com (60-240-2-228.tpgi.com.au. [60.240.2.228]) by smtp.gmail.com with ESMTPSA id g17sm2632626pfv.136.2021.11.19.03.32.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Nov 2021 03:32:04 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v4 4/5] powerpc/watchdog: read TB close to where it is used Date: Fri, 19 Nov 2021 21:31:45 +1000 Message-Id: <20211119113146.752759-5-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20211119113146.752759-1-npiggin@gmail.com> References: <20211119113146.752759-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Dufour , Nicholas Piggin , Daniel Axtens Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" When taking watchdog actions, printing messages, comparing and re-setting wd_smp_last_reset_tb, etc., read TB close to the point of use and under wd_smp_lock or printing lock (if applicable). This should keep timebase mostly monotonic with kernel log messages, and could prevent (in theory) a laggy CPU updating wd_smp_last_reset_tb to something a long way in the past, and causing other CPUs to appear to be stuck. These additional TB reads are all slowpath (lockup has been detected), so performance does not matter. Reviewed-by: Laurent Dufour Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/watchdog.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index cfd45049ec7f..23745af38d62 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -157,7 +157,7 @@ static void wd_lockup_ipi(struct pt_regs *regs) /* Do not panic from here because that can recurse into NMI IPI layer */ } -static bool set_cpu_stuck(int cpu, u64 tb) +static bool set_cpu_stuck(int cpu) { cpumask_set_cpu(cpu, &wd_smp_cpus_stuck); cpumask_clear_cpu(cpu, &wd_smp_cpus_pending); @@ -166,7 +166,7 @@ static bool set_cpu_stuck(int cpu, u64 tb) */ smp_mb(); if (cpumask_empty(&wd_smp_cpus_pending)) { - wd_smp_last_reset_tb = tb; + wd_smp_last_reset_tb = get_tb(); cpumask_andnot(&wd_smp_cpus_pending, &wd_cpus_enabled, &wd_smp_cpus_stuck); @@ -175,15 +175,16 @@ static bool set_cpu_stuck(int cpu, u64 tb) return false; } -static void watchdog_smp_panic(int cpu, u64 tb) +static void watchdog_smp_panic(int cpu) { static cpumask_t wd_smp_cpus_ipi; // protected by reporting unsigned long flags; - u64 last_reset; + u64 tb, last_reset; int c; wd_smp_lock(&flags); /* Double check some things under lock */ + tb = get_tb(); last_reset = wd_smp_last_reset_tb; if ((s64)(tb - last_reset) < (s64)wd_smp_panic_timeout_tb) goto out; @@ -198,7 +199,7 @@ static void watchdog_smp_panic(int cpu, u64 tb) continue; // should not happen __cpumask_set_cpu(c, &wd_smp_cpus_ipi); - if (set_cpu_stuck(c, tb)) + if (set_cpu_stuck(c)) break; } if (cpumask_empty(&wd_smp_cpus_ipi)) { @@ -243,7 +244,7 @@ static void watchdog_smp_panic(int cpu, u64 tb) wd_smp_unlock(&flags); } -static void wd_smp_clear_cpu_pending(int cpu, u64 tb) +static void wd_smp_clear_cpu_pending(int cpu) { if (!cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) { if (unlikely(cpumask_test_cpu(cpu, &wd_smp_cpus_stuck))) { @@ -251,7 +252,7 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb) unsigned long flags; pr_emerg("CPU %d became unstuck TB:%lld\n", - cpu, tb); + cpu, get_tb()); print_irqtrace_events(current); if (regs) show_regs(regs); @@ -317,7 +318,7 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb) */ wd_smp_lock(&flags); if (cpumask_empty(&wd_smp_cpus_pending)) { - wd_smp_last_reset_tb = tb; + wd_smp_last_reset_tb = get_tb(); cpumask_andnot(&wd_smp_cpus_pending, &wd_cpus_enabled, &wd_smp_cpus_stuck); @@ -332,10 +333,10 @@ static void watchdog_timer_interrupt(int cpu) per_cpu(wd_timer_tb, cpu) = tb; - wd_smp_clear_cpu_pending(cpu, tb); + wd_smp_clear_cpu_pending(cpu); if ((s64)(tb - wd_smp_last_reset_tb) >= (s64)wd_smp_panic_timeout_tb) - watchdog_smp_panic(cpu, tb); + watchdog_smp_panic(cpu); } DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) @@ -372,7 +373,7 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) return 0; } - set_cpu_stuck(cpu, tb); + set_cpu_stuck(cpu); wd_smp_unlock(&flags); @@ -433,7 +434,7 @@ void arch_touch_nmi_watchdog(void) tb = get_tb(); if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) { per_cpu(wd_timer_tb, cpu) = tb; - wd_smp_clear_cpu_pending(cpu, tb); + wd_smp_clear_cpu_pending(cpu); } } EXPORT_SYMBOL(arch_touch_nmi_watchdog); @@ -491,7 +492,7 @@ static void stop_watchdog(void *arg) cpumask_clear_cpu(cpu, &wd_cpus_enabled); wd_smp_unlock(&flags); - wd_smp_clear_cpu_pending(cpu, get_tb()); + wd_smp_clear_cpu_pending(cpu); } static int stop_watchdog_on_cpu(unsigned int cpu) From patchwork Fri Nov 19 11:31:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1557052 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=dfd0fSIt; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HwZND17ltz9sPf for ; Fri, 19 Nov 2021 22:35:23 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4HwZNC5kdNz3ddh for ; Fri, 19 Nov 2021 22:35:23 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=dfd0fSIt; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::629; helo=mail-pl1-x629.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=dfd0fSIt; dkim-atps=neutral Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4HwZJT5T4Lz3bWb for ; Fri, 19 Nov 2021 22:32:09 +1100 (AEDT) Received: by mail-pl1-x629.google.com with SMTP id m24so7898176pls.10 for ; Fri, 19 Nov 2021 03:32:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bBD6K8XoTAM7ZzTK9IxXY1jKAzzbF/adKK/RRa51jnw=; b=dfd0fSItSwBicPk++srUQxWANowTUia0On69KFPdnjJiXE4L+HBc8yV15Q9lP4J34G 1SDJ8cbGzp82GsjWOBfKmFdKqMongJFjmwWuh2cuOnwScnU7R1tm3kHYa60UGeSY2WFl 6OvMDp1/oHwpdLpy+Mfrdriez6CeDn11nRcndFutC02vOVUZMd6kwaubzVNjdCt4kk0d JuxRrGXJgIKv9yDKMBDJEyHOc3hNB25C0EisT0H5MsQQ1yq4vetUF/SlfeNC04JVp7Pp +cGu7mwBNv+Ncr8dkQNQa2nnkv2jSUUHr8EoPr9kXmOHA70Efpmb8kUWDWbxbTQ6CG8z sg7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bBD6K8XoTAM7ZzTK9IxXY1jKAzzbF/adKK/RRa51jnw=; b=LvPU57X5LmLcx3jeAsfMlI13c2WjEY7Hmu4MfGH1c4l17uRkyv64K9sGDPhFD0W6e8 iiHNwoSyU6pARY3DwXlfCw6QKky8pNnQqp/LhqjKud+pRgdwUdLruDPnczyKQp65sWdo 56Eo7PMiPXq/Sb4dhpuGpm0jf0egX0gSJV/UpzBINVU4JddM/OYkJnXP9GDqZ03P5wJy cNq+k0sQqCKawE+R6oUTD8hYQZEfxj/S0NDYAq+aD1MzDhNk1Gawvsh1sowICXOGHKnl ZyDmHmb8yAE7qcOwPE7JImjThBeETMk+fepsyqOOKEISxmxg0kIUgBeC1em2TlM9kp/Q ZG8Q== X-Gm-Message-State: AOAM531bU3oAmJmJRKHQR0UkTnG1bR/4C9V4JMEpNNKucnFrSuEslqiv XROwwGidTcnB2/zjgDxNIsHNksa/7aI= X-Google-Smtp-Source: ABdhPJzF1Ag2X4gJsANTJUSS/F2YgDeoYSVWIukcnGF6+lJCQuyU6cO16szFyVvFpfqhCJyuIOsMGA== X-Received: by 2002:a17:90a:fe0a:: with SMTP id ck10mr3835851pjb.216.1637321527847; Fri, 19 Nov 2021 03:32:07 -0800 (PST) Received: from bobo.ozlabs.ibm.com (60-240-2-228.tpgi.com.au. [60.240.2.228]) by smtp.gmail.com with ESMTPSA id g17sm2632626pfv.136.2021.11.19.03.32.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Nov 2021 03:32:07 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v4 5/5] powerpc/watchdog: help remote CPUs to flush NMI printk output Date: Fri, 19 Nov 2021 21:31:46 +1000 Message-Id: <20211119113146.752759-6-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20211119113146.752759-1-npiggin@gmail.com> References: <20211119113146.752759-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Dufour , Nicholas Piggin , Daniel Axtens Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The printk layer at the moment does not seem to have a good way to force flush printk messages that are created in NMI context, except in the panic path. NMI-context printk messages normally get to the console with irq_work, but that won't help if the CPU is stuck with irqs disabled, as can be the case for hard lockup watchdog messages. The watchdog currently flushes the printk buffers after detecting a lockup on remote CPUs, but they may not have processed their NMI IPI yet by that stage, or they may have self-detected a lockup in which case they won't go via this NMI IPI path. Improve the situation by having NMI-context mark a flag if it called printk, and have watchdog timer interrupts check if that flag was set and try to flush if it was. Latency is not a big problem because we were already stuck for a while, just need to try to make sure the messages eventually make it out. Cc: Laurent Dufour Signed-off-by: Nicholas Piggin Reviewed-by: Laurent Dufour --- This patch requires commit 5d5e4522a7f4 ("printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces"). If backporting this to a kernel without commit 93d102f094be ("printk: remove safe buffers"), then printk_safe_flush() should be used in place of printk_trigger_flush(). Thanks, Nick arch/powerpc/kernel/watchdog.c | 37 ++++++++++++++++++++++++++++------ 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index 23745af38d62..bfc27496fe7e 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -86,6 +86,7 @@ static DEFINE_PER_CPU(u64, wd_timer_tb); /* SMP checker bits */ static unsigned long __wd_smp_lock; static unsigned long __wd_reporting; +static unsigned long __wd_nmi_output; static cpumask_t wd_smp_cpus_pending; static cpumask_t wd_smp_cpus_stuck; static u64 wd_smp_last_reset_tb; @@ -154,6 +155,23 @@ static void wd_lockup_ipi(struct pt_regs *regs) else dump_stack(); + /* + * __wd_nmi_output must be set after we printk from NMI context. + * + * printk from NMI context defers printing to the console to irq_work. + * If that NMI was taken in some code that is hard-locked, then irqs + * are disabled so irq_work will never fire. That can result in the + * hard lockup messages being delayed (indefinitely, until something + * else kicks the console drivers). + * + * Setting __wd_nmi_output will cause another CPU to notice and kick + * the console drivers for us. + * + * xchg is not needed here (it could be a smp_mb and store), but xchg + * gives the memory ordering and atomicity required. + */ + xchg(&__wd_nmi_output, 1); + /* Do not panic from here because that can recurse into NMI IPI layer */ } @@ -227,12 +245,6 @@ static void watchdog_smp_panic(int cpu) cpumask_clear(&wd_smp_cpus_ipi); } - /* - * Force flush any remote buffers that might be stuck in IRQ context - * and therefore could not run their irq_work. - */ - printk_trigger_flush(); - if (hardlockup_panic) nmi_panic(NULL, "Hard LOCKUP"); @@ -337,6 +349,17 @@ static void watchdog_timer_interrupt(int cpu) if ((s64)(tb - wd_smp_last_reset_tb) >= (s64)wd_smp_panic_timeout_tb) watchdog_smp_panic(cpu); + + if (__wd_nmi_output && xchg(&__wd_nmi_output, 0)) { + /* + * Something has called printk from NMI context. It might be + * stuck, so this this triggers a flush that will get that + * printk output to the console. + * + * See wd_lockup_ipi. + */ + printk_trigger_flush(); + } } DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) @@ -386,6 +409,8 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt) print_irqtrace_events(current); show_regs(regs); + xchg(&__wd_nmi_output, 1); // see wd_lockup_ipi + if (sysctl_hardlockup_all_cpu_backtrace) trigger_allbutself_cpu_backtrace();