From patchwork Sun Sep 6 02:55:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Chan X-Patchwork-Id: 1358240 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=broadcom.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=broadcom.com header.i=@broadcom.com header.a=rsa-sha256 header.s=google header.b=T0hKDJjT; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4Bkbdn6xCtz9sTS for ; Sun, 6 Sep 2020 12:56:13 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728662AbgIFCz6 (ORCPT ); Sat, 5 Sep 2020 22:55:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728257AbgIFCzv (ORCPT ); Sat, 5 Sep 2020 22:55:51 -0400 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68589C061573 for ; Sat, 5 Sep 2020 19:55:51 -0700 (PDT) Received: by mail-pl1-x642.google.com with SMTP id s10so2915305plp.1 for ; Sat, 05 Sep 2020 19:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zFbvft9lBMSz2gGZ6CsDgcHMULuuJfQk4PukonXGLvo=; b=T0hKDJjTRrMRXZPTb5MXAi8IzQ2chN4UmqBumhDi72d3CI83AbSDZFgKz224oHfv9Q QVhE+Oa33d8OVRox16A5PNyTLNwpv1gVraIXGadDZ3X209wpVQkHM3tH+YMe/GAM0R9K JjOqofec1ouDZXr0iIHA9kdGEzazKvPOPoQ6k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zFbvft9lBMSz2gGZ6CsDgcHMULuuJfQk4PukonXGLvo=; b=nqE7NMZtUkdj4FQ+WYvMMkS9IvAI40aonK86cAvskwgm2U0B+sMUWaYUDukAIJFr99 QZPNjZ7x9VMEWBIQJuZvtsMbvZQHw/HunrJLF9DLwio2DUzgbLFZPC3m/rpCPJz7DWSZ KeUQ9F7Dklt2uZ1mTXyj2rCdjaqDW6FQm/94dl79Na9yM9DZfzH8HUIp+yClEDlmTSwN bnwq3nD7AZ40HH5r9exFk5Jn4WSicmM44xM7A9xpHjImrKvoLZFpV2KazxEt0b4LHzOf H47D3voT0TosqQpf+mi44k7SRqKO57yLaVo2ibidxe1Ygxvv1QghqqcebuWb+Jl9l0zt u0sg== X-Gm-Message-State: AOAM530TxPRDbJ3SO3JFd2+jAc85BdozlRWOwISR+SSp23U1i5z0A4iY yFGZ2IRd7RX9w3gR0/W5qTg25cNUCAyXGw== X-Google-Smtp-Source: ABdhPJxDeRWRtuomi+z6TeFR1oOR9jvk7NpuIuctpdDm4nBfGUo+e0yTyfogZgroIxhJ3OD0ekYV7A== X-Received: by 2002:a17:90a:9285:: with SMTP id n5mr15630875pjo.64.1599360949588; Sat, 05 Sep 2020 19:55:49 -0700 (PDT) Received: from localhost.swdvt.lab.broadcom.com ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id h5sm1346959pgn.75.2020.09.05.19.55.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 05 Sep 2020 19:55:49 -0700 (PDT) From: Michael Chan To: davem@davemloft.net Cc: netdev@vger.kernel.org, kuba@kernel.org, Vasundhara Volam Subject: [PATCH net 1/2] bnxt_en: Avoid sending firmware messages when AER error is detected. Date: Sat, 5 Sep 2020 22:55:36 -0400 Message-Id: <1599360937-26197-2-git-send-email-michael.chan@broadcom.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599360937-26197-1-git-send-email-michael.chan@broadcom.com> References: <1599360937-26197-1-git-send-email-michael.chan@broadcom.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vasundhara Volam When the driver goes through PCIe AER reset in error state, all firmware messages will timeout because the PCIe bus is no longer accessible. This can lead to AER reset taking many minutes to complete as each firmware command takes time to timeout. Define a new macro BNXT_NO_FW_ACCESS() to skip these firmware messages when either firmware is in fatal error state or when pci_channel_offline() is true. It now takes a more reasonable 20 to 30 seconds to complete AER recovery. Fixes: b4fff2079d10 ("bnxt_en: Do not send firmware messages if firmware is in error state.") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 +++--- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++++ 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index b167066..619eb55 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -4305,7 +4305,7 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len, u32 bar_offset = BNXT_GRCPF_REG_CHIMP_COMM; u16 dst = BNXT_HWRM_CHNL_CHIMP; - if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state)) + if (BNXT_NO_FW_ACCESS(bp)) return -EBUSY; if (msg_len > BNXT_HWRM_MAX_REQ_LEN) { @@ -5723,7 +5723,7 @@ static int hwrm_ring_free_send_msg(struct bnxt *bp, struct hwrm_ring_free_output *resp = bp->hwrm_cmd_resp_addr; u16 error_code; - if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state)) + if (BNXT_NO_FW_ACCESS(bp)) return 0; bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_RING_FREE, cmpl_ring_id, -1); @@ -7817,7 +7817,7 @@ static int bnxt_set_tpa(struct bnxt *bp, bool set_tpa) if (set_tpa) tpa_flags = bp->flags & BNXT_FLAG_TPA; - else if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state)) + else if (BNXT_NO_FW_ACCESS(bp)) return 0; for (i = 0; i < bp->nr_vnics; i++) { rc = bnxt_hwrm_vnic_set_tpa(bp, i, tpa_flags); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 5a13eb6..0ef89da 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1737,6 +1737,10 @@ struct bnxt { #define BNXT_STATE_FW_FATAL_COND 6 #define BNXT_STATE_DRV_REGISTERED 7 +#define BNXT_NO_FW_ACCESS(bp) \ + (test_bit(BNXT_STATE_FW_FATAL_COND, &(bp)->state) || \ + pci_channel_offline((bp)->pdev)) + struct bnxt_irq *irq_tbl; int total_irqs; u8 mac_addr[ETH_ALEN]; From patchwork Sun Sep 6 02:55:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Chan X-Patchwork-Id: 1358239 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=broadcom.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=broadcom.com header.i=@broadcom.com header.a=rsa-sha256 header.s=google header.b=NFefUQ6i; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4Bkbdl6RF0z9sSP for ; Sun, 6 Sep 2020 12:56:11 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728731AbgIFCz6 (ORCPT ); Sat, 5 Sep 2020 22:55:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728400AbgIFCzv (ORCPT ); Sat, 5 Sep 2020 22:55:51 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CCF1C061575 for ; Sat, 5 Sep 2020 19:55:51 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id a8so2915064plm.2 for ; Sat, 05 Sep 2020 19:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CEP28nOtEo0ciAPmv7dW3hQUWMcebI/be1qrrf2Ssm0=; b=NFefUQ6ia05ontLqp8ekY/q/TyERKr6WLpdRgIaCoDJmRgUF46UQzkUoS5lEwvaroq qO42xRbFR+3jSmdTcccP/0dvq8cxRQXLNZlvsvcFM6Q6AC3AHq+G6JC3SRXQQBMDvrPT 7z0qDnjWX1bVd/benY1UVAuDoUZ6RQoBUKPAY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=CEP28nOtEo0ciAPmv7dW3hQUWMcebI/be1qrrf2Ssm0=; b=qvfXtFMMjVfh2XQb8FB5KgGgaSbyk7u+hEgNpJGQkPR47XCROZM/X3eJ8WEEAWU07L rEKALGQm4M1mAw/2UsN+DVZI44Q5ApAG2FpnaU8ajJtL7CsZynI2tsbxnO7LkXZl4Ku0 GCkPyNsguL2/tqejdNWWB6Unt0hacFRkp7JW4GVevJqZ7gOi4jJXcN5ksBMzIXj+Kgte x4l3mO/HnXZW/YuVdzEVp26iwdSZFldeYBjET55nJDBWfIUu01RRgsUNfzL7NYOhshrU AekKY6mKdwKXStGkAzQY+hwyNfqnqT0MRd10a7tIkXk00UIIVYa38rYZ585+5aKSxa7R V1Kg== X-Gm-Message-State: AOAM5316zblqUSeU72hK0f/39+0lK+Thz4Fds7R/EbLfn+f68LoHUXCG W1NH5RaWas3n2xlafBZ4oys2gQ== X-Google-Smtp-Source: ABdhPJw39c5g+ha7dPzSF2We4+Y35Yqrli5Qw/NecAdriG2bi3MN2TUsM7TLUaTgOWHpG2OD+e6Nww== X-Received: by 2002:a17:90b:3891:: with SMTP id mu17mr14239860pjb.160.1599360950865; Sat, 05 Sep 2020 19:55:50 -0700 (PDT) Received: from localhost.swdvt.lab.broadcom.com ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id h5sm1346959pgn.75.2020.09.05.19.55.49 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 05 Sep 2020 19:55:50 -0700 (PDT) From: Michael Chan To: davem@davemloft.net Cc: netdev@vger.kernel.org, kuba@kernel.org, Vasundhara Volam Subject: [PATCH net 2/2] bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task() Date: Sat, 5 Sep 2020 22:55:37 -0400 Message-Id: <1599360937-26197-3-git-send-email-michael.chan@broadcom.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1599360937-26197-1-git-send-email-michael.chan@broadcom.com> References: <1599360937-26197-1-git-send-email-michael.chan@broadcom.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vasundhara Volam bnxt_fw_reset_task() which runs from a workqueue can race with bnxt_remove_one(). For example, if firmware reset and VF FLR are happening at about the same time. bnxt_remove_one() already cancels the workqueue and waits for it to finish, but we need to do this earlier before the devlink reporters are destroyed. This will guarantee that the devlink reporters will always be valid when bnxt_fw_reset_task() is still running. Fixes: b148bb238c02 ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().") Reviewed-by: Edwin Peer Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 619eb55..8eb73fe 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -11779,6 +11779,10 @@ static void bnxt_remove_one(struct pci_dev *pdev) if (BNXT_PF(bp)) bnxt_sriov_disable(bp); + clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); + bnxt_cancel_sp_work(bp); + bp->sp_event = 0; + bnxt_dl_fw_reporters_destroy(bp, true); if (BNXT_PF(bp)) devlink_port_type_clear(&bp->dl_port); @@ -11786,9 +11790,6 @@ static void bnxt_remove_one(struct pci_dev *pdev) unregister_netdev(dev); bnxt_dl_unregister(bp); bnxt_shutdown_tc(bp); - clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); - bnxt_cancel_sp_work(bp); - bp->sp_event = 0; bnxt_clear_int_mode(bp); bnxt_hwrm_func_drv_unrgtr(bp);