From patchwork Thu Jul 30 17:09:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tony Nguyen X-Patchwork-Id: 1338982 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BHcNC58bYz9sRX for ; Fri, 31 Jul 2020 03:09:47 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730085AbgG3RJq (ORCPT ); Thu, 30 Jul 2020 13:09:46 -0400 Received: from mga04.intel.com ([192.55.52.120]:53573 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727072AbgG3RJp (ORCPT ); Thu, 30 Jul 2020 13:09:45 -0400 IronPort-SDR: kZjhnbVnSsKlFbOX/Sv4KOZIP5N/ZwEIjE7uxlDR0+ILR6EUuVVZAhMCilEuggXXqwlBce4d5l GM80S86204yw== X-IronPort-AV: E=McAfee;i="6000,8403,9698"; a="149111765" X-IronPort-AV: E=Sophos;i="5.75,415,1589266800"; d="scan'208";a="149111765" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jul 2020 10:09:44 -0700 IronPort-SDR: y0nW39Z1i8VKzqXnB6B/8ud7Db11d4wnn/oQAZLWbGbCyWuz1goOt+l/XCBe4nspf7Y/VYUN4A YOUs3yaRvHFA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,415,1589266800"; d="scan'208";a="272979243" Received: from jtkirshe-desk1.jf.intel.com ([134.134.177.86]) by fmsmga007.fm.intel.com with ESMTP; 30 Jul 2020 10:09:44 -0700 From: Tony Nguyen To: davem@davemloft.net Cc: Francesco Ruggeri , netdev@vger.kernel.org, nhorman@redhat.com, sassmann@redhat.com, jeffrey.t.kirsher@intel.com, anthony.l.nguyen@intel.com, Aaron Brown Subject: [net 2/2] igb: reinit_locked() should be called with rtnl_lock Date: Thu, 30 Jul 2020 10:09:38 -0700 Message-Id: <20200730170938.3766899-3-anthony.l.nguyen@intel.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200730170938.3766899-1-anthony.l.nguyen@intel.com> References: <20200730170938.3766899-1-anthony.l.nguyen@intel.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Francesco Ruggeri We observed two panics involving races with igb_reset_task. The first panic is caused by this race condition: kworker reboot -f igb_reset_task igb_reinit_locked igb_down napi_synchronize __igb_shutdown igb_clear_interrupt_scheme igb_free_q_vectors igb_free_q_vector adapter->q_vector[v_idx] = NULL; napi_disable Panics trying to access adapter->q_vector[v_idx].napi_state The second panic (a divide error) is caused by this race: kworker reboot -f tx packet igb_reset_task __igb_shutdown rtnl_lock() ... igb_clear_interrupt_scheme igb_free_q_vectors adapter->num_tx_queues = 0 ... rtnl_unlock() rtnl_lock() igb_reinit_locked igb_down igb_up netif_tx_start_all_queues dev_hard_start_xmit igb_xmit_frame igb_tx_queue_mapping Panics on r_idx % adapter->num_tx_queues This commit applies to igb_reset_task the same changes that were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch adds support for DCB to the kernel and ixgbe driver"), commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in reset subtask"). Signed-off-by: Francesco Ruggeri Tested-by: Aaron Brown Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 8bb3db2cbd41..6e5861bfb0fa 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6224,9 +6224,18 @@ static void igb_reset_task(struct work_struct *work) struct igb_adapter *adapter; adapter = container_of(work, struct igb_adapter, reset_task); + rtnl_lock(); + /* If we're already down or resetting, just bail */ + if (test_bit(__IGB_DOWN, &adapter->state) || + test_bit(__IGB_RESETTING, &adapter->state)) { + rtnl_unlock(); + return; + } + igb_dump(adapter); netdev_err(adapter->netdev, "Reset adapter\n"); igb_reinit_locked(adapter); + rtnl_unlock(); } /**