From patchwork Fri Nov 6 23:26:02 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joshua Hay X-Patchwork-Id: 541145 X-Patchwork-Delegate: jeffrey.t.kirsher@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by ozlabs.org (Postfix) with ESMTP id A85EF1402C0 for ; Sat, 7 Nov 2015 10:26:22 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id F227D332A9; Fri, 6 Nov 2015 23:26:21 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XICSojvVPCLx; Fri, 6 Nov 2015 23:26:20 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by silver.osuosl.org (Postfix) with ESMTP id 4EBB033279; Fri, 6 Nov 2015 23:26:20 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by ash.osuosl.org (Postfix) with ESMTP id 8D7221C09CC for ; Fri, 6 Nov 2015 23:26:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 89D38A3EC8 for ; Fri, 6 Nov 2015 23:26:19 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aoEOt5i-p2Kg for ; Fri, 6 Nov 2015 23:26:13 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by fraxinus.osuosl.org (Postfix) with ESMTP id 8EE70A3ECF for ; Fri, 6 Nov 2015 23:26:13 +0000 (UTC) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP; 06 Nov 2015 15:26:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,254,1444719600"; d="scan'208";a="595575984" Received: from jahay1-mobl2.amr.corp.intel.com (HELO localhost.localdomain.localdomain) ([134.134.176.151]) by FMSMGA003.fm.intel.com with ESMTP; 06 Nov 2015 15:26:13 -0800 From: Joshua Hay To: intel-wired-lan@lists.osuosl.org Date: Fri, 6 Nov 2015 15:26:02 -0800 Message-Id: <1446852372-25480-5-git-send-email-joshua.a.hay@intel.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1446852372-25480-1-git-send-email-joshua.a.hay@intel.com> References: <1446852372-25480-1-git-send-email-joshua.a.hay@intel.com> Subject: [Intel-wired-lan] [next PATCH S21 04/14] i40e: Detection and recovery of TX queue hung logic moved to service_task from tx_timeout X-BeenThere: intel-wired-lan@lists.osuosl.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-wired-lan-bounces@lists.osuosl.org Sender: "Intel-wired-lan" From: Kiran Patil This patch contains following changes: - detection and recovery logic (issue SW interrupt) has been moved to service_task from timeout function. - added some more debug info from tx_timeout. Logic to detect and recover TX queue hung is now two step process: - service_task detects TX queue hung and sets a bit(hung_detected) if it was not set. - if bit was set (means this is back-back hung condition detected), issue SW interrupt and clear the bit. - napi_poll clears the bit unconditionally since it cleans TX/RX queues. Signed-off-by: Kiran Patil Change-ID: Ieed03a48927c845a988b3ff375090bf37caeb903 Tested-by: Andrew Bowers --- drivers/net/ethernet/intel/i40e/i40e.h | 3 +++ drivers/net/ethernet/intel/i40e/i40e_main.c | 36 ++++++++++++++++++++++----- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 ++ drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 21 ++++++++++------ drivers/net/ethernet/intel/i40evf/i40e_txrx.h | 15 +++++++++++ 5 files changed, 64 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index d854a46..768a19b 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -579,6 +579,9 @@ struct i40e_q_vector { u8 num_ringpairs; /* total number of ring pairs in vector */ +#define I40E_Q_VECTOR_HUNG_DETECT 0 /* Bit Index for hung detection logic */ + unsigned long hung_detected; /* Set/Reset for hung_detection logic */ + cpumask_t affinity_mask; struct rcu_head rcu; /* to avoid race with update stats on free */ char name[I40E_INT_NAME_STR_LEN]; diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 7759703..d748431 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -4366,17 +4366,41 @@ static void i40e_detect_recover_hung_queue(int q_idx, struct i40e_vsi *vsi) else val = rd32(&pf->hw, I40E_PFINT_DYN_CTL0); + /* Bail out if interrupts are disabled because napi_poll + * execution in-progress or will get scheduled soon. + * napi_poll cleans TX and RX queues and updates 'next_to_clean'. + */ + if (!(val & I40E_PFINT_DYN_CTLN_INTENA_MASK)) + return; + head = i40e_get_head(tx_ring); tx_pending = i40e_get_tx_pending(tx_ring); - /* Interrupts are disabled and TX pending is non-zero, - * trigger the SW interrupt (don't wait). Worst case - * there will be one extra interrupt which may result - * into not cleaning any queues because queues are cleaned. + /* HW is done executing descriptors, updated HEAD write back, + * but SW hasn't processed those descriptors. If interrupt is + * not generated from this point ON, it could result into + * dev_watchdog detecting timeout on those netdev_queue, + * hence proactively trigger SW interrupt. */ - if (tx_pending && (!(val & I40E_PFINT_DYN_CTLN_INTENA_MASK))) - i40e_force_wb(vsi, tx_ring->q_vector); + if (tx_pending) { + /* NAPI Poll didn't run and clear since it was set */ + if (test_and_clear_bit(I40E_Q_VECTOR_HUNG_DETECT, + &tx_ring->q_vector->hung_detected)) { + netdev_info(vsi->netdev, "VSI_seid %d, Hung TX queue %d, tx_pending: %d, NTC:0x%x, HWB: 0x%x, NTU: 0x%x, TAIL: 0x%x\n", + vsi->seid, q_idx, tx_pending, + tx_ring->next_to_clean, head, + tx_ring->next_to_use, + readl(tx_ring->tail)); + netdev_info(vsi->netdev, "VSI_seid %d, Issuing force_wb for TX queue %d, Interrupt Reg: 0x%x\n", + vsi->seid, q_idx, val); + i40e_force_wb(vsi, tx_ring->q_vector); + } else { + /* First Chance - detected possible hung */ + set_bit(I40E_Q_VECTOR_HUNG_DETECT, + &tx_ring->q_vector->hung_detected); + } + } } /** diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 494e6ef..9be0a90 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -1892,6 +1892,8 @@ int i40e_napi_poll(struct napi_struct *napi, int budget) return 0; } + /* Clear hung_detected bit */ + clear_bit(I40E_Q_VECTOR_HUNG_DETECT, &q_vector->hung_detected); /* Since the actual Tx work is minimal, we can give the Tx a larger * budget and be more aggressive about cleaning up the Tx descriptors. */ diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c index 3725744..3a531fe 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c @@ -127,17 +127,24 @@ void i40evf_free_tx_resources(struct i40e_ring *tx_ring) } /** - * i40e_get_head - Retrieve head from head writeback - * @tx_ring: tx ring to fetch head of + * i40evf_get_tx_pending - how many tx descriptors not processed + * @tx_ring: the ring of descriptors * - * Returns value of Tx ring head based on value stored - * in head write-back location + * Since there is no access to the ring head register + * in XL710, we need to use our local copies **/ -static inline u32 i40e_get_head(struct i40e_ring *tx_ring) +u32 i40evf_get_tx_pending(struct i40e_ring *ring) { - void *head = (struct i40e_tx_desc *)tx_ring->desc + tx_ring->count; + u32 head, tail; - return le32_to_cpu(*(volatile __le32 *)head); + head = i40e_get_head(ring); + tail = readl(ring->tail); + + if (head != tail) + return (head < tail) ? + tail - head : (tail + ring->count - head); + + return 0; } #define WB_STRIDE 0x3 diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h index 929ddd9..6640ea5 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h @@ -324,4 +324,19 @@ int i40evf_setup_rx_descriptors(struct i40e_ring *rx_ring); void i40evf_free_tx_resources(struct i40e_ring *tx_ring); void i40evf_free_rx_resources(struct i40e_ring *rx_ring); int i40evf_napi_poll(struct napi_struct *napi, int budget); +u32 i40evf_get_tx_pending(struct i40e_ring *ring); + +/** + * i40e_get_head - Retrieve head from head writeback + * @tx_ring: tx ring to fetch head of + * + * Returns value of Tx ring head based on value stored + * in head write-back location + **/ +static inline u32 i40e_get_head(struct i40e_ring *tx_ring) +{ + void *head = (struct i40e_tx_desc *)tx_ring->desc + tx_ring->count; + + return le32_to_cpu(*(volatile __le32 *)head); +} #endif /* _I40E_TXRX_H_ */