From patchwork Tue Apr 28 19:01:52 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prashant Sreedharan X-Patchwork-Id: 465740 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 11EC014007D for ; Wed, 29 Apr 2015 05:11:45 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030935AbbD1TLk (ORCPT ); Tue, 28 Apr 2015 15:11:40 -0400 Received: from mail-gw1-out.broadcom.com ([216.31.210.62]:44331 "EHLO mail-gw1-out.broadcom.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030934AbbD1TLg (ORCPT ); Tue, 28 Apr 2015 15:11:36 -0400 X-IronPort-AV: E=Sophos;i="5.11,665,1422950400"; d="scan'208,223";a="63532534" Received: from irvexchcas06.broadcom.com (HELO IRVEXCHCAS06.corp.ad.broadcom.com) ([10.9.208.53]) by mail-gw1-out.broadcom.com with ESMTP; 28 Apr 2015 12:39:37 -0700 Received: from IRVEXCHSMTP3.corp.ad.broadcom.com (10.9.207.53) by IRVEXCHCAS06.corp.ad.broadcom.com (10.9.208.53) with Microsoft SMTP Server (TLS) id 14.3.235.1; Tue, 28 Apr 2015 12:11:35 -0700 Received: from mail-irva-13.broadcom.com (10.10.10.20) by IRVEXCHSMTP3.corp.ad.broadcom.com (10.9.207.53) with Microsoft SMTP Server id 14.3.235.1; Tue, 28 Apr 2015 12:11:35 -0700 Received: from [10.12.136.109] (unknown [10.12.136.109]) by mail-irva-13.broadcom.com (Postfix) with ESMTP id CB43B41067; Tue, 28 Apr 2015 12:09:25 -0700 (PDT) Subject: Re: [Problem] broadcom tg3 network driver disconnects under high load From: Prashant Sreedharan To: Michael Chan CC: Toan Pham , , In-Reply-To: <1430244665.6888.26.camel@LTIRV-MCHAN1.corp.ad.broadcom.com> References: <1429908991.3920.2.camel@LTIRV-MCHAN1.corp.ad.broadcom.com> <1430244665.6888.26.camel@LTIRV-MCHAN1.corp.ad.broadcom.com> Date: Tue, 28 Apr 2015 12:01:52 -0700 Message-ID: <1430247712.26841.18.camel@prashant> MIME-Version: 1.0 X-Mailer: Evolution 2.32.3 (2.32.3-30.el6) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Tue, 2015-04-28 at 11:11 -0700, Michael Chan wrote: > On Mon, 2015-04-27 at 22:10 +0000, Toan Pham wrote: > > Michael, > > > > > > Please see attach files. > > > > BTW, I have also tested this bug on at least 8 different HP 705 PCs > > with the 5762 NIC, so it is probably not a manufacturer defect. In > > addition, I can never replicate the same issue on the older chipset, > > BCM5761, which can be found on the HP model 6005. I hope this > > information is helpful. Thanks > > Thanks for the data. The memory enable bit is cleared and there are > some correctable error bits set. My colleague Sanjeev will look into > this. > > Do you have PCIE Advanced Error Reporting (CONFIG_PCIEAER) enabled in > your kernel? > 5762 NIC has a bug due to which the chip would detect false 4G boundary crossing and it would stall the chip. With the data you have provided it is not clear whether we are hitting this problem or not. Register 0x4c04 bit 5 would be set when this condition occurs. But since the memory enable bit is clear the register dump collected before the chip was reset is having all garbage in it. We were able to reproduce this issue internally only with iommu enabled. In your dmesg logs I do not see iommu enabled. So unless we have a pcie trace we cannot confirm if this HW bug is indeed the problem you are seeing. Meanwhile can you try the attached patch and see if you are able to reproduce the problem ? This patch will restrict all DMA address given to the chip to 31 bits. Toan, thanks for bringing this to our notice, also please cc maintainers so that mails are not missed. From 488fd699985f73d361d04d4788de48833c6442ca Mon Sep 17 00:00:00 2001 From: Prashant Sreedharan Date: Tue, 28 Apr 2015 11:32:56 -0700 Subject: [PATCH] tg3: Restrict DMA address to 31 bits for 5762 device --- drivers/net/ethernet/broadcom/tg3.c | 13 +++++++++++++ 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 069952f..e980c96 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -17707,6 +17707,8 @@ static int tg3_init_one(struct pci_dev *pdev, */ if (tg3_flag(tp, IS_5788)) persist_dma_mask = dma_mask = DMA_BIT_MASK(32); + else if (tg3_asic_rev(tp) == ASIC_REV_5762) + persist_dma_mask = dma_mask = DMA_BIT_MASK(31); else if (tg3_flag(tp, 40BIT_DMA_BUG)) { persist_dma_mask = dma_mask = DMA_BIT_MASK(40); #ifdef CONFIG_HIGHMEM @@ -17736,6 +17738,17 @@ static int tg3_init_one(struct pci_dev *pdev, "No usable DMA configuration, aborting\n"); goto err_out_apeunmap; } + } else { + err = pci_set_dma_mask(pdev, dma_mask); + if (!err) { + err = pci_set_consistent_dma_mask(pdev, + persist_dma_mask); + } + if (err) { + dev_err(&pdev->dev, + "No usable DMA configuration, aborting\n"); + goto err_out_apeunmap; + } } tg3_init_bufmgr_config(tp); -- 1.7.1