From patchwork Mon Apr 27 23:59:22 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Herrenschmidt X-Patchwork-Id: 465334 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id CE6A51400DE for ; Tue, 28 Apr 2015 09:59:32 +1000 (AEST) Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id BACB81A0758 for ; Tue, 28 Apr 2015 09:59:32 +1000 (AEST) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 910CC1A02BF for ; Tue, 28 Apr 2015 09:59:27 +1000 (AEST) Received: from localhost (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.13.8) with ESMTP id t3RNxNsS022931 for ; Mon, 27 Apr 2015 18:59:24 -0500 Message-ID: <1430179162.16571.136.camel@kernel.crashing.org> From: Benjamin Herrenschmidt To: skiboot@lists.ozlabs.org Date: Tue, 28 Apr 2015 09:59:22 +1000 X-Mailer: Evolution 3.12.10-0ubuntu1~14.10.1 Mime-Version: 1.0 Subject: [Skiboot] [PATCH] phb3: Disable write scope group in PHB for certain adapters X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" A performance issue on HPC workloads was identified with some network adapters due to the specific DMA access patterns they use which hits a worst-case scenario in the PHB. Disabling the write scope group feature in the PHB works around this, so let's do that when we detect such an adapter in a PCIe direct slot. Signed-off-by: Benjamin Herrenschmidt --- Note: I've already pushed that in our internal 810.xx branch due to the short runway to get it into the next release. It has been tested. core/pci.c | 2 +- hw/phb3.c | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/core/pci.c b/core/pci.c index 7447314..a773df0 100644 --- a/core/pci.c +++ b/core/pci.c @@ -165,6 +165,7 @@ static struct pci_device *pci_scan_one(struct phb *phb, struct pci_device *paren pd->is_multifunction = !!(htype & 0x80); pd->is_bridge = (htype & 0x7f) != 0; pd->scan_map = 0xffffffff; /* Default */ + pd->primary_bus = (bdfn >> 8); ecap = pci_find_cap(phb, bdfn, PCI_CFG_CAP_ID_EXP); if (ecap > 0) { @@ -205,7 +206,6 @@ static struct pci_device *pci_scan_one(struct phb *phb, struct pci_device *paren * This will help when walking down those bridges later on */ if (pd->is_bridge) { - pd->primary_bus = (bdfn >> 8); pci_cfg_write8(phb, bdfn, PCI_CFG_PRIMARY_BUS, pd->primary_bus); pci_cfg_write8(phb, bdfn, PCI_CFG_SECONDARY_BUS, 0); pci_cfg_write8(phb, bdfn, PCI_CFG_SUBORDINATE_BUS, 0); diff --git a/hw/phb3.c b/hw/phb3.c index 49e6b33..0172cb3 100644 --- a/hw/phb3.c +++ b/hw/phb3.c @@ -417,11 +417,44 @@ static void phb3_endpoint_init(struct phb *phb, pci_cfg_write32(phb, bdfn, aercap + PCIECAP_AER_CAPCTL, val32); } +static void phb3_check_device_quirks(struct phb *phb, struct pci_device *dev) +{ + struct phb3 *p = phb_to_phb3(phb); + u64 modectl; + u32 vdid; + u16 vendor, device; + + /* For these adapters, if they are directly under the PHB, we + * adjust some settings for performances + */ + xscom_read(p->chip_id, p->pe_xscom + 0x0b, &modectl); + + pci_cfg_read32(phb, dev->bdfn, 0, &vdid); + vendor = vdid & 0xffff; + device = vdid >> 16; + if (vendor == 0x15b3 && + (device == 0x1003 || /* Travis3-EN (CX3) */ + device == 0x1011 || /* HydePark (ConnectIB) */ + device == 0x1013)) { /* GlacierPark (CX4) */ + /* Set disable_wr_scope_group bit */ + modectl |= PPC_BIT(14); + } else { + /* Clear disable_wr_scope_group bit */ + modectl &= ~PPC_BIT(14); + } + + xscom_write(p->chip_id, p->pe_xscom + 0x0b, modectl); +} + static void phb3_device_init(struct phb *phb, struct pci_device *dev) { int ecap = 0; int aercap = 0; + /* Some special adapter tweaks for devices directly under the PHB */ + if (dev->primary_bus == 1) + phb3_check_device_quirks(phb, dev); + /* Figure out PCIe & AER capability */ if (pci_has_cap(dev, PCI_CFG_CAP_ID_EXP, false)) { ecap = pci_cap(dev, PCI_CFG_CAP_ID_EXP, false);