From patchwork Wed Aug 1 06:20:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vaibhav Jain X-Patchwork-Id: 951960 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41gNTz5dn6z9s3Z for ; Wed, 1 Aug 2018 16:20:55 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41gNTz4PQzzF19q for ; Wed, 1 Aug 2018 16:20:55 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=linux.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=vaibhav@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41gNTv02pkzF18f for ; Wed, 1 Aug 2018 16:20:50 +1000 (AEST) Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w716KfQP014120 for ; Wed, 1 Aug 2018 02:20:48 -0400 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0a-001b2d01.pphosted.com with ESMTP id 2kk2s1scwb-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 01 Aug 2018 02:20:47 -0400 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 1 Aug 2018 07:20:44 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 1 Aug 2018 07:20:42 +0100 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w716Kemn36700202 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 1 Aug 2018 06:20:40 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4040F42045; Wed, 1 Aug 2018 09:20:52 +0100 (BST) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6A83B42041; Wed, 1 Aug 2018 09:20:48 +0100 (BST) Received: from vajain21.in.ibm.com.com (unknown [9.85.85.34]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 1 Aug 2018 09:20:47 +0100 (BST) From: Vaibhav Jain To: Stewart Smith , Frederic Barrat , Andrew Donnellan Date: Wed, 1 Aug 2018 11:50:07 +0530 X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 x-cbid: 18080106-0020-0000-0000-000002AF2A8F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080106-0021-0000-0000-000020FB4F12 Message-Id: <20180801062007.32606-1-vaibhav@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-01_02:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1806210000 definitions=main-1808010069 Subject: [Skiboot] [RESEND][PATCH] phb4, doc: Make GPU-Direct bandwidth optimizations Witherspoon specific X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Philippe Bergheaud , Christophe Lombard , skiboot@lists.ozlabs.org MIME-Version: 1.0 Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" Patch 5690c5a8980f("phb4: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidth") introduced allocation of extra DMA-read engines for improving Mellanox CX5 GPU-Direct bandwidth. At present CX5 is the only card thats using these optimizations so these changes will only impact Witherspoon systems. However hardware team has raised the possibility of other non-witherspoon systems in future that may be using a similar card, where these optimizations wont be needed. So they have asked us to make these changes Witherspoon specific. Hence this patch updates the phb4_init_capp_regs() & enable_capi_mode() to configure the extra DMA-read engine allocation if and only if skiboot is running on Witherspoon platform. Cc: stable #6.0.6+ Fixes: 5690c5a8980f("phb4: Reallocate PEC2 DMA-Read engines to improve GPU-Direct bandwidth") Signed-off-by: Vaibhav Jain --- Change-log: Resend -> Updated the request for merge to stable from 5.0.6+ to 6.0.6+ --- .../opal-pci-set-phb-capi-mode-93.rst | 8 ++-- hw/phb4.c | 47 ++++++++++++------- 2 files changed, 35 insertions(+), 20 deletions(-) diff --git a/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst b/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst index 6a8d2be8..09ecd69c 100644 --- a/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst +++ b/doc/opal-api/opal-pci-set-phb-capi-mode-93.rst @@ -44,10 +44,10 @@ CAPP-PSL transactions. Notes: ----- -* If PHB is in PEC2 then requesting mode `OPAL_PHB_CAPI_MODE_DMA_TVT1` will - allocate extra 16/8 dma read engines to the PHB depending on its stack - (stack 0/ stack 1). This is needed to improve the Direct-GPU DMA read - performance for the Mellanox CX5 card. +* On a Witherspoon system if PHB is in PEC2 then requesting mode + `OPAL_PHB_CAPI_MODE_DMA_TVT1` will allocate extra 16/8 dma read engines to the + PHB depending on its stack (stack 0/ stack 1). This is needed to improve the + Direct-GPU DMA read performance for the Mellanox CX5 card. * Mode `OPAL_PHB_CAPI_MODE_PCIE` not yet supported on Power-9. * Requesting mode `OPAL_PHB_CAPI_MODE_CAPI` on Power-9 will disable fast-reboot. * Modes `OPAL_PHB_CAPI_MODE_DMA`, `OPAL_PHB_CAPI_MODE_SNOOP_OFF` are diff --git a/hw/phb4.c b/hw/phb4.c index a3aa8b80..ee238109 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -148,6 +148,9 @@ static void phb4_init_hw(struct phb4 *p); #define PHB4_CAN_STORE_EOI(p) \ (XIVE_STORE_EOI_ENABLED && ((p)->rev >= PHB4_REV_NIMBUS_DD20)) +/* Are we running on a Witherspoon system */ +#define IS_WITHERSPOON() (strcmp(platform.name, "Witherspoon") == 0) + static bool verbose_eeh; static bool pci_tracing; static bool pci_eeh_mmio; @@ -3937,24 +3940,29 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) 0xDCE0280428000000); } - /* capp owns PHB read buffers */ - if (p->index == CAPP0_PHB_INDEX) { + + /* assigned capp owned PHB read buffers */ + reg = 0; + if (capp_eng & CAPP_MAX_DMA_READ_ENGINES) { + /* In case of Mellanox CX5 card on witherspoon assign + * just 4 phb read buffers to CAPP. On other systems allocate + * 8 read phb read buffers + */ + reg = IS_WITHERSPOON() ? 0xF000000000000000 : /*4 Read buffers*/ + 0xFF00000000000000; /*8 PHB Read buffers*/ + + } else if (p->index == CAPP0_PHB_INDEX) { /* max PHB read buffers 0-47 */ reg = 0xFFFFFFFFFFFF0000; - if (capp_eng & CAPP_MAX_DMA_READ_ENGINES) - reg = 0xF000000000000000; - xscom_write(p->chip_id, APC_FSM_READ_MASK + offset, reg); - xscom_write(p->chip_id, XPT_FSM_RMM + offset, reg); - } - if (p->index == CAPP1_PHB_INDEX) { + + } else if (p->index == CAPP1_PHB_INDEX) { /* Set 30 Read machines for CAPP Minus 20-27 for DMA */ reg = 0xFFFFF00E00000000; - if (capp_eng & CAPP_MAX_DMA_READ_ENGINES) - reg = 0xF000000000000000; - xscom_write(p->chip_id, APC_FSM_READ_MASK + offset, reg); - xscom_write(p->chip_id, XPT_FSM_RMM + offset, reg); } + xscom_write(p->chip_id, APC_FSM_READ_MASK + offset, reg); + xscom_write(p->chip_id, XPT_FSM_RMM + offset, reg); + /* CAPP FIR Action 0 */ xscom_write(p->chip_id, CAPP_FIR_ACTION0 + offset, 0x0b1c000104060000); @@ -4111,8 +4119,13 @@ static int64_t enable_capi_mode(struct phb4 *p, uint64_t pe_number, /* CAPP Control Register. Enable CAPP Mode */ reg = 0x8000000000000000ULL; /* PEC works in CAPP Mode */ reg |= stq_eng; - if (capp_eng & CAPP_MAX_DMA_READ_ENGINES) - dma_eng = 0x0000F00000000000ULL; /* 4 CAPP Read machines */ + if (capp_eng & CAPP_MAX_DMA_READ_ENGINES) { + /* For Mellanox CX5 running on witherspoon allocate 4 CAPP read + * machines. On other systems allocate 8 CAPP Read machines + */ + dma_eng = IS_WITHERSPOON() ? 0x0000F00000000000ULL : + 0x0000FF0000000000ULL; + } reg |= dma_eng; xscom_write(p->chip_id, p->pe_xscom + XPEC_NEST_CAPP_CNTL, reg); @@ -4120,9 +4133,11 @@ static int64_t enable_capi_mode(struct phb4 *p, uint64_t pe_number, * x8+x8 (bifurcated) or x8+x4+x4 (trifurcated) mode. When * Mellanox CX5 card is attached to stack0 of this PEC, indicated by * request to allocate CAPP_MAX_DMA_READ_ENGINES; we tweak the default - * dma-read engines allocations to maximize the DMA read performance + * dma-read engines allocations to maximize the DMA read performance. + * Do this only on a witherspoon system. */ - if ((p->index == CAPP1_PHB_INDEX) && + if (IS_WITHERSPOON() && + (p->index == CAPP1_PHB_INDEX) && (capp_eng & CAPP_MAX_DMA_READ_ENGINES)) { /*