From patchwork Mon Aug 13 16:51:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Tai X-Patchwork-Id: 957057 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=oracle.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=oracle.com header.i=@oracle.com header.b="RquIYOjT"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 41q1w763XVz9sBJ for ; Tue, 14 Aug 2018 02:51:35 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730009AbeHMTee (ORCPT ); Mon, 13 Aug 2018 15:34:34 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:38286 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728293AbeHMTee (ORCPT ); Mon, 13 Aug 2018 15:34:34 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w7DGiD5F110484; Mon, 13 Aug 2018 16:51:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=4vem3WlwoG9LEM7QetxehHlb+NqNREKMrGLyd3p8t1Q=; b=RquIYOjTAV4zbV0WE3/V4M59R6fTU4caTY51Uj9R5NWG0zqFaFbxLP9LfO+TTMWC+FuN j1eEbTlpCw5SHO4yLsNrv9bVCF/u7E326ybQ/4eCyX33JGPtOL8uJ+G7eLtVGODuMjg7 D0lna4ToPPOoMoTMXLmnXZdoS/D6YJ7n6m3uwbqsV/IT/6m3jAya6O20raUuf98pJhga ZaQjTjlhWcxpB76XbZMrOMmAxa4r0b0HBJAFzrBze8dJHo40DJEWuJL79WJWwwQ3iI5/ JX/nAQIkGG13fnznNq2bX7vRG1yzBS/rFHgYpksEpWAb6b0K9p2ErjrbyNk8c1iVFYFi mg== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2130.oracle.com with ESMTP id 2ksq7t53se-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 13 Aug 2018 16:51:29 +0000 Received: from brm-x54-01.us.oracle.com (brm-x54-01.us.oracle.com [10.80.150.34]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w7DGpSfj004999; Mon, 13 Aug 2018 16:51:29 GMT From: Thomas Tai To: bhelgaas@google.com, keith.busch@intel.com, poza@codeaurora.org, thomas.tai@oracle.com Cc: linux-pci@vger.kernel.org Subject: [PATCH 1/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed Date: Mon, 13 Aug 2018 10:51:28 -0600 Message-Id: <1534179088-44219-2-git-send-email-thomas.tai@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1534179088-44219-1-git-send-email-thomas.tai@oracle.com> References: <1534179088-44219-1-git-send-email-thomas.tai@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8984 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808130173 Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org In order to prevent the pcie_do_fatal_recovery() from using the device after it is removed, the device's domain:bus:devfn is stored at the entry of pcie_do_fatal_recovery(). After rescanning the bus, the stored domain:bus:devfn is used to find the device and uses to report pci_info. The original issue only happens on an non-bridge device, a local variable is used instead of checking the device's header type. Signed-off-by: Thomas Tai --- drivers/pci/pcie/err.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c index f02e334..3414445 100644 --- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -287,15 +287,20 @@ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service) struct pci_bus *parent; struct pci_dev *pdev, *temp; pci_ers_result_t result; + bool is_bridge_device = false; + u16 domain = pci_domain_nr(dev->bus); + u8 bus = dev->bus->number; + u8 devfn = dev->devfn; - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + is_bridge_device = true; udev = dev; - else + } else { udev = dev->bus->self; + } parent = udev->subordinate; pci_lock_rescan_remove(); - pci_dev_get(dev); list_for_each_entry_safe_reverse(pdev, temp, &parent->devices, bus_list) { pci_dev_get(pdev); @@ -309,27 +314,35 @@ void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service) result = reset_link(udev, service); - if ((service == PCIE_PORT_SERVICE_AER) && - (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)) { + if (service == PCIE_PORT_SERVICE_AER && is_bridge_device) { /* * If the error is reported by a bridge, we think this error * is related to the downstream link of the bridge, so we * do error recovery on all subordinates of the bridge instead * of the bridge and clear the error status of the bridge. */ - pci_cleanup_aer_uncorrect_error_status(dev); + pci_cleanup_aer_uncorrect_error_status(udev); } if (result == PCI_ERS_RESULT_RECOVERED) { if (pcie_wait_for_link(udev, true)) pci_rescan_bus(udev->bus); - pci_info(dev, "Device recovery from fatal error successful\n"); + /* find the pci_dev after rescanning the bus */ + dev = pci_get_domain_bus_and_slot(domain, bus, devfn); + if (dev) + pci_info(dev, "Device recovery from fatal error successful\n"); + else + pr_err("AER: Can not find pci_dev for %04x:%02x:%02x.%x\n", + domain, bus, + PCI_SLOT(devfn), PCI_FUNC(devfn)); + pci_dev_put(dev); } else { - pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT); - pci_info(dev, "Device recovery from fatal error failed\n"); + if (is_bridge_device) + pci_uevent_ers(udev, PCI_ERS_RESULT_DISCONNECT); + pr_err("AER: Device %04x:%02x:%02x.%x recovery from fatal error failed\n", + domain, bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); } - pci_dev_put(dev); pci_unlock_rescan_remove(); }