diff mbox

powerpc/eeh: Fix invalid cached PE primary bus

Message ID 1466132722-13512-1-git-send-email-gwshan@linux.vnet.ibm.com (mailing list archive)
State Accepted
Headers show

Commit Message

Gavin Shan June 17, 2016, 3:05 a.m. UTC
The PE primary bus cannot be got from its child devices when having
full hotplug in error recovery. The PE primary bus is cached, which
is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary
bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared
before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the
PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually.

This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the
PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset())
can get valid PE primary bus through eeh_pe_bus_get().

Cc: stable@vger.kernel.org # v4.6+
Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/eeh_driver.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Michael Ellerman June 23, 2016, 9:26 a.m. UTC | #1
On Fri, 2016-17-06 at 03:05:11 UTC, Gavin Shan wrote:
> The PE primary bus cannot be got from its child devices when having
> full hotplug in error recovery. The PE primary bus is cached, which
> is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary
> bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared
> before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the
> PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually.
> 
> This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the
> PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset())
> can get valid PE primary bus through eeh_pe_bus_get().
> 
> Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE")
> Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/a3aa256b7258b3d19f8b44557c

cheers
diff mbox

Patch

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 2714a3b..b5f73cb 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -642,7 +642,6 @@  static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 		if (pe->type & EEH_PE_VF) {
 			eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL);
 		} else {
-			eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
 			pci_lock_rescan_remove();
 			pci_hp_remove_devices(bus);
 			pci_unlock_rescan_remove();
@@ -692,10 +691,12 @@  static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus,
 		 */
 		edev = list_first_entry(&pe->edevs, struct eeh_dev, list);
 		eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL);
-		if (pe->type & EEH_PE_VF)
+		if (pe->type & EEH_PE_VF) {
 			eeh_add_virt_device(edev, NULL);
-		else
+		} else {
+			eeh_pe_state_clear(pe, EEH_PE_PRI_BUS);
 			pci_hp_add_devices(bus);
+		}
 	} else if (frozen_bus && rmv_data->removed) {
 		pr_info("EEH: Sleep 5s ahead of partial hotplug\n");
 		ssleep(5);