Message ID | 1466132722-13512-1-git-send-email-gwshan@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On Fri, 2016-17-06 at 03:05:11 UTC, Gavin Shan wrote: > The PE primary bus cannot be got from its child devices when having > full hotplug in error recovery. The PE primary bus is cached, which > is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary > bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared > before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the > PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually. > > This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the > PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset()) > can get valid PE primary bus through eeh_pe_bus_get(). > > Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE") > Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com> > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/a3aa256b7258b3d19f8b44557c cheers
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 2714a3b..b5f73cb 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -642,7 +642,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus, if (pe->type & EEH_PE_VF) { eeh_pe_dev_traverse(pe, eeh_rmv_device, NULL); } else { - eeh_pe_state_clear(pe, EEH_PE_PRI_BUS); pci_lock_rescan_remove(); pci_hp_remove_devices(bus); pci_unlock_rescan_remove(); @@ -692,10 +691,12 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus, */ edev = list_first_entry(&pe->edevs, struct eeh_dev, list); eeh_pe_traverse(pe, eeh_pe_detach_dev, NULL); - if (pe->type & EEH_PE_VF) + if (pe->type & EEH_PE_VF) { eeh_add_virt_device(edev, NULL); - else + } else { + eeh_pe_state_clear(pe, EEH_PE_PRI_BUS); pci_hp_add_devices(bus); + } } else if (frozen_bus && rmv_data->removed) { pr_info("EEH: Sleep 5s ahead of partial hotplug\n"); ssleep(5);
The PE primary bus cannot be got from its child devices when having full hotplug in error recovery. The PE primary bus is cached, which is done in commit <05ba75f84864> ("powerpc/eeh: Fix stale cached primary bus"). In eeh_reset_device(), the flag (EEH_PE_PRI_BUS) is cleared before the PCI hot remove. eeh_pe_bus_get() then returns NULL as the PE primary bus in pnv_eeh_reset() and it crashes the kernel eventually. This fixes the issue by clearing the flag (EEH_PE_PRI_BUS) before the PCI hot add. With it, the PowerNV EEH reset backend (pnv_eeh_reset()) can get valid PE primary bus through eeh_pe_bus_get(). Cc: stable@vger.kernel.org # v4.6+ Fixes: 67086e32b564 ("powerpc/eeh: powerpc/eeh: Support error recovery for VF PE") Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> --- arch/powerpc/kernel/eeh_driver.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)