Message ID | 20240624121052.233232-2-krishnak@linux.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | PCI hotplug driver fixes | expand |
I expect this series would go through the powerpc tree since that's where most of the chance is. On Mon, Jun 24, 2024 at 05:39:27PM +0530, Krishna Kumar wrote: > Description of the problem: The hotplug driver for powerpc > (pci/hotplug/pnv_php.c) gives kernel crash when we try to > hot-unplug/disable the PCIe switch/bridge from the PHB. > > Root Cause of Crash: The crash is due to the reason that, though the msi > data structure has been released during disable/hot-unplug path and it > has been assigned with NULL, still during unregistartion the code was > again trying to explicitly disable the msi which causes the Null pointer > dereference and kernel crash. s/unregistartion/unregistration/ s/Null/NULL/ to match previous use s/msi/MSI/ to match spec usage > Proposed Fix : The fix is to correct the check during unregistration path > so that the code should not try to invoke pci_disable_msi/msix() if its > data structure is already freed. s/Proposed Fix : The fix is to// ... Just say what the patch does. If/when the powerpc folks like this, add my: Acked-by: Bjorn Helgaas <bhelgaas@google.com> > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Christophe Leroy <christophe.leroy@csgroup.eu> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Gaurav Batra <gbatra@linux.ibm.com> > Cc: Nathan Lynch <nathanl@linux.ibm.com> > Cc: Brian King <brking@linux.vnet.ibm.com> > > Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com> > --- > drivers/pci/hotplug/pnv_php.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c > index 694349be9d0a..573a41869c15 100644 > --- a/drivers/pci/hotplug/pnv_php.c > +++ b/drivers/pci/hotplug/pnv_php.c > @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, > bool disable_device) > { > struct pci_dev *pdev = php_slot->pdev; > - int irq = php_slot->irq; > u16 ctrl; > > if (php_slot->irq > 0) { > @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, > php_slot->wq = NULL; > } > > - if (disable_device || irq > 0) { > + if (disable_device) { > if (pdev->msix_enabled) > pci_disable_msix(pdev); > else if (pdev->msi_enabled) > -- > 2.45.0 >
Hi Krishna, On 6/24/24 7:09 AM, Krishna Kumar wrote: > Description of the problem: The hotplug driver for powerpc > (pci/hotplug/pnv_php.c) gives kernel crash when we try to > hot-unplug/disable the PCIe switch/bridge from the PHB. > > Root Cause of Crash: The crash is due to the reason that, though the msi > data structure has been released during disable/hot-unplug path and it > has been assigned with NULL, still during unregistartion the code was > again trying to explicitly disable the msi which causes the Null pointer > dereference and kernel crash. > > Proposed Fix : The fix is to correct the check during unregistration path > so that the code should not try to invoke pci_disable_msi/msix() if its > data structure is already freed. > > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Christophe Leroy <christophe.leroy@csgroup.eu> > Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Gaurav Batra <gbatra@linux.ibm.com> > Cc: Nathan Lynch <nathanl@linux.ibm.com> > Cc: Brian King <brking@linux.vnet.ibm.com> > > Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com> As with v1, I can confirm that this patch solves the panic encountered when hotplugging PCIe bridges on POWER9. Tested-by: Shawn Anastasio <sanastasio@raptorengineering.com> Thanks, Shawn
Shawn Anastasio <sanastasio@raptorengineering.com> writes: > Hi Krishna, > > On 6/24/24 7:09 AM, Krishna Kumar wrote: >> Description of the problem: The hotplug driver for powerpc >> (pci/hotplug/pnv_php.c) gives kernel crash when we try to >> hot-unplug/disable the PCIe switch/bridge from the PHB. >> >> Root Cause of Crash: The crash is due to the reason that, though the msi >> data structure has been released during disable/hot-unplug path and it >> has been assigned with NULL, still during unregistartion the code was >> again trying to explicitly disable the msi which causes the Null pointer >> dereference and kernel crash. >> >> Proposed Fix : The fix is to correct the check during unregistration path >> so that the code should not try to invoke pci_disable_msi/msix() if its >> data structure is already freed. >> >> Cc: Michael Ellerman <mpe@ellerman.id.au> >> Cc: Nicholas Piggin <npiggin@gmail.com> >> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> >> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> >> Cc: Bjorn Helgaas <bhelgaas@google.com> >> Cc: Gaurav Batra <gbatra@linux.ibm.com> >> Cc: Nathan Lynch <nathanl@linux.ibm.com> >> Cc: Brian King <brking@linux.vnet.ibm.com> >> >> Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com> > > As with v1, I can confirm that this patch solves the panic encountered > when hotplugging PCIe bridges on POWER9. Was the panic reported anywhere? So we can link to the report in the commit. cheers
Bjorn Helgaas <helgaas@kernel.org> writes: > I expect this series would go through the powerpc tree since that's > where most of the chance is. Thanks, yeah I'll plan to merge v4 with your comments addressed. cheers > On Mon, Jun 24, 2024 at 05:39:27PM +0530, Krishna Kumar wrote: >> Description of the problem: The hotplug driver for powerpc >> (pci/hotplug/pnv_php.c) gives kernel crash when we try to >> hot-unplug/disable the PCIe switch/bridge from the PHB. >> >> Root Cause of Crash: The crash is due to the reason that, though the msi >> data structure has been released during disable/hot-unplug path and it >> has been assigned with NULL, still during unregistartion the code was >> again trying to explicitly disable the msi which causes the Null pointer >> dereference and kernel crash. > > s/unregistartion/unregistration/ > s/Null/NULL/ to match previous use > s/msi/MSI/ to match spec usage > >> Proposed Fix : The fix is to correct the check during unregistration path >> so that the code should not try to invoke pci_disable_msi/msix() if its >> data structure is already freed. > > s/Proposed Fix : The fix is to// ... Just say what the patch does. > > If/when the powerpc folks like this, add my: > > Acked-by: Bjorn Helgaas <bhelgaas@google.com> > >> Cc: Michael Ellerman <mpe@ellerman.id.au> >> Cc: Nicholas Piggin <npiggin@gmail.com> >> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> >> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> >> Cc: Bjorn Helgaas <bhelgaas@google.com> >> Cc: Gaurav Batra <gbatra@linux.ibm.com> >> Cc: Nathan Lynch <nathanl@linux.ibm.com> >> Cc: Brian King <brking@linux.vnet.ibm.com> >> >> Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com> >> --- >> drivers/pci/hotplug/pnv_php.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c >> index 694349be9d0a..573a41869c15 100644 >> --- a/drivers/pci/hotplug/pnv_php.c >> +++ b/drivers/pci/hotplug/pnv_php.c >> @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, >> bool disable_device) >> { >> struct pci_dev *pdev = php_slot->pdev; >> - int irq = php_slot->irq; >> u16 ctrl; >> >> if (php_slot->irq > 0) { >> @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, >> php_slot->wq = NULL; >> } >> >> - if (disable_device || irq > 0) { >> + if (disable_device) { >> if (pdev->msix_enabled) >> pci_disable_msix(pdev); >> else if (pdev->msi_enabled) >> -- >> 2.45.0 >>
Hi Michael, On 6/27/24 11:48 PM, Michael Ellerman wrote: > Was the panic reported anywhere? So we can link to the report in the > commit. It was indeed -- here is the link to the thread from last December: https://lists.ozlabs.org/pipermail/linuxppc-dev/2023-December/267034.html > cheers Thanks, Shawn
Shawn Anastasio <sanastasio@raptorengineering.com> writes: > Hi Michael, > > On 6/27/24 11:48 PM, Michael Ellerman wrote: >> Was the panic reported anywhere? So we can link to the report in the >> commit. > > It was indeed -- here is the link to the thread from last December: > https://lists.ozlabs.org/pipermail/linuxppc-dev/2023-December/267034.html Thanks. cheers
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 694349be9d0a..573a41869c15 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, bool disable_device) { struct pci_dev *pdev = php_slot->pdev; - int irq = php_slot->irq; u16 ctrl; if (php_slot->irq > 0) { @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, php_slot->wq = NULL; } - if (disable_device || irq > 0) { + if (disable_device) { if (pdev->msix_enabled) pci_disable_msix(pdev); else if (pdev->msi_enabled)
Description of the problem: The hotplug driver for powerpc (pci/hotplug/pnv_php.c) gives kernel crash when we try to hot-unplug/disable the PCIe switch/bridge from the PHB. Root Cause of Crash: The crash is due to the reason that, though the msi data structure has been released during disable/hot-unplug path and it has been assigned with NULL, still during unregistartion the code was again trying to explicitly disable the msi which causes the Null pointer dereference and kernel crash. Proposed Fix : The fix is to correct the check during unregistration path so that the code should not try to invoke pci_disable_msi/msix() if its data structure is already freed. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Gaurav Batra <gbatra@linux.ibm.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com> --- drivers/pci/hotplug/pnv_php.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)