diff mbox series

[v3,1/2] pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv

Message ID 20240624121052.233232-2-krishnak@linux.ibm.com (mailing list archive)
State Changes Requested
Headers show
Series PCI hotplug driver fixes | expand

Commit Message

krishna kumar June 24, 2024, 12:09 p.m. UTC
Description of the problem: The hotplug driver for powerpc
(pci/hotplug/pnv_php.c) gives kernel crash when we try to
hot-unplug/disable the PCIe switch/bridge from the PHB.

Root Cause of Crash: The crash is due to the reason that, though the msi
data structure has been released during disable/hot-unplug path and it
has been assigned with NULL, still during unregistartion the code was
again trying to explicitly disable the msi which causes the Null pointer
dereference and kernel crash.

Proposed Fix : The fix is to correct the check during unregistration path
so that the code should not  try to invoke pci_disable_msi/msix() if its
data structure is already freed.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Gaurav Batra <gbatra@linux.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>

Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com>
---
 drivers/pci/hotplug/pnv_php.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Bjorn Helgaas June 26, 2024, 3:21 p.m. UTC | #1
I expect this series would go through the powerpc tree since that's
where most of the chance is.

On Mon, Jun 24, 2024 at 05:39:27PM +0530, Krishna Kumar wrote:
> Description of the problem: The hotplug driver for powerpc
> (pci/hotplug/pnv_php.c) gives kernel crash when we try to
> hot-unplug/disable the PCIe switch/bridge from the PHB.
> 
> Root Cause of Crash: The crash is due to the reason that, though the msi
> data structure has been released during disable/hot-unplug path and it
> has been assigned with NULL, still during unregistartion the code was
> again trying to explicitly disable the msi which causes the Null pointer
> dereference and kernel crash.

s/unregistartion/unregistration/
s/Null/NULL/ to match previous use
s/msi/MSI/ to match spec usage

> Proposed Fix : The fix is to correct the check during unregistration path
> so that the code should not  try to invoke pci_disable_msi/msix() if its
> data structure is already freed.

s/Proposed Fix : The fix is to// ... Just say what the patch does.

If/when the powerpc folks like this, add my:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Gaurav Batra <gbatra@linux.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Brian King <brking@linux.vnet.ibm.com>
> 
> Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com>
> ---
>  drivers/pci/hotplug/pnv_php.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
> index 694349be9d0a..573a41869c15 100644
> --- a/drivers/pci/hotplug/pnv_php.c
> +++ b/drivers/pci/hotplug/pnv_php.c
> @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
>  				bool disable_device)
>  {
>  	struct pci_dev *pdev = php_slot->pdev;
> -	int irq = php_slot->irq;
>  	u16 ctrl;
>  
>  	if (php_slot->irq > 0) {
> @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
>  		php_slot->wq = NULL;
>  	}
>  
> -	if (disable_device || irq > 0) {
> +	if (disable_device) {
>  		if (pdev->msix_enabled)
>  			pci_disable_msix(pdev);
>  		else if (pdev->msi_enabled)
> -- 
> 2.45.0
>
Shawn Anastasio June 27, 2024, 5:08 p.m. UTC | #2
Hi Krishna,

On 6/24/24 7:09 AM, Krishna Kumar wrote:
> Description of the problem: The hotplug driver for powerpc
> (pci/hotplug/pnv_php.c) gives kernel crash when we try to
> hot-unplug/disable the PCIe switch/bridge from the PHB.
> 
> Root Cause of Crash: The crash is due to the reason that, though the msi
> data structure has been released during disable/hot-unplug path and it
> has been assigned with NULL, still during unregistartion the code was
> again trying to explicitly disable the msi which causes the Null pointer
> dereference and kernel crash.
> 
> Proposed Fix : The fix is to correct the check during unregistration path
> so that the code should not  try to invoke pci_disable_msi/msix() if its
> data structure is already freed.
> 
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Nicholas Piggin <npiggin@gmail.com>
> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Gaurav Batra <gbatra@linux.ibm.com>
> Cc: Nathan Lynch <nathanl@linux.ibm.com>
> Cc: Brian King <brking@linux.vnet.ibm.com>
> 
> Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com>

As with v1, I can confirm that this patch solves the panic encountered
when hotplugging PCIe bridges on POWER9.

Tested-by: Shawn Anastasio <sanastasio@raptorengineering.com>

Thanks,
Shawn
Michael Ellerman June 28, 2024, 4:48 a.m. UTC | #3
Shawn Anastasio <sanastasio@raptorengineering.com> writes:
> Hi Krishna,
>
> On 6/24/24 7:09 AM, Krishna Kumar wrote:
>> Description of the problem: The hotplug driver for powerpc
>> (pci/hotplug/pnv_php.c) gives kernel crash when we try to
>> hot-unplug/disable the PCIe switch/bridge from the PHB.
>> 
>> Root Cause of Crash: The crash is due to the reason that, though the msi
>> data structure has been released during disable/hot-unplug path and it
>> has been assigned with NULL, still during unregistartion the code was
>> again trying to explicitly disable the msi which causes the Null pointer
>> dereference and kernel crash.
>> 
>> Proposed Fix : The fix is to correct the check during unregistration path
>> so that the code should not  try to invoke pci_disable_msi/msix() if its
>> data structure is already freed.
>> 
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Nicholas Piggin <npiggin@gmail.com>
>> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
>> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: Gaurav Batra <gbatra@linux.ibm.com>
>> Cc: Nathan Lynch <nathanl@linux.ibm.com>
>> Cc: Brian King <brking@linux.vnet.ibm.com>
>> 
>> Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com>
>
> As with v1, I can confirm that this patch solves the panic encountered
> when hotplugging PCIe bridges on POWER9.

Was the panic reported anywhere? So we can link to the report in the
commit.

cheers
Michael Ellerman June 28, 2024, 5:34 a.m. UTC | #4
Bjorn Helgaas <helgaas@kernel.org> writes:
> I expect this series would go through the powerpc tree since that's
> where most of the chance is.

Thanks, yeah I'll plan to merge v4 with your comments addressed.

cheers

> On Mon, Jun 24, 2024 at 05:39:27PM +0530, Krishna Kumar wrote:
>> Description of the problem: The hotplug driver for powerpc
>> (pci/hotplug/pnv_php.c) gives kernel crash when we try to
>> hot-unplug/disable the PCIe switch/bridge from the PHB.
>> 
>> Root Cause of Crash: The crash is due to the reason that, though the msi
>> data structure has been released during disable/hot-unplug path and it
>> has been assigned with NULL, still during unregistartion the code was
>> again trying to explicitly disable the msi which causes the Null pointer
>> dereference and kernel crash.
>
> s/unregistartion/unregistration/
> s/Null/NULL/ to match previous use
> s/msi/MSI/ to match spec usage
>
>> Proposed Fix : The fix is to correct the check during unregistration path
>> so that the code should not  try to invoke pci_disable_msi/msix() if its
>> data structure is already freed.
>
> s/Proposed Fix : The fix is to// ... Just say what the patch does.
>
> If/when the powerpc folks like this, add my:
>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Nicholas Piggin <npiggin@gmail.com>
>> Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
>> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: Gaurav Batra <gbatra@linux.ibm.com>
>> Cc: Nathan Lynch <nathanl@linux.ibm.com>
>> Cc: Brian King <brking@linux.vnet.ibm.com>
>> 
>> Signed-off-by: Krishna Kumar <krishnak@linux.ibm.com>
>> ---
>>  drivers/pci/hotplug/pnv_php.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>> 
>> diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
>> index 694349be9d0a..573a41869c15 100644
>> --- a/drivers/pci/hotplug/pnv_php.c
>> +++ b/drivers/pci/hotplug/pnv_php.c
>> @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
>>  				bool disable_device)
>>  {
>>  	struct pci_dev *pdev = php_slot->pdev;
>> -	int irq = php_slot->irq;
>>  	u16 ctrl;
>>  
>>  	if (php_slot->irq > 0) {
>> @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
>>  		php_slot->wq = NULL;
>>  	}
>>  
>> -	if (disable_device || irq > 0) {
>> +	if (disable_device) {
>>  		if (pdev->msix_enabled)
>>  			pci_disable_msix(pdev);
>>  		else if (pdev->msi_enabled)
>> -- 
>> 2.45.0
>>
Shawn Anastasio June 28, 2024, 7:22 p.m. UTC | #5
Hi Michael,

On 6/27/24 11:48 PM, Michael Ellerman wrote:
> Was the panic reported anywhere? So we can link to the report in the
> commit.

It was indeed -- here is the link to the thread from last December:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2023-December/267034.html

> cheers

Thanks,
Shawn
Michael Ellerman June 29, 2024, 7:30 a.m. UTC | #6
Shawn Anastasio <sanastasio@raptorengineering.com> writes:
> Hi Michael,
>
> On 6/27/24 11:48 PM, Michael Ellerman wrote:
>> Was the panic reported anywhere? So we can link to the report in the
>> commit.
>
> It was indeed -- here is the link to the thread from last December:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2023-December/267034.html

Thanks.

cheers
diff mbox series

Patch

diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c
index 694349be9d0a..573a41869c15 100644
--- a/drivers/pci/hotplug/pnv_php.c
+++ b/drivers/pci/hotplug/pnv_php.c
@@ -40,7 +40,6 @@  static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
 				bool disable_device)
 {
 	struct pci_dev *pdev = php_slot->pdev;
-	int irq = php_slot->irq;
 	u16 ctrl;
 
 	if (php_slot->irq > 0) {
@@ -59,7 +58,7 @@  static void pnv_php_disable_irq(struct pnv_php_slot *php_slot,
 		php_slot->wq = NULL;
 	}
 
-	if (disable_device || irq > 0) {
+	if (disable_device) {
 		if (pdev->msix_enabled)
 			pci_disable_msix(pdev);
 		else if (pdev->msi_enabled)