Message ID | 20170720202858.1918-1-scott.bauer@intel.com |
---|---|
State | Changes Requested |
Headers | show |
On Tue, Jul 25, 2017 at 08:15:32PM -0600, Jon Derrick wrote: > On 07/20/2017 02:28 PM, Scott Bauer wrote: > > This patch frees up the IRQs we request on the suspend path, > > and reallocates them on the resume path. > > > > Fixes: > > [ 559.964386] CPU 111 disable failed: CPU has 9 vectors assigned and there are only 0 available. > > [ 559.966824] Error taking CPU111 down: -34 > > [ 559.966825] Non-boot CPUs are not disabled > > [ 559.966826] Enabling non-boot CPUs ... > > > > Signed-off-by: Scott Bauer <scott.bauer@intel.com> > > --- > > drivers/pci/host/vmd.c | 14 ++++++++++++++ > > 1 file changed, 14 insertions(+) > > > > diff --git a/drivers/pci/host/vmd.c b/drivers/pci/host/vmd.c > > index 6088c3083194..88bf0c754fb2 100644 > > --- a/drivers/pci/host/vmd.c > > +++ b/drivers/pci/host/vmd.c > > @@ -755,6 +755,11 @@ static void vmd_remove(struct pci_dev *dev) > > static int vmd_suspend(struct device *dev) > > { > > struct pci_dev *pdev = to_pci_dev(dev); > > + struct vmd_dev *vmd = pci_get_drvdata(pdev); > > + int i; > > + > > + for (i = 0; i < vmd->msix_count; i++) > > + devm_free_irq(dev, pci_irq_vector(pdev, i), &vmd->irqs[i]); > > > > pci_save_state(pdev); > > return 0; > > @@ -763,7 +768,16 @@ static int vmd_suspend(struct device *dev) > > static int vmd_resume(struct device *dev) > > { > > struct pci_dev *pdev = to_pci_dev(dev); > > + struct vmd_dev *vmd = pci_get_drvdata(pdev); > > + int err, i; > > > > + for (i = 0; i < vmd->msix_count; i++) { > > + err = devm_request_irq(dev, pci_irq_vector(pdev, i), > > + vmd_irq, IRQF_NO_THREAD, > The flag here looks redundant to me because the fn calls into > request_threaded_irq with no thread_fn. But since it's a bit more > verbose, can you change the other devm_request_irq to use IRQF_NO_THREAD? I actually yanked the request irq from the other call. It was changed to NO_THREAD in April: (3eefa790c968) PCI: host: Mark PCIe/PCI (MSI) cascade ISR as IRQF_NO_THREAD
On Tue, Jul 25, 2017 at 08:36:10PM -0600, Jon Derrick wrote: > On 07/20/2017 03:34 PM, Scott Bauer wrote: > [snip] > >>> + for (i = 0; i < vmd->msix_count; i++) { > >>> + err = devm_request_irq(dev, pci_irq_vector(pdev, i), > >>> + vmd_irq, IRQF_NO_THREAD, > >> The flag here looks redundant to me because the fn calls into > >> request_threaded_irq with no thread_fn. But since it's a bit more > >> verbose, can you change the other devm_request_irq to use IRQF_NO_THREAD? > > > > I actually yanked the request irq from the other call. It was changed > > to NO_THREAD in April: > > > > (3eefa790c968) PCI: host: Mark PCIe/PCI (MSI) cascade ISR as IRQF_NO_THREAD > > > > Ah so it was. Sorry was looking at v4.12 > > > I think we'll need to do more than free the irq handlers because if an > interrupt occurs after that, I think it'll get kicked to handle_bad_irq. > I think we just need to add pci_disable_device after save and > pcim_enable_device after restore. I orignally had this in the patch I was testing on SLES and took it out when it seemingly didn't make a difference. I didnt test yanking a drive or anything behind the domain while it was suspended, so I wouldn't have hit your scenario. I'll add it back in. Also, I think I need to CC stable on this one too since this is missing all the way since the original inception of this driver.
On Thu, Jul 20, 2017 at 03:56:54PM -0600, Scott Bauer wrote: > On Tue, Jul 25, 2017 at 08:36:10PM -0600, Jon Derrick wrote: > > > > I think we'll need to do more than free the irq handlers because if an > > interrupt occurs after that, I think it'll get kicked to handle_bad_irq. > > I think we just need to add pci_disable_device after save and > > pcim_enable_device after restore. > > I orignally had this in the patch I was testing on SLES and took it out when > it seemingly didn't make a difference. I didnt test yanking a drive or anything > behind the domain while it was suspended, so I wouldn't have hit your scenario. > > I'll add it back in. Also, I think I need to CC stable on this one too since > this is missing all the way since the original inception of this driver. I think your patch should be okay as-is. After freeing VMD's IRQs, the kernel will automatically clear the MSI-x control bit to disable it from firing.
On 07/20/2017 02:28 PM, Scott Bauer wrote: > This patch frees up the IRQs we request on the suspend path, > and reallocates them on the resume path. > > Fixes: > [ 559.964386] CPU 111 disable failed: CPU has 9 vectors assigned and there are only 0 available. > [ 559.966824] Error taking CPU111 down: -34 > [ 559.966825] Non-boot CPUs are not disabled > [ 559.966826] Enabling non-boot CPUs ... > > Signed-off-by: Scott Bauer <scott.bauer@intel.com> > --- > drivers/pci/host/vmd.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/pci/host/vmd.c b/drivers/pci/host/vmd.c > index 6088c3083194..88bf0c754fb2 100644 > --- a/drivers/pci/host/vmd.c > +++ b/drivers/pci/host/vmd.c > @@ -755,6 +755,11 @@ static void vmd_remove(struct pci_dev *dev) > static int vmd_suspend(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > + struct vmd_dev *vmd = pci_get_drvdata(pdev); > + int i; > + > + for (i = 0; i < vmd->msix_count; i++) > + devm_free_irq(dev, pci_irq_vector(pdev, i), &vmd->irqs[i]); > > pci_save_state(pdev); > return 0; > @@ -763,7 +768,16 @@ static int vmd_suspend(struct device *dev) > static int vmd_resume(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > + struct vmd_dev *vmd = pci_get_drvdata(pdev); > + int err, i; > > + for (i = 0; i < vmd->msix_count; i++) { > + err = devm_request_irq(dev, pci_irq_vector(pdev, i), > + vmd_irq, IRQF_NO_THREAD, The flag here looks redundant to me because the fn calls into request_threaded_irq with no thread_fn. But since it's a bit more verbose, can you change the other devm_request_irq to use IRQF_NO_THREAD? > + "vmd", &vmd->irqs[i]); > + if (err) > + return err; > + } > pci_restore_state(pdev); > return 0; > } >
On 07/20/2017 03:34 PM, Scott Bauer wrote: [snip] >>> + for (i = 0; i < vmd->msix_count; i++) { >>> + err = devm_request_irq(dev, pci_irq_vector(pdev, i), >>> + vmd_irq, IRQF_NO_THREAD, >> The flag here looks redundant to me because the fn calls into >> request_threaded_irq with no thread_fn. But since it's a bit more >> verbose, can you change the other devm_request_irq to use IRQF_NO_THREAD? > > I actually yanked the request irq from the other call. It was changed > to NO_THREAD in April: > > (3eefa790c968) PCI: host: Mark PCIe/PCI (MSI) cascade ISR as IRQF_NO_THREAD > Ah so it was. Sorry was looking at v4.12 I think we'll need to do more than free the irq handlers because if an interrupt occurs after that, I think it'll get kicked to handle_bad_irq. I think we just need to add pci_disable_device after save and pcim_enable_device after restore.
diff --git a/drivers/pci/host/vmd.c b/drivers/pci/host/vmd.c index 6088c3083194..88bf0c754fb2 100644 --- a/drivers/pci/host/vmd.c +++ b/drivers/pci/host/vmd.c @@ -755,6 +755,11 @@ static void vmd_remove(struct pci_dev *dev) static int vmd_suspend(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); + struct vmd_dev *vmd = pci_get_drvdata(pdev); + int i; + + for (i = 0; i < vmd->msix_count; i++) + devm_free_irq(dev, pci_irq_vector(pdev, i), &vmd->irqs[i]); pci_save_state(pdev); return 0; @@ -763,7 +768,16 @@ static int vmd_suspend(struct device *dev) static int vmd_resume(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); + struct vmd_dev *vmd = pci_get_drvdata(pdev); + int err, i; + for (i = 0; i < vmd->msix_count; i++) { + err = devm_request_irq(dev, pci_irq_vector(pdev, i), + vmd_irq, IRQF_NO_THREAD, + "vmd", &vmd->irqs[i]); + if (err) + return err; + } pci_restore_state(pdev); return 0; }
This patch frees up the IRQs we request on the suspend path, and reallocates them on the resume path. Fixes: [ 559.964386] CPU 111 disable failed: CPU has 9 vectors assigned and there are only 0 available. [ 559.966824] Error taking CPU111 down: -34 [ 559.966825] Non-boot CPUs are not disabled [ 559.966826] Enabling non-boot CPUs ... Signed-off-by: Scott Bauer <scott.bauer@intel.com> --- drivers/pci/host/vmd.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)