Message ID | 20241025222755.3756162-2-kbusch@meta.com |
---|---|
State | New |
Headers | show |
Series | [PATCHv2,1/2] pci: provide bus reset attribute | expand |
On Fri, 25 Oct 2024 15:27:55 -0700 Keith Busch <kbusch@meta.com> wrote: > From: Keith Busch <kbusch@kernel.org> > > If a reset is issued to a running device with a driver that didn't > register the notification callbacks, the driver may be unaware of this > event and have an inconsistent view of the device's state. Log a warning > of this event because there's nothing else indicating the event occured, > which could be confusing when debugging such situations. > > Signed-off-by: Keith Busch <kbusch@kernel.org> > --- > drivers/pci/pci.c | 4 ++++ > 1 file changed, 4 insertions(+) Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
On Fri, 2024-10-25 at 15:27 -0700, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > If a reset is issued to a running device with a driver that didn't > register the notification callbacks, the driver may be unaware of this > event and have an inconsistent view of the device's state. Log a warning > of this event because there's nothing else indicating the event occured, > which could be confusing when debugging such situations. > > Signed-off-by: Keith Busch <kbusch@kernel.org> > --- > drivers/pci/pci.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 338dfcd983f1e..bbf12d4998269 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -5158,6 +5158,8 @@ static void pci_dev_save_and_disable(struct pci_dev *dev) > */ > if (err_handler && err_handler->reset_prepare) > err_handler->reset_prepare(dev); > + else if (dev->driver) > + pci_warn(dev, "resetting"); > > /* > * Wake-up device prior to save. PM registers default to D0 after > @@ -5191,6 +5193,8 @@ static void pci_dev_restore(struct pci_dev *dev) > */ > if (err_handler && err_handler->reset_done) > err_handler->reset_done(dev); > + else if (dev->driver) > + pci_warn(dev, "reset done"); > } > > /* dev->reset_methods[] is a 0-terminated list of indices into this array */ For what it's worth on s390 I think the previous proposal of having the attribute on the pci_bus would have been better in principle. On s390 the PCI bus probing is done by firmware and Linux doesn't see a pci_dev for bridges but we do create struct pci_bus for example for a PF and its child VFs. Then again we can't really do a reset on this level, other than manually going through the PCI functions and resetting them one by one. In fact we may see PCI functions on their own bus while another Linux instance (LPAR) sees other functions from that bus. So yeah, I guess it's fair not to have this attribute but still wanted to offer these thoughts. Thanks, Niklas
On 24/10/25 03:27pm, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > If a reset is issued to a running device with a driver that didn't > register the notification callbacks, the driver may be unaware of this > event and have an inconsistent view of the device's state. Log a warning > of this event because there's nothing else indicating the event occured, > which could be confusing when debugging such situations. > > Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com> > --- > drivers/pci/pci.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 338dfcd983f1e..bbf12d4998269 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -5158,6 +5158,8 @@ static void pci_dev_save_and_disable(struct pci_dev *dev) > */ > if (err_handler && err_handler->reset_prepare) > err_handler->reset_prepare(dev); > + else if (dev->driver) > + pci_warn(dev, "resetting"); > > /* > * Wake-up device prior to save. PM registers default to D0 after > @@ -5191,6 +5193,8 @@ static void pci_dev_restore(struct pci_dev *dev) > */ > if (err_handler && err_handler->reset_done) > err_handler->reset_done(dev); > + else if (dev->driver) > + pci_warn(dev, "reset done"); > } > > /* dev->reset_methods[] is a 0-terminated list of indices into this array */ > -- > 2.43.5 >
On Tue, Oct 29, 2024 at 12:27:41PM +0100, Niklas Schnelle wrote: > For what it's worth on s390 I think the previous proposal of having the > attribute on the pci_bus would have been better in principle. On s390 > the PCI bus probing is done by firmware and Linux doesn't see a pci_dev > for bridges but we do create struct pci_bus for example for a PF and > its child VFs. > > Then again we can't really do a reset on this level, other than > manually going through the PCI functions and resetting them one by one. > In fact we may see PCI functions on their own bus while another Linux > instance (LPAR) sees other functions from that bus. So yeah, I guess > it's fair not to have this attribute but still wanted to offer these > thoughts. This attribute uses the pci_dev bridge control register. If you don't have the bridge device, I don't think this would be useful, so I guess you'd have to fallback to resetting individual functions. It seems people can navigate /sys/bus/pci/devices/ easier than finding a pci_bus under /sys/devices/, though, so that's a plus for pci_dev.
On Fri, 2024-11-01 at 13:21 -0600, Keith Busch wrote: > On Tue, Oct 29, 2024 at 12:27:41PM +0100, Niklas Schnelle wrote: > > For what it's worth on s390 I think the previous proposal of having the > > attribute on the pci_bus would have been better in principle. On s390 > > the PCI bus probing is done by firmware and Linux doesn't see a pci_dev > > for bridges but we do create struct pci_bus for example for a PF and > > its child VFs. > > > > Then again we can't really do a reset on this level, other than > > manually going through the PCI functions and resetting them one by one. > > In fact we may see PCI functions on their own bus while another Linux > > instance (LPAR) sees other functions from that bus. So yeah, I guess > > it's fair not to have this attribute but still wanted to offer these > > thoughts. > > This attribute uses the pci_dev bridge control register. If you don't > have the bridge device, I don't think this would be useful, so I guess > you'd have to fallback to resetting individual functions. > > It seems people can navigate /sys/bus/pci/devices/ easier than finding a > pci_bus under /sys/devices/, though, so that's a plus for pci_dev. > Yes you're right, since the reset via __pci_reset_bus() and then pci_bridge_secondary_bus_reset() requires the bridge device (unlike pci_reset_bus()) it makes more sense to have it on the bridge thus also requiring the bridge device to exist. I still kind of wish pci_bridge_secondary_bus_reset() and pcibios_reset_secondary_bus() would take a struct pci_bus and then we could do the fallback in the latter platform specific code and having the attribute on the bus would make sense, but I'm not sure it's all that useful. One more question though, what would happen with this reset for a bus with an SR-IOV device with more than 256 VFs i.e. where pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since VFs are physically still controlled by the bridge all VFs would be reset but at the same time virtfn_add_bus() sets the bridge device for the added bus as NULL so I think it might look odd in sysfs, sadly I don't have such a device to test with. Still, this might actually be an argument for having the attribute on the bridge. Thanks, Niklas
On Mon, Nov 04, 2024 at 10:44:23AM +0100, Niklas Schnelle wrote: > > One more question though, what would happen with this reset for a bus > with an SR-IOV device with more than 256 VFs i.e. where > pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since > VFs are physically still controlled by the bridge all VFs would be > reset but at the same time virtfn_add_bus() sets the bridge device for > the added bus as NULL so I think it might look odd in sysfs, sadly I > don't have such a device to test with. Still, this might actually be an > argument for having the attribute on the bridge. I assume everything is reset at the PCI level. Are you asking what the kernel does? I don't think it does anything special with SR-IOV functions. Those pci_dev's aren't attached to the bridge pci_dev; you have to go through the pci_bus' children instead.
On Mon, 2024-11-04 at 10:01 -0700, Keith Busch wrote: > On Mon, Nov 04, 2024 at 10:44:23AM +0100, Niklas Schnelle wrote: > > > > One more question though, what would happen with this reset for a bus > > with an SR-IOV device with more than 256 VFs i.e. where > > pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since > > VFs are physically still controlled by the bridge all VFs would be > > reset but at the same time virtfn_add_bus() sets the bridge device for > > the added bus as NULL so I think it might look odd in sysfs, sadly I > > don't have such a device to test with. Still, this might actually be an > > argument for having the attribute on the bridge. > > I assume everything is reset at the PCI level. > > Are you asking what the kernel does? I don't think it does anything > special with SR-IOV functions. Those pci_dev's aren't attached to the > bridge pci_dev; you have to go through the pci_bus' children instead. > I just want to make sure we're okay with the behavior with such VFs as it seems like the one case where a reset via the bridge affects PCI functions which aren't attached to the bridge pci_dev otherwise. And for example as I understand it these would not be covered by the pci_bus_save_and_disable_locked(). Thanks, Niklas
On Tue, Nov 05, 2024 at 01:28:39PM +0100, Niklas Schnelle wrote: > On Mon, 2024-11-04 at 10:01 -0700, Keith Busch wrote: > > On Mon, Nov 04, 2024 at 10:44:23AM +0100, Niklas Schnelle wrote: > > > > > > One more question though, what would happen with this reset for a bus > > > with an SR-IOV device with more than 256 VFs i.e. where > > > pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since > > > VFs are physically still controlled by the bridge all VFs would be > > > reset but at the same time virtfn_add_bus() sets the bridge device for > > > the added bus as NULL so I think it might look odd in sysfs, sadly I > > > don't have such a device to test with. Still, this might actually be an > > > argument for having the attribute on the bridge. > > > > I assume everything is reset at the PCI level. > > > > Are you asking what the kernel does? I don't think it does anything > > special with SR-IOV functions. Those pci_dev's aren't attached to the > > bridge pci_dev; you have to go through the pci_bus' children instead. > > > > I just want to make sure we're okay with the behavior with such VFs as > it seems like the one case where a reset via the bridge affects PCI > functions which aren't attached to the bridge pci_dev otherwise. And > for example as I understand it these would not be covered by the > pci_bus_save_and_disable_locked(). Well, it's no different than doing FLR to the PF while VFs are configured, which is existing behavior, so I think it's okay. The PF's SRIOV state still gets restored to the end of the reset. The VFs themselves don't get saved and locked, though. But again, this new patch is following how it already works for other reset methods.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 338dfcd983f1e..bbf12d4998269 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5158,6 +5158,8 @@ static void pci_dev_save_and_disable(struct pci_dev *dev) */ if (err_handler && err_handler->reset_prepare) err_handler->reset_prepare(dev); + else if (dev->driver) + pci_warn(dev, "resetting"); /* * Wake-up device prior to save. PM registers default to D0 after @@ -5191,6 +5193,8 @@ static void pci_dev_restore(struct pci_dev *dev) */ if (err_handler && err_handler->reset_done) err_handler->reset_done(dev); + else if (dev->driver) + pci_warn(dev, "reset done"); } /* dev->reset_methods[] is a 0-terminated list of indices into this array */