Message ID | 20120427190033.GA17588@ldl.usa.hp.com |
---|---|
State | Accepted |
Headers | show |
On Fri, Apr 27, 2012 at 1:00 PM, Khalid Aziz <khalid.aziz@hp.com> wrote: > Disable Bus Master bit on the device in > pci_device_shutdown() to ensure PCI devices do not continue > to DMA data after shutdown. This can cause memory > corruption in case of a kexec where the current kernel > shuts down and transfers control to a new kernel while a > PCI device continues to DMA to memory that does not belong > to it any more in the new kernel. > > I have tested this code on two laptops, two workstations and > a 16-socket server. kexec worked correctly on all of them. > > > Signed-off-by: Khalid Aziz <khalid.aziz@hp.com> > --- > drivers/pci/pci-driver.c | 6 ++++++ > 1 files changed, 6 insertions(+), 0 deletions(-) > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index 6b54b23..9db5940 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -420,6 +420,12 @@ static void pci_device_shutdown(struct device *dev) > pci_msi_shutdown(pci_dev); > pci_msix_shutdown(pci_dev); > > + /* > + * Turn off Bus Master bit on the device to tell it to not > + * continue to do DMA > + */ > + pci_disable_device(pci_dev); > + > /* > * Devices may be enabled to wake up by runtime PM, but they need not > * be supposed to wake up the system from its "power off" state (e.g. Any comment on this, Eric? It seems reasonable to me. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 3, 2012 at 5:52 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Fri, Apr 27, 2012 at 1:00 PM, Khalid Aziz <khalid.aziz@hp.com> wrote: >> Disable Bus Master bit on the device in >> pci_device_shutdown() to ensure PCI devices do not continue >> to DMA data after shutdown. This can cause memory >> corruption in case of a kexec where the current kernel >> shuts down and transfers control to a new kernel while a >> PCI device continues to DMA to memory that does not belong >> to it any more in the new kernel. >> >> I have tested this code on two laptops, two workstations and >> a 16-socket server. kexec worked correctly on all of them. >> >> >> Signed-off-by: Khalid Aziz <khalid.aziz@hp.com> >> --- >> drivers/pci/pci-driver.c | 6 ++++++ >> 1 files changed, 6 insertions(+), 0 deletions(-) >> >> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c >> index 6b54b23..9db5940 100644 >> --- a/drivers/pci/pci-driver.c >> +++ b/drivers/pci/pci-driver.c >> @@ -420,6 +420,12 @@ static void pci_device_shutdown(struct device *dev) >> pci_msi_shutdown(pci_dev); >> pci_msix_shutdown(pci_dev); >> >> + /* >> + * Turn off Bus Master bit on the device to tell it to not >> + * continue to do DMA >> + */ >> + pci_disable_device(pci_dev); >> + >> /* >> * Devices may be enabled to wake up by runtime PM, but they need not >> * be supposed to wake up the system from its "power off" state (e.g. > > Any comment on this, Eric? It seems reasonable to me. Applied to my "next" branch, thanks. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Apr 27, 2012 at 01:00:33PM -0600, Khalid Aziz wrote: > Disable Bus Master bit on the device in > pci_device_shutdown() to ensure PCI devices do not continue > to DMA data after shutdown. This can cause memory > corruption in case of a kexec where the current kernel > shuts down and transfers control to a new kernel while a > PCI device continues to DMA to memory that does not belong > to it any more in the new kernel. This protects against the case where a piece of hardware is continuing to DMA even after the driver shutdown method has been called? I'm not convinced this is safe. Some Broadcom parts will crash if busmastering is disabled while they're still performing DMA, and they'll then hang the bus if reenabled. There's also the risk that the hardware will start DMAing again if it's reenabled after being shut down. It seems like you're covering over the case where the driver didn't correctly quiesce the hardware, but you risk triggering other bugs instead.
On Wed, 2012-06-06 at 14:50 +0100, Matthew Garrett wrote: > On Fri, Apr 27, 2012 at 01:00:33PM -0600, Khalid Aziz wrote: > > Disable Bus Master bit on the device in > > pci_device_shutdown() to ensure PCI devices do not continue > > to DMA data after shutdown. This can cause memory > > corruption in case of a kexec where the current kernel > > shuts down and transfers control to a new kernel while a > > PCI device continues to DMA to memory that does not belong > > to it any more in the new kernel. > > This protects against the case where a piece of hardware is continuing > to DMA even after the driver shutdown method has been called? I'm not > convinced this is safe. Some Broadcom parts will crash if busmastering > is disabled while they're still performing DMA, and they'll then hang > the bus if reenabled. There's also the risk that the hardware will start > DMAing again if it's reenabled after being shut down. It seems like > you're covering over the case where the driver didn't correctly quiesce > the hardware, but you risk triggering other bugs instead. Hi Matthew, That is a good piece of information. I see your concern and agree with it. My take is shutdown method for the drivers will end all active I/O and clear the I/O queue. This should take care of any DMA caused by an I/O request originating in the kernel. For devices like NIC, a DMA can be triggered by an incoming packet and I am trying to stop that by disabling Bus Master bit. This is the issue that was reported on kexec mailing list in July of last year and it involved qla driver. I observed similar problem with kexec on ia64 many years ago and had written a patch to disable Bus Master bit on kexec. This patch was in ia64 tree for some time before it was removed. HP shipped kernels with this patch for many years and those kernels have been in deployment in field for some 7+ years with no problems. So it seems we do have a real problem. I understand there are devices with quirks related to Bus Master bit and it really helps to know about those. I have found disabling Bus Master bit has worked very well for all of the systems I have deployed kernels with this patch on but I have not come even close to having tried all PCI devices out there. I am open to other suggestions on how to solve this problem and make kexec reliable. Thanks Matthew! I appreciate the feedback.
On Wed, Jun 06, 2012 at 10:17:43AM -0600, Khalid Aziz wrote: > That is a good piece of information. I see your concern and agree with > it. My take is shutdown method for the drivers will end all active I/O > and clear the I/O queue. This should take care of any DMA caused by an > I/O request originating in the kernel. For devices like NIC, a DMA can > be triggered by an incoming packet and I am trying to stop that by > disabling Bus Master bit. This is the issue that was reported on kexec > mailing list in July of last year and it involved qla driver. I observed > similar problem with kexec on ia64 many years ago and had written a > patch to disable Bus Master bit on kexec. This patch was in ia64 tree > for some time before it was removed. HP shipped kernels with this patch > for many years and those kernels have been in deployment in field for > some 7+ years with no problems. If the qla driver knows that this is safe, then can't this just be done in the qla driver?
On Wed, 2012-06-06 at 17:27 +0100, Matthew Garrett wrote: > If the qla driver knows that this is safe, then can't this just be done > in the qla driver? > That is one way of doing it. It makes for a safe change but potentially leaves out a number of other cases that could be helped by wider scope change of disabling Bus Master bit on all PCI devices, until we laboriously debug every one of those cases and then add code to disable Bus Master bit. Sounds to me like it is not a clear win in either case. Do we agree that if device shutdown routine cleanly shuts down all I/O, clearing PCI Bus Mster bit should be safe? If yes, then we only have to deal with broken devices. So the approach could be to disable Bus Master bit unless the device ID matches a blacklist which we update as we find broken devices. I really don't like the idea of maintaining blacklists in the kernel for such things but is that a more practical approach? If blacklist does not sound good, maybe we can ask drivers to tell PCI subsystem if they are not ok with clearing Bus Master bit and then PCI subsystem could skip those devices.
On Wed, Jun 06, 2012 at 11:32:36AM -0600, Khalid Aziz wrote: > Do we agree that if device shutdown routine cleanly shuts down all I/O, > clearing PCI Bus Mster bit should be safe? In the absence of hardware that dislikes the bus master bit ever being disabled, yes. Do we know if hardware is ever tested in that situation? > If yes, then we only have to deal with broken devices. So the approach > could be to disable Bus Master bit unless the device ID matches a > blacklist which we update as we find broken devices. I really don't > like the idea of maintaining blacklists in the kernel for such things > but is that a more practical approach? If blacklist does not sound > good, maybe we can ask drivers to tell PCI subsystem if they are not > ok with clearing Bus Master bit and then PCI subsystem could skip > those devices. Or we could just put responsibility on the drivers to ensure that the hardware won't continue doing any DMA, either by shutting down the engines or clearing the bit.
On Wed, 2012-06-06 at 18:42 +0100, Matthew Garrett wrote: > On Wed, Jun 06, 2012 at 11:32:36AM -0600, Khalid Aziz wrote: > > > Do we agree that if device shutdown routine cleanly shuts down all I/O, > > clearing PCI Bus Mster bit should be safe? > > In the absence of hardware that dislikes the bus master bit ever being > disabled, yes. Do we know if hardware is ever tested in that situation? I will wait for device vendors to comment on that. I can't claim I have tested more than a few devices that way. > > > If yes, then we only have to deal with broken devices. So the approach > > could be to disable Bus Master bit unless the device ID matches a > > blacklist which we update as we find broken devices. I really don't > > like the idea of maintaining blacklists in the kernel for such things > > but is that a more practical approach? If blacklist does not sound > > good, maybe we can ask drivers to tell PCI subsystem if they are not > > ok with clearing Bus Master bit and then PCI subsystem could skip > > those devices. > > Or we could just put responsibility on the drivers to ensure that the > hardware won't continue doing any DMA, either by shutting down the > engines or clearing the bit. > I assume device shutdown routine should stop all I/O and shutting down DMA engine. Disabling Bus Master bit is just an extra measure of safety. I do like the idea of disabling Bus Master bit in device shutdown routine. After all, drivers know their hardware best. On the other hand, it is change to lots of driver code to implement this which means it will end up happening slowly over period of time. I don't mind doing the work up front on a good number of drivers I feel comfortable modifying. I am ok with pulling out code to clear bus master bit from PCI subsystem and replacing it with modified shutdown routines for a few drivers to start with. Does any one see any other issues with modifying driver shutdown routines for disabling Bus Master bit? Bjorn, any opinions? ==================================================================== Khalid Aziz Unix Systems Lab (970)898-9214 Hewlett-Packard khalid.aziz@hp.com Fort Collins, CO -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Khalid Aziz <khalid.aziz@hp.com> writes: > On Wed, 2012-06-06 at 18:42 +0100, Matthew Garrett wrote: >> On Wed, Jun 06, 2012 at 11:32:36AM -0600, Khalid Aziz wrote: >> >> > Do we agree that if device shutdown routine cleanly shuts down all I/O, >> > clearing PCI Bus Mster bit should be safe? >> >> In the absence of hardware that dislikes the bus master bit ever being >> disabled, yes. Do we know if hardware is ever tested in that situation? > > I will wait for device vendors to comment on that. I can't claim I have > tested more than a few devices that way. Testing is easy. kexec into a new kernel. Shrug. A long standing useful kernel feature. In all other cases I expec the firmware triggers a board level reset of the hardware to avoid issues during reboot. >> > If yes, then we only have to deal with broken devices. So the approach >> > could be to disable Bus Master bit unless the device ID matches a >> > blacklist which we update as we find broken devices. I really don't >> > like the idea of maintaining blacklists in the kernel for such things >> > but is that a more practical approach? If blacklist does not sound >> > good, maybe we can ask drivers to tell PCI subsystem if they are not >> > ok with clearing Bus Master bit and then PCI subsystem could skip >> > those devices. >> >> Or we could just put responsibility on the drivers to ensure that the >> hardware won't continue doing any DMA, either by shutting down the >> engines or clearing the bit. Where the responsibily has squarely been for the last decade, and we still have issues in the common case. > I assume device shutdown routine should stop all I/O and shutting down > DMA engine. Disabling Bus Master bit is just an extra measure of safety. > I do like the idea of disabling Bus Master bit in device shutdown > routine. After all, drivers know their hardware best. On the other hand, > it is change to lots of driver code to implement this which means it > will end up happening slowly over period of time. I don't mind doing the > work up front on a good number of drivers I feel comfortable modifying. > I am ok with pulling out code to clear bus master bit from PCI subsystem > and replacing it with modified shutdown routines for a few drivers to > start with. Absent anyone even knowing if there are devices that exist that can not tolerate their bus master bit being flipped when DMA is not ongoing I think the current state of the code is good. When we find the broken hardware that can not tolerate a standard PCI bit being used in a standard way we can add a flag in the core to avoid doing that. pci_device_shutdown calls drv->shutdown before calling pci_device_disable. Which means that only devices that have trouble with this bit being flipped while DMA is ongoing and don't bother to stop their own DMA will have a problem. As for shifting problems I do think we have shifted the problem in a very positive way. Now instead of having a random failure at a random location caused by DMA happing at a random moment for no expected reason we have failures happening when we disable or enable a device, which should be much more debugable. If we encounter devices that can't have their bus master bit disabled at all we can move that functionality into the drivers or add some sort of flag so that pci_device_shutdown avoids this on real hardware. > Does any one see any other issues with modifying driver shutdown > routines for disabling Bus Master bit? Bjorn, any opinions? I don't have a problem with moving it all of the way into the drivers I just think it might be a little bit silly at this point. Ultimately I don't see the complaint raised by this thread. Either the drivers for the broadcom devices in questoin are buggy before we added the pci_disable_device or those drivers are not buggy. If we really want to do something to reduce the testing burden and make certain things work better in general we need to merge the device shutdown and the device remove methods. Shrug. People keep getting squeamish when I suggest that. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 06, 2012 at 12:42:07PM -0700, Eric W. Biederman wrote: > pci_device_shutdown calls drv->shutdown before calling > pci_device_disable. Which means that only devices that have trouble > with this bit being flipped while DMA is ongoing and don't bother > to stop their own DMA will have a problem. drv->shutdown should already be quiescing the hardware. If it isn't, it should be. If it is, what does this patch fix? Many drivers call pci_enable_device() early enough that they clearly expect the hardware to be quiescent when they do so, so this really does seem to simply handle the kexec case without handling any other cases that could be similarly problematic.
On Wed, Jun 6, 2012 at 12:07 PM, Khalid Aziz <khalid.aziz@hp.com> wrote: > On Wed, 2012-06-06 at 18:42 +0100, Matthew Garrett wrote: >> On Wed, Jun 06, 2012 at 11:32:36AM -0600, Khalid Aziz wrote: >> >> > Do we agree that if device shutdown routine cleanly shuts down all I/O, >> > clearing PCI Bus Mster bit should be safe? >> >> In the absence of hardware that dislikes the bus master bit ever being >> disabled, yes. Do we know if hardware is ever tested in that situation? > > I will wait for device vendors to comment on that. I can't claim I have > tested more than a few devices that way. > >> >> > If yes, then we only have to deal with broken devices. So the approach >> > could be to disable Bus Master bit unless the device ID matches a >> > blacklist which we update as we find broken devices. I really don't >> > like the idea of maintaining blacklists in the kernel for such things >> > but is that a more practical approach? If blacklist does not sound >> > good, maybe we can ask drivers to tell PCI subsystem if they are not >> > ok with clearing Bus Master bit and then PCI subsystem could skip >> > those devices. >> >> Or we could just put responsibility on the drivers to ensure that the >> hardware won't continue doing any DMA, either by shutting down the >> engines or clearing the bit. >> > > I assume device shutdown routine should stop all I/O and shutting down > DMA engine. Disabling Bus Master bit is just an extra measure of safety. > I do like the idea of disabling Bus Master bit in device shutdown > routine. After all, drivers know their hardware best. On the other hand, > it is change to lots of driver code to implement this which means it > will end up happening slowly over period of time. I don't mind doing the > work up front on a good number of drivers I feel comfortable modifying. > I am ok with pulling out code to clear bus master bit from PCI subsystem > and replacing it with modified shutdown routines for a few drivers to > start with. > > Does any one see any other issues with modifying driver shutdown > routines for disabling Bus Master bit? Bjorn, any opinions? Hay Khalid, I've been following this discussion but still do not understand what the perceived correct fix is for the situation you were originally trying to circumvent - namely, stopping NIC devices from triggering DMA activity upon receipt of an incoming packet. From the discussion it sounds like a driver, upon shutdown, can end all active I/O clearing the I/O queue, and also (?) shut down its DMA engine, and may also go as far as disabling the Bus Master bit. What was occurring in the original issue - was the driver in question not shutting down its DMA engine for some, possibly valid, reason? Thanks, Myron > > ==================================================================== > Khalid Aziz Unix Systems Lab > (970)898-9214 Hewlett-Packard > khalid.aziz@hp.com Fort Collins, CO > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> This protects against the case where a piece of hardware is continuing > to DMA even after the driver shutdown method has been called? I'm not It doesn't. We also have hardware which craps itself if you clear the bus mastering bit and we have platforms where the BIOS gets most upset if you do that on suspend paths. There are also lots of devices that simply ignore the bus mastering bit ! > convinced this is safe. Some Broadcom parts will crash if busmastering > is disabled while they're still performing DMA, and they'll then hang > the bus if reenabled. This is very common, disabling the bus master bit isn't exactly a tested or well defined pathway. A lot of device FIFOs will also happily belch out their queue when you flip the bit. So it might look like progress but it probably isn't. Unfortunately if you've got a device peeing into memory you need to fix the driver, or sometimes the firmware. You can probably use the IOMMU for some protection on newer systems. And then you get things like the CS5520 where disabling bus mastering on the IDE controller crashes the machine, and some UMA video devices that honour the bit for video fetch. From the IDE experience you really need to be careful here. Changing the default is asking for nasty and weird regressions. This isn't a fix, its a band aid to cover over broken driver shutdown methods. Those driver shutdown methods need fixing. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-06-06 at 14:16 -0600, Myron Stowe wrote: > Hay Khalid, > > I've been following this discussion but still do not understand what the > perceived correct fix is for the situation you were originally trying to > circumvent - namely, stopping NIC devices from triggering DMA activity > upon receipt of an incoming packet. > > From the discussion it sounds like a driver, upon shutdown, can end > all active I/O > clearing the I/O queue, and also (?) shut down its DMA engine, and may > also go as far as disabling the Bus Master bit. > > What was occurring in the original issue - was the driver in question not > shutting down its DMA engine for some, possibly valid, reason? > > Thanks, > Myron Hi Myron, The last driver issue with kexec was reported on kexec mailing list in July of last year and that was with qla driver. From what H Peter Anvin described, the driver left DMA engine running. For the issue I found on an ia64 platform, it was so many years ago that you are now taxing my old and decrepit memory :) I am digging through old internal bug reports and will let you know once I find that original bug report. ==================================================================== Khalid Aziz Unix Systems Lab (970)898-9214 Hewlett-Packard khalid.aziz@hp.com Fort Collins, CO -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 6, 2012 at 5:03 PM, Khalid Aziz <khalid.aziz@hp.com> wrote: > On Wed, 2012-06-06 at 14:16 -0600, Myron Stowe wrote: >> Hay Khalid, >> >> I've been following this discussion but still do not understand what the >> perceived correct fix is for the situation you were originally trying to >> circumvent - namely, stopping NIC devices from triggering DMA activity >> upon receipt of an incoming packet. >> >> From the discussion it sounds like a driver, upon shutdown, can end >> all active I/O >> clearing the I/O queue, and also (?) shut down its DMA engine, and may >> also go as far as disabling the Bus Master bit. >> >> What was occurring in the original issue - was the driver in question not >> shutting down its DMA engine for some, possibly valid, reason? >> >> Thanks, >> Myron > > Hi Myron, > > The last driver issue with kexec was reported on kexec mailing list in > July of last year and that was with qla driver. From what H Peter Anvin > described, the driver left DMA engine running. > > For the issue I found on an ia64 platform, it was so many years ago that > you are now taxing my old and decrepit memory :) Yeah, if it was more than a couple of weeks ago then I'm likely to have forgotten myself. If you can find it great but don't spend too much time digging. I just wanted some validation of my understanding - o That drivers must consider both "active I/O" and "DMA" (when shutting down), o That the Bus Master disablement was trying to compensate for a driver not quiescing "DMA" as it probably should have (I wasn't sure if these were separate capabilities and/or if there was possibly a valid reason for a driver leaving "DMA" in an enabled state upon shutdown - i.e. Wake on LAN or something similar). Myron > I am digging through > old internal bug reports and will let you know once I find that original > bug report. > > ==================================================================== > Khalid Aziz Unix Systems Lab > (970)898-9214 Hewlett-Packard > khalid.aziz@hp.com Fort Collins, CO > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-06-06 at 12:42 -0700, Eric W. Biederman wrote: > Absent anyone even knowing if there are devices that exist that can not > tolerate their bus master bit being flipped when DMA is not ongoing I > think the current state of the code is good. When we find the broken > hardware that can not tolerate a standard PCI bit being used in a > standard way we can add a flag in the core to avoid doing that. > > pci_device_shutdown calls drv->shutdown before calling > pci_device_disable. Which means that only devices that have trouble > with this bit being flipped while DMA is ongoing and don't bother > to stop their own DMA will have a problem. > > As for shifting problems I do think we have shifted the problem in a > very positive way. Now instead of having a random failure at a random > location caused by DMA happing at a random moment for no expected reason > we have failures happening when we disable or enable a device, which > should be much more debugable. That is a very good point. Failure at a predictable point is much better than random failures with the root cause being elsewhere from the point of failure in time and code. This at least gives us a good shot at being able to debug buggy hardware and drivers.
Alan Cox <alan@lxorguk.ukuu.org.uk> writes: >> This protects against the case where a piece of hardware is continuing >> to DMA even after the driver shutdown method has been called? I'm not > > It doesn't. We also have hardware which craps itself if you clear the bus > mastering bit and we have platforms where the BIOS gets most upset if you > do that on suspend paths. There are also lots of devices that simply > ignore the bus mastering bit ! But it also makes my system do kexec successfully for the first time. Maybe can make it an option. n-Andi
Matthew Garrett <mjg@redhat.com> writes: > > This protects against the case where a piece of hardware is continuing > to DMA even after the driver shutdown method has been called? I'm not > convinced this is safe. Some Broadcom parts will crash if busmastering > is disabled while they're still performing DMA, and they'll then hang > the bus if reenabled. There's also the risk that the hardware will start > DMAing again if it's reenabled after being shut down. It seems like > you're covering over the case where the driver didn't correctly quiesce > the hardware, but you risk triggering other bugs instead. One alternative I've been pondering some time is to use AER link reset instead. But this is mainly on servers, a lot of clients don't have it. -Andi
> But it also makes my system do kexec successfully for the first time. > > Maybe can make it an option. More useful might be to report which devices have the bus master bit left on at the time you tried the kexec, then we might actually find the bug. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-06-07 at 10:07 -0700, Andi Kleen wrote: > Alan Cox <alan@lxorguk.ukuu.org.uk> writes: > > >> This protects against the case where a piece of hardware is continuing > >> to DMA even after the driver shutdown method has been called? I'm not > > > > It doesn't. We also have hardware which craps itself if you clear the bus > > mastering bit and we have platforms where the BIOS gets most upset if you > > do that on suspend paths. There are also lots of devices that simply > > ignore the bus mastering bit ! > > But it also makes my system do kexec successfully for the first time. > > Maybe can make it an option. > > n-Andi > Hi Andi, That makes my day :) It will be very helpful to see which PCI devices your system has. I was thinking about kernel command line option this morning and had started looking at reset_devices option to see if I could leverage that one in any way.
On Wed, 2012-06-06 at 21:09 +0100, Matthew Garrett wrote: > On Wed, Jun 06, 2012 at 12:42:07PM -0700, Eric W. Biederman wrote: > > > pci_device_shutdown calls drv->shutdown before calling > > pci_device_disable. Which means that only devices that have trouble > > with this bit being flipped while DMA is ongoing and don't bother > > to stop their own DMA will have a problem. > > drv->shutdown should already be quiescing the hardware. If it isn't, it > should be. If it is, what does this patch fix? Many drivers > call pci_enable_device() early enough that they clearly expect the > hardware to be quiescent when they do so, so this really does seem to > simply handle the kexec case without handling any other cases that could > be similarly problematic. > I agree drv->shutdown should quiesce the hardware (although anecdotal evidence suggests that is not happening), so turning Bus Master bit off is an additional safety measure. As Alan pointed out, doing that can be fraught with danger. Clearing Bus Master bit seems like the right thing to do, but due to buggy hardware/firmware it can cause problems, although those problems happen at predictable times and thus become easier to diagnose. I am considering other options. Andi mentioned command line option which does allow for some control instead of blindly clearing Bus Master bit on all system. I am also wondering if clearing Bus Master bit on just the PCI bridges is an option. If Bus Master bit is clear on a bridge, it is not supposed to allow DMA transactions through it. That can prevent random memory corruption by broken devices and also not upset the devices that do not like their Bus Master bit cleared. Any opinions on this option? I have also acquired a broadcom NIC that is causing what looks like random DMAs into kernel memory on a kexec, although this is with a kernel that does not have the Bus Master reset patch. I will run some experiments on this NIC and see what I can figure out. ==================================================================== Khalid Aziz Unix Systems Lab (970)898-9214 Hewlett-Packard khalid.aziz@hp.com Fort Collins, CO -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 6b54b23..9db5940 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -420,6 +420,12 @@ static void pci_device_shutdown(struct device *dev) pci_msi_shutdown(pci_dev); pci_msix_shutdown(pci_dev); + /* + * Turn off Bus Master bit on the device to tell it to not + * continue to do DMA + */ + pci_disable_device(pci_dev); + /* * Devices may be enabled to wake up by runtime PM, but they need not * be supposed to wake up the system from its "power off" state (e.g.
Disable Bus Master bit on the device in pci_device_shutdown() to ensure PCI devices do not continue to DMA data after shutdown. This can cause memory corruption in case of a kexec where the current kernel shuts down and transfers control to a new kernel while a PCI device continues to DMA to memory that does not belong to it any more in the new kernel. I have tested this code on two laptops, two workstations and a 16-socket server. kexec worked correctly on all of them. Signed-off-by: Khalid Aziz <khalid.aziz@hp.com> --- drivers/pci/pci-driver.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-)