Message ID | 4664cf39-5dfe-4557-959d-149640ca36e1@roeck-us.net |
---|---|
State | New |
Headers | show |
Series | Problems (timeouts) when testing usb-ohci with qemu | expand |
On Tue, Apr 23, 2024 at 10:04:17AM -0700, Guenter Roeck wrote: > Hi, > > when testing usb-ohci What is usb-ohci? Do you mean ohci-hcd? > with qemu's pci-ohci emulation, I keep getting > random usb interface timeouts. Sometimes the usb_hub_wq times out. ... > Sometimes there is an i/o scheduling timeout such as ... > This is not a new problem; I have seen it forever. Recently I spent some > time trying to understand the problem and found that the linux driver does > not always handle all ohci interupts. Please be more specific: _Which_ interrupts aren't being handled? That is, which flags remain set in the intrstatus register when the handler returns? > Since the interrupt is shared and > thus level triggered, that means that interrupts are still pending when > ohci_irq() exits. The interrupt core in Linux does not re-enter the > interrupt handler, presumably because it is marked as shared interrupt > and returns that the interrupt has been handled. Isn't that behavior mistaken? A level-triggered IRQ that remains set when it is re-enabled (when the various shared handlers return) should cause another interrupt to occur right away. Edged-triggered interrupts would be a different story, of course. > I found two possible fixes for the problem. One essentially mirrors the > code from ehci_irq(), the other adds a (bad) kludge into qemu. Both "fix" > or work around the problem. > > Question is: What is actually wrong ? Something in the generic interrupt > handling code in Linux, something in the Linux usb-ohci driver, or > something in qemu ? Any idea how a proper fix might look like ? To answer these questions we need more information. Alan Stern
Hi Alan, On 4/23/24 10:30, Alan Stern wrote: > On Tue, Apr 23, 2024 at 10:04:17AM -0700, Guenter Roeck wrote: >> Hi, >> >> when testing usb-ohci > > What is usb-ohci? Do you mean ohci-hcd? > >> with qemu's pci-ohci emulation, I keep getting >> random usb interface timeouts. Sometimes the usb_hub_wq times out. > ... > >> Sometimes there is an i/o scheduling timeout such as > ... > >> This is not a new problem; I have seen it forever. Recently I spent some >> time trying to understand the problem and found that the linux driver does >> not always handle all ohci interupts. > > Please be more specific: _Which_ interrupts aren't being handled? That > is, which flags remain set in the intrstatus register when the handler > returns? > Sorry, I didn't think it was relevant. Here is the result of repeating the test with the linux patch several times and passing the log output through "sort | uniq". 1 pending 0x2 mask 0x8000005a 9 pending 0x4 mask 0x8000001e 208 pending 0x4 mask 0x8000005e So it is almost always OHCI_INTR_SF and rarely OHCI_INTR_WDH. For reference, this is repeatedly running CMDLINE="panic=-1 kunit.stats_enabled=2 kunit.filter=speed>slow root=/dev/sda rootwait console=ttyS0,115200 earlycon=uart8250,mmio,0x10000000,115200" qemu-system-riscv32 -M virt -m 512M \ -no-reboot -bios default -kernel arch/riscv/boot/Image -snapshot \ -device virtio-net-device,netdev=net0 -netdev user,id=net0 -usb \ -device pci-ohci,id=ohci -device usb-storage,bus=ohci.0,drive=d0 \ -drive file=rootfs.ext2,if=none,id=d0,format=raw \ -append "${CMDLINE}" \ -nographic -monitor none \ --trace 'usb*' -D "/tmp/usb-ohci.$$.${sequence}.trace.log" Tracing isn't really necessary but it changes timing enough that the problem is more likely to be seen if it is active. The problem is seen with various emulations; I just picked one of them. >> Since the interrupt is shared and >> thus level triggered, that means that interrupts are still pending when >> ohci_irq() exits. The interrupt core in Linux does not re-enter the >> interrupt handler, presumably because it is marked as shared interrupt >> and returns that the interrupt has been handled. > > Isn't that behavior mistaken? A level-triggered IRQ that remains set when > it is re-enabled (when the various shared handlers return) should cause > another interrupt to occur right away. > > Edged-triggered interrupts would be a different story, of course. > Maybe I got it wrong; I thought that shared interrupts would have to be level triggered. But then you are correct: one would think that level triggered interrupts would keep executing interrupt handlers until the interrupt is completely handled. I guess that means that I don't really understand what is happening. Sorry for jumping to conclusions. >> I found two possible fixes for the problem. One essentially mirrors the >> code from ehci_irq(), the other adds a (bad) kludge into qemu. Both "fix" >> or work around the problem. >> >> Question is: What is actually wrong ? Something in the generic interrupt >> handling code in Linux, something in the Linux usb-ohci driver, or >> something in qemu ? Any idea how a proper fix might look like ? > > To answer these questions we need more information. > What else would you need ? Thanks, Guenter
On Tue, Apr 23, 2024 at 02:10:13PM -0700, Guenter Roeck wrote: > Hi Alan, > > On 4/23/24 10:30, Alan Stern wrote: > > On Tue, Apr 23, 2024 at 10:04:17AM -0700, Guenter Roeck wrote: > > > Hi, > > > > > > when testing usb-ohci > > > > What is usb-ohci? Do you mean ohci-hcd? > > > > > with qemu's pci-ohci emulation, I keep getting > > > random usb interface timeouts. Sometimes the usb_hub_wq times out. > > ... > > > > > Sometimes there is an i/o scheduling timeout such as > > ... > > > > > This is not a new problem; I have seen it forever. Recently I spent some > > > time trying to understand the problem and found that the linux driver does > > > not always handle all ohci interupts. > > > > Please be more specific: _Which_ interrupts aren't being handled? That > > is, which flags remain set in the intrstatus register when the handler > > returns? > > > Sorry, I didn't think it was relevant. > > Here is the result of repeating the test with the linux patch several times > and passing the log output through "sort | uniq". > > 1 pending 0x2 mask 0x8000005a > 9 pending 0x4 mask 0x8000001e > 208 pending 0x4 mask 0x8000005e > > So it is almost always OHCI_INTR_SF and rarely OHCI_INTR_WDH. I see. Both of those flags can legitimately get turned on while the interrupt routine is running, and under heavy load this could happen fairly frequently. The SF (Start of Frame) interrupt occurs once every millisecond, and the WDH (Writeback Done Head) interrupt occurs more or less when a transfer completes. > For reference, this is repeatedly running > > CMDLINE="panic=-1 kunit.stats_enabled=2 kunit.filter=speed>slow root=/dev/sda rootwait console=ttyS0,115200 earlycon=uart8250,mmio,0x10000000,115200" > qemu-system-riscv32 -M virt -m 512M \ > -no-reboot -bios default -kernel arch/riscv/boot/Image -snapshot \ > -device virtio-net-device,netdev=net0 -netdev user,id=net0 -usb \ > -device pci-ohci,id=ohci -device usb-storage,bus=ohci.0,drive=d0 \ > -drive file=rootfs.ext2,if=none,id=d0,format=raw \ > -append "${CMDLINE}" \ > -nographic -monitor none \ > --trace 'usb*' -D "/tmp/usb-ohci.$$.${sequence}.trace.log" > > Tracing isn't really necessary but it changes timing enough that > the problem is more likely to be seen if it is active. The problem > is seen with various emulations; I just picked one of them. > > > > Since the interrupt is shared and > > > thus level triggered, that means that interrupts are still pending when > > > ohci_irq() exits. The interrupt core in Linux does not re-enter the > > > interrupt handler, presumably because it is marked as shared interrupt > > > and returns that the interrupt has been handled. > > > > Isn't that behavior mistaken? A level-triggered IRQ that remains set when > > it is re-enabled (when the various shared handlers return) should cause > > another interrupt to occur right away. > > > > Edged-triggered interrupts would be a different story, of course. > > > > Maybe I got it wrong; I thought that shared interrupts would have to be > level triggered. But then you are correct: one would think that level > triggered interrupts would keep executing interrupt handlers until the > interrupt is completely handled. I guess that means that I don't really > understand what is happening. Sorry for jumping to conclusions. To avoid the overhead of repeated interrupts, it would be best to check the interrupt status at the end of the routine and restart if any of the enabled bits are set, as in your first patch. If you would like to clean it up (get rid of the debugging stuff) and submit it, I'll review it. > > > I found two possible fixes for the problem. One essentially mirrors the > > > code from ehci_irq(), the other adds a (bad) kludge into qemu. Both "fix" > > > or work around the problem. > > > > > > Question is: What is actually wrong ? Something in the generic interrupt > > > handling code in Linux, something in the Linux usb-ohci driver, or > > > something in qemu ? Any idea how a proper fix might look like ? > > > > To answer these questions we need more information. > > > What else would you need ? Nothing else really, just the remaining interrupt flag bits. Alan Stern
> qemu hack: > > hw/usb/hcd-ohci.c | 11 +++++++++++ > hw/usb/hcd-ohci.h | 1 + > 2 files changed, 12 insertions(+) > > diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c > index fc8fc91a1d..99e52ad13a 100644 > --- a/hw/usb/hcd-ohci.c > +++ b/hw/usb/hcd-ohci.c > @@ -267,6 +267,10 @@ static inline void ohci_intr_update(OHCIState *ohci) > (ohci->intr_status & ohci->intr)) > level = 1; > > + if (level && ohci->level) > + qemu_set_irq(ohci->irq, 0); > + > + ohci->level = level; > qemu_set_irq(ohci->irq, level); > } > > diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h > index e1827227ac..6f82e72bd9 100644 > --- a/hw/usb/hcd-ohci.h > +++ b/hw/usb/hcd-ohci.h > @@ -52,6 +52,7 @@ struct OHCIState { > uint32_t ctl, status; > uint32_t intr_status; > uint32_t intr; > + int level; > > /* memory pointer partition */ > uint32_t hcca; Phew. Disclaimer: Havn't looked at the ohci emulation code for years. It should not be needed to store the interrupt level that way. It is possible to calculate what the interrupt level should be, based on the interrupt status register and the interrupt mask register, and the code above seems to do exactly that (the "ohci->intr_status & ohci->intr" bit). ohci_intr_update() must be called if one of these two registers changes, i.e. if the guest changes the mask, if the guest acks an IRQ by clearing an status bit, if the device raises an IRQ by setting an status bit. Might be the ohci emulation has a bug here. Another possible trouble spot is that the timing behavior is different on virtual vs. physical hardware. Specifically with the emulated hardware some actions appear to complete instantly (when the vmexit to handle the mmio register write returns it's finished already), which will never complete that quickly on physical hardware. So drivers can have race conditions which only trigger on virtual hardware. The error pattern you are describing sounds like this could be the case here. HTH & take care, Gerd
On 4/23/24 19:11, Alan Stern wrote: [ ... ] > > To avoid the overhead of repeated interrupts, it would be best to check the > interrupt status at the end of the routine and restart if any of the > enabled bits are set, as in your first patch. > > If you would like to clean it up (get rid of the debugging stuff) and > submit it, I'll review it. > Sure, I'll do that. Thanks, Guenter
On 4/24/24 04:16, Gerd Hoffmann wrote: >> qemu hack: >> >> hw/usb/hcd-ohci.c | 11 +++++++++++ >> hw/usb/hcd-ohci.h | 1 + >> 2 files changed, 12 insertions(+) >> >> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c >> index fc8fc91a1d..99e52ad13a 100644 >> --- a/hw/usb/hcd-ohci.c >> +++ b/hw/usb/hcd-ohci.c >> @@ -267,6 +267,10 @@ static inline void ohci_intr_update(OHCIState *ohci) >> (ohci->intr_status & ohci->intr)) >> level = 1; >> >> + if (level && ohci->level) >> + qemu_set_irq(ohci->irq, 0); >> + >> + ohci->level = level; >> qemu_set_irq(ohci->irq, level); >> } >> >> diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h >> index e1827227ac..6f82e72bd9 100644 >> --- a/hw/usb/hcd-ohci.h >> +++ b/hw/usb/hcd-ohci.h >> @@ -52,6 +52,7 @@ struct OHCIState { >> uint32_t ctl, status; >> uint32_t intr_status; >> uint32_t intr; >> + int level; >> >> /* memory pointer partition */ >> uint32_t hcca; > > Phew. Disclaimer: Havn't looked at the ohci emulation code for years. > > It should not be needed to store the interrupt level that way. It is > possible to calculate what the interrupt level should be, based on the > interrupt status register and the interrupt mask register, and the code > above seems to do exactly that (the "ohci->intr_status & ohci->intr" > bit). > You are correct. For the purpose of this kludge a simpler + qemu_set_irq(ohci->irq, 0); qemu_set_irq(ohci->irq, level); would have been sufficient. My original code added tracing, where this generated a lot of noise. I didn't completely simplify the kludge. Sorry for that and for any confusion it may have caused. > ohci_intr_update() must be called if one of these two registers changes, > i.e. if the guest changes the mask, if the guest acks an IRQ by clearing > an status bit, if the device raises an IRQ by setting an status bit. > Might be the ohci emulation has a bug here. > > Another possible trouble spot is that the timing behavior is different > on virtual vs. physical hardware. Specifically with the emulated > hardware some actions appear to complete instantly (when the vmexit to > handle the mmio register write returns it's finished already), which > will never complete that quickly on physical hardware. So drivers can > have race conditions which only trigger on virtual hardware. The error > pattern you are describing sounds like this could be the case here. > I think the underlying problem is that both the qemu emulation and the Linux kernel driver expect that the interrupt is level triggered. It looks like some entity in between handles the interrupts as edge triggered. This makes the kludge necessary: All it does is to generate an artificial interrupt edge. This can be worked around in the Linux interrupt handler by checking if another interrupt arrived while the original interrupt was handled. This will ensure that all interrupts are handled and cleared when the handler exits, and that a later arriving interrupt will generate the necessary edge and thus another interrupt. That doesn't fix the edge<->level triggered interrupt confusion (if that is indeed the root cause of the problem), but it does address its consequences. If anyone has an idea how to find out where the interrupt confusion happens, please let me know, and I'll be happy to do some more testing. I am quite curious myself, and it would make sense to solve the problem at its root. Thanks, Guenter
On 4/24/24 08:23, Guenter Roeck wrote: > On 4/24/24 04:16, Gerd Hoffmann wrote: >>> qemu hack: >>> >>> hw/usb/hcd-ohci.c | 11 +++++++++++ >>> hw/usb/hcd-ohci.h | 1 + >>> 2 files changed, 12 insertions(+) >>> >>> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c >>> index fc8fc91a1d..99e52ad13a 100644 >>> --- a/hw/usb/hcd-ohci.c >>> +++ b/hw/usb/hcd-ohci.c >>> @@ -267,6 +267,10 @@ static inline void ohci_intr_update(OHCIState *ohci) >>> (ohci->intr_status & ohci->intr)) >>> level = 1; >>> + if (level && ohci->level) >>> + qemu_set_irq(ohci->irq, 0); >>> + >>> + ohci->level = level; >>> qemu_set_irq(ohci->irq, level); >>> } >>> diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h >>> index e1827227ac..6f82e72bd9 100644 >>> --- a/hw/usb/hcd-ohci.h >>> +++ b/hw/usb/hcd-ohci.h >>> @@ -52,6 +52,7 @@ struct OHCIState { >>> uint32_t ctl, status; >>> uint32_t intr_status; >>> uint32_t intr; >>> + int level; >>> /* memory pointer partition */ >>> uint32_t hcca; >> >> Phew. Disclaimer: Havn't looked at the ohci emulation code for years. >> >> It should not be needed to store the interrupt level that way. It is >> possible to calculate what the interrupt level should be, based on the >> interrupt status register and the interrupt mask register, and the code >> above seems to do exactly that (the "ohci->intr_status & ohci->intr" >> bit). >> > > You are correct. For the purpose of this kludge a simpler > + qemu_set_irq(ohci->irq, 0); > qemu_set_irq(ohci->irq, level); > > would have been sufficient. My original code added tracing, > where this generated a lot of noise. I didn't completely simplify > the kludge. Sorry for that and for any confusion it may have caused. > >> ohci_intr_update() must be called if one of these two registers changes, >> i.e. if the guest changes the mask, if the guest acks an IRQ by clearing >> an status bit, if the device raises an IRQ by setting an status bit. >> Might be the ohci emulation has a bug here. >> >> Another possible trouble spot is that the timing behavior is different >> on virtual vs. physical hardware. Specifically with the emulated >> hardware some actions appear to complete instantly (when the vmexit to >> handle the mmio register write returns it's finished already), which >> will never complete that quickly on physical hardware. So drivers can >> have race conditions which only trigger on virtual hardware. The error >> pattern you are describing sounds like this could be the case here. >> > > I think the underlying problem is that both the qemu emulation and > the Linux kernel driver expect that the interrupt is level triggered. > It looks like some entity in between handles the interrupts as edge > triggered. This makes the kludge necessary: All it does is to generate > an artificial interrupt edge. > > This can be worked around in the Linux interrupt handler by checking > if another interrupt arrived while the original interrupt was handled. > This will ensure that all interrupts are handled and cleared when the > handler exits, and that a later arriving interrupt will generate the > necessary edge and thus another interrupt. That doesn't fix the > edge<->level triggered interrupt confusion (if that is indeed the root > cause of the problem), but it does address its consequences. > > If anyone has an idea how to find out where the interrupt confusion > happens, please let me know, and I'll be happy to do some more testing. > I am quite curious myself, and it would make sense to solve the problem > at its root. > Update: I found upstream commit 0b60557230ad ("usb: ehci: Prevent missed ehci interrupts with edge-triggered MSI") which I think explains the problem. Guenter
diff --git a/drivers/usb/host/ohci-hcd.c b/drivers/usb/host/ohci-hcd.c index 4f9982ecfb58..48d523e71ea0 100644 --- a/drivers/usb/host/ohci-hcd.c +++ b/drivers/usb/host/ohci-hcd.c @@ -883,6 +883,7 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) * optimization of checking the LSB of hcca->done_head; it doesn't * work on all systems (edge triggering for OHCI can be a factor). */ +retry: ints = ohci_readl(ohci, ®s->intrstatus); /* Check for an all 1's result which is a typical consequence @@ -982,6 +983,14 @@ static irqreturn_t ohci_irq (struct usb_hcd *hcd) } spin_unlock(&ohci->lock); + ints = ohci_readl(ohci, ®s->intrstatus); + ints &= ohci_readl(ohci, ®s->intrenable); + if (ints) { + pr_err("######## Interrupts still pending 0x%x mask 0x%x\n", ints, + ohci_readl(ohci, ®s->intrenable)); + goto retry; + } + return IRQ_HANDLED; } --- qemu hack: hw/usb/hcd-ohci.c | 11 +++++++++++ hw/usb/hcd-ohci.h | 1 + 2 files changed, 12 insertions(+) diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c index fc8fc91a1d..99e52ad13a 100644 --- a/hw/usb/hcd-ohci.c +++ b/hw/usb/hcd-ohci.c @@ -267,6 +267,10 @@ static inline void ohci_intr_update(OHCIState *ohci) (ohci->intr_status & ohci->intr)) level = 1; + if (level && ohci->level) + qemu_set_irq(ohci->irq, 0); + + ohci->level = level; qemu_set_irq(ohci->irq, level); } diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h index e1827227ac..6f82e72bd9 100644 --- a/hw/usb/hcd-ohci.h +++ b/hw/usb/hcd-ohci.h @@ -52,6 +52,7 @@ struct OHCIState { uint32_t ctl, status; uint32_t intr_status; uint32_t intr; + int level; /* memory pointer partition */ uint32_t hcca;