Problems (timeouts) when testing usb-ohci with qemu

Message ID	4664cf39-5dfe-4557-959d-149640ca36e1@roeck-us.net
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> Date: Tue, 23 Apr 2024 10:04:17 -0700 From: Guenter Roeck <linux@roeck-us.net> To: Alan Stern <stern@rowland.harvard.edu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, linux-usb@vger.kernel.org, Gerd Hoffmann <kraxel@redhat.com>, qemu-devel@nongnu.org Subject: Problems (timeouts) when testing usb-ohci with qemu Message-ID: <4664cf39-5dfe-4557-959d-149640ca36e1@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=groeck7@gmail.com; helo=mail-pf1-x42e.google.com X-Spam_score_int: -14 X-Spam_score: -1.5 X-Spam_bar: - X-Spam_report: (-1.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Series	Problems (timeouts) when testing usb-ohci with qemu \| expand Problems (timeouts) when testing usb-ohci with qemu

Guenter Roeck April 23, 2024, 5:04 p.m. UTC

Hi,

when testing usb-ohci with qemu's pci-ohci emulation, I keep getting
random usb interface timeouts. Sometimes the usb_hub_wq times out.

[    9.555666] Waiting for root device /dev/sda...
[   62.452625] INFO: task kworker/0:2:42 blocked for more than 30 seconds.
[   62.453036]       Tainted: G                 N 6.9.0-rc1-00305-geae7a41d2233 #1
[   62.453393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   62.453778] task:kworker/0:2     state:D stack:0     pid:42    tgid:42    ppid:2      flags:0x00000000
[   62.454700] Workqueue: usb_hub_wq hub_event
[   62.455137] Call Trace:
[   62.455416] [<c0b47836>] __schedule+0x35c/0xe00
[   62.455708] [<c0b4830c>] schedule+0x32/0x178
[   62.455906] [<c07da782>] usb_kill_urb+0xa8/0xda
[   62.456220] [<c07db53c>] usb_start_wait_urb+0xca/0xe2
[   62.456441] [<c07db5ee>] usb_control_msg+0x9a/0x102
[   62.456648] [<c07d3970>] hub_port_init+0x5de/0xb40
[   62.456851] [<c07d67fe>] hub_event+0xb90/0x1364
[   62.457049] [<c002ead6>] process_one_work+0x200/0x564

Sometimes there is an i/o scheduling timeout such as

[    6.361811] Run /sbin/init as init process
[   93.167039] INFO: task kworker/u4:0:10 blocked for more than 30 seconds.
[   93.167715]       Tainted: G                 N 6.9.0-rc5-00036-gaece0dd54838 #4
[   93.168169] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[   93.168607] task:kworker/u4:0    state:D stack:0     pid:10    tgid:10    ppid:2      flags:0x00000000
[   93.169602] Workqueue: scsi_tmf_0 scmd_eh_abort_handler
[   93.170278] Call Trace:
[   93.170584] [<c0b3a70e>] __schedule+0x358/0xd4e
[   93.170904] [<c0b3b136>] schedule+0x32/0x166
[   93.171161] [<c0b41d6c>] schedule_timeout+0xd8/0x10a
[   93.171420] [<c0b3ba0e>] __wait_for_common+0xce/0x1ce
[   93.171604] [<c0b3bb2a>] wait_for_completion+0x1c/0x24
[   93.171716] [<c0810a70>] command_abort_matching.part.0+0x38/0x52
[   93.171841] [<c0810ac0>] command_abort+0x36/0x70
[   93.171946] [<c06970a4>] scmd_eh_abort_handler+0xa6/0x192
...

This is not a new problem; I have seen it forever. Recently I spent some
time trying to understand the problem and found that the linux driver does
not always handle all ohci interupts. Since the interrupt is shared and
thus level triggered, that means that interrupts are still pending when
ohci_irq() exits. The interrupt core in Linux does not re-enter the
interrupt handler, presumably because it is marked as shared interrupt
and returns that the interrupt has been handled.

I found two possible fixes for the problem. One essentially mirrors the
code from ehci_irq(), the other adds a (bad) kludge into qemu. Both "fix"
or work around the problem.

Question is: What is actually wrong ? Something in the generic interrupt
handling code in Linux, something in the Linux usb-ohci driver, or
something in qemu ? Any idea how a proper fix might look like ?

Thanks,
Guenter

---
Linux hack:

Alan Stern April 23, 2024, 5:30 p.m. UTC | #1

On Tue, Apr 23, 2024 at 10:04:17AM -0700, Guenter Roeck wrote:
> Hi,
> 
> when testing usb-ohci

What is usb-ohci?  Do you mean ohci-hcd?

>  with qemu's pci-ohci emulation, I keep getting
> random usb interface timeouts. Sometimes the usb_hub_wq times out.
...

> Sometimes there is an i/o scheduling timeout such as
...

> This is not a new problem; I have seen it forever. Recently I spent some
> time trying to understand the problem and found that the linux driver does
> not always handle all ohci interupts.

Please be more specific: _Which_ interrupts aren't being handled?  That 
is, which flags remain set in the intrstatus register when the handler 
returns?

>  Since the interrupt is shared and
> thus level triggered, that means that interrupts are still pending when
> ohci_irq() exits. The interrupt core in Linux does not re-enter the
> interrupt handler, presumably because it is marked as shared interrupt
> and returns that the interrupt has been handled.

Isn't that behavior mistaken?  A level-triggered IRQ that remains set when 
it is re-enabled (when the various shared handlers return) should cause 
another interrupt to occur right away.

Edged-triggered interrupts would be a different story, of course.

> I found two possible fixes for the problem. One essentially mirrors the
> code from ehci_irq(), the other adds a (bad) kludge into qemu. Both "fix"
> or work around the problem.
> 
> Question is: What is actually wrong ? Something in the generic interrupt
> handling code in Linux, something in the Linux usb-ohci driver, or
> something in qemu ? Any idea how a proper fix might look like ?

To answer these questions we need more information.

Alan Stern

Guenter Roeck April 23, 2024, 9:10 p.m. UTC | #2

Hi Alan,

On 4/23/24 10:30, Alan Stern wrote:
> On Tue, Apr 23, 2024 at 10:04:17AM -0700, Guenter Roeck wrote:
>> Hi,
>>
>> when testing usb-ohci
> 
> What is usb-ohci?  Do you mean ohci-hcd?
> 
>>   with qemu's pci-ohci emulation, I keep getting
>> random usb interface timeouts. Sometimes the usb_hub_wq times out.
> ...
> 
>> Sometimes there is an i/o scheduling timeout such as
> ...
> 
>> This is not a new problem; I have seen it forever. Recently I spent some
>> time trying to understand the problem and found that the linux driver does
>> not always handle all ohci interupts.
> 
> Please be more specific: _Which_ interrupts aren't being handled?  That
> is, which flags remain set in the intrstatus register when the handler
> returns?
> 
Sorry, I didn't think it was relevant.

Here is the result of repeating the test with the linux patch several times
and passing the log output through "sort | uniq".

       1 pending 0x2 mask 0x8000005a
       9 pending 0x4 mask 0x8000001e
     208 pending 0x4 mask 0x8000005e

So it is almost always OHCI_INTR_SF and rarely OHCI_INTR_WDH.
For reference, this is repeatedly running

CMDLINE="panic=-1 kunit.stats_enabled=2 kunit.filter=speed>slow root=/dev/sda rootwait console=ttyS0,115200 earlycon=uart8250,mmio,0x10000000,115200"
qemu-system-riscv32 -M virt -m 512M \
      -no-reboot -bios default -kernel arch/riscv/boot/Image -snapshot \
      -device virtio-net-device,netdev=net0 -netdev user,id=net0 -usb \
      -device pci-ohci,id=ohci -device usb-storage,bus=ohci.0,drive=d0 \
      -drive file=rootfs.ext2,if=none,id=d0,format=raw \
      -append "${CMDLINE}" \
      -nographic -monitor none \
      --trace 'usb*' -D "/tmp/usb-ohci.$$.${sequence}.trace.log"

Tracing isn't really necessary but it changes timing enough that
the problem is more likely to be seen if it is active. The problem
is seen with various emulations; I just picked one of them.

>>   Since the interrupt is shared and
>> thus level triggered, that means that interrupts are still pending when
>> ohci_irq() exits. The interrupt core in Linux does not re-enter the
>> interrupt handler, presumably because it is marked as shared interrupt
>> and returns that the interrupt has been handled.
> 
> Isn't that behavior mistaken?  A level-triggered IRQ that remains set when
> it is re-enabled (when the various shared handlers return) should cause
> another interrupt to occur right away.
> 
> Edged-triggered interrupts would be a different story, of course.
> 

Maybe I got it wrong; I thought that shared interrupts would have to be
level triggered. But then you are correct: one would think that level
triggered interrupts would keep executing interrupt handlers until the
interrupt is completely handled. I guess that means that I don't really
understand what is happening. Sorry for jumping to conclusions.

>> I found two possible fixes for the problem. One essentially mirrors the
>> code from ehci_irq(), the other adds a (bad) kludge into qemu. Both "fix"
>> or work around the problem.
>>
>> Question is: What is actually wrong ? Something in the generic interrupt
>> handling code in Linux, something in the Linux usb-ohci driver, or
>> something in qemu ? Any idea how a proper fix might look like ?
> 
> To answer these questions we need more information.
> 
What else would you need ?

Thanks,
Guenter

Alan Stern April 24, 2024, 2:11 a.m. UTC | #3

On Tue, Apr 23, 2024 at 02:10:13PM -0700, Guenter Roeck wrote:
> Hi Alan,
> 
> On 4/23/24 10:30, Alan Stern wrote:
> > On Tue, Apr 23, 2024 at 10:04:17AM -0700, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > when testing usb-ohci
> > 
> > What is usb-ohci?  Do you mean ohci-hcd?
> > 
> > >   with qemu's pci-ohci emulation, I keep getting
> > > random usb interface timeouts. Sometimes the usb_hub_wq times out.
> > ...
> > 
> > > Sometimes there is an i/o scheduling timeout such as
> > ...
> > 
> > > This is not a new problem; I have seen it forever. Recently I spent some
> > > time trying to understand the problem and found that the linux driver does
> > > not always handle all ohci interupts.
> > 
> > Please be more specific: _Which_ interrupts aren't being handled?  That
> > is, which flags remain set in the intrstatus register when the handler
> > returns?
> > 
> Sorry, I didn't think it was relevant.
> 
> Here is the result of repeating the test with the linux patch several times
> and passing the log output through "sort | uniq".
> 
>       1 pending 0x2 mask 0x8000005a
>       9 pending 0x4 mask 0x8000001e
>     208 pending 0x4 mask 0x8000005e
> 
> So it is almost always OHCI_INTR_SF and rarely OHCI_INTR_WDH.

I see.  Both of those flags can legitimately get turned on while the 
interrupt routine is running, and under heavy load this could happen fairly 
frequently.  The SF (Start of Frame) interrupt occurs once every 
millisecond, and the WDH (Writeback Done Head) interrupt occurs more or 
less when a transfer completes.

> For reference, this is repeatedly running
> 
> CMDLINE="panic=-1 kunit.stats_enabled=2 kunit.filter=speed>slow root=/dev/sda rootwait console=ttyS0,115200 earlycon=uart8250,mmio,0x10000000,115200"
> qemu-system-riscv32 -M virt -m 512M \
>      -no-reboot -bios default -kernel arch/riscv/boot/Image -snapshot \
>      -device virtio-net-device,netdev=net0 -netdev user,id=net0 -usb \
>      -device pci-ohci,id=ohci -device usb-storage,bus=ohci.0,drive=d0 \
>      -drive file=rootfs.ext2,if=none,id=d0,format=raw \
>      -append "${CMDLINE}" \
>      -nographic -monitor none \
>      --trace 'usb*' -D "/tmp/usb-ohci.$$.${sequence}.trace.log"
> 
> Tracing isn't really necessary but it changes timing enough that
> the problem is more likely to be seen if it is active. The problem
> is seen with various emulations; I just picked one of them.
> 
> > >   Since the interrupt is shared and
> > > thus level triggered, that means that interrupts are still pending when
> > > ohci_irq() exits. The interrupt core in Linux does not re-enter the
> > > interrupt handler, presumably because it is marked as shared interrupt
> > > and returns that the interrupt has been handled.
> > 
> > Isn't that behavior mistaken?  A level-triggered IRQ that remains set when
> > it is re-enabled (when the various shared handlers return) should cause
> > another interrupt to occur right away.
> > 
> > Edged-triggered interrupts would be a different story, of course.
> > 
> 
> Maybe I got it wrong; I thought that shared interrupts would have to be
> level triggered. But then you are correct: one would think that level
> triggered interrupts would keep executing interrupt handlers until the
> interrupt is completely handled. I guess that means that I don't really
> understand what is happening. Sorry for jumping to conclusions.

To avoid the overhead of repeated interrupts, it would be best to check the 
interrupt status at the end of the routine and restart if any of the 
enabled bits are set, as in your first patch.

If you would like to clean it up (get rid of the debugging stuff) and 
submit it, I'll review it.

> > > I found two possible fixes for the problem. One essentially mirrors the
> > > code from ehci_irq(), the other adds a (bad) kludge into qemu. Both "fix"
> > > or work around the problem.
> > > 
> > > Question is: What is actually wrong ? Something in the generic interrupt
> > > handling code in Linux, something in the Linux usb-ohci driver, or
> > > something in qemu ? Any idea how a proper fix might look like ?
> > 
> > To answer these questions we need more information.
> > 
> What else would you need ?

Nothing else really, just the remaining interrupt flag bits.

Alan Stern

Gerd Hoffmann April 24, 2024, 11:16 a.m. UTC | #4

> qemu hack:
> 
>  hw/usb/hcd-ohci.c | 11 +++++++++++
>  hw/usb/hcd-ohci.h |  1 +
>  2 files changed, 12 insertions(+)
> 
> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
> index fc8fc91a1d..99e52ad13a 100644
> --- a/hw/usb/hcd-ohci.c
> +++ b/hw/usb/hcd-ohci.c
> @@ -267,6 +267,10 @@ static inline void ohci_intr_update(OHCIState *ohci)
>          (ohci->intr_status & ohci->intr))
>          level = 1;
>  
> +    if (level && ohci->level)
> +        qemu_set_irq(ohci->irq, 0);
> +
> +    ohci->level = level;
>      qemu_set_irq(ohci->irq, level);
>  }
>  
> diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h
> index e1827227ac..6f82e72bd9 100644
> --- a/hw/usb/hcd-ohci.h
> +++ b/hw/usb/hcd-ohci.h
> @@ -52,6 +52,7 @@ struct OHCIState {
>      uint32_t ctl, status;
>      uint32_t intr_status;
>      uint32_t intr;
> +    int level;
>  
>      /* memory pointer partition */
>      uint32_t hcca;

Phew.  Disclaimer: Havn't looked at the ohci emulation code for years.

It should not be needed to store the interrupt level that way.  It is
possible to calculate what the interrupt level should be, based on the
interrupt status register and the interrupt mask register, and the code
above seems to do exactly that (the "ohci->intr_status & ohci->intr"
bit).

ohci_intr_update() must be called if one of these two registers changes,
i.e. if the guest changes the mask, if the guest acks an IRQ by clearing
an status bit, if the device raises an IRQ by setting an status bit.
Might be the ohci emulation has a bug here.

Another possible trouble spot is that the timing behavior is different
on virtual vs. physical hardware.  Specifically with the emulated
hardware some actions appear to complete instantly (when the vmexit to
handle the mmio register write returns it's finished already), which
will never complete that quickly on physical hardware.  So drivers can
have race conditions which only trigger on virtual hardware.  The error
pattern you are describing sounds like this could be the case here.

HTH & take care,
  Gerd

Guenter Roeck April 24, 2024, 3:11 p.m. UTC | #5

On 4/23/24 19:11, Alan Stern wrote:
[ ... ]
> 
> To avoid the overhead of repeated interrupts, it would be best to check the
> interrupt status at the end of the routine and restart if any of the
> enabled bits are set, as in your first patch.
> 
> If you would like to clean it up (get rid of the debugging stuff) and
> submit it, I'll review it.
> 
Sure, I'll do that.

Thanks,
Guenter

Guenter Roeck April 24, 2024, 3:23 p.m. UTC | #6

On 4/24/24 04:16, Gerd Hoffmann wrote:
>> qemu hack:
>>
>>   hw/usb/hcd-ohci.c | 11 +++++++++++
>>   hw/usb/hcd-ohci.h |  1 +
>>   2 files changed, 12 insertions(+)
>>
>> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
>> index fc8fc91a1d..99e52ad13a 100644
>> --- a/hw/usb/hcd-ohci.c
>> +++ b/hw/usb/hcd-ohci.c
>> @@ -267,6 +267,10 @@ static inline void ohci_intr_update(OHCIState *ohci)
>>           (ohci->intr_status & ohci->intr))
>>           level = 1;
>>   
>> +    if (level && ohci->level)
>> +        qemu_set_irq(ohci->irq, 0);
>> +
>> +    ohci->level = level;
>>       qemu_set_irq(ohci->irq, level);
>>   }
>>   
>> diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h
>> index e1827227ac..6f82e72bd9 100644
>> --- a/hw/usb/hcd-ohci.h
>> +++ b/hw/usb/hcd-ohci.h
>> @@ -52,6 +52,7 @@ struct OHCIState {
>>       uint32_t ctl, status;
>>       uint32_t intr_status;
>>       uint32_t intr;
>> +    int level;
>>   
>>       /* memory pointer partition */
>>       uint32_t hcca;
> 
> Phew.  Disclaimer: Havn't looked at the ohci emulation code for years.
> 
> It should not be needed to store the interrupt level that way.  It is
> possible to calculate what the interrupt level should be, based on the
> interrupt status register and the interrupt mask register, and the code
> above seems to do exactly that (the "ohci->intr_status & ohci->intr"
> bit).
> 

You are correct. For the purpose of this kludge a simpler
+	qemu_set_irq(ohci->irq, 0);
	qemu_set_irq(ohci->irq, level);

would have been sufficient. My original code added tracing,
where this generated a lot of noise. I didn't completely simplify
the kludge. Sorry for that and for any confusion it may have caused.

> ohci_intr_update() must be called if one of these two registers changes,
> i.e. if the guest changes the mask, if the guest acks an IRQ by clearing
> an status bit, if the device raises an IRQ by setting an status bit.
> Might be the ohci emulation has a bug here.
> 
> Another possible trouble spot is that the timing behavior is different
> on virtual vs. physical hardware.  Specifically with the emulated
> hardware some actions appear to complete instantly (when the vmexit to
> handle the mmio register write returns it's finished already), which
> will never complete that quickly on physical hardware.  So drivers can
> have race conditions which only trigger on virtual hardware.  The error
> pattern you are describing sounds like this could be the case here.
> 

I think the underlying problem is that both the qemu emulation and
the Linux kernel driver expect that the interrupt is level triggered.
It looks like some entity in between handles the interrupts as edge
triggered. This makes the kludge necessary: All it does is to generate
an artificial interrupt edge.

This can be worked around in the Linux interrupt handler by checking
if another interrupt arrived while the original interrupt was handled.
This will ensure that all interrupts are handled and cleared when the
handler exits, and that a later arriving interrupt will generate the
necessary edge and thus another interrupt. That doesn't fix the
edge<->level triggered interrupt confusion (if that is indeed the root
cause of the problem), but it does address its consequences.

If anyone has an idea how to find out where the interrupt confusion
happens, please let me know, and I'll be happy to do some more testing.
I am quite curious myself, and it would make sense to solve the problem
at its root.

Thanks,
Guenter

Guenter Roeck April 24, 2024, 5:05 p.m. UTC | #7

On 4/24/24 08:23, Guenter Roeck wrote:
> On 4/24/24 04:16, Gerd Hoffmann wrote:
>>> qemu hack:
>>>
>>>   hw/usb/hcd-ohci.c | 11 +++++++++++
>>>   hw/usb/hcd-ohci.h |  1 +
>>>   2 files changed, 12 insertions(+)
>>>
>>> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
>>> index fc8fc91a1d..99e52ad13a 100644
>>> --- a/hw/usb/hcd-ohci.c
>>> +++ b/hw/usb/hcd-ohci.c
>>> @@ -267,6 +267,10 @@ static inline void ohci_intr_update(OHCIState *ohci)
>>>           (ohci->intr_status & ohci->intr))
>>>           level = 1;
>>> +    if (level && ohci->level)
>>> +        qemu_set_irq(ohci->irq, 0);
>>> +
>>> +    ohci->level = level;
>>>       qemu_set_irq(ohci->irq, level);
>>>   }
>>> diff --git a/hw/usb/hcd-ohci.h b/hw/usb/hcd-ohci.h
>>> index e1827227ac..6f82e72bd9 100644
>>> --- a/hw/usb/hcd-ohci.h
>>> +++ b/hw/usb/hcd-ohci.h
>>> @@ -52,6 +52,7 @@ struct OHCIState {
>>>       uint32_t ctl, status;
>>>       uint32_t intr_status;
>>>       uint32_t intr;
>>> +    int level;
>>>       /* memory pointer partition */
>>>       uint32_t hcca;
>>
>> Phew.  Disclaimer: Havn't looked at the ohci emulation code for years.
>>
>> It should not be needed to store the interrupt level that way.  It is
>> possible to calculate what the interrupt level should be, based on the
>> interrupt status register and the interrupt mask register, and the code
>> above seems to do exactly that (the "ohci->intr_status & ohci->intr"
>> bit).
>>
> 
> You are correct. For the purpose of this kludge a simpler
> +    qemu_set_irq(ohci->irq, 0);
>      qemu_set_irq(ohci->irq, level);
> 
> would have been sufficient. My original code added tracing,
> where this generated a lot of noise. I didn't completely simplify
> the kludge. Sorry for that and for any confusion it may have caused.
> 
>> ohci_intr_update() must be called if one of these two registers changes,
>> i.e. if the guest changes the mask, if the guest acks an IRQ by clearing
>> an status bit, if the device raises an IRQ by setting an status bit.
>> Might be the ohci emulation has a bug here.
>>
>> Another possible trouble spot is that the timing behavior is different
>> on virtual vs. physical hardware.  Specifically with the emulated
>> hardware some actions appear to complete instantly (when the vmexit to
>> handle the mmio register write returns it's finished already), which
>> will never complete that quickly on physical hardware.  So drivers can
>> have race conditions which only trigger on virtual hardware.  The error
>> pattern you are describing sounds like this could be the case here.
>>
> 
> I think the underlying problem is that both the qemu emulation and
> the Linux kernel driver expect that the interrupt is level triggered.
> It looks like some entity in between handles the interrupts as edge
> triggered. This makes the kludge necessary: All it does is to generate
> an artificial interrupt edge.
> 
> This can be worked around in the Linux interrupt handler by checking
> if another interrupt arrived while the original interrupt was handled.
> This will ensure that all interrupts are handled and cleared when the
> handler exits, and that a later arriving interrupt will generate the
> necessary edge and thus another interrupt. That doesn't fix the
> edge<->level triggered interrupt confusion (if that is indeed the root
> cause of the problem), but it does address its consequences.
> 
> If anyone has an idea how to find out where the interrupt confusion
> happens, please let me know, and I'll be happy to do some more testing.
> I am quite curious myself, and it would make sense to solve the problem
> at its root.
> 
Update:

I found upstream commit 0b60557230ad ("usb: ehci: Prevent missed ehci
interrupts with edge-triggered MSI") which I think explains the problem.

Guenter

Problems (timeouts) when testing usb-ohci with qemu

Commit Message

Comments

Patch