mbox series

[RFT,v1,0/5] usb: Improve robustness of ehci-hcd controller operation

Message ID 20200322130031.10455-1-lukma@denx.de
Headers show
Series usb: Improve robustness of ehci-hcd controller operation | expand

Message

Lukasz Majewski March 22, 2020, 1 p.m. UTC
This patch set is rather a request for testing (and a starting point for the
discussion), as it may improve the robustness of USB with some pendrives - and
yes sacrifice some performance for reliability.
The previous version of this patch: https://patchwork.ozlabs.org/patch/1244928/
fixed issue for some network USB adapters and improved stability on TI boards.
This patch also provides very detailed explanation of the problem in the commit
message.

With the async support patch applied (
SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 ), the qhtoken variable
has value 0x00 when token shows errors. As a result the error handling path
is not executed. This looks like some missing/broken cache flushing - for easier
bisecting this patch has been reverted for now


Test setup:
===========
Data:
VFAT on the pendrive
16MiB FitImage on USB pen drive (to be read in U-Boot)
400 MiB recovery image (but not read)

Pen drives:
-----------
1. ID 0930:6544 Toshiba Corp. TransMemory-Mini / Kingston DataTraveler 2.0 Stick
2. ID 21c4:8005 GOODRAM 8GB

HW:
---
IMX6Q -> TPC70 board

Procedure:
----------
Boot U-Boot, execute: loadusb=usb start; fatload usb 0 ${loadaddr} ${upd_image}
(Then the image has been crafted to OOPs and WDT after 4 seconds causes reset).
When we fail - the "normal" execution path is followed and we boot up till
prompt.

Results:
========

1. Current mainline - SHA1: 63b2ef407de2f0997deef3b54b7e7ab9c7a7cb27
- The USB error (EHCI timed out on TD - token=0x80008d80) is present after ~10
  minutes - pendrive [1]

- Error after a few minutes (1,2) - pendrive [2]

2. When USB async support is reverted (commit
SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 [*])
- Error is detected after ~1h - pendrive [1]


3. With this patchset applied
- 12h of testing - no error - pendrive [1]

With the pendrive [2] I do observe that very often it is not recognized at all.
Even more strange - there is a difference in the reliability of being recognized
between identical pendrives (used one vs. just unboxed one).



Lukasz Majewski (5):
  Revert "usb: ehci-hcd: Keep async schedule running"
  usb: Handle XACTERR error in DATA phase of USB storage
  usb: Add some delay to wait for slow USB devices to be operational
  usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data

 common/usb.c                | 10 ++++-
 common/usb_storage.c        | 46 ++++++++++++++++++++++
 drivers/usb/host/ehci-hcd.c | 78 ++++++++++++++++++++++++++-----------
 include/usb.h               |  1 +
 include/usb_defs.h          |  1 +
 5 files changed, 112 insertions(+), 24 deletions(-)

Comments

Tom Rini March 23, 2020, 8:58 p.m. UTC | #1
On Sun, Mar 22, 2020 at 02:00:26PM +0100, Lukasz Majewski wrote:

> This patch set is rather a request for testing (and a starting point for the
> discussion), as it may improve the robustness of USB with some pendrives - and
> yes sacrifice some performance for reliability.
> The previous version of this patch: https://patchwork.ozlabs.org/patch/1244928/
> fixed issue for some network USB adapters and improved stability on TI boards.
> This patch also provides very detailed explanation of the problem in the commit
> message.
> 
> With the async support patch applied (
> SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 ), the qhtoken variable
> has value 0x00 when token shows errors. As a result the error handling path
> is not executed. This looks like some missing/broken cache flushing - for easier
> bisecting this patch has been reverted for now

Note that while the original patch returned USB ethernet on the
Beagleboard to a functional state with this series applied it's back to
non-functional.
Lukasz Majewski March 23, 2020, 10:11 p.m. UTC | #2
Hi Tom,

> On Sun, Mar 22, 2020 at 02:00:26PM +0100, Lukasz Majewski wrote:
> 
> > This patch set is rather a request for testing (and a starting
> > point for the discussion), as it may improve the robustness of USB
> > with some pendrives - and yes sacrifice some performance for
> > reliability. The previous version of this patch:
> > https://patchwork.ozlabs.org/patch/1244928/ fixed issue for some
> > network USB adapters and improved stability on TI boards. This
> > patch also provides very detailed explanation of the problem in the
> > commit message.
> > 
> > With the async support patch applied (
> > SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 ), the qhtoken
> > variable has value 0x00 when token shows errors. As a result the
> > error handling path is not executed. This looks like some
> > missing/broken cache flushing - for easier bisecting this patch has
> > been reverted for now  
> 
> Note that while the original patch returned USB ethernet on the
> Beagleboard to a functional state with this series applied it's back
> to non-functional.
> 

Thanks for testing.

The _only_ difference between the first version of this patch and this
one is the lack of dynamic reduction of transfer size for the latter.
The former reduces the transfer size to 64 blocks (instead of default
240). And with 64 blocks it retries two times the transmission.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de