mbox series

[0/4,SRU,Jammy/Unstable/OEM-5.17] Bolt doesn't work with native USB4 hosts

Message ID 20220407171739.1176275-1-vicamo.yang@canonical.com
Headers show
Series Bolt doesn't work with native USB4 hosts | expand

Message

You-Sheng Yang April 7, 2022, 5:17 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1962349

[Impact]

 * AMD Yellow Carp provides integrated USB4 host controllers
 * When plugging in a Thunderbolt3 or USB4 device, users are unable to authorize
   it using the GUI due to an error message: "parent not authorized, deferring"

[Test Plan]

AMD Yellow Carp Host (issue this bug is about)
----------------------------------------------
 * Plug in USB4 device or TBT3 to AMD Yellow Carp host
 * Ensure that PCI topology has populated
 * Observe that /sys/bus/thunderbolt/devices/DEVICE/authorized is "0"
 * Try to run `boltctl enroll $UUID`

Alpine Ridge / Titan Ridge host (discrete controller)
------------------------------------------------------
Start out on a host with discrete controller (Alpine Ridge or Titan Ridge)
1. sudo boltctl forget -a
2. Plug in dock
3. Make sure 'boltctl list' enumerates dock.
4. Check /sys/bus/thunderbolt/devices/domain0/iommu_dma_protection (value
   dependent upon host)
   - If 0; try to manually enroll using 'boltctl enroll $UUID'
   - If 1; ensure that device automatically enrolled with bolt.

GUI Check
---------
Ensure that devices show up in the Settings GUI and are now able to authorize.
Note: for AMD platforms enumerating PCIe devices is a separate problem from BOLT
handled by kernel tasks. GUI check is only about "authorization".

[Where problems could occur]

 * Intel USB4 or TBT3 hosts also use bolt. They could have a problem with the
   new version of bolt.
 * This is very unlikely however since there is a through test suite, and up
   until now the entire industry has been using bolt on Intel controllers for a
   long time.
 * There haven't been any significant bugs reported upstream or in Ubuntu since
   0.9.1 release.

[Other Info]
 * This bug also occurs on Intel controllers from ICL, TGL or ALD, but in many
   cases they are automatically authorized to an iommu DMA policy.
 * It is fixed in bolt 0.9.1 or later release.
 * To solve the SRU, will backport 0.9.1 release from Impish.
 * I did look into backporting just the commit(s) for fixing this, but it's not
   a trivial backport. Quoting the changelog
   (https://gitlab.freedesktop.org/bolt/bolt/-/blob/master/CHANGELOG.md):
   "Additionally the unique_id of said host controller changes with every boot,
   which breaks one of the fundamental assumptions in boltd".

Mario Limonciello (4):
  thunderbolt: Retry DROM reads for more failure scenarios
  thunderbolt: Do not resume routers if UID is not set
  thunderbolt: Do not make DROM read success compulsory
  PCI/ACPI: Allow D3 only if Root Port can signal and wake from D3

 drivers/pci/pci-acpi.c       | 41 ++++++++++++++++++++++++++----------
 drivers/thunderbolt/eeprom.c | 17 +++++++++------
 drivers/thunderbolt/switch.c | 10 +++++----
 3 files changed, 46 insertions(+), 22 deletions(-)

Comments

Tim Gardner April 15, 2022, 3:52 p.m. UTC | #1
Acked-by: Tim Gardner <tim.gardner@canonical.com>

Patches 1-3 are merged upstream. Patch 4 is still linux-next.

On 4/7/22 11:17, You-Sheng Yang wrote:
> BugLink: https://bugs.launchpad.net/bugs/1962349
> 
> [Impact]
> 
>   * AMD Yellow Carp provides integrated USB4 host controllers
>   * When plugging in a Thunderbolt3 or USB4 device, users are unable to authorize
>     it using the GUI due to an error message: "parent not authorized, deferring"
> 
> [Test Plan]
> 
> AMD Yellow Carp Host (issue this bug is about)
> ----------------------------------------------
>   * Plug in USB4 device or TBT3 to AMD Yellow Carp host
>   * Ensure that PCI topology has populated
>   * Observe that /sys/bus/thunderbolt/devices/DEVICE/authorized is "0"
>   * Try to run `boltctl enroll $UUID`
> 
> Alpine Ridge / Titan Ridge host (discrete controller)
> ------------------------------------------------------
> Start out on a host with discrete controller (Alpine Ridge or Titan Ridge)
> 1. sudo boltctl forget -a
> 2. Plug in dock
> 3. Make sure 'boltctl list' enumerates dock.
> 4. Check /sys/bus/thunderbolt/devices/domain0/iommu_dma_protection (value
>     dependent upon host)
>     - If 0; try to manually enroll using 'boltctl enroll $UUID'
>     - If 1; ensure that device automatically enrolled with bolt.
> 
> GUI Check
> ---------
> Ensure that devices show up in the Settings GUI and are now able to authorize.
> Note: for AMD platforms enumerating PCIe devices is a separate problem from BOLT
> handled by kernel tasks. GUI check is only about "authorization".
> 
> [Where problems could occur]
> 
>   * Intel USB4 or TBT3 hosts also use bolt. They could have a problem with the
>     new version of bolt.
>   * This is very unlikely however since there is a through test suite, and up
>     until now the entire industry has been using bolt on Intel controllers for a
>     long time.
>   * There haven't been any significant bugs reported upstream or in Ubuntu since
>     0.9.1 release.
> 
> [Other Info]
>   * This bug also occurs on Intel controllers from ICL, TGL or ALD, but in many
>     cases they are automatically authorized to an iommu DMA policy.
>   * It is fixed in bolt 0.9.1 or later release.
>   * To solve the SRU, will backport 0.9.1 release from Impish.
>   * I did look into backporting just the commit(s) for fixing this, but it's not
>     a trivial backport. Quoting the changelog
>     (https://gitlab.freedesktop.org/bolt/bolt/-/blob/master/CHANGELOG.md):
>     "Additionally the unique_id of said host controller changes with every boot,
>     which breaks one of the fundamental assumptions in boltd".
> 
> Mario Limonciello (4):
>    thunderbolt: Retry DROM reads for more failure scenarios
>    thunderbolt: Do not resume routers if UID is not set
>    thunderbolt: Do not make DROM read success compulsory
>    PCI/ACPI: Allow D3 only if Root Port can signal and wake from D3
> 
>   drivers/pci/pci-acpi.c       | 41 ++++++++++++++++++++++++++----------
>   drivers/thunderbolt/eeprom.c | 17 +++++++++------
>   drivers/thunderbolt/switch.c | 10 +++++----
>   3 files changed, 46 insertions(+), 22 deletions(-)
>
Paolo Pisati April 19, 2022, 10:10 a.m. UTC | #2
On Fri, Apr 08, 2022 at 01:17:34AM +0800, You-Sheng Yang wrote:
> BugLink: https://bugs.launchpad.net/bugs/1962349
Timo Aaltonen April 22, 2022, 8:24 a.m. UTC | #3
You-Sheng Yang kirjoitti 7.4.2022 klo 20.17:
> BugLink: https://bugs.launchpad.net/bugs/1962349
> 
> [Impact]
> 
>   * AMD Yellow Carp provides integrated USB4 host controllers
>   * When plugging in a Thunderbolt3 or USB4 device, users are unable to authorize
>     it using the GUI due to an error message: "parent not authorized, deferring"
> 
> [Test Plan]
> 
> AMD Yellow Carp Host (issue this bug is about)
> ----------------------------------------------
>   * Plug in USB4 device or TBT3 to AMD Yellow Carp host
>   * Ensure that PCI topology has populated
>   * Observe that /sys/bus/thunderbolt/devices/DEVICE/authorized is "0"
>   * Try to run `boltctl enroll $UUID`
> 
> Alpine Ridge / Titan Ridge host (discrete controller)
> ------------------------------------------------------
> Start out on a host with discrete controller (Alpine Ridge or Titan Ridge)
> 1. sudo boltctl forget -a
> 2. Plug in dock
> 3. Make sure 'boltctl list' enumerates dock.
> 4. Check /sys/bus/thunderbolt/devices/domain0/iommu_dma_protection (value
>     dependent upon host)
>     - If 0; try to manually enroll using 'boltctl enroll $UUID'
>     - If 1; ensure that device automatically enrolled with bolt.
> 
> GUI Check
> ---------
> Ensure that devices show up in the Settings GUI and are now able to authorize.
> Note: for AMD platforms enumerating PCIe devices is a separate problem from BOLT
> handled by kernel tasks. GUI check is only about "authorization".
> 
> [Where problems could occur]
> 
>   * Intel USB4 or TBT3 hosts also use bolt. They could have a problem with the
>     new version of bolt.
>   * This is very unlikely however since there is a through test suite, and up
>     until now the entire industry has been using bolt on Intel controllers for a
>     long time.
>   * There haven't been any significant bugs reported upstream or in Ubuntu since
>     0.9.1 release.
> 
> [Other Info]
>   * This bug also occurs on Intel controllers from ICL, TGL or ALD, but in many
>     cases they are automatically authorized to an iommu DMA policy.
>   * It is fixed in bolt 0.9.1 or later release.
>   * To solve the SRU, will backport 0.9.1 release from Impish.
>   * I did look into backporting just the commit(s) for fixing this, but it's not
>     a trivial backport. Quoting the changelog
>     (https://gitlab.freedesktop.org/bolt/bolt/-/blob/master/CHANGELOG.md):
>     "Additionally the unique_id of said host controller changes with every boot,
>     which breaks one of the fundamental assumptions in boltd".
> 
> Mario Limonciello (4):
>    thunderbolt: Retry DROM reads for more failure scenarios
>    thunderbolt: Do not resume routers if UID is not set
>    thunderbolt: Do not make DROM read success compulsory
>    PCI/ACPI: Allow D3 only if Root Port can signal and wake from D3
> 
>   drivers/pci/pci-acpi.c       | 41 ++++++++++++++++++++++++++----------
>   drivers/thunderbolt/eeprom.c | 17 +++++++++------
>   drivers/thunderbolt/switch.c | 10 +++++----
>   3 files changed, 46 insertions(+), 22 deletions(-)
> 

applied to oem-5.17, thanks
Andrea Righi July 19, 2022, 5:21 a.m. UTC | #4
On Fri, Apr 08, 2022 at 01:17:34AM +0800, You-Sheng Yang wrote:
> BugLink: https://bugs.launchpad.net/bugs/1962349
> 
> [Impact]
> 
>  * AMD Yellow Carp provides integrated USB4 host controllers
>  * When plugging in a Thunderbolt3 or USB4 device, users are unable to authorize
>    it using the GUI due to an error message: "parent not authorized, deferring"
> 
> [Test Plan]
> 
> AMD Yellow Carp Host (issue this bug is about)
> ----------------------------------------------
>  * Plug in USB4 device or TBT3 to AMD Yellow Carp host
>  * Ensure that PCI topology has populated
>  * Observe that /sys/bus/thunderbolt/devices/DEVICE/authorized is "0"
>  * Try to run `boltctl enroll $UUID`
> 
> Alpine Ridge / Titan Ridge host (discrete controller)
> ------------------------------------------------------
> Start out on a host with discrete controller (Alpine Ridge or Titan Ridge)
> 1. sudo boltctl forget -a
> 2. Plug in dock
> 3. Make sure 'boltctl list' enumerates dock.
> 4. Check /sys/bus/thunderbolt/devices/domain0/iommu_dma_protection (value
>    dependent upon host)
>    - If 0; try to manually enroll using 'boltctl enroll $UUID'
>    - If 1; ensure that device automatically enrolled with bolt.
> 
> GUI Check
> ---------
> Ensure that devices show up in the Settings GUI and are now able to authorize.
> Note: for AMD platforms enumerating PCIe devices is a separate problem from BOLT
> handled by kernel tasks. GUI check is only about "authorization".
> 
> [Where problems could occur]
> 
>  * Intel USB4 or TBT3 hosts also use bolt. They could have a problem with the
>    new version of bolt.
>  * This is very unlikely however since there is a through test suite, and up
>    until now the entire industry has been using bolt on Intel controllers for a
>    long time.
>  * There haven't been any significant bugs reported upstream or in Ubuntu since
>    0.9.1 release.
> 
> [Other Info]
>  * This bug also occurs on Intel controllers from ICL, TGL or ALD, but in many
>    cases they are automatically authorized to an iommu DMA policy.
>  * It is fixed in bolt 0.9.1 or later release.
>  * To solve the SRU, will backport 0.9.1 release from Impish.
>  * I did look into backporting just the commit(s) for fixing this, but it's not
>    a trivial backport. Quoting the changelog
>    (https://gitlab.freedesktop.org/bolt/bolt/-/blob/master/CHANGELOG.md):
>    "Additionally the unique_id of said host controller changes with every boot,
>    which breaks one of the fundamental assumptions in boltd".

Already applied to kinetic/linux via upstream updates.

Thanks,
-Andrea