mbox series

[v5,00/12] Fix USB suspend on TI J7200 (cdns3-ti, cdns3, xhci)

Message ID 20240726-s2r-cdns-v5-0-8664bfb032ac@bootlin.com
Headers show
Series Fix USB suspend on TI J7200 (cdns3-ti, cdns3, xhci) | expand

Message

Théo Lebrun July 26, 2024, 6:17 p.m. UTC
Currently, system-wide suspend is broken on J7200 because of a
controller reset. The TI wrapper does not get re-initialised at resume
and the first register access from cdns core fails.

We address that in a few ways:
 - In cdns3-ti, if a reset has occured at resume, we reconfigure the HW.
 - We pass the XHCI_RESET_ON_RESUME quirk, meaning the XHCI core expects
   a resume.
 - We add a xhci->lost_power flag.

The previous revision had one big issue: we had to know if
reset-on-resume was true, at probe-time. This is where the main
difference with previous revisions is. We now pass the information from
wrapper devices back up into XHCI. The xhci->lost_power flag gets its
default value from the XHCI_RESET_ON_RESUME quirk. It however allows
wrappers to signal *at resume* if they still expect a reset.

That means wrappers that are unsure if they will reset should:
 - (1) set the quirk at probe and,
 - (2) potentially set xhci->lost_power to false at resume.

We implement that for cdns3, by piggybacking on the host role ->resume()
callback already receives the information from its caller.

Have a nice day,
Théo

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
Changes in v5:
- dt-bindings: take Reviewed-by Rob and Conor for the first
  patch: "dt-bindings: usb: ti,j721e-usb: fix compatible list".
- cdns3-ti:
  - We now do have HW init code inside cdns_ti_reset_and_init_hw().
  - It gets called at probe unconditionally and from ->runtime_resume()
    if a reset is detected (using the W1 register).
  - Auxdata patches have been reworked now that there is default auxdata
    since commit b50a2da03bd9 ("usb: cdns3-ti: Add workaround for
    Errata i2409"). We now have a patch that moves auxdata to match
    data: "usb: cdns3-ti: grab auxdata from match data".
- cdns3/xhci: those are three new patches.
  - First, we rename "hibernated" to "lost_power" in arguments to
    the role ->resume() callbacks.
  - Then we add the xhci->lost_power flag, and only have it always copy
    the value from XHCI_RESET_ON_RESUME.
  - Finally, we set the flag from the host role driver.
- Link to v4: https://lore.kernel.org/lkml/20240307-j7200-usb-suspend-v4-0-5ec7615431f3@bootlin.com/

Changes in v4:
- dt-bindings: usb: ti,j721e-usb:
  - Remove ti,am64-usb single compatible entry.
  - Reverse ordering of compatible pair j721e + am64
    (becoming am64 + j721e).
  - Add j7200 + j721e compatible pair (versus only j7200). It is the
    same thing as am64: only the integration differs with base j721e
    compatible.
  - NOT taking trailers from Conor as patches changed substantially.
- arm64: dts: ti: j3-j7200:
  - Use j7200 + j721e compatible pair (versus only j7200 previously).
- arm64: dts: ti: j3-am64:
  - Fix to use am64 + j721e compatible pair (versus only am64).
    This is a new patch.
- Link to v3: https://lore.kernel.org/r/20240223-j7200-usb-suspend-v3-0-b41c9893a130@bootlin.com

Changes in v3:
- dt-bindings: use an enum to list compatibles instead of the previous
  odd construct. This is done in a separate patch from the one adding
  J7200 compatible.
- dt-bindings: dropped Acked-by Conor as the changes were modified a lot.
- Add runtime PM back. Put the init sequence in ->runtime_resume(). It
  gets called at probe for all compatibles and at resume for J7200.
- Introduce a cdns_ti_match_data struct rather than rely on compatible
  from code.
- Reorder code changes. Add infrastructure based on match data THEN add
  compatible and its match data.
- DTSI: use only J7200 compatible rather than both J7200 then J721E.
- Link to v2: https://lore.kernel.org/r/20231120-j7200-usb-suspend-v2-0-038c7e4a3df4@bootlin.com

Changes in v2:
- Remove runtime PM from cdns3-ti; it brings nothing. That means our
  cdns3-ti suspend/resume patch is simpler; there is no need to handle
  runtime PM at suspend/resume.
- Do not add cdns3 host role suspend/resume callbacks; they are not
  needed as core detects reset on resume & calls cdns_drd_host_on when
  needed.
- cdns3-ti: Move usb2_refclk_rate_code assignment closer to the value
  computation.
- cdns3/host.c: do not pass XHCI_SUSPEND_RESUME_CLKS quirk to xHCI; it
  is unneeded on our platform.
- Link to v1: https://lore.kernel.org/r/20231113-j7200-usb-suspend-v1-0-ad1ee714835c@bootlin.com

---
Théo Lebrun (12):
      dt-bindings: usb: ti,j721e-usb: fix compatible list
      dt-bindings: usb: ti,j721e-usb: add ti,j7200-usb compatible
      usb: cdns3-ti: move reg writes to separate function
      usb: cdns3-ti: run HW init at resume() if HW was reset
      usb: cdns3: add quirk to platform data for reset-on-resume
      usb: cdns3-ti: grab auxdata from match data
      usb: cdns3-ti: add J7200 support with reset-on-resume behavior
      usb: cdns3: rename hibernated argument of role->resume() to lost_power
      xhci: introduce xhci->lost_power flag
      usb: cdns3: host: transmit lost_power signal from wrapper to XHCI
      arm64: dts: ti: k3-j7200: use J7200-specific USB compatible
      arm64: dts: ti: k3-am64: add USB fallback compatible to J721E

 .../devicetree/bindings/usb/ti,j721e-usb.yaml      |   5 +-
 arch/arm64/boot/dts/ti/k3-am64-main.dtsi           |   2 +-
 arch/arm64/boot/dts/ti/k3-j7200-main.dtsi          |   2 +-
 drivers/usb/cdns3/cdns3-gadget.c                   |   4 +-
 drivers/usb/cdns3/cdns3-ti.c                       | 151 ++++++++++++++-------
 drivers/usb/cdns3/cdnsp-gadget.c                   |   2 +-
 drivers/usb/cdns3/core.h                           |   3 +-
 drivers/usb/cdns3/host.c                           |  13 ++
 drivers/usb/host/xhci.c                            |   8 +-
 drivers/usb/host/xhci.h                            |   6 +
 10 files changed, 136 insertions(+), 60 deletions(-)
---
base-commit: c33ffdb70cc6df4105160f991288e7d2567d7ffa
change-id: 20240726-s2r-cdns-4b180cd960ff

Best regards,

Comments

Roger Quadros Aug. 3, 2024, 3:14 p.m. UTC | #1
Hi,

On 26/07/2024 21:17, Théo Lebrun wrote:
> Currently, system-wide suspend is broken on J7200 because of a
> controller reset. The TI wrapper does not get re-initialised at resume
> and the first register access from cdns core fails.
> 
> We address that in a few ways:
>  - In cdns3-ti, if a reset has occured at resume, we reconfigure the HW.
>  - We pass the XHCI_RESET_ON_RESUME quirk, meaning the XHCI core expects
>    a resume.

OK.
>  - We add a xhci->lost_power flag.

Why?

> 
> The previous revision had one big issue: we had to know if
> reset-on-resume was true, at probe-time. This is where the main

Don't we already know this at probe-time? why not just rely on the existing
XHCI_RESET_ON_RESUME qurik, than add a new mechanism?

> difference with previous revisions is. We now pass the information from
> wrapper devices back up into XHCI. The xhci->lost_power flag gets its
> default value from the XHCI_RESET_ON_RESUME quirk. It however allows
> wrappers to signal *at resume* if they still expect a reset.
> 
> That means wrappers that are unsure if they will reset should:
>  - (1) set the quirk at probe and,
>  - (2) potentially set xhci->lost_power to false at resume.
> 
> We implement that for cdns3, by piggybacking on the host role ->resume()
> callback already receives the information from its caller.
> 
> Have a nice day,
> Théo
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> ---
> Changes in v5:
> - dt-bindings: take Reviewed-by Rob and Conor for the first
>   patch: "dt-bindings: usb: ti,j721e-usb: fix compatible list".
> - cdns3-ti:
>   - We now do have HW init code inside cdns_ti_reset_and_init_hw().
>   - It gets called at probe unconditionally and from ->runtime_resume()
>     if a reset is detected (using the W1 register).
>   - Auxdata patches have been reworked now that there is default auxdata
>     since commit b50a2da03bd9 ("usb: cdns3-ti: Add workaround for
>     Errata i2409"). We now have a patch that moves auxdata to match
>     data: "usb: cdns3-ti: grab auxdata from match data".
> - cdns3/xhci: those are three new patches.
>   - First, we rename "hibernated" to "lost_power" in arguments to
>     the role ->resume() callbacks.
>   - Then we add the xhci->lost_power flag, and only have it always copy
>     the value from XHCI_RESET_ON_RESUME.
>   - Finally, we set the flag from the host role driver.
> - Link to v4: https://lore.kernel.org/lkml/20240307-j7200-usb-suspend-v4-0-5ec7615431f3@bootlin.com/
> 
> Changes in v4:
> - dt-bindings: usb: ti,j721e-usb:
>   - Remove ti,am64-usb single compatible entry.
>   - Reverse ordering of compatible pair j721e + am64
>     (becoming am64 + j721e).
>   - Add j7200 + j721e compatible pair (versus only j7200). It is the
>     same thing as am64: only the integration differs with base j721e
>     compatible.
>   - NOT taking trailers from Conor as patches changed substantially.
> - arm64: dts: ti: j3-j7200:
>   - Use j7200 + j721e compatible pair (versus only j7200 previously).
> - arm64: dts: ti: j3-am64:
>   - Fix to use am64 + j721e compatible pair (versus only am64).
>     This is a new patch.
> - Link to v3: https://lore.kernel.org/r/20240223-j7200-usb-suspend-v3-0-b41c9893a130@bootlin.com
> 
> Changes in v3:
> - dt-bindings: use an enum to list compatibles instead of the previous
>   odd construct. This is done in a separate patch from the one adding
>   J7200 compatible.
> - dt-bindings: dropped Acked-by Conor as the changes were modified a lot.
> - Add runtime PM back. Put the init sequence in ->runtime_resume(). It
>   gets called at probe for all compatibles and at resume for J7200.
> - Introduce a cdns_ti_match_data struct rather than rely on compatible
>   from code.
> - Reorder code changes. Add infrastructure based on match data THEN add
>   compatible and its match data.
> - DTSI: use only J7200 compatible rather than both J7200 then J721E.
> - Link to v2: https://lore.kernel.org/r/20231120-j7200-usb-suspend-v2-0-038c7e4a3df4@bootlin.com
> 
> Changes in v2:
> - Remove runtime PM from cdns3-ti; it brings nothing. That means our
>   cdns3-ti suspend/resume patch is simpler; there is no need to handle
>   runtime PM at suspend/resume.
> - Do not add cdns3 host role suspend/resume callbacks; they are not
>   needed as core detects reset on resume & calls cdns_drd_host_on when
>   needed.
> - cdns3-ti: Move usb2_refclk_rate_code assignment closer to the value
>   computation.
> - cdns3/host.c: do not pass XHCI_SUSPEND_RESUME_CLKS quirk to xHCI; it
>   is unneeded on our platform.
> - Link to v1: https://lore.kernel.org/r/20231113-j7200-usb-suspend-v1-0-ad1ee714835c@bootlin.com
> 
> ---
> Théo Lebrun (12):
>       dt-bindings: usb: ti,j721e-usb: fix compatible list
>       dt-bindings: usb: ti,j721e-usb: add ti,j7200-usb compatible
>       usb: cdns3-ti: move reg writes to separate function
>       usb: cdns3-ti: run HW init at resume() if HW was reset
>       usb: cdns3: add quirk to platform data for reset-on-resume
>       usb: cdns3-ti: grab auxdata from match data
>       usb: cdns3-ti: add J7200 support with reset-on-resume behavior
>       usb: cdns3: rename hibernated argument of role->resume() to lost_power
>       xhci: introduce xhci->lost_power flag
>       usb: cdns3: host: transmit lost_power signal from wrapper to XHCI
>       arm64: dts: ti: k3-j7200: use J7200-specific USB compatible
>       arm64: dts: ti: k3-am64: add USB fallback compatible to J721E
> 
>  .../devicetree/bindings/usb/ti,j721e-usb.yaml      |   5 +-
>  arch/arm64/boot/dts/ti/k3-am64-main.dtsi           |   2 +-
>  arch/arm64/boot/dts/ti/k3-j7200-main.dtsi          |   2 +-
>  drivers/usb/cdns3/cdns3-gadget.c                   |   4 +-
>  drivers/usb/cdns3/cdns3-ti.c                       | 151 ++++++++++++++-------
>  drivers/usb/cdns3/cdnsp-gadget.c                   |   2 +-
>  drivers/usb/cdns3/core.h                           |   3 +-
>  drivers/usb/cdns3/host.c                           |  13 ++
>  drivers/usb/host/xhci.c                            |   8 +-
>  drivers/usb/host/xhci.h                            |   6 +
>  10 files changed, 136 insertions(+), 60 deletions(-)
> ---
> base-commit: c33ffdb70cc6df4105160f991288e7d2567d7ffa
> change-id: 20240726-s2r-cdns-4b180cd960ff
> 
> Best regards,
Théo Lebrun Aug. 5, 2024, 8:58 a.m. UTC | #2
Hello Roger,

On Sat Aug 3, 2024 at 5:14 PM CEST, Roger Quadros wrote:
> On 26/07/2024 21:17, Théo Lebrun wrote:
> > Currently, system-wide suspend is broken on J7200 because of a
> > controller reset. The TI wrapper does not get re-initialised at resume
> > and the first register access from cdns core fails.
> > 
> > We address that in a few ways:
> >  - In cdns3-ti, if a reset has occured at resume, we reconfigure the HW.
> >  - We pass the XHCI_RESET_ON_RESUME quirk, meaning the XHCI core expects
> >    a resume.
>
> OK.
> >  - We add a xhci->lost_power flag.
>
> Why?
>
> > 
> > The previous revision had one big issue: we had to know if
> > reset-on-resume was true, at probe-time. This is where the main
>
> Don't we already know this at probe-time? why not just rely on the existing
> XHCI_RESET_ON_RESUME qurik, than add a new mechanism?

Some TI platforms cannot tell, before going to suspend, if their USB
controller will reset. Suspend behavior is defined by (at least) two
features:

 - Power domains. See arch/arm64/boot/dts/ti/k3-j7200-main.dtsi:

   usbss0: cdns-usb@4104000 {
      compatible = "ti,j7200-usb", "ti,j721e-usb";
      // ...
      power-domains = <&k3_pds 288 TI_SCI_PD_EXCLUSIVE>;
      // ...
   };

   This `power-domains` property implies that even s2idle will reset
   the controller.

 - Deep suspend. The type of suspend defines what will happen to various
   controllers. Currently deep suspend is suspend-to-RAM, with the SoC
   being turned off.

   This might evolve over time with more complex rules: the chosen
   suspend behavior could depend on wakeup source and/or wakeup target
   latencies. That information might not be available to Linux, being
   decided upon by firmwares. We need to be able to resume successfully
   without being surprised by a reset.

I am sorry the precise usecase wasn't clear from the get-go.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Roger Quadros Aug. 5, 2024, 1:41 p.m. UTC | #3
On 26/07/2024 21:17, Théo Lebrun wrote:
> The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
> expect a reset after resume. It is also used by some to enforce a XHCI
> reset on resume (see needs-reset-on-resume DT prop).
> 
> Some wrappers are unsure beforehands if they will reset. Add a mechanism
> to signal *at resume* if power has been lost. Parent devices can set
> this flag, that defaults to the XHCI_RESET_ON_RESUME value.
> 
> The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
> controller. This is required as we do not know if a suspend will
> trigger a reset, so the best guess is to avoid runtime PM.
> 
> Reset the xhci->lost_power value each time in xhci_resume(), making it
> safe for devices to only set lost_power on some resumes.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> ---
>  drivers/usb/host/xhci.c | 8 +++++++-
>  drivers/usb/host/xhci.h | 6 ++++++
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 0a8cf6c17f82..2c9b32d339f9 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1029,9 +1029,12 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
>  
>  	spin_lock_irq(&xhci->lock);
>  
> -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
> +	if (hibernated || xhci->lost_power || xhci->broken_suspend)

Why not treat xhci->lost_power and xhci->quriks & XHCI_RESET_ON_RESUME independently?

XHCI_RESET_ON_RESUME is sued by devices that know they always need to be reset on resume.

xhci->lost_power is used by devices that don't have consistent behavior.


>  		reinit_xhc = true;
>  
> +	/* Reset to default value, parent devices might correct it at next resume. */
> +	xhci->lost_power = !!(xhci->quirks & XHCI_RESET_ON_RESUME);
> +

then you don't need to do this.

>  	if (!reinit_xhc) {
>  		/*
>  		 * Some controllers might lose power during suspend, so wait
> @@ -5228,6 +5231,9 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
>  	if (get_quirks)
>  		get_quirks(dev, xhci);
>  
> +	/* Default value, that can be corrected at resume. */
> +	xhci->lost_power = !!(xhci->quirks & XHCI_RESET_ON_RESUME);
> + 

nor this.

>  	/* In xhci controllers which follow xhci 1.0 spec gives a spurious
>  	 * success event after a short transfer. This quirk will ignore such
>  	 * spurious event.
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index ebd0afd59a60..ec7c6061363f 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1640,6 +1640,12 @@ struct xhci_hcd {
>  	unsigned		broken_suspend:1;
>  	/* Indicates that omitting hcd is supported if root hub has no ports */
>  	unsigned		allow_single_roothub:1;
> +	/*
> +	 * Signal from upper stacks that we lost power during system-wide
> +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
> +	 * it is safe for wrappers to not modify lost_power at resume.
> +	 */
> +	unsigned                lost_power:1;
>  	/* cached extended protocol port capabilities */
>  	struct xhci_port_cap	*port_caps;
>  	unsigned int		num_port_caps;
>
Roger Quadros Aug. 5, 2024, 1:49 p.m. UTC | #4
On 26/07/2024 21:17, Théo Lebrun wrote:
> The cdns3 host role does not care about reset-on-resume. xHCI however
> reconfigures itself in silence rather than printing a warning about a
> resume error. Related warning example:
> 
>   [   16.017462] xhci-hcd xhci-hcd.1.auto: xHC error in resume, USBSTS 0x401, Reinit
> 
> Allow passing a CDNS3_RESET_ON_RESUME quirk flag from cdns3 pdata down
> to xHCI pdata. The goal is to allow signaling about reset-on-resume
> behavior from platform wrapper drivers.
> 
> When used, remote wakeup is not expected to work.
> 
> Acked-by: Peter Chen <peter.chen@kernel.org>
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>

Reviewed-by: Roger Quadros <rogerq@kernel.org>
Roger Quadros Aug. 5, 2024, 1:51 p.m. UTC | #5
On 26/07/2024 21:17, Théo Lebrun wrote:
> Current code uses the global `cdns_ti_auxdata` variable as auxiliary
> data passed to of_platform_populate(). Use match data to store a
> pointer to auxdata.
> 
> Current behavior is not changed; it allows future compatibles to provide
> different auxiliary data.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>

Reviewed-by: Roger Quadros <rogerq@kernel.org>
Roger Quadros Aug. 5, 2024, 1:54 p.m. UTC | #6
On 26/07/2024 21:17, Théo Lebrun wrote:
> Add ti,j7200-usb compatible. Match data indicates the controller resets
> on resume which tells that to the cdns3 core. This in turn silences a
> xHCI warning visible in cases of unexpected resets.
> 
> We also inherit the errata quirk CDNS3_DRD_SUSPEND_RESIDENCY_ENABLE from
> the default `cdns_ti_auxdata` configuration.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> ---
>  drivers/usb/cdns3/cdns3-ti.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/usb/cdns3/cdns3-ti.c b/drivers/usb/cdns3/cdns3-ti.c
> index 159814dfc856..65b8b6f4c654 100644
> --- a/drivers/usb/cdns3/cdns3-ti.c
> +++ b/drivers/usb/cdns3/cdns3-ti.c
> @@ -258,7 +258,21 @@ static const struct of_dev_auxdata cdns_ti_auxdata[] = {
>  	{},
>  };
>  
> +static struct cdns3_platform_data cdns_ti_j7200_pdata = {
> +	.quirks = CDNS3_RESET_ON_RESUME |

But you mentioned that the behavior can be different based on which
idle state the system went into.
Setting this flag will means Reset is required on every resume.


Instead, you just need to rely on the runtime check and set the xhci->lost_power flag at resume.


> +		  CDNS3_DRD_SUSPEND_RESIDENCY_ENABLE,   /* Errata i2409 */
> +};
> +
> +static const struct of_dev_auxdata cdns_ti_j7200_auxdata[] = {
> +	{
> +		.compatible = "cdns,usb3",
> +		.platform_data = &cdns_ti_j7200_pdata,
> +	},
> +	{},
> +};
> +
>  static const struct of_device_id cdns_ti_of_match[] = {
> +	{ .compatible = "ti,j7200-usb", .data = cdns_ti_j7200_auxdata },
>  	{ .compatible = "ti,j721e-usb", .data = cdns_ti_auxdata },
>  	{ .compatible = "ti,am64-usb", .data = cdns_ti_auxdata },
>  	{},
>
Roger Quadros Aug. 5, 2024, 2:01 p.m. UTC | #7
+Vibhore & Vishal

On 05/08/2024 11:58, Théo Lebrun wrote:
> Hello Roger,
> 
> On Sat Aug 3, 2024 at 5:14 PM CEST, Roger Quadros wrote:
>> On 26/07/2024 21:17, Théo Lebrun wrote:
>>> Currently, system-wide suspend is broken on J7200 because of a
>>> controller reset. The TI wrapper does not get re-initialised at resume
>>> and the first register access from cdns core fails.
>>>
>>> We address that in a few ways:
>>>  - In cdns3-ti, if a reset has occured at resume, we reconfigure the HW.
>>>  - We pass the XHCI_RESET_ON_RESUME quirk, meaning the XHCI core expects
>>>    a resume.
>>
>> OK.
>>>  - We add a xhci->lost_power flag.
>>
>> Why?
>>
>>>
>>> The previous revision had one big issue: we had to know if
>>> reset-on-resume was true, at probe-time. This is where the main
>>
>> Don't we already know this at probe-time? why not just rely on the existing
>> XHCI_RESET_ON_RESUME qurik, than add a new mechanism?
> 
> Some TI platforms cannot tell, before going to suspend, if their USB
> controller will reset. Suspend behavior is defined by (at least) two
> features:
> 
>  - Power domains. See arch/arm64/boot/dts/ti/k3-j7200-main.dtsi:
> 
>    usbss0: cdns-usb@4104000 {
>       compatible = "ti,j7200-usb", "ti,j721e-usb";
>       // ...
>       power-domains = <&k3_pds 288 TI_SCI_PD_EXCLUSIVE>;
>       // ...
>    };
> 
>    This `power-domains` property implies that even s2idle will reset
>    the controller.

I'm not so sure. All K3 platforms have the power-domains property for
the USB wrapper nodes.

> 
>  - Deep suspend. The type of suspend defines what will happen to various
>    controllers. Currently deep suspend is suspend-to-RAM, with the SoC
>    being turned off.
> 
>    This might evolve over time with more complex rules: the chosen
>    suspend behavior could depend on wakeup source and/or wakeup target
>    latencies. That information might not be available to Linux, being
>    decided upon by firmwares. We need to be able to resume successfully
>    without being surprised by a reset.
> 

Got it. Might be worth to mention this in the patch description.

> I am sorry the precise usecase wasn't clear from the get-go.
> 
> Thanks,
> 
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
Roger Quadros Aug. 5, 2024, 2:01 p.m. UTC | #8
On 26/07/2024 21:17, Théo Lebrun wrote:
> On our platform, suspend-to-idle or suspend-to-RAM turn the controller
> off. This compatible triggers reset-on-resume behavior to reconfigure
> the hardware.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Roger Quadros Aug. 5, 2024, 2:03 p.m. UTC | #9
On 26/07/2024 21:18, Théo Lebrun wrote:
> USB on AM64 is the same peripheral as on J721E. It has a specific
> compatible for potential integration details. Express this
> relationship, matching what the dt-bindings indicate.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>

Reviewed-by: Roger Quadros <rogerq@kernel.org>
Kevin Hilman Aug. 6, 2024, 11:12 p.m. UTC | #10
Théo Lebrun <theo.lebrun@bootlin.com> writes:

[...]

> Some TI platforms cannot tell, before going to suspend, if their USB
> controller will reset. Suspend behavior is defined by (at least) two
> features:
>
>  - Power domains. See arch/arm64/boot/dts/ti/k3-j7200-main.dtsi:
>
>    usbss0: cdns-usb@4104000 {
>       compatible = "ti,j7200-usb", "ti,j721e-usb";
>       // ...
>       power-domains = <&k3_pds 288 TI_SCI_PD_EXCLUSIVE>;
>       // ...
>    };
>
>    This `power-domains` property implies that even s2idle will reset
>    the controller.

minor but important clarification: not *will* reset, but *might* reset the controller.

Kevin
Peter Chen Aug. 9, 2024, 1:19 a.m. UTC | #11
On 24-07-26 20:17:48, Théo Lebrun wrote:
> Currently, system-wide suspend is broken on J7200 because of a
> controller reset. The TI wrapper does not get re-initialised at resume
> and the first register access from cdns core fails.
> 
> We address that in a few ways:
>  - In cdns3-ti, if a reset has occured at resume, we reconfigure the HW.
>  - We pass the XHCI_RESET_ON_RESUME quirk, meaning the XHCI core expects
>    a resume.
>  - We add a xhci->lost_power flag.
> 
> The previous revision had one big issue: we had to know if
> reset-on-resume was true, at probe-time. This is where the main
> difference with previous revisions is. We now pass the information from
> wrapper devices back up into XHCI. The xhci->lost_power flag gets its
> default value from the XHCI_RESET_ON_RESUME quirk. It however allows
> wrappers to signal *at resume* if they still expect a reset.
> 
> That means wrappers that are unsure if they will reset should:
>  - (1) set the quirk at probe and,
>  - (2) potentially set xhci->lost_power to false at resume.

Judge if controller is power lost has implemented at cdns_power_is_lost
Please check if you could use that.

Peter

> 
> We implement that for cdns3, by piggybacking on the host role ->resume()
> callback already receives the information from its caller.
> 
> Have a nice day,
> Théo
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> ---
> Changes in v5:
> - dt-bindings: take Reviewed-by Rob and Conor for the first
>   patch: "dt-bindings: usb: ti,j721e-usb: fix compatible list".
> - cdns3-ti:
>   - We now do have HW init code inside cdns_ti_reset_and_init_hw().
>   - It gets called at probe unconditionally and from ->runtime_resume()
>     if a reset is detected (using the W1 register).
>   - Auxdata patches have been reworked now that there is default auxdata
>     since commit b50a2da03bd9 ("usb: cdns3-ti: Add workaround for
>     Errata i2409"). We now have a patch that moves auxdata to match
>     data: "usb: cdns3-ti: grab auxdata from match data".
> - cdns3/xhci: those are three new patches.
>   - First, we rename "hibernated" to "lost_power" in arguments to
>     the role ->resume() callbacks.
>   - Then we add the xhci->lost_power flag, and only have it always copy
>     the value from XHCI_RESET_ON_RESUME.
>   - Finally, we set the flag from the host role driver.
> - Link to v4: https://lore.kernel.org/lkml/20240307-j7200-usb-suspend-v4-0-5ec7615431f3@bootlin.com/
> 
> Changes in v4:
> - dt-bindings: usb: ti,j721e-usb:
>   - Remove ti,am64-usb single compatible entry.
>   - Reverse ordering of compatible pair j721e + am64
>     (becoming am64 + j721e).
>   - Add j7200 + j721e compatible pair (versus only j7200). It is the
>     same thing as am64: only the integration differs with base j721e
>     compatible.
>   - NOT taking trailers from Conor as patches changed substantially.
> - arm64: dts: ti: j3-j7200:
>   - Use j7200 + j721e compatible pair (versus only j7200 previously).
> - arm64: dts: ti: j3-am64:
>   - Fix to use am64 + j721e compatible pair (versus only am64).
>     This is a new patch.
> - Link to v3: https://lore.kernel.org/r/20240223-j7200-usb-suspend-v3-0-b41c9893a130@bootlin.com
> 
> Changes in v3:
> - dt-bindings: use an enum to list compatibles instead of the previous
>   odd construct. This is done in a separate patch from the one adding
>   J7200 compatible.
> - dt-bindings: dropped Acked-by Conor as the changes were modified a lot.
> - Add runtime PM back. Put the init sequence in ->runtime_resume(). It
>   gets called at probe for all compatibles and at resume for J7200.
> - Introduce a cdns_ti_match_data struct rather than rely on compatible
>   from code.
> - Reorder code changes. Add infrastructure based on match data THEN add
>   compatible and its match data.
> - DTSI: use only J7200 compatible rather than both J7200 then J721E.
> - Link to v2: https://lore.kernel.org/r/20231120-j7200-usb-suspend-v2-0-038c7e4a3df4@bootlin.com
> 
> Changes in v2:
> - Remove runtime PM from cdns3-ti; it brings nothing. That means our
>   cdns3-ti suspend/resume patch is simpler; there is no need to handle
>   runtime PM at suspend/resume.
> - Do not add cdns3 host role suspend/resume callbacks; they are not
>   needed as core detects reset on resume & calls cdns_drd_host_on when
>   needed.
> - cdns3-ti: Move usb2_refclk_rate_code assignment closer to the value
>   computation.
> - cdns3/host.c: do not pass XHCI_SUSPEND_RESUME_CLKS quirk to xHCI; it
>   is unneeded on our platform.
> - Link to v1: https://lore.kernel.org/r/20231113-j7200-usb-suspend-v1-0-ad1ee714835c@bootlin.com
> 
> ---
> Théo Lebrun (12):
>       dt-bindings: usb: ti,j721e-usb: fix compatible list
>       dt-bindings: usb: ti,j721e-usb: add ti,j7200-usb compatible
>       usb: cdns3-ti: move reg writes to separate function
>       usb: cdns3-ti: run HW init at resume() if HW was reset
>       usb: cdns3: add quirk to platform data for reset-on-resume
>       usb: cdns3-ti: grab auxdata from match data
>       usb: cdns3-ti: add J7200 support with reset-on-resume behavior
>       usb: cdns3: rename hibernated argument of role->resume() to lost_power
>       xhci: introduce xhci->lost_power flag
>       usb: cdns3: host: transmit lost_power signal from wrapper to XHCI
>       arm64: dts: ti: k3-j7200: use J7200-specific USB compatible
>       arm64: dts: ti: k3-am64: add USB fallback compatible to J721E
> 
>  .../devicetree/bindings/usb/ti,j721e-usb.yaml      |   5 +-
>  arch/arm64/boot/dts/ti/k3-am64-main.dtsi           |   2 +-
>  arch/arm64/boot/dts/ti/k3-j7200-main.dtsi          |   2 +-
>  drivers/usb/cdns3/cdns3-gadget.c                   |   4 +-
>  drivers/usb/cdns3/cdns3-ti.c                       | 151 ++++++++++++++-------
>  drivers/usb/cdns3/cdnsp-gadget.c                   |   2 +-
>  drivers/usb/cdns3/core.h                           |   3 +-
>  drivers/usb/cdns3/host.c                           |  13 ++
>  drivers/usb/host/xhci.c                            |   8 +-
>  drivers/usb/host/xhci.h                            |   6 +
>  10 files changed, 136 insertions(+), 60 deletions(-)
> ---
> base-commit: c33ffdb70cc6df4105160f991288e7d2567d7ffa
> change-id: 20240726-s2r-cdns-4b180cd960ff
> 
> Best regards,
> -- 
> Théo Lebrun <theo.lebrun@bootlin.com>
>
Nishanth Menon Sept. 1, 2024, 8:20 p.m. UTC | #12
Hi Théo Lebrun,

On Fri, 26 Jul 2024 20:17:48 +0200, Théo Lebrun wrote:
> Currently, system-wide suspend is broken on J7200 because of a
> controller reset. The TI wrapper does not get re-initialised at resume
> and the first register access from cdns core fails.
> 
> We address that in a few ways:
>  - In cdns3-ti, if a reset has occured at resume, we reconfigure the HW.
>  - We pass the XHCI_RESET_ON_RESUME quirk, meaning the XHCI core expects
>    a resume.
>  - We add a xhci->lost_power flag.
> 
> [...]

Since Greg has picked f7fd939e805672417bbf418f6035dec9400230fd ("dt-bindings:
usb: ti,j721e-usb: fix compatible list"), the corresponding
patch needs to go via the soc dt tree, so picking just that.

I have applied the following to branch ti-k3-dts-next on [1].
Thank you!

[12/12] arm64: dts: ti: k3-am64: add USB fallback compatible to J721E
        commit: 99ced42d6f3ebcae52c2c6d1207d3f96d7cf88ac

Theo: Do let me know if Greg decides to drop the said patch, and I will drop
this off my PR as well. But, no action at the moment for this.

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent up the chain during
the next merge window (or sooner if it is a relevant bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/ti/linux.git
Théo Lebrun Sept. 10, 2024, 1:50 p.m. UTC | #13
On Mon Aug 5, 2024 at 3:41 PM CEST, Roger Quadros wrote:
> On 26/07/2024 21:17, Théo Lebrun wrote:
> > The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
> > expect a reset after resume. It is also used by some to enforce a XHCI
> > reset on resume (see needs-reset-on-resume DT prop).
> > 
> > Some wrappers are unsure beforehands if they will reset. Add a mechanism
> > to signal *at resume* if power has been lost. Parent devices can set
> > this flag, that defaults to the XHCI_RESET_ON_RESUME value.
> > 
> > The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
> > controller. This is required as we do not know if a suspend will
> > trigger a reset, so the best guess is to avoid runtime PM.
> > 
> > Reset the xhci->lost_power value each time in xhci_resume(), making it
> > safe for devices to only set lost_power on some resumes.
> > 
> > Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> > ---
> >  drivers/usb/host/xhci.c | 8 +++++++-
> >  drivers/usb/host/xhci.h | 6 ++++++
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> > index 0a8cf6c17f82..2c9b32d339f9 100644
> > --- a/drivers/usb/host/xhci.c
> > +++ b/drivers/usb/host/xhci.c
> > @@ -1029,9 +1029,12 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
> >  
> >  	spin_lock_irq(&xhci->lock);
> >  
> > -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
> > +	if (hibernated || xhci->lost_power || xhci->broken_suspend)
>
> Why not treat xhci->lost_power and xhci->quriks & XHCI_RESET_ON_RESUME independently?
>
> XHCI_RESET_ON_RESUME is sued by devices that know they always need to be reset on resume.
>
> xhci->lost_power is used by devices that don't have consistent behavior.

The goal is to avoid almost-duplicate functionality. I feel like:

    XHCI_RESET_ON_RESUME is the default value of xhci->lost_power,
    which might be modified at resume.

Is a more straight forward solution than:

    Both XHCI_RESET_ON_RESUME and xhci->lost_power define if power was
    lost at resume. First must be statically known, second can be
    updated during runtime. If second is used, first one must NOT be
    set.

Indeed, the first solution brings two additional lines of code as you
commented below. I'd argue the easier-to-wrap-your-head-around logic is
more important.

Tell me if you are convinced the second approach is better.

>
>
> >  		reinit_xhc = true;
> >  
> > +	/* Reset to default value, parent devices might correct it at next resume. */
> > +	xhci->lost_power = !!(xhci->quirks & XHCI_RESET_ON_RESUME);
> > +
>
> then you don't need to do this.

To be honest, I added this line out of rigor. We could remove it and say
that any device that modifies xhci->lost_power once at resume must set
it at each later resume.

The above line felt like a small safety net to avoid logic mistakes.

>
> >  	if (!reinit_xhc) {
> >  		/*
> >  		 * Some controllers might lose power during suspend, so wait
> > @@ -5228,6 +5231,9 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
> >  	if (get_quirks)
> >  		get_quirks(dev, xhci);
> >  
> > +	/* Default value, that can be corrected at resume. */
> > +	xhci->lost_power = !!(xhci->quirks & XHCI_RESET_ON_RESUME);
> > + 
>
> nor this.

Regards,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Théo Lebrun Sept. 10, 2024, 1:57 p.m. UTC | #14
On Mon Aug 5, 2024 at 3:54 PM CEST, Roger Quadros wrote:
> On 26/07/2024 21:17, Théo Lebrun wrote:
> > Add ti,j7200-usb compatible. Match data indicates the controller resets
> > on resume which tells that to the cdns3 core. This in turn silences a
> > xHCI warning visible in cases of unexpected resets.
> > 
> > We also inherit the errata quirk CDNS3_DRD_SUSPEND_RESIDENCY_ENABLE from
> > the default `cdns_ti_auxdata` configuration.
> > 
> > Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> > ---
> >  drivers/usb/cdns3/cdns3-ti.c | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)
> > 
> > diff --git a/drivers/usb/cdns3/cdns3-ti.c b/drivers/usb/cdns3/cdns3-ti.c
> > index 159814dfc856..65b8b6f4c654 100644
> > --- a/drivers/usb/cdns3/cdns3-ti.c
> > +++ b/drivers/usb/cdns3/cdns3-ti.c
> > @@ -258,7 +258,21 @@ static const struct of_dev_auxdata cdns_ti_auxdata[] = {
> >  	{},
> >  };
> >  
> > +static struct cdns3_platform_data cdns_ti_j7200_pdata = {
> > +	.quirks = CDNS3_RESET_ON_RESUME |
>
> But you mentioned that the behavior can be different based on which
> idle state the system went into.
> Setting this flag will means Reset is required on every resume.

No, this flag's only behavior is to enable XHCI_RESET_ON_RESUME in the
lower xHCI stack. Code is in __cdns_host_init():

	if (cdns->pdata && (cdns->pdata->quirks & CDNS3_RESET_ON_RESUME))
		cdns->xhci_plat_data->quirks |= XHCI_RESET_ON_RESUME;

> Instead, you just need to rely on the runtime check and set the xhci->lost_power flag at resume.

I argued about my view of the XHCI_RESET_ON_RESUME quirk under
[PATCH v5 09/12].
https://lore.kernel.org/lkml/D42NIH63EHZG.KKWZR2WZB68L@bootlin.com/

Regards,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Théo Lebrun Sept. 10, 2024, 2:04 p.m. UTC | #15
Hello Peter,

On Fri Aug 9, 2024 at 3:19 AM CEST, Peter Chen wrote:
> On 24-07-26 20:17:48, Théo Lebrun wrote:
> > Currently, system-wide suspend is broken on J7200 because of a
> > controller reset. The TI wrapper does not get re-initialised at resume
> > and the first register access from cdns core fails.
> > 
> > We address that in a few ways:
> >  - In cdns3-ti, if a reset has occured at resume, we reconfigure the HW.
> >  - We pass the XHCI_RESET_ON_RESUME quirk, meaning the XHCI core expects
> >    a resume.
> >  - We add a xhci->lost_power flag.
> > 
> > The previous revision had one big issue: we had to know if
> > reset-on-resume was true, at probe-time. This is where the main
> > difference with previous revisions is. We now pass the information from
> > wrapper devices back up into XHCI. The xhci->lost_power flag gets its
> > default value from the XHCI_RESET_ON_RESUME quirk. It however allows
> > wrappers to signal *at resume* if they still expect a reset.
> > 
> > That means wrappers that are unsure if they will reset should:
> >  - (1) set the quirk at probe and,
> >  - (2) potentially set xhci->lost_power to false at resume.
>
> Judge if controller is power lost has implemented at cdns_power_is_lost
> Please check if you could use that.

That function is being used! Its return value is passed as second
argument to the resume() callback in struct cdns_role_driver. We set
xhci->lost_power using that exact value.

My cover letter explanation was slightly off, as it is not wrappers that
set xhci->lost_power, but instead role drivers. Wrappers don't have any
reason to touch the xhci struct directly, they are one layer above.

Related: [PATCH v5 08/15] commit message looks like this:

------------------------------------------------------------------------

The cdns_role_driver->resume() callback takes a second boolean argument
named `hibernated` in its implementations. This is mistaken; the only
potential caller is:

int cdns_resume(struct cdns *cdns)
{
	/* ... */

	if (cdns->roles[cdns->role]->resume)
		cdns->roles[cdns->role]->resume(cdns, cdns_power_is_lost(cdns));

	return 0;
}

The argument can be true in cases outside of return from hibernation.
Reflect the true meaning by renaming both arguments to `lost_power`.

------------------------------------------------------------------------

Regards,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com