mbox series

[SRU,OEM-5.14/OEM-5.17/J,0/2] Fix system hangs after s2idle on AMD A+A GPU

Message ID 20220526083458.321813-1-aaron.ma@canonical.com
Headers show
Series Fix system hangs after s2idle on AMD A+A GPU | expand

Message

Aaron Ma May 26, 2022, 8:34 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1975804

[Impact]
Sytesm may hang after s2idle on AMD A+A GPU config due to the following
commits:
8b328c64c7082278c888927f00e526786eec2880 ("drm/amdgpu: don't use BACO
for reset in S3") https://bugs.launchpad.net/bugs/1968475
a26d80ba0d9e67ea11cbfc25618320163497d3fe ("drm/amd/pm: keep the BACO
feature enabled for suspend") https://bugs.launchpad.net/bugs/1958371

[Fix]
Revert one commit and don't reset dGPU when s2idle

[Test]
Verified on AMD RMB, stress s2idle passed.

[Where problems could occur]
Low risk, it may break s2idle on AMD platform.

Alex Deucher (1):
  Revert "drm/amd/pm: keep the BACO feature enabled for suspend"

Mario Limonciello (1):
  drm/amd: Don't reset dGPUs if the system is going to s2idle

 drivers/gpu/drm/amd/amdgpu/amdgpu.h       |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  2 +-
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  8 +-------
 4 files changed, 18 insertions(+), 8 deletions(-)

Comments

Tim Gardner May 26, 2022, 12:37 p.m. UTC | #1
Acked-by: Tim Gardner <tim.gardner@canonical.com>

On 5/26/22 02:34, Aaron Ma wrote:
> BugLink: https://bugs.launchpad.net/bugs/1975804
> 
> [Impact]
> Sytesm may hang after s2idle on AMD A+A GPU config due to the following
> commits:
> 8b328c64c7082278c888927f00e526786eec2880 ("drm/amdgpu: don't use BACO
> for reset in S3") https://bugs.launchpad.net/bugs/1968475
> a26d80ba0d9e67ea11cbfc25618320163497d3fe ("drm/amd/pm: keep the BACO
> feature enabled for suspend") https://bugs.launchpad.net/bugs/1958371
> 
> [Fix]
> Revert one commit and don't reset dGPU when s2idle
> 
> [Test]
> Verified on AMD RMB, stress s2idle passed.
> 
> [Where problems could occur]
> Low risk, it may break s2idle on AMD platform.
> 
> Alex Deucher (1):
>    Revert "drm/amd/pm: keep the BACO feature enabled for suspend"
> 
> Mario Limonciello (1):
>    drm/amd: Don't reset dGPUs if the system is going to s2idle
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h       |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  | 14 ++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  2 +-
>   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  8 +-------
>   4 files changed, 18 insertions(+), 8 deletions(-)
>
Cengiz Can May 26, 2022, 3:41 p.m. UTC | #2
On Thu, 2022-05-26 at 16:34 +0800, Aaron Ma wrote:
> BugLink: https://bugs.launchpad.net/bugs/1975804
> 
> [Impact]
> Sytesm may hang after s2idle on AMD A+A GPU config due to the following

There's a little typo here. It should be "System may hang..".

It's wrong on LP so it propagated here.

I've fixed the description on LP. 

Other than that, patches look good to me. Thank you!

Acked-by: Cengiz Can <cengiz.can@canonical.com>

> commits:
> 8b328c64c7082278c888927f00e526786eec2880 ("drm/amdgpu: don't use BACO
> for reset in S3") https://bugs.launchpad.net/bugs/1968475
> a26d80ba0d9e67ea11cbfc25618320163497d3fe ("drm/amd/pm: keep the BACO
> feature enabled for suspend") https://bugs.launchpad.net/bugs/1958371
> 
> [Fix]
> Revert one commit and don't reset dGPU when s2idle
> 
> [Test]
> Verified on AMD RMB, stress s2idle passed.
> 
> [Where problems could occur]
> Low risk, it may break s2idle on AMD platform.
> 
> Alex Deucher (1):
>   Revert "drm/amd/pm: keep the BACO feature enabled for suspend"
> 
> Mario Limonciello (1):
>   drm/amd: Don't reset dGPUs if the system is going to s2idle
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h       |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  | 14 ++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  2 +-
>  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  8 +-------
>  4 files changed, 18 insertions(+), 8 deletions(-)
> 
> -- 
> 2.36.1
> 
>
Kleber Souza May 27, 2022, 10:30 a.m. UTC | #3
On 26.05.22 10:34, Aaron Ma wrote:
> BugLink: https://bugs.launchpad.net/bugs/1975804
> 
> [Impact]
> Sytesm may hang after s2idle on AMD A+A GPU config due to the following
> commits:
> 8b328c64c7082278c888927f00e526786eec2880 ("drm/amdgpu: don't use BACO
> for reset in S3") https://bugs.launchpad.net/bugs/1968475
> a26d80ba0d9e67ea11cbfc25618320163497d3fe ("drm/amd/pm: keep the BACO
> feature enabled for suspend") https://bugs.launchpad.net/bugs/1958371
> 
> [Fix]
> Revert one commit and don't reset dGPU when s2idle
> 
> [Test]
> Verified on AMD RMB, stress s2idle passed.
> 
> [Where problems could occur]
> Low risk, it may break s2idle on AMD platform.
> 
> Alex Deucher (1):
>    Revert "drm/amd/pm: keep the BACO feature enabled for suspend"
> 
> Mario Limonciello (1):
>    drm/amd: Don't reset dGPUs if the system is going to s2idle
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h       |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  | 14 ++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  2 +-
>   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  8 +-------
>   4 files changed, 18 insertions(+), 8 deletions(-)
> 

Applied to jammy:linux.

Thanks,
Kleber
Timo Aaltonen June 7, 2022, 9:59 a.m. UTC | #4
Aaron Ma kirjoitti 26.5.2022 klo 11.34:
> BugLink: https://bugs.launchpad.net/bugs/1975804
> 
> [Impact]
> Sytesm may hang after s2idle on AMD A+A GPU config due to the following
> commits:
> 8b328c64c7082278c888927f00e526786eec2880 ("drm/amdgpu: don't use BACO
> for reset in S3") https://bugs.launchpad.net/bugs/1968475
> a26d80ba0d9e67ea11cbfc25618320163497d3fe ("drm/amd/pm: keep the BACO
> feature enabled for suspend") https://bugs.launchpad.net/bugs/1958371
> 
> [Fix]
> Revert one commit and don't reset dGPU when s2idle
> 
> [Test]
> Verified on AMD RMB, stress s2idle passed.
> 
> [Where problems could occur]
> Low risk, it may break s2idle on AMD platform.
> 
> Alex Deucher (1):
>    Revert "drm/amd/pm: keep the BACO feature enabled for suspend"
> 
> Mario Limonciello (1):
>    drm/amd: Don't reset dGPUs if the system is going to s2idle
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h       |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c  | 14 ++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  2 +-
>   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  8 +-------
>   4 files changed, 18 insertions(+), 8 deletions(-)
> 

applied to oem-kernels, thanks