mbox series

[SRU,OEM-5.10,0/4] Fix unexpected AER/DPC on PCH400 and PCH500

Message ID 20210129080245.979511-1-kai.heng.feng@canonical.com
Headers show
Series Fix unexpected AER/DPC on PCH400 and PCH500 | expand

Message

Kai-Heng Feng Jan. 29, 2021, 8:02 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1913691

[Impact]
On PCH400 and PCH500, root port on S3 resume trips AER and DPC, dropping
NVMe after a failed reset.

[Fix]
Disable AER/DPC interrupt on suspend.

[Test]
Apply the patch, run S3 stress test 100 times on CML/PCH400 and
RKL/PCH500, issue is no longer reproducible.

[Where problems could occur]
If any device depends on tripping AER/DPC to reset after system resume
to work properly, this patch series will break them.

Kai-Heng Feng (4):
  Revert "UBUNTU: SAUCE: PCI: Enable ACS quirk on all CML root ports"
  Revert "UBUNTU: SAUCE: PCI: Enable ACS quirk on CML root port"
  UBUNTU: SAUCE: PCI/AER: Disable AER interrupt during suspend
  UBUNTU: SAUCE: PCI/DPC: Disable DPC interrupt during suspend

 drivers/pci/pcie/aer.c | 18 ++++++++++++++++
 drivers/pci/pcie/dpc.c | 49 ++++++++++++++++++++++++++++++++----------
 drivers/pci/quirks.c   |  1 -
 3 files changed, 56 insertions(+), 12 deletions(-)

Comments

Timo Aaltonen Jan. 29, 2021, 8:32 a.m. UTC | #1
On 29.1.2021 10.02, Kai-Heng Feng wrote:
> BugLink: https://bugs.launchpad.net/bugs/1913691
> 
> [Impact]
> On PCH400 and PCH500, root port on S3 resume trips AER and DPC, dropping
> NVMe after a failed reset.
> 
> [Fix]
> Disable AER/DPC interrupt on suspend.
> 
> [Test]
> Apply the patch, run S3 stress test 100 times on CML/PCH400 and
> RKL/PCH500, issue is no longer reproducible.
> 
> [Where problems could occur]
> If any device depends on tripping AER/DPC to reset after system resume
> to work properly, this patch series will break them.
> 
> Kai-Heng Feng (4):
>    Revert "UBUNTU: SAUCE: PCI: Enable ACS quirk on all CML root ports"
>    Revert "UBUNTU: SAUCE: PCI: Enable ACS quirk on CML root port"
>    UBUNTU: SAUCE: PCI/AER: Disable AER interrupt during suspend
>    UBUNTU: SAUCE: PCI/DPC: Disable DPC interrupt during suspend
> 
>   drivers/pci/pcie/aer.c | 18 ++++++++++++++++
>   drivers/pci/pcie/dpc.c | 49 ++++++++++++++++++++++++++++++++----------
>   drivers/pci/quirks.c   |  1 -
>   3 files changed, 56 insertions(+), 12 deletions(-)
> 

applied to oem-5.10, thanks
Andrea Righi Jan. 29, 2021, 9:47 a.m. UTC | #2
On Fri, Jan 29, 2021 at 04:02:41PM +0800, Kai-Heng Feng wrote:
> BugLink: https://bugs.launchpad.net/bugs/1913691
> 
> [Impact]
> On PCH400 and PCH500, root port on S3 resume trips AER and DPC, dropping
> NVMe after a failed reset.
> 
> [Fix]
> Disable AER/DPC interrupt on suspend.
> 
> [Test]
> Apply the patch, run S3 stress test 100 times on CML/PCH400 and
> RKL/PCH500, issue is no longer reproducible.
> 
> [Where problems could occur]
> If any device depends on tripping AER/DPC to reset after system resume
> to work properly, this patch series will break them.
> 
> Kai-Heng Feng (4):
>   Revert "UBUNTU: SAUCE: PCI: Enable ACS quirk on all CML root ports"
>   Revert "UBUNTU: SAUCE: PCI: Enable ACS quirk on CML root port"
>   UBUNTU: SAUCE: PCI/AER: Disable AER interrupt during suspend
>   UBUNTU: SAUCE: PCI/DPC: Disable DPC interrupt during suspend

Applied to unstable/5.11.

Thanks,
-Andrea
Paolo Pisati Jan. 29, 2021, 10:10 a.m. UTC | #3
On Fri, Jan 29, 2021 at 04:02:41PM +0800, Kai-Heng Feng wrote:
> BugLink: https://bugs.launchpad.net/bugs/1913691

Have you tried to upstream these patches?

BTW, the bugzilla bug points to a different patch, and you says it's just a
placebo (e.g. the issue is still reproducible).
Kai-Heng Feng Jan. 29, 2021, 11:26 a.m. UTC | #4
On Fri, Jan 29, 2021 at 6:10 PM Paolo Pisati <paolo.pisati@canonical.com> wrote:
>
> On Fri, Jan 29, 2021 at 04:02:41PM +0800, Kai-Heng Feng wrote:
> > BugLink: https://bugs.launchpad.net/bugs/1913691
>
> Have you tried to upstream these patches?

Yes, there's a discussion going on.

>
> BTW, the bugzilla bug points to a different patch, and you says it's just a
> placebo (e.g. the issue is still reproducible).

That patch is wrong.

Kai-Heng

> --
> bye,
> p.