mbox series

[SRU,N/O/unstable,v2,0/2] Enable ASPM for nvme controller when working in RAID on mode

Message ID 20240819025908.50667-1-hui.wang@canonical.com
Headers show
Series Enable ASPM for nvme controller when working in RAID on mode | expand

Message

Hui Wang Aug. 19, 2024, 2:59 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2072679

In the v2:
Thanks for Timo to point out, there is the same problem for
lunar-generic and mantic-generic kernels, and we applied to similar
UBUNTU SAUCE patches to those kernels, but somehow we forgot to
apply the patches to unstable at that time, hence we have regression
for this issue in N/O/..., the tracking bug for L/M is:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034504

short-term plan:
SRU the SAUCE patches to N/O/unstable, fix the regression on those
Dell machines ASAP.
long-term plan:
Kai-heng will ping and discuss with linux-pci maintainers, modify
the patch as maintainers request, once the formal patches are merged
to upstream kernel, I will revert the SAUCE patches from N/O/unstable
and SRU the formal patches to these kernels.


[Impact]
The NVME controller works in RAID on mode by default on some Dell
machines, and in this case, the PCIE ASPM couldn't be enabled, and
as a result the system idle can't enter deep idle states. This issue
not only impacts ubuntu users but also impacts our Dell OEM projects.


[Fix]
pick 2 commits from linux-pci mailist

[Test]
After running the patched kernel, we could run 'sudo lspci -nnvv'
and check "Non-Volatile memory controller":
               LnkCtl: ASPM L1 Enabled;

And check idle states, we could see the system could enter deep idle:
$ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show
Package C2 : 55740989
Package C3 : 4656373
Package C6 : 43325041
Package C7 : 6687655
Package C8 : 44948950
Package C9 : 1693
Package C10 : 92865596

[Where problems could occur]
Because the patchset is not accepted by upstream yet, it is a bit
risky to merge the patchset to ubuntu kernel. And the patch only
impacts vmd driver, hence if there is regression, it could only be
in the nvme driver with RAID on mode. The regression possibility is
very low because we already tested the patch on many Dell, lenovo
machines, there is no regression so far.


Kai-Heng Feng (2):
  UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is
    incapable of
  UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD
    domain

 drivers/pci/controller/vmd.c | 2 ++
 drivers/pci/pcie/aspm.c      | 8 ++++++--
 include/linux/pci.h          | 1 +
 3 files changed, 9 insertions(+), 2 deletions(-)

Comments

Aaron Jauregui Aug. 20, 2024, 5:43 a.m. UTC | #1
On Mon, Aug 19, 2024 at 10:59:06AM +0800, Hui Wang wrote:
> BugLink: https://bugs.launchpad.net/bugs/2072679
> 
> In the v2:
> Thanks for Timo to point out, there is the same problem for
> lunar-generic and mantic-generic kernels, and we applied to similar
> UBUNTU SAUCE patches to those kernels, but somehow we forgot to
> apply the patches to unstable at that time, hence we have regression
> for this issue in N/O/..., the tracking bug for L/M is:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034504
> 
> short-term plan:
> SRU the SAUCE patches to N/O/unstable, fix the regression on those
> Dell machines ASAP.
> long-term plan:
> Kai-heng will ping and discuss with linux-pci maintainers, modify
> the patch as maintainers request, once the formal patches are merged
> to upstream kernel, I will revert the SAUCE patches from N/O/unstable
> and SRU the formal patches to these kernels.
> 
> 
> [Impact]
> The NVME controller works in RAID on mode by default on some Dell
> machines, and in this case, the PCIE ASPM couldn't be enabled, and
> as a result the system idle can't enter deep idle states. This issue
> not only impacts ubuntu users but also impacts our Dell OEM projects.
> 
> 
> [Fix]
> pick 2 commits from linux-pci mailist
> 
> [Test]
> After running the patched kernel, we could run 'sudo lspci -nnvv'
> and check "Non-Volatile memory controller":
>                LnkCtl: ASPM L1 Enabled;
> 
> And check idle states, we could see the system could enter deep idle:
> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show
> Package C2 : 55740989
> Package C3 : 4656373
> Package C6 : 43325041
> Package C7 : 6687655
> Package C8 : 44948950
> Package C9 : 1693
> Package C10 : 92865596
> 
> [Where problems could occur]
> Because the patchset is not accepted by upstream yet, it is a bit
> risky to merge the patchset to ubuntu kernel. And the patch only
> impacts vmd driver, hence if there is regression, it could only be
> in the nvme driver with RAID on mode. The regression possibility is
> very low because we already tested the patch on many Dell, lenovo
> machines, there is no regression so far.
> 
> 
> Kai-Heng Feng (2):
>   UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is
>     incapable of
>   UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD
>     domain
> 
>  drivers/pci/controller/vmd.c | 2 ++
>  drivers/pci/pcie/aspm.c      | 8 ++++++--
>  include/linux/pci.h          | 1 +
>  3 files changed, 9 insertions(+), 2 deletions(-)

Acked-by: Aaron Jauregui <aaron.jauregui@canonical.com>
Kuan-Ying Lee Aug. 23, 2024, 3:29 a.m. UTC | #2
On Mon, Aug 19, 2024 at 10:59:06AM +0800, Hui Wang wrote:
> BugLink: https://bugs.launchpad.net/bugs/2072679
> 
> In the v2:
> Thanks for Timo to point out, there is the same problem for
> lunar-generic and mantic-generic kernels, and we applied to similar
> UBUNTU SAUCE patches to those kernels, but somehow we forgot to
> apply the patches to unstable at that time, hence we have regression
> for this issue in N/O/..., the tracking bug for L/M is:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034504
> 
> short-term plan:
> SRU the SAUCE patches to N/O/unstable, fix the regression on those
> Dell machines ASAP.
> long-term plan:
> Kai-heng will ping and discuss with linux-pci maintainers, modify
> the patch as maintainers request, once the formal patches are merged
> to upstream kernel, I will revert the SAUCE patches from N/O/unstable
> and SRU the formal patches to these kernels.
> 
> 
> [Impact]
> The NVME controller works in RAID on mode by default on some Dell
> machines, and in this case, the PCIE ASPM couldn't be enabled, and
> as a result the system idle can't enter deep idle states. This issue
> not only impacts ubuntu users but also impacts our Dell OEM projects.
> 
> 
> [Fix]
> pick 2 commits from linux-pci mailist
> 
> [Test]
> After running the patched kernel, we could run 'sudo lspci -nnvv'
> and check "Non-Volatile memory controller":
>                LnkCtl: ASPM L1 Enabled;
> 
> And check idle states, we could see the system could enter deep idle:
> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show
> Package C2 : 55740989
> Package C3 : 4656373
> Package C6 : 43325041
> Package C7 : 6687655
> Package C8 : 44948950
> Package C9 : 1693
> Package C10 : 92865596
> 
> [Where problems could occur]
> Because the patchset is not accepted by upstream yet, it is a bit
> risky to merge the patchset to ubuntu kernel. And the patch only
> impacts vmd driver, hence if there is regression, it could only be
> in the nvme driver with RAID on mode. The regression possibility is
> very low because we already tested the patch on many Dell, lenovo
> machines, there is no regression so far.
> 
> 
> Kai-Heng Feng (2):
>   UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is
>     incapable of
>   UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD
>     domain
> 
>  drivers/pci/controller/vmd.c | 2 ++
>  drivers/pci/pcie/aspm.c      | 8 ++++++--
>  include/linux/pci.h          | 1 +
>  3 files changed, 9 insertions(+), 2 deletions(-)

Acked-by: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
Roxana Nicolescu Aug. 26, 2024, 7:14 a.m. UTC | #3
On 19/08/2024 04:59, Hui Wang wrote:
> BugLink: https://bugs.launchpad.net/bugs/2072679
>
> In the v2:
> Thanks for Timo to point out, there is the same problem for
> lunar-generic and mantic-generic kernels, and we applied to similar
> UBUNTU SAUCE patches to those kernels, but somehow we forgot to
> apply the patches to unstable at that time, hence we have regression
> for this issue in N/O/..., the tracking bug for L/M is:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034504
>
> short-term plan:
> SRU the SAUCE patches to N/O/unstable, fix the regression on those
> Dell machines ASAP.
> long-term plan:
> Kai-heng will ping and discuss with linux-pci maintainers, modify
> the patch as maintainers request, once the formal patches are merged
> to upstream kernel, I will revert the SAUCE patches from N/O/unstable
> and SRU the formal patches to these kernels.
>
>
> [Impact]
> The NVME controller works in RAID on mode by default on some Dell
> machines, and in this case, the PCIE ASPM couldn't be enabled, and
> as a result the system idle can't enter deep idle states. This issue
> not only impacts ubuntu users but also impacts our Dell OEM projects.
>
>
> [Fix]
> pick 2 commits from linux-pci mailist
>
> [Test]
> After running the patched kernel, we could run 'sudo lspci -nnvv'
> and check "Non-Volatile memory controller":
>                 LnkCtl: ASPM L1 Enabled;
>
> And check idle states, we could see the system could enter deep idle:
> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show
> Package C2 : 55740989
> Package C3 : 4656373
> Package C6 : 43325041
> Package C7 : 6687655
> Package C8 : 44948950
> Package C9 : 1693
> Package C10 : 92865596
>
> [Where problems could occur]
> Because the patchset is not accepted by upstream yet, it is a bit
> risky to merge the patchset to ubuntu kernel. And the patch only
> impacts vmd driver, hence if there is regression, it could only be
> in the nvme driver with RAID on mode. The regression possibility is
> very low because we already tested the patch on many Dell, lenovo
> machines, there is no regression so far.
>
>
> Kai-Heng Feng (2):
>    UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is
>      incapable of
>    UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD
>      domain
>
>   drivers/pci/controller/vmd.c | 2 ++
>   drivers/pci/pcie/aspm.c      | 8 ++++++--
>   include/linux/pci.h          | 1 +
>   3 files changed, 9 insertions(+), 2 deletions(-)
>
Applied to noble:linux master-next branch. Thanks!
Timo Aaltonen Sept. 2, 2024, 12:18 p.m. UTC | #4
Hui Wang kirjoitti 19.8.2024 klo 5.59:
> BugLink: https://bugs.launchpad.net/bugs/2072679
> 
> In the v2:
> Thanks for Timo to point out, there is the same problem for
> lunar-generic and mantic-generic kernels, and we applied to similar
> UBUNTU SAUCE patches to those kernels, but somehow we forgot to
> apply the patches to unstable at that time, hence we have regression
> for this issue in N/O/..., the tracking bug for L/M is:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034504
> 
> short-term plan:
> SRU the SAUCE patches to N/O/unstable, fix the regression on those
> Dell machines ASAP.
> long-term plan:
> Kai-heng will ping and discuss with linux-pci maintainers, modify
> the patch as maintainers request, once the formal patches are merged
> to upstream kernel, I will revert the SAUCE patches from N/O/unstable
> and SRU the formal patches to these kernels.
> 
> 
> [Impact]
> The NVME controller works in RAID on mode by default on some Dell
> machines, and in this case, the PCIE ASPM couldn't be enabled, and
> as a result the system idle can't enter deep idle states. This issue
> not only impacts ubuntu users but also impacts our Dell OEM projects.
> 
> 
> [Fix]
> pick 2 commits from linux-pci mailist
> 
> [Test]
> After running the patched kernel, we could run 'sudo lspci -nnvv'
> and check "Non-Volatile memory controller":
>                 LnkCtl: ASPM L1 Enabled;
> 
> And check idle states, we could see the system could enter deep idle:
> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show
> Package C2 : 55740989
> Package C3 : 4656373
> Package C6 : 43325041
> Package C7 : 6687655
> Package C8 : 44948950
> Package C9 : 1693
> Package C10 : 92865596
> 
> [Where problems could occur]
> Because the patchset is not accepted by upstream yet, it is a bit
> risky to merge the patchset to ubuntu kernel. And the patch only
> impacts vmd driver, hence if there is regression, it could only be
> in the nvme driver with RAID on mode. The regression possibility is
> very low because we already tested the patch on many Dell, lenovo
> machines, there is no regression so far.
> 
> 
> Kai-Heng Feng (2):
>    UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is
>      incapable of
>    UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD
>      domain
> 
>   drivers/pci/controller/vmd.c | 2 ++
>   drivers/pci/pcie/aspm.c      | 8 ++++++--
>   include/linux/pci.h          | 1 +
>   3 files changed, 9 insertions(+), 2 deletions(-)
> 

applied to oracular, thanks