Message ID | 20240815032259.27719-1-hui.wang@canonical.com |
---|---|
Headers | show |
Series | Enable ASPM for nvme controller when working in RAID on mode | expand |
On 15.08.24 05:22, Hui Wang wrote: > BugLink: https://bugs.launchpad.net/bugs/2072679 > > [Impact] > The NVME controller works in RAID on mode by default on some Dell > machines, and in this case, the PCIE ASPM couldn't be enabled, and > as a result the system idle can't enter deep idle states. This issue > not only impacts ubuntu users but also impacts our Dell OEM projects. > > > [Fix] > pick 2 commits from linux-pci mailist > > [Test] > After running the patched kernel, we could run 'sudo lspci -nnvv' > and check "Non-Volatile memory controller": > LnkCtl: ASPM L1 Enabled; > > And check idle states, we could see the system could enter deep idle: > $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show > Package C2 : 55740989 > Package C3 : 4656373 > Package C6 : 43325041 > Package C7 : 6687655 > Package C8 : 44948950 > Package C9 : 1693 > Package C10 : 92865596 > > [Where problems could occur] > Because the patchset is not accepted by upstream yet, it is a bit > risky to merge the patchset to ubuntu kernel. And the patch only > impacts vmd driver, hence if there is regression, it could only be > in the nvme driver with RAID on mode. The regression possibility is > very low because we already tested the patch on many Dell, lenovo > machines, there is no regression so far. > > > Kai-Heng Feng (2): > UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is > incapable of > UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD > domain > > drivers/pci/controller/vmd.c | 2 ++ > drivers/pci/pcie/aspm.c | 8 ++++++-- > include/linux/pci.h | 1 + > 3 files changed, 9 insertions(+), 2 deletions(-) > Rejected for the following reasons: - Patches which are not at least in linux-next are not material for SRU -Stefan
On 8/15/24 20:38, Stefan Bader wrote: > On 15.08.24 05:22, Hui Wang wrote: >> BugLink: https://bugs.launchpad.net/bugs/2072679 >> >> [Impact] >> The NVME controller works in RAID on mode by default on some Dell >> machines, and in this case, the PCIE ASPM couldn't be enabled, and >> as a result the system idle can't enter deep idle states. This issue >> not only impacts ubuntu users but also impacts our Dell OEM projects. >> >> >> [Fix] >> pick 2 commits from linux-pci mailist >> >> [Test] >> After running the patched kernel, we could run 'sudo lspci -nnvv' >> and check "Non-Volatile memory controller": >> LnkCtl: ASPM L1 Enabled; >> >> And check idle states, we could see the system could enter deep idle: >> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show >> Package C2 : 55740989 >> Package C3 : 4656373 >> Package C6 : 43325041 >> Package C7 : 6687655 >> Package C8 : 44948950 >> Package C9 : 1693 >> Package C10 : 92865596 >> >> [Where problems could occur] >> Because the patchset is not accepted by upstream yet, it is a bit >> risky to merge the patchset to ubuntu kernel. And the patch only >> impacts vmd driver, hence if there is regression, it could only be >> in the nvme driver with RAID on mode. The regression possibility is >> very low because we already tested the patch on many Dell, lenovo >> machines, there is no regression so far. >> >> >> Kai-Heng Feng (2): >> UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is >> incapable of >> UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD >> domain >> >> drivers/pci/controller/vmd.c | 2 ++ >> drivers/pci/pcie/aspm.c | 8 ++++++-- >> include/linux/pci.h | 1 + >> 3 files changed, 9 insertions(+), 2 deletions(-) >> > > Rejected for the following reasons: > - Patches which are not at least in linux-next are not material for SRU > > -Stefan Hi Stefan, Understand, that is the general rule for SRU. In this case, it impacts our Dell oem project. This issue is a regression of hwe-6.8 kernel, we don't have this issue with the oem-6.5 kernel, but oem-6.5 is EOL. Hence we need to merge this patchset to -generic kernel ASAP. Kai-Heng will continue working on this patchset and make sure it will be accepted by PCI maintainers. So could we get an exception in this case? Thanks, Hui.
On 16.08.24 12:16, Hui Wang wrote: > > On 8/15/24 20:38, Stefan Bader wrote: >> On 15.08.24 05:22, Hui Wang wrote: >>> BugLink: https://bugs.launchpad.net/bugs/2072679 >>> >>> [Impact] >>> The NVME controller works in RAID on mode by default on some Dell >>> machines, and in this case, the PCIE ASPM couldn't be enabled, and >>> as a result the system idle can't enter deep idle states. This issue >>> not only impacts ubuntu users but also impacts our Dell OEM projects. >>> >>> >>> [Fix] >>> pick 2 commits from linux-pci mailist >>> >>> [Test] >>> After running the patched kernel, we could run 'sudo lspci -nnvv' >>> and check "Non-Volatile memory controller": >>> LnkCtl: ASPM L1 Enabled; >>> >>> And check idle states, we could see the system could enter deep idle: >>> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show >>> Package C2 : 55740989 >>> Package C3 : 4656373 >>> Package C6 : 43325041 >>> Package C7 : 6687655 >>> Package C8 : 44948950 >>> Package C9 : 1693 >>> Package C10 : 92865596 >>> >>> [Where problems could occur] >>> Because the patchset is not accepted by upstream yet, it is a bit >>> risky to merge the patchset to ubuntu kernel. And the patch only >>> impacts vmd driver, hence if there is regression, it could only be >>> in the nvme driver with RAID on mode. The regression possibility is >>> very low because we already tested the patch on many Dell, lenovo >>> machines, there is no regression so far. >>> >>> >>> Kai-Heng Feng (2): >>> UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is >>> incapable of >>> UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD >>> domain >>> >>> drivers/pci/controller/vmd.c | 2 ++ >>> drivers/pci/pcie/aspm.c | 8 ++++++-- >>> include/linux/pci.h | 1 + >>> 3 files changed, 9 insertions(+), 2 deletions(-) >>> >> >> Rejected for the following reasons: >> - Patches which are not at least in linux-next are not material for SRU >> >> -Stefan > > Hi Stefan, > > Understand, that is the general rule for SRU. > > In this case, it impacts our Dell oem project. This issue is a > regression of hwe-6.8 kernel, we don't have this issue with the oem-6.5 > kernel, but oem-6.5 is EOL. Hence we need to merge this patchset to > -generic kernel ASAP. > > Kai-Heng will continue working on this patchset and make sure it will be > accepted by PCI maintainers. So could we get an exception in this case? > > Thanks, > > Hui. > If you re-submit and put the reasoning about why this is needed now and the plan going forward. If this does not get accepted as is we need to make sure things get updated to the actual solution.
Stefan Bader kirjoitti 16.8.2024 klo 14.36: > On 16.08.24 12:16, Hui Wang wrote: >> >> On 8/15/24 20:38, Stefan Bader wrote: >>> On 15.08.24 05:22, Hui Wang wrote: >>>> BugLink: https://bugs.launchpad.net/bugs/2072679 >>>> >>>> [Impact] >>>> The NVME controller works in RAID on mode by default on some Dell >>>> machines, and in this case, the PCIE ASPM couldn't be enabled, and >>>> as a result the system idle can't enter deep idle states. This issue >>>> not only impacts ubuntu users but also impacts our Dell OEM projects. >>>> >>>> >>>> [Fix] >>>> pick 2 commits from linux-pci mailist >>>> >>>> [Test] >>>> After running the patched kernel, we could run 'sudo lspci -nnvv' >>>> and check "Non-Volatile memory controller": >>>> LnkCtl: ASPM L1 Enabled; >>>> >>>> And check idle states, we could see the system could enter deep idle: >>>> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show >>>> Package C2 : 55740989 >>>> Package C3 : 4656373 >>>> Package C6 : 43325041 >>>> Package C7 : 6687655 >>>> Package C8 : 44948950 >>>> Package C9 : 1693 >>>> Package C10 : 92865596 >>>> >>>> [Where problems could occur] >>>> Because the patchset is not accepted by upstream yet, it is a bit >>>> risky to merge the patchset to ubuntu kernel. And the patch only >>>> impacts vmd driver, hence if there is regression, it could only be >>>> in the nvme driver with RAID on mode. The regression possibility is >>>> very low because we already tested the patch on many Dell, lenovo >>>> machines, there is no regression so far. >>>> >>>> >>>> Kai-Heng Feng (2): >>>> UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is >>>> incapable of >>>> UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD >>>> domain >>>> >>>> drivers/pci/controller/vmd.c | 2 ++ >>>> drivers/pci/pcie/aspm.c | 8 ++++++-- >>>> include/linux/pci.h | 1 + >>>> 3 files changed, 9 insertions(+), 2 deletions(-) >>>> >>> >>> Rejected for the following reasons: >>> - Patches which are not at least in linux-next are not material for SRU >>> >>> -Stefan >> >> Hi Stefan, >> >> Understand, that is the general rule for SRU. >> >> In this case, it impacts our Dell oem project. This issue is a >> regression of hwe-6.8 kernel, we don't have this issue with the >> oem-6.5 kernel, but oem-6.5 is EOL. Hence we need to merge this >> patchset to -generic kernel ASAP. >> >> Kai-Heng will continue working on this patchset and make sure it will >> be accepted by PCI maintainers. So could we get an exception in this >> case? >> >> Thanks, >> >> Hui. >> > > If you re-submit and put the reasoning about why this is needed now and > the plan going forward. If this does not get accepted as is we need to > make sure things get updated to the actual solution. FTR, this bug was originally https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034504 and it got applied to the mantic kernel but not unstable (even though the bug suggests that it was), which is the reason for the regression.. would've been nice if the new bug had a link to the old one, to save a lot of time chasing where things went wrong
On 8/16/24 19:36, Stefan Bader wrote: > On 16.08.24 12:16, Hui Wang wrote: >> >> On 8/15/24 20:38, Stefan Bader wrote: >>> On 15.08.24 05:22, Hui Wang wrote: >>>> BugLink: https://bugs.launchpad.net/bugs/2072679 >>>> >>>> [Impact] >>>> The NVME controller works in RAID on mode by default on some Dell >>>> machines, and in this case, the PCIE ASPM couldn't be enabled, and >>>> as a result the system idle can't enter deep idle states. This issue >>>> not only impacts ubuntu users but also impacts our Dell OEM projects. >>>> >>>> >>>> [Fix] >>>> pick 2 commits from linux-pci mailist >>>> >>>> [Test] >>>> After running the patched kernel, we could run 'sudo lspci -nnvv' >>>> and check "Non-Volatile memory controller": >>>> LnkCtl: ASPM L1 Enabled; >>>> >>>> And check idle states, we could see the system could enter deep idle: >>>> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show >>>> Package C2 : 55740989 >>>> Package C3 : 4656373 >>>> Package C6 : 43325041 >>>> Package C7 : 6687655 >>>> Package C8 : 44948950 >>>> Package C9 : 1693 >>>> Package C10 : 92865596 >>>> >>>> [Where problems could occur] >>>> Because the patchset is not accepted by upstream yet, it is a bit >>>> risky to merge the patchset to ubuntu kernel. And the patch only >>>> impacts vmd driver, hence if there is regression, it could only be >>>> in the nvme driver with RAID on mode. The regression possibility is >>>> very low because we already tested the patch on many Dell, lenovo >>>> machines, there is no regression so far. >>>> >>>> >>>> Kai-Heng Feng (2): >>>> UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is >>>> incapable of >>>> UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD >>>> domain >>>> >>>> drivers/pci/controller/vmd.c | 2 ++ >>>> drivers/pci/pcie/aspm.c | 8 ++++++-- >>>> include/linux/pci.h | 1 + >>>> 3 files changed, 9 insertions(+), 2 deletions(-) >>>> >>> >>> Rejected for the following reasons: >>> - Patches which are not at least in linux-next are not material for SRU >>> >>> -Stefan >> >> Hi Stefan, >> >> Understand, that is the general rule for SRU. >> >> In this case, it impacts our Dell oem project. This issue is a >> regression of hwe-6.8 kernel, we don't have this issue with the >> oem-6.5 kernel, but oem-6.5 is EOL. Hence we need to merge this >> patchset to -generic kernel ASAP. >> >> Kai-Heng will continue working on this patchset and make sure it will >> be accepted by PCI maintainers. So could we get an exception in this >> case? >> >> Thanks, >> >> Hui. >> > > If you re-submit and put the reasoning about why this is needed now > and the plan going forward. If this does not get accepted as is we > need to make sure things get updated to the actual solution. OK, got it. Thanks.
On 8/17/24 03:58, Timo Aaltonen wrote: > Stefan Bader kirjoitti 16.8.2024 klo 14.36: >> On 16.08.24 12:16, Hui Wang wrote: >>> >>> On 8/15/24 20:38, Stefan Bader wrote: >>>> On 15.08.24 05:22, Hui Wang wrote: >>>>> BugLink: https://bugs.launchpad.net/bugs/2072679 >>>>> >>>>> [Impact] >>>>> The NVME controller works in RAID on mode by default on some Dell >>>>> machines, and in this case, the PCIE ASPM couldn't be enabled, and >>>>> as a result the system idle can't enter deep idle states. This issue >>>>> not only impacts ubuntu users but also impacts our Dell OEM projects. >>>>> >>>>> >>>>> [Fix] >>>>> pick 2 commits from linux-pci mailist >>>>> >>>>> [Test] >>>>> After running the patched kernel, we could run 'sudo lspci -nnvv' >>>>> and check "Non-Volatile memory controller": >>>>> LnkCtl: ASPM L1 Enabled; >>>>> >>>>> And check idle states, we could see the system could enter deep idle: >>>>> $ sudo cat /sys/kernel/debug/pmc_core/package_cstate_show >>>>> Package C2 : 55740989 >>>>> Package C3 : 4656373 >>>>> Package C6 : 43325041 >>>>> Package C7 : 6687655 >>>>> Package C8 : 44948950 >>>>> Package C9 : 1693 >>>>> Package C10 : 92865596 >>>>> >>>>> [Where problems could occur] >>>>> Because the patchset is not accepted by upstream yet, it is a bit >>>>> risky to merge the patchset to ubuntu kernel. And the patch only >>>>> impacts vmd driver, hence if there is regression, it could only be >>>>> in the nvme driver with RAID on mode. The regression possibility is >>>>> very low because we already tested the patch on many Dell, lenovo >>>>> machines, there is no regression so far. >>>>> >>>>> >>>>> Kai-Heng Feng (2): >>>>> UBUNTU: SAUCE: PCI: ASPM: Allow OS to configure ASPM where BIOS is >>>>> incapable of >>>>> UBUNTU: SAUCE: PCI: vmd: Let OS control ASPM for devices under VMD >>>>> domain >>>>> >>>>> drivers/pci/controller/vmd.c | 2 ++ >>>>> drivers/pci/pcie/aspm.c | 8 ++++++-- >>>>> include/linux/pci.h | 1 + >>>>> 3 files changed, 9 insertions(+), 2 deletions(-) >>>>> >>>> >>>> Rejected for the following reasons: >>>> - Patches which are not at least in linux-next are not material for >>>> SRU >>>> >>>> -Stefan >>> >>> Hi Stefan, >>> >>> Understand, that is the general rule for SRU. >>> >>> In this case, it impacts our Dell oem project. This issue is a >>> regression of hwe-6.8 kernel, we don't have this issue with the >>> oem-6.5 kernel, but oem-6.5 is EOL. Hence we need to merge this >>> patchset to -generic kernel ASAP. >>> >>> Kai-Heng will continue working on this patchset and make sure it >>> will be accepted by PCI maintainers. So could we get an exception in >>> this case? >>> >>> Thanks, >>> >>> Hui. >>> >> >> If you re-submit and put the reasoning about why this is needed now >> and the plan going forward. If this does not get accepted as is we >> need to make sure things get updated to the actual solution. > > FTR, this bug was originally > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2034504 > > and it got applied to the mantic kernel but not unstable (even though > the bug suggests that it was), which is the reason for the regression.. > > would've been nice if the new bug had a link to the old one, to save a > lot of time chasing where things went wrong Thanks for sharing this. I will add the link as you pointed out. Thanks, Hui. > >