Message ID | 20220610150131.6256-1-Zhiqiang.Hou@nxp.com |
---|---|
State | New |
Headers | show |
Series | PCI: Align MPS to upstream bridge for SAFE and PERFORMANCE mode | expand |
Hi Bjorn and Lorenzo, Any comments on this change? -----Original Message----- From: Z.Q. Hou <zhiqiang.hou@nxp.com> Sent: 2022年6月10日 23:02 To: linux-pci@vger.kernel.org; bhelgaas@google.com Cc: Z.Q. Hou <zhiqiang.hou@nxp.com> Subject: [PATCH] PCI: Align MPS to upstream bridge for SAFE and PERFORMANCE mode From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT mode, so that it's more likely that a hot-added device will work in a system with an optimized MPS configuration. Obviously, the Linux itself optimizes the MPS settings in the PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also for these modes. Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> --- drivers/pci/probe.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 17a969942d37..2c5a1aefd9cb 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev) * Fancier MPS configuration is done later by * pcie_bus_configure_settings() */ - if (pcie_bus_config != PCIE_BUS_DEFAULT) + if (pcie_bus_config != PCIE_BUS_DEFAULT && + pcie_bus_config != PCIE_BUS_SAFE && + pcie_bus_config != PCIE_BUS_PERFORMANCE) return; mpss = 128 << dev->pcie_mpss; @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev) rc = pcie_set_mps(dev, p_mps); if (rc) { - pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", + pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use +\"pci=pcie_bus_peer2peer\" and report a bug\n", p_mps); return; } -- 2.17.1 Thanks, Zhiqiang
On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > mode, so that it's more likely that a hot-added device will work in > a system with an optimized MPS configuration. > > Obviously, the Linux itself optimizes the MPS settings in the > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > for these modes. > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > --- Hi Bjorn - We have some interest in this patch and I am hoping it can be considered in preparation for v6.1. I took a look at it and it makes sense to me but I'm not an expert in this area. Thanks! Tyler > drivers/pci/probe.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 17a969942d37..2c5a1aefd9cb 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev) > * Fancier MPS configuration is done later by > * pcie_bus_configure_settings() > */ > - if (pcie_bus_config != PCIE_BUS_DEFAULT) > + if (pcie_bus_config != PCIE_BUS_DEFAULT && > + pcie_bus_config != PCIE_BUS_SAFE && > + pcie_bus_config != PCIE_BUS_PERFORMANCE) > return; > > mpss = 128 << dev->pcie_mpss; > @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev) > > rc = pcie_set_mps(dev, p_mps); > if (rc) { > - pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", > + pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n", > p_mps); > return; > } > -- > 2.17.1 >
On 2022-08-26 10:49:39, Tyler Hicks wrote: > On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > > mode, so that it's more likely that a hot-added device will work in > > a system with an optimized MPS configuration. > > > > Obviously, the Linux itself optimizes the MPS settings in the > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > > for these modes. > > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > --- > > Hi Bjorn - We have some interest in this patch and I am hoping it can be > considered in preparation for v6.1. I took a look at it and it makes > sense to me but I'm not an expert in this area. Thanks! Ping. Do you expect to get any free cycles to review this in prep for v6.1? Tyler > > Tyler > > > drivers/pci/probe.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > > index 17a969942d37..2c5a1aefd9cb 100644 > > --- a/drivers/pci/probe.c > > +++ b/drivers/pci/probe.c > > @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev) > > * Fancier MPS configuration is done later by > > * pcie_bus_configure_settings() > > */ > > - if (pcie_bus_config != PCIE_BUS_DEFAULT) > > + if (pcie_bus_config != PCIE_BUS_DEFAULT && > > + pcie_bus_config != PCIE_BUS_SAFE && > > + pcie_bus_config != PCIE_BUS_PERFORMANCE) > > return; > > > > mpss = 128 << dev->pcie_mpss; > > @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev) > > > > rc = pcie_set_mps(dev, p_mps); > > if (rc) { > > - pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", > > + pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n", > > p_mps); > > return; > > } > > -- > > 2.17.1 > >
On 2022-09-14 09:41:05, Tyler Hicks wrote: > On 2022-08-26 10:49:39, Tyler Hicks wrote: > > On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > > > mode, so that it's more likely that a hot-added device will work in > > > a system with an optimized MPS configuration. > > > > > > Obviously, the Linux itself optimizes the MPS settings in the > > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > > > for these modes. > > > > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > --- > > > > Hi Bjorn - We have some interest in this patch and I am hoping it can be > > considered in preparation for v6.1. I took a look at it and it makes > > sense to me but I'm not an expert in this area. Thanks! > > Ping. Do you expect to get any free cycles to review this in prep for > v6.1? Ping now that you've got the v6.1 PR out for the PCI subsystem. Thanks. > > Tyler > > > > > Tyler > > > > > drivers/pci/probe.c | 6 ++++-- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > > > index 17a969942d37..2c5a1aefd9cb 100644 > > > --- a/drivers/pci/probe.c > > > +++ b/drivers/pci/probe.c > > > @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev) > > > * Fancier MPS configuration is done later by > > > * pcie_bus_configure_settings() > > > */ > > > - if (pcie_bus_config != PCIE_BUS_DEFAULT) > > > + if (pcie_bus_config != PCIE_BUS_DEFAULT && > > > + pcie_bus_config != PCIE_BUS_SAFE && > > > + pcie_bus_config != PCIE_BUS_PERFORMANCE) > > > return; > > > > > > mpss = 128 << dev->pcie_mpss; > > > @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev) > > > > > > rc = pcie_set_mps(dev, p_mps); > > > if (rc) { > > > - pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", > > > + pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n", > > > p_mps); > > > return; > > > } > > > -- > > > 2.17.1 > > >
On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") Adding Keith, as the author of that commit. > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > mode, so that it's more likely that a hot-added device will work in > a system with an optimized MPS configuration. > > Obviously, the Linux itself optimizes the MPS settings in the > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > for these modes. > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > --- I wanted to give a little more information about the issue we're seeing. We're having trouble retaining the optimized Max Payload Size (MPS) value of a PCIe endpoint after hotplug/rescan. In this case, `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf preserved the MPS value when using the default PCIe bus mode (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other modes, as well. Tyler > drivers/pci/probe.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 17a969942d37..2c5a1aefd9cb 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev) > * Fancier MPS configuration is done later by > * pcie_bus_configure_settings() > */ > - if (pcie_bus_config != PCIE_BUS_DEFAULT) > + if (pcie_bus_config != PCIE_BUS_DEFAULT && > + pcie_bus_config != PCIE_BUS_SAFE && > + pcie_bus_config != PCIE_BUS_PERFORMANCE) > return; > > mpss = 128 << dev->pcie_mpss; > @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev) > > rc = pcie_set_mps(dev, p_mps); > if (rc) { > - pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", > + pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n", > p_mps); > return; > } > -- > 2.17.1 >
On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote: > On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > > Adding Keith, as the author of that commit. > > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > > mode, so that it's more likely that a hot-added device will work in > > a system with an optimized MPS configuration. > > > > Obviously, the Linux itself optimizes the MPS settings in the > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > > for these modes. > > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > --- > > I wanted to give a little more information about the issue we're seeing. > We're having trouble retaining the optimized Max Payload Size (MPS) > value of a PCIe endpoint after hotplug/rescan. In this case, > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and > we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf > preserved the MPS value when using the default PCIe bus mode > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other > modes, as well. As I recall, the PCIE_BUS_DEFAULT mode was created specifically because we didn't want to change the behavior of PCIE_BUS_SAFE. What reason do you have for using that mode, anyway? That's specifically saying not to retune bridges after the initial scan.
On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote: > On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > > mode, so that it's more likely that a hot-added device will work in > > a system with an optimized MPS configuration. > > > > Obviously, the Linux itself optimizes the MPS settings in the > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > > for these modes. > > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > I wanted to give a little more information about the issue we're seeing. > We're having trouble retaining the optimized Max Payload Size (MPS) > value of a PCIe endpoint after hotplug/rescan. In this case, > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and > we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf > preserved the MPS value when using the default PCIe bus mode > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other > modes, as well. Thanks, Tyler. I need help understanding what's going on here. An example of the topology and what happens and what *should* happen might help. Some MPS configuration is done per-device in pci_configure_mps(), and some considers a hierarchy in pcie_bus_configure_settings(). In the current tree, in the PCIE_BUS_SAFE case: - pci_configure_mps() does nothing (except for RCiEPs). - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at the bridge leading to "bus". If the hierarchy contains a hotplug Switch Downstream Port, it sets MPS and MRRS to 128 for everything. If it does not contain such a bridge, it finds the smallest MPS_Supported ("smpss") of any device in the hierarchy and sets MPS and MRRS to that for everything. If you boot with a hotplug Root Port leading to an empty slot, I think the RP MPS will end up being whatever BIOS put there. A subsequent hot-add will do nothing in pci_configure_mps(), and pcie_bus_configure_settings() looks like it would set the RP and EP MPS to the minimum of the RP and EP MPS_Supported. Is that what you see? What would you like to see instead? Bjorn
On 2022-10-20 15:24:37, Bjorn Helgaas wrote: > On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote: > > On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > > > mode, so that it's more likely that a hot-added device will work in > > > a system with an optimized MPS configuration. > > > > > > Obviously, the Linux itself optimizes the MPS settings in the > > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > > > for these modes. > > > > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > I wanted to give a little more information about the issue we're seeing. > > We're having trouble retaining the optimized Max Payload Size (MPS) > > value of a PCIe endpoint after hotplug/rescan. In this case, > > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and > > we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf > > preserved the MPS value when using the default PCIe bus mode > > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other > > modes, as well. > > Thanks, Tyler. I need help understanding what's going on here. An > example of the topology and what happens and what *should* happen > might help. Hey Bjorn and Keith! Thanks for both of your responses and your patience. They spurred good investigations on my side and I'm learning a lot as I go. > Some MPS configuration is done per-device in pci_configure_mps(), and > some considers a hierarchy in pcie_bus_configure_settings(). In the > current tree, in the PCIE_BUS_SAFE case: > > - pci_configure_mps() does nothing (except for RCiEPs). > > - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at > the bridge leading to "bus". If the hierarchy contains a hotplug > Switch Downstream Port, it sets MPS and MRRS to 128 for > everything. > > If it does not contain such a bridge, it finds the smallest > MPS_Supported ("smpss") of any device in the hierarchy and sets > MPS and MRRS to that for everything. > > If you boot with a hotplug Root Port leading to an empty slot, I think > the RP MPS will end up being whatever BIOS put there. I've been talking to the firmware folks on why SAFE mode was selected, based on Keith's question from Wednesday. From what I've been told, U-Boot doesn't seed the RP MPS with a value so the kernel sees a value of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode. Apparently UEFI does seed the RP MPS but we don't get that with U-Boot. Therefore, SAFE mode was selected to get an initial, tuned RP MPS value set to 256. > A subsequent hot-add will do nothing in pci_configure_mps(), and > pcie_bus_configure_settings() looks like it would set the RP and EP > MPS to the minimum of the RP and EP MPS_Supported. > > Is that what you see? What would you like to see instead? No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE mode even though the RP MPS is 256. So, the question is if SAFE/PERFORMANCE modes are expected to tune the EP when it is added? We would have thought that EP's would be tuned based on the algorithm selected as they're hot-added. Tyler > > Bjorn
On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote: > On 2022-10-20 15:24:37, Bjorn Helgaas wrote: > > On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote: > > > On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > > > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > > > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > > > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > > > > mode, so that it's more likely that a hot-added device will work in > > > > a system with an optimized MPS configuration. > > > > > > > > Obviously, the Linux itself optimizes the MPS settings in the > > > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > > > > for these modes. > > > > > > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > > > I wanted to give a little more information about the issue we're seeing. > > > We're having trouble retaining the optimized Max Payload Size (MPS) > > > value of a PCIe endpoint after hotplug/rescan. In this case, > > > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and > > > we expect the Linux kernel to retain the MPS value. Commit 27d868b5e6cf > > > preserved the MPS value when using the default PCIe bus mode > > > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to other > > > modes, as well. > > > > Thanks, Tyler. I need help understanding what's going on here. An > > example of the topology and what happens and what *should* happen > > might help. > > Hey Bjorn and Keith! Thanks for both of your responses and your > patience. They spurred good investigations on my side and I'm learning a > lot as I go. > > > Some MPS configuration is done per-device in pci_configure_mps(), and > > some considers a hierarchy in pcie_bus_configure_settings(). In the > > current tree, in the PCIE_BUS_SAFE case: > > > > - pci_configure_mps() does nothing (except for RCiEPs). > > > > - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at > > the bridge leading to "bus". If the hierarchy contains a hotplug > > Switch Downstream Port, it sets MPS and MRRS to 128 for > > everything. > > > > If it does not contain such a bridge, it finds the smallest > > MPS_Supported ("smpss") of any device in the hierarchy and sets > > MPS and MRRS to that for everything. > > > > If you boot with a hotplug Root Port leading to an empty slot, I think > > the RP MPS will end up being whatever BIOS put there. > > I've been talking to the firmware folks on why SAFE mode was selected, > based on Keith's question from Wednesday. From what I've been told, > U-Boot doesn't seed the RP MPS with a value so the kernel sees a value > of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode. > Apparently UEFI does seed the RP MPS but we don't get that with U-Boot. > Therefore, SAFE mode was selected to get an initial, tuned RP MPS value > set to 256. Are there any devices below the RP at the time we set MPS=256? > > A subsequent hot-add will do nothing in pci_configure_mps(), and > > pcie_bus_configure_settings() looks like it would set the RP and EP > > MPS to the minimum of the RP and EP MPS_Supported. > > > > Is that what you see? What would you like to see instead? > > No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE > mode even though the RP MPS is 256. Can you add the relevant topology here so we can work through the concrete details? Is the endpoint hot-added directly below a Root Port? Or is there a switch involved? What are the MPS_Supported values for all the devices? If there's a switch in the picture, it looks like we currently restrict to 128, I think because it's possible an endpoint that can only do 128 may be added. Bjorn
On Thu, Oct 27, 2022 at 05:51:49PM -0500, Bjorn Helgaas wrote: > On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote: > > On 2022-10-20 15:24:37, Bjorn Helgaas wrote: > > I've been talking to the firmware folks on why SAFE mode was selected, > > based on Keith's question from Wednesday. From what I've been told, > > U-Boot doesn't seed the RP MPS with a value so the kernel sees a value > > of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode. > > Apparently UEFI does seed the RP MPS but we don't get that with U-Boot. > > Therefore, SAFE mode was selected to get an initial, tuned RP MPS value > > set to 256. > > Are there any devices below the RP at the time we set MPS=256? > > > > A subsequent hot-add will do nothing in pci_configure_mps(), and > > > pcie_bus_configure_settings() looks like it would set the RP and EP > > > MPS to the minimum of the RP and EP MPS_Supported. > > > > > > Is that what you see? What would you like to see instead? > > > > No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE > > mode even though the RP MPS is 256. > > Can you add the relevant topology here so we can work through the > concrete details? Is the endpoint hot-added directly below a Root > Port? Or is there a switch involved? What are the MPS_Supported > values for all the devices? If there's a switch in the picture, it > looks like we currently restrict to 128, I think because it's possible > an endpoint that can only do 128 may be added. Ping? I'd like to talk about a concrete scenario. It's too hard for me to imagine all the possible things that could go wrong. I guess part of what's interesting here is that things work better when firmware has already configured MPS? It doesn't seem like we should *depend* on firmware having done that. Bjorn
On 2022-11-03 17:14:29, Bjorn Helgaas wrote: > On Thu, Oct 27, 2022 at 05:51:49PM -0500, Bjorn Helgaas wrote: > > On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote: > > > On 2022-10-20 15:24:37, Bjorn Helgaas wrote: > > > > I've been talking to the firmware folks on why SAFE mode was selected, > > > based on Keith's question from Wednesday. From what I've been told, > > > U-Boot doesn't seed the RP MPS with a value so the kernel sees a value > > > of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode. > > > Apparently UEFI does seed the RP MPS but we don't get that with U-Boot. > > > Therefore, SAFE mode was selected to get an initial, tuned RP MPS value > > > set to 256. > > > > Are there any devices below the RP at the time we set MPS=256? > > > > > > A subsequent hot-add will do nothing in pci_configure_mps(), and > > > > pcie_bus_configure_settings() looks like it would set the RP and EP > > > > MPS to the minimum of the RP and EP MPS_Supported. > > > > > > > > Is that what you see? What would you like to see instead? > > > > > > No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE > > > mode even though the RP MPS is 256. > > > > Can you add the relevant topology here so we can work through the > > concrete details? # lspci -t -[0000:00]---00.0-[01-ff]--+-00.0 +-00.1 +-00.2 +-00.3 \-00.4 > > Is the endpoint hot-added directly below a Root Port? Or is there a > > switch involved? There's not a switch involved. The multifunction endpoint is hot-added directly below the root port. > > What are the MPS_Supported values for all the devices? If there's a switch > > in the picture, it looks like we currently restrict to 128, I think because > > it's possible an endpoint that can only do 128 may be added. 0000:00's MPS_Supported value is 256. The multifunction endpoint's MPS_Supported is 512. > Ping? I'd like to talk about a concrete scenario. It's too hard for > me to imagine all the possible things that could go wrong. Sorry for the slow reply here. Thanks for your interest in getting more details. > I guess part of what's interesting here is that things work better > when firmware has already configured MPS? It doesn't seem like we > should *depend* on firmware having done that. Our firmware folks felt the same way. Tyler > > Bjorn
> -----Original Message----- > From: Bjorn Helgaas <helgaas@kernel.org> > Sent: Friday, October 21, 2022 4:25 AM > To: Tyler Hicks <code@tyhicks.com> > Cc: bhelgaas@google.com; Keith Busch <kbusch@kernel.org>; > linux-pci@vger.kernel.org; Z.Q. Hou <zhiqiang.hou@nxp.com>; Lorenzo > Pieralisi <lpieralisi@kernel.org> > Subject: Re: [PATCH] PCI: Align MPS to upstream bridge for SAFE and > PERFORMANCE mode > > On Wed, Oct 19, 2022 at 01:25:59PM -0500, Tyler Hicks wrote: > > On 2022-06-10 23:01:31, Zhiqiang Hou wrote: > > > From: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > > > The commit 27d868b5e6cf ("PCI: Set MPS to match upstream bridge") > > > made the device's MPS matches upstream bridge for PCIE_BUS_DEFAULT > > > mode, so that it's more likely that a hot-added device will work in > > > a system with an optimized MPS configuration. > > > > > > Obviously, the Linux itself optimizes the MPS settings in the > > > PCIE_BUS_SAFE and PCIE_BUS_PERFORMANCE mode, so let's do this also > > > for these modes. > > > > > > Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> > > > > I wanted to give a little more information about the issue we're seeing. > > We're having trouble retaining the optimized Max Payload Size (MPS) > > value of a PCIe endpoint after hotplug/rescan. In this case, > > `pcie_ports=native` and `pci=pcie_bus_safe` are set on the cmdline and > > we expect the Linux kernel to retain the MPS value. Commit > > 27d868b5e6cf preserved the MPS value when using the default PCIe bus > > mode > > (PCIE_BUS_DEFAULT) and we're hopeful that this can be extended to > > other modes, as well. > > Thanks, Tyler. I need help understanding what's going on here. An example > of the topology and what happens and what *should* happen might help. > > Some MPS configuration is done per-device in pci_configure_mps(), and some > considers a hierarchy in pcie_bus_configure_settings(). In the current tree, > in the PCIE_BUS_SAFE case: > > - pci_configure_mps() does nothing (except for RCiEPs). > > - pcie_bus_configure_settings(bus) looks at the hierarchy rooted at > the bridge leading to "bus". If the hierarchy contains a hotplug > Switch Downstream Port, it sets MPS and MRRS to 128 for > everything. > > If it does not contain such a bridge, it finds the smallest > MPS_Supported ("smpss") of any device in the hierarchy and sets > MPS and MRRS to that for everything. > > If you boot with a hotplug Root Port leading to an empty slot, I think the RP > MPS will end up being whatever BIOS put there. > > A subsequent hot-add will do nothing in pci_configure_mps(), and > pcie_bus_configure_settings() looks like it would set the RP and EP MPS to the > minimum of the RP and EP MPS_Supported. > > Is that what you see? What would you like to see instead? > Hi Bjorn, Thanks for your comments! This patch is for the case that kernel boot with 'pci=pcie_buf_perf', the MPS is tuned during the enumeration, but if the EP is removed and then rescan via the sysfs, the MPS won't be tuned in the rescan process. And the MPS is also tuned in the 'pcie_bus_safe' mode (see the Documentation/admin-guide/kernel-parameters.txt, I pasted the 2 options below for your reference). Is this expected behavior? If yes, can you help understand the reason. pcie_bus_safe Set every device's MPS to the largest value supported by all devices below the root complex. pcie_bus_perf Set device MPS to the largest allowable MPS based on its parent bus. Also set MRRS (Max Read Request Size) to the largest supported value (no larger than the MPS that the device or bus can support) for best performance. Thanks, Zhiqiang > Bjorn
On 2022-11-09 17:42:10, Tyler Hicks (Microsoft) wrote: > On 2022-11-03 17:14:29, Bjorn Helgaas wrote: > > On Thu, Oct 27, 2022 at 05:51:49PM -0500, Bjorn Helgaas wrote: > > > On Tue, Oct 25, 2022 at 12:07:47AM -0500, Tyler Hicks wrote: > > > > On 2022-10-20 15:24:37, Bjorn Helgaas wrote: > > > > > > I've been talking to the firmware folks on why SAFE mode was selected, > > > > based on Keith's question from Wednesday. From what I've been told, > > > > U-Boot doesn't seed the RP MPS with a value so the kernel sees a value > > > > of 128 for p_mps in pci_configure_mps() when using the DEFAULT mode. > > > > Apparently UEFI does seed the RP MPS but we don't get that with U-Boot. > > > > Therefore, SAFE mode was selected to get an initial, tuned RP MPS value > > > > set to 256. > > > > > > Are there any devices below the RP at the time we set MPS=256? > > > > > > > > A subsequent hot-add will do nothing in pci_configure_mps(), and > > > > > pcie_bus_configure_settings() looks like it would set the RP and EP > > > > > MPS to the minimum of the RP and EP MPS_Supported. > > > > > > > > > > Is that what you see? What would you like to see instead? > > > > > > > > No, not quite. After hot-add, we see the EP MPS set to 128 with SAFE > > > > mode even though the RP MPS is 256. > > > > > > Can you add the relevant topology here so we can work through the > > > concrete details? > > # lspci -t > -[0000:00]---00.0-[01-ff]--+-00.0 > +-00.1 > +-00.2 > +-00.3 > \-00.4 > > > > > Is the endpoint hot-added directly below a Root Port? Or is there a > > > switch involved? > > There's not a switch involved. The multifunction endpoint is hot-added directly > below the root port. > > > > What are the MPS_Supported values for all the devices? If there's a switch > > > in the picture, it looks like we currently restrict to 128, I think because > > > it's possible an endpoint that can only do 128 may be added. > > 0000:00's MPS_Supported value is 256. > > The multifunction endpoint's MPS_Supported is 512. > > > Ping? I'd like to talk about a concrete scenario. It's too hard for > > me to imagine all the possible things that could go wrong. > > Sorry for the slow reply here. Thanks for your interest in getting more > details. Hey Bjorn - I wanted to re-ping you on this discussion since we're on the other side of the merge window now. Let me know if you need anymore details. Thanks! Tyler > > > I guess part of what's interesting here is that things work better > > when firmware has already configured MPS? It doesn't seem like we > > should *depend* on firmware having done that. > > Our firmware folks felt the same way. > > Tyler > > > > > Bjorn
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 17a969942d37..2c5a1aefd9cb 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2034,7 +2034,9 @@ static void pci_configure_mps(struct pci_dev *dev) * Fancier MPS configuration is done later by * pcie_bus_configure_settings() */ - if (pcie_bus_config != PCIE_BUS_DEFAULT) + if (pcie_bus_config != PCIE_BUS_DEFAULT && + pcie_bus_config != PCIE_BUS_SAFE && + pcie_bus_config != PCIE_BUS_PERFORMANCE) return; mpss = 128 << dev->pcie_mpss; @@ -2047,7 +2049,7 @@ static void pci_configure_mps(struct pci_dev *dev) rc = pcie_set_mps(dev, p_mps); if (rc) { - pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_safe\" and report a bug\n", + pci_warn(dev, "can't set Max Payload Size to %d; if necessary, use \"pci=pcie_bus_peer2peer\" and report a bug\n", p_mps); return; }