Message ID | 1407371998-11437-1-git-send-email-festevam@gmail.com |
---|---|
State | New |
Headers | show |
On Wed, Aug 06, 2014 at 09:39:58PM -0300, Fabio Estevam wrote: > From: Fabio Estevam <fabio.estevam@freescale.com> > > When PCI is used and a suspend/resume sequence is done we see the following > kernel hang: > > root@freescale /$ echo mem > /sys/power/state > [ 16.099018] PM: Syncing filesystems ... done. > [ 16.141010] Freezing user space processes ... (elapsed 0.002 seconds) done. > [ 16.150840] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done. > [ 16.199438] random: nonblocking pool is initialized > [ 16.229639] PM: suspend of devices complete after 64.793 msecs > [ 16.235488] PM: suspend devices took 0.070 seconds > [ 16.245301] PM: late suspend of devices complete after 4.968 msecs > [ 16.257063] PM: noirq suspend of devices complete after 5.538 msecs > [ 16.263425] Disabling non-boot CPUs ... > [ 16.274666] CPU1: shutdown > [ 16.286351] CPU2: shutdown > [ 16.294169] CPU3: shutdown > [ 16.299551] Enabling non-boot CPUs ... > [ 16.304155] CPU1: Booted secondary processor > [ 16.305717] CPU1 is up > [ 16.313078] CPU2: Booted secondary processor > [ 16.313456] CPU2 is up > [ 16.320778] CPU3: Booted secondary processor > [ 16.321174] CPU3 is up > (hangs here) > > Implement a workaround for the erratum ERR005723: "PCIe does not support L2 > Power Down", which consists in toggling bit 18 (TEST_POWERDOWN) of GPR1 register. > > Tested on a mx6qsabresd TO1.2 revC2. I intend to agree with Lucas that we should align our testing and understanding on the issue before trying to fix it. Again, I do not see this issue on my TO1.2 chip, but only TO1.5, aka Rev 1.3, one. Shawn > > Reported-by: Shawn Guo <shawn.guo@freescale.com> > Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
On Wed, Aug 6, 2014 at 11:18 PM, Shawn Guo <shawn.guo@freescale.com> wrote: > I intend to agree with Lucas that we should align our testing and > understanding on the issue before trying to fix it. > > Again, I do not see this issue on my TO1.2 chip, but only TO1.5, > aka Rev 1.3, one. Which version is your sabresd with TO1.2? Mine is rev C2 and the suspend/resume issue happens 100%.
On Wed, Aug 06, 2014 at 11:52:29PM -0300, Fabio Estevam wrote: > On Wed, Aug 6, 2014 at 11:18 PM, Shawn Guo <shawn.guo@freescale.com> wrote: > > > I intend to agree with Lucas that we should align our testing and > > understanding on the issue before trying to fix it. > > > > Again, I do not see this issue on my TO1.2 chip, but only TO1.5, > > aka Rev 1.3, one. > > Which version is your sabresd with TO1.2? Mine is rev C2 and the > suspend/resume issue happens 100%. The label on the board says it's Rev B. Shawn
On Thu, Aug 7, 2014 at 12:08 AM, Shawn Guo <shawn.guo@freescale.com> wrote: > On Wed, Aug 06, 2014 at 11:52:29PM -0300, Fabio Estevam wrote: >> On Wed, Aug 6, 2014 at 11:18 PM, Shawn Guo <shawn.guo@freescale.com> wrote: >> >> > I intend to agree with Lucas that we should align our testing and >> > understanding on the issue before trying to fix it. >> > >> > Again, I do not see this issue on my TO1.2 chip, but only TO1.5, >> > aka Rev 1.3, one. >> >> Which version is your sabresd with TO1.2? Mine is rev C2 and the >> suspend/resume issue happens 100%. > > The label on the board says it's Rev B. If I remember correctly you are testing suspend/resume with no PCI card connected to the TO1.2 board, right? I have just tried suspend/resume without any PCI card connect and it does work fine. Only when I have the PCI card connected I get the hang.
On Thu, Aug 07, 2014 at 12:24:37AM -0300, Fabio Estevam wrote: > On Thu, Aug 7, 2014 at 12:08 AM, Shawn Guo <shawn.guo@freescale.com> wrote: > > On Wed, Aug 06, 2014 at 11:52:29PM -0300, Fabio Estevam wrote: > >> On Wed, Aug 6, 2014 at 11:18 PM, Shawn Guo <shawn.guo@freescale.com> wrote: > >> > >> > I intend to agree with Lucas that we should align our testing and > >> > understanding on the issue before trying to fix it. > >> > > >> > Again, I do not see this issue on my TO1.2 chip, but only TO1.5, > >> > aka Rev 1.3, one. > >> > >> Which version is your sabresd with TO1.2? Mine is rev C2 and the > >> suspend/resume issue happens 100%. > > > > The label on the board says it's Rev B. > > If I remember correctly you are testing suspend/resume with no PCI > card connected to the TO1.2 board, right? Right. I do not have any PCIe device on hands, so all my testing are without PCIe device connected, and TO1.2 works while TO1.5 doesn't. Shawn > > I have just tried suspend/resume without any PCI card connect and it > does work fine. > > Only when I have the PCI card connected I get the hang.
Am Mittwoch, den 06.08.2014, 21:39 -0300 schrieb Fabio Estevam: > From: Fabio Estevam <fabio.estevam@freescale.com> > > When PCI is used and a suspend/resume sequence is done we see the following > kernel hang: > > root@freescale /$ echo mem > /sys/power/state > [ 16.099018] PM: Syncing filesystems ... done. > [ 16.141010] Freezing user space processes ... (elapsed 0.002 seconds) done. > [ 16.150840] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done. > [ 16.199438] random: nonblocking pool is initialized > [ 16.229639] PM: suspend of devices complete after 64.793 msecs > [ 16.235488] PM: suspend devices took 0.070 seconds > [ 16.245301] PM: late suspend of devices complete after 4.968 msecs > [ 16.257063] PM: noirq suspend of devices complete after 5.538 msecs > [ 16.263425] Disabling non-boot CPUs ... > [ 16.274666] CPU1: shutdown > [ 16.286351] CPU2: shutdown > [ 16.294169] CPU3: shutdown > [ 16.299551] Enabling non-boot CPUs ... > [ 16.304155] CPU1: Booted secondary processor > [ 16.305717] CPU1 is up > [ 16.313078] CPU2: Booted secondary processor > [ 16.313456] CPU2 is up > [ 16.320778] CPU3: Booted secondary processor > [ 16.321174] CPU3 is up > (hangs here) > > Implement a workaround for the erratum ERR005723: "PCIe does not support L2 > Power Down", which consists in toggling bit 18 (TEST_POWERDOWN) of GPR1 register. > > Tested on a mx6qsabresd TO1.2 revC2. > > Reported-by: Shawn Guo <shawn.guo@freescale.com> > Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> NACK To my understanding the DW PCIe core needs a specific sequence to enter/exit L2 at suspend that we currently ignore. Contrary to the errata description I think it can be done without a dedicated core reset signal. Hitting the PHY PD is a sledgehammer without proper understanding of the problem. Also hiding such things in the platform code seems really bad. Please keep in mind that this is the mainline kernel, not a vendor downstream tree where you are free to smash things all over the place. Regards, Lucas
On Thu, Aug 7, 2014 at 7:04 AM, Lucas Stach <l.stach@pengutronix.de> wrote: > Also hiding such things in the platform code seems really bad. Please > keep in mind that this is the mainline kernel, not a vendor downstream > tree where you are free to smash things all over the place. Sure, I should have marked it as RFC. I also asked for a better way to handle this issue below the --- line.
Hi Fabio, On Wed, 2014-08-06 at 21:39 -0300, an unknown sender wrote: > From: Fabio Estevam <fabio.estevam at freescale.com> > > When PCI is used and a suspend/resume sequence is done we see the following > kernel hang: > > root at freescale /$ echo mem > /sys/power/state > [ 16.099018] PM: Syncing filesystems ... done. > [ 16.141010] Freezing user space processes ... (elapsed 0.002 seconds) done. > [ 16.150840] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done. > [ 16.199438] random: nonblocking pool is initialized > [ 16.229639] PM: suspend of devices complete after 64.793 msecs > [ 16.235488] PM: suspend devices took 0.070 seconds > [ 16.245301] PM: late suspend of devices complete after 4.968 msecs > [ 16.257063] PM: noirq suspend of devices complete after 5.538 msecs > [ 16.263425] Disabling non-boot CPUs ... > [ 16.274666] CPU1: shutdown > [ 16.286351] CPU2: shutdown > [ 16.294169] CPU3: shutdown > [ 16.299551] Enabling non-boot CPUs ... > [ 16.304155] CPU1: Booted secondary processor > [ 16.305717] CPU1 is up > [ 16.313078] CPU2: Booted secondary processor > [ 16.313456] CPU2 is up > [ 16.320778] CPU3: Booted secondary processor > [ 16.321174] CPU3 is up > (hangs here) > > Implement a workaround for the erratum ERR005723: "PCIe does not support L2 > Power Down", which consists in toggling bit 18 (TEST_POWERDOWN) of GPR1 register. > > Tested on a mx6qsabresd TO1.2 revC2. > > Reported-by: Shawn Guo <shawn.guo at freescale.com> > Signed-off-by: Fabio Estevam <fabio.estevam at freescale.com> Tested-by: Bjørn Erik Nilsen <ben@datarespons.no> I encountered exactly the same problem and can confirm your patch works. Thank you! I hope this patch (or one that is considered a proper solution) will make it to the mainline kernel any time soon. Best regards, Bjørn Erik Nilsen > --- > Richard, > > Is there any better method to fix this issue? Or should we do the same > as in the FSL kernel? > > I also tried to add pm ops into the imx6 pci driver and did the toggle there, > but I still got the kernel hang. > > arch/arm/mach-imx/pm-imx6.c | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > diff --git a/arch/arm/mach-imx/pm-imx6.c b/arch/arm/mach-imx/pm-imx6.c > index 5c3af8f..92511cf 100644 > --- a/arch/arm/mach-imx/pm-imx6.c > +++ b/arch/arm/mach-imx/pm-imx6.c > @@ -341,6 +341,23 @@ static int imx6q_suspend_finish(unsigned long val) > > static int imx6q_pm_enter(suspend_state_t state) > { > + struct regmap *gpr; > + gpr = syscon_regmap_lookup_by_compatible("fsl,imx6q-iomuxc-gpr"); > + if (IS_ERR(gpr)) { > + pr_err("failed to find fsl,imx6q-iomux-gpr regmap\n"); > + return PTR_ERR(gpr); > + } > + > + /* > + * L2 can exit by 'reset' or Inband beacon (from remote EP) > + * toggling phy_powerdown has same effect as 'inband beacon' > + * So, toggle bit18 of GPR1 as a workaround of erratum > + * ERR005723 - "PCIe does not support L2 Power Down" > + */ > + if (IS_ENABLED(CONFIG_PCI_IMX6) && !cpu_is_imx6sx()) > + regmap_update_bits(gpr, IOMUXC_GPR1, IMX6Q_GPR1_PCIE_TEST_PD, > + IMX6Q_GPR1_PCIE_TEST_PD); > + > switch (state) { > case PM_SUSPEND_STANDBY: > imx6q_set_lpm(STOP_POWER_ON); > @@ -383,6 +400,16 @@ static int imx6q_pm_enter(suspend_state_t state) > return -EINVAL; > } > > + /* > + * L2 can exit by 'reset' or Inband beacon (from remote EP) > + * toggling phy_powerdown has same effect as 'inband beacon' > + * So, toggle bit18 of GPR1 as a workaround of erratum > + * ERR005723 - "PCIe does not support L2 Power Down" > + */ > + if (IS_ENABLED(CONFIG_PCI_IMX6) && !cpu_is_imx6sx()) > + regmap_update_bits(gpr, IOMUXC_GPR1, IMX6Q_GPR1_PCIE_TEST_PD, > + !IMX6Q_GPR1_PCIE_TEST_PD); > + > return 0; > } >
Hi Bjørn, On Tue, Sep 30, 2014 at 7:12 AM, Bjørn Erik Nilsen <ben@datarespons.no> wrote: > Hi Fabio, > > On Wed, 2014-08-06 at 21:39 -0300, an unknown sender wrote: >> From: Fabio Estevam <fabio.estevam at freescale.com> >> >> When PCI is used and a suspend/resume sequence is done we see the following >> kernel hang: >> >> root at freescale /$ echo mem > /sys/power/state >> [ 16.099018] PM: Syncing filesystems ... done. >> [ 16.141010] Freezing user space processes ... (elapsed 0.002 seconds) done. >> [ 16.150840] Freezing remaining freezable tasks ... (elapsed 0.003 seconds) done. >> [ 16.199438] random: nonblocking pool is initialized >> [ 16.229639] PM: suspend of devices complete after 64.793 msecs >> [ 16.235488] PM: suspend devices took 0.070 seconds >> [ 16.245301] PM: late suspend of devices complete after 4.968 msecs >> [ 16.257063] PM: noirq suspend of devices complete after 5.538 msecs >> [ 16.263425] Disabling non-boot CPUs ... >> [ 16.274666] CPU1: shutdown >> [ 16.286351] CPU2: shutdown >> [ 16.294169] CPU3: shutdown >> [ 16.299551] Enabling non-boot CPUs ... >> [ 16.304155] CPU1: Booted secondary processor >> [ 16.305717] CPU1 is up >> [ 16.313078] CPU2: Booted secondary processor >> [ 16.313456] CPU2 is up >> [ 16.320778] CPU3: Booted secondary processor >> [ 16.321174] CPU3 is up >> (hangs here) >> >> Implement a workaround for the erratum ERR005723: "PCIe does not support L2 >> Power Down", which consists in toggling bit 18 (TEST_POWERDOWN) of GPR1 register. >> >> Tested on a mx6qsabresd TO1.2 revC2. >> >> Reported-by: Shawn Guo <shawn.guo at freescale.com> >> Signed-off-by: Fabio Estevam <fabio.estevam at freescale.com> > > Tested-by: Bjørn Erik Nilsen <ben@datarespons.no> > > I encountered exactly the same problem and can confirm your patch works. > > Thank you! > > I hope this patch (or one that is considered a proper solution) will > make it to the mainline kernel any time soon. Thanks for testing. This is a workaround found on FSL kernel. We still need to come up with a proper fix for this PCI suspend/resume issue.
Hi Bjørn, On Tue, Sep 30, 2014 at 8:42 AM, Fabio Estevam <festevam@gmail.com> wrote: > Thanks for testing. > > This is a workaround found on FSL kernel. We still need to come up > with a proper fix for this PCI suspend/resume issue. Just tried 3.18.0-rc1 and I don't see this PCI suspend/resume hang anymore on a mx6qsabresd revC2 with silicon TO1.2. It does hang on an old mx6sabresd revB with TO1.1 silicon though. Can you try with 3.18-rc1?
diff --git a/arch/arm/mach-imx/pm-imx6.c b/arch/arm/mach-imx/pm-imx6.c index 5c3af8f..92511cf 100644 --- a/arch/arm/mach-imx/pm-imx6.c +++ b/arch/arm/mach-imx/pm-imx6.c @@ -341,6 +341,23 @@ static int imx6q_suspend_finish(unsigned long val) static int imx6q_pm_enter(suspend_state_t state) { + struct regmap *gpr; + gpr = syscon_regmap_lookup_by_compatible("fsl,imx6q-iomuxc-gpr"); + if (IS_ERR(gpr)) { + pr_err("failed to find fsl,imx6q-iomux-gpr regmap\n"); + return PTR_ERR(gpr); + } + + /* + * L2 can exit by 'reset' or Inband beacon (from remote EP) + * toggling phy_powerdown has same effect as 'inband beacon' + * So, toggle bit18 of GPR1 as a workaround of erratum + * ERR005723 - "PCIe does not support L2 Power Down" + */ + if (IS_ENABLED(CONFIG_PCI_IMX6) && !cpu_is_imx6sx()) + regmap_update_bits(gpr, IOMUXC_GPR1, IMX6Q_GPR1_PCIE_TEST_PD, + IMX6Q_GPR1_PCIE_TEST_PD); + switch (state) { case PM_SUSPEND_STANDBY: imx6q_set_lpm(STOP_POWER_ON); @@ -383,6 +400,16 @@ static int imx6q_pm_enter(suspend_state_t state) return -EINVAL; } + /* + * L2 can exit by 'reset' or Inband beacon (from remote EP) + * toggling phy_powerdown has same effect as 'inband beacon' + * So, toggle bit18 of GPR1 as a workaround of erratum + * ERR005723 - "PCIe does not support L2 Power Down" + */ + if (IS_ENABLED(CONFIG_PCI_IMX6) && !cpu_is_imx6sx()) + regmap_update_bits(gpr, IOMUXC_GPR1, IMX6Q_GPR1_PCIE_TEST_PD, + !IMX6Q_GPR1_PCIE_TEST_PD); + return 0; }