diff mbox series

[RFC] gpio: Fix probing of gpio-hogs

Message ID 0b24ac9c0f40ba2410529ee6e0ffd603c9453cff.1718275638.git.chris@arachsys.com
State RFC
Delegated to: Tom Rini
Headers show
Series [RFC] gpio: Fix probing of gpio-hogs | expand

Commit Message

Chris Webb June 13, 2024, 10:59 a.m. UTC
48b3ecbe replumbed the gpio-hog probing to use DM_FLAG_PROBE_AFTER_BIND.

Unfortunately gpio_post_bind is called after the non-preloc recursive
dm_probe_devices completes, so setting this flag does not have the intended
effect and the gpio-hogs never get probed. With instrumentation:

  [...]
  CPU:   MediaTek MT7981
  Model: GL.iNet GL-X3000
  DRAM:  512 MiB
  <mtk_pinctrl_mt7981_bind called>
  <dm_probe_devices called: root root_driver root_driver [+] [ ]>
  <dm_probe_devices called: clk fixed_clock gpt_dummy20m [ ] [ ]>
  [...]
  <dm_probe_devices called: led gpio_led signal-4 [ ] [ ]>
  Core:  34 devices, 14 uclasses, devicetree: separate
  MMC:   <gpio_post_bind called>
  mmc@11230000: 0
  [...]

Probe them directly in gpio_post_bind instead.

Signed-off-by: Chris Webb <chris@arachsys.com>
---
 drivers/gpio/gpio-uclass.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Tom Rini June 20, 2024, 8:16 p.m. UTC | #1
On Thu, Jun 13, 2024 at 11:59:05AM +0100, Chris Webb wrote:

> 48b3ecbe replumbed the gpio-hog probing to use DM_FLAG_PROBE_AFTER_BIND.
> 
> Unfortunately gpio_post_bind is called after the non-preloc recursive
> dm_probe_devices completes, so setting this flag does not have the intended
> effect and the gpio-hogs never get probed. With instrumentation:
> 
>   [...]
>   CPU:   MediaTek MT7981
>   Model: GL.iNet GL-X3000
>   DRAM:  512 MiB
>   <mtk_pinctrl_mt7981_bind called>
>   <dm_probe_devices called: root root_driver root_driver [+] [ ]>
>   <dm_probe_devices called: clk fixed_clock gpt_dummy20m [ ] [ ]>
>   [...]
>   <dm_probe_devices called: led gpio_led signal-4 [ ] [ ]>
>   Core:  34 devices, 14 uclasses, devicetree: separate
>   MMC:   <gpio_post_bind called>
>   mmc@11230000: 0
>   [...]
> 
> Probe them directly in gpio_post_bind instead.
> 
> Signed-off-by: Chris Webb <chris@arachsys.com>
> ---
>  drivers/gpio/gpio-uclass.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpio/gpio-uclass.c b/drivers/gpio/gpio-uclass.c
> index 4234cd91..1c6e1715 100644
> --- a/drivers/gpio/gpio-uclass.c
> +++ b/drivers/gpio/gpio-uclass.c
> @@ -1539,7 +1539,9 @@ static int gpio_post_bind(struct udevice *dev)
>  				 * since hogs can be essential to the hardware
>  				 * system.
>  				 */
> -				dev_or_flags(child, DM_FLAG_PROBE_AFTER_BIND);
> +				ret = device_probe(child);
> +				if (ret)
> +					return ret;
>  			}
>  		}
>  	}

Adding Marek, as the author of commit 48b3ecbedf82 ("gpio: Get rid of
gpio_hog_probe_all()").
Chris Webb June 22, 2024, 11:04 a.m. UTC | #2
Tom Rini <trini@konsulko.com> wrote:

> Adding Marek, as the author of commit 48b3ecbedf82 ("gpio: Get rid of
> gpio_hog_probe_all()").

Thanks! I don't claim this is the correct way to fix this, just that it  
works.

Specifically, the two things I found that got gpio-hog working were

   (a) adding an explicit probe instead of DM_FLAG_PROBE_AFTER_BIND in gpio_post_bind(), or

   (b) adding a .bind function in U_BOOT_DRIVER(mt7981_pinctrl) like

     static int mtk_pinctrl_mt7981_bind(struct udevice *dev)
     {
     	dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);
     	return 0;
     }

However, presumably (b) isn't right as it would (presumably) need  
repeating in lots of other pinctrl drivers?

Best wishes,

Chris.
Chris Webb July 3, 2024, 9:49 a.m. UTC | #3
Chris Webb <chris@arachsys.com> wrote:

> Tom Rini <trini@konsulko.com> wrote:
>
>> Adding Marek, as the author of commit 48b3ecbedf82 ("gpio: Get rid of
>> gpio_hog_probe_all()").
>
> Thanks! I don't claim this is the correct way to fix this, just that it  
> works.
>
> Specifically, the two things I found that got gpio-hog working were
>
>   (a) adding an explicit probe instead of DM_FLAG_PROBE_AFTER_BIND in gpio_post_bind(), or
>
>   (b) adding a .bind function in U_BOOT_DRIVER(mt7981_pinctrl) like
>
>     static int mtk_pinctrl_mt7981_bind(struct udevice *dev)
>     {
>     	dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);
>     	return 0;
>     }
>
> However, presumably (b) isn't right as it would (presumably) need  
> repeating in lots of other pinctrl drivers?

Now the release is out, I'd be really keen to pick this one up and get it  
fixed upstream if possible.

The device I originally discovered this on is now deployed, but I could  
probably grab it back for a bit and resolder a serial console onto it for  
further testing if neither of the above is correct and a third alternative  
I didn't try needs confirming on real hardware.

Best wishes,

Chris.
Chris Webb July 29, 2024, 7:45 a.m. UTC | #4
Chris Webb <chris@arachsys.com> wrote:

> Now the release is out, I'd be really keen to pick this one up and get  
> it fixed upstream if possible.

Hi Tom, is there anything more I can do to help out here? I'd love  
upstream 2024.10 to ship with gpio-hog that works again.

Best wishes,

Chris.
Simon Glass July 29, 2024, 3:25 p.m. UTC | #5
Hi Chris,

On Thu, 13 Jun 2024 at 04:59, Chris Webb <chris@arachsys.com> wrote:
>
> 48b3ecbe replumbed the gpio-hog probing to use DM_FLAG_PROBE_AFTER_BIND.
>
> Unfortunately gpio_post_bind is called after the non-preloc recursive
> dm_probe_devices completes, so setting this flag does not have the intended
> effect and the gpio-hogs never get probed. With instrumentation:
>
>   [...]
>   CPU:   MediaTek MT7981
>   Model: GL.iNet GL-X3000
>   DRAM:  512 MiB
>   <mtk_pinctrl_mt7981_bind called>
>   <dm_probe_devices called: root root_driver root_driver [+] [ ]>
>   <dm_probe_devices called: clk fixed_clock gpt_dummy20m [ ] [ ]>
>   [...]
>   <dm_probe_devices called: led gpio_led signal-4 [ ] [ ]>
>   Core:  34 devices, 14 uclasses, devicetree: separate
>   MMC:   <gpio_post_bind called>
>   mmc@11230000: 0
>   [...]
>
> Probe them directly in gpio_post_bind instead.

We cannot probe devices when they are bound since it breaks the
ordering of driver model.

From your trace it looks like everything is happening after
relocation. I can't quite see what is actually going wrong. But if you
look at dm_init_and_scan(), it does the probe at the end, immediately
after all devices have been bound. So it should do what you want.

Is the GPIO device not being bound? There is something strange here.

>
> Signed-off-by: Chris Webb <chris@arachsys.com>
> ---
>  drivers/gpio/gpio-uclass.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpio/gpio-uclass.c b/drivers/gpio/gpio-uclass.c
> index 4234cd91..1c6e1715 100644
> --- a/drivers/gpio/gpio-uclass.c
> +++ b/drivers/gpio/gpio-uclass.c
> @@ -1539,7 +1539,9 @@ static int gpio_post_bind(struct udevice *dev)
>                                  * since hogs can be essential to the hardware
>                                  * system.
>                                  */
> -                               dev_or_flags(child, DM_FLAG_PROBE_AFTER_BIND);
> +                               ret = device_probe(child);
> +                               if (ret)
> +                                       return ret;
>                         }
>                 }
>         }

Regards,
Simon
Chris Webb July 29, 2024, 3:44 p.m. UTC | #6
Simon Glass <sjg@chromium.org> wrote:

> We cannot probe devices when they are bound since it breaks the
> ordering of driver model.
>
> From your trace it looks like everything is happening after
> relocation. I can't quite see what is actually going wrong. But if you
> look at dm_init_and_scan(), it does the probe at the end, immediately
> after all devices have been bound. So it should do what you want.
>
> Is the GPIO device not being bound? There is something strange here.

Hi Simon, many thanks for your follow up. Yes I wasn't convinced the patch  
was the correct fix (hence the RFC) but posted as it was one of the two  
ways I found to make gpio-hog work, the other being adding a .bind  
function in U_BOOT_DRIVER(mt7981_pinctrl) like

     static int mtk_pinctrl_mt7981_bind(struct udevice *dev)
     {
     	dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);
     	return 0;
     }

to force a probe after bind in the parent pinctrl device. I was hoping  
someone with more clue than me might go 'Aha! This is just...'  :)

The device I tested on has been deployed but I can probably get it back  
for a bit and resolder a serial console on to test again if that would be  
helpful. Are there other significant places I should be adding some traces  
that would make the problem clearer?

Is it significant/relevant that the gpio device is a child of the pinctrl  
device in the mt7981 device tree?

I think the gpio device must be getting bound, because otherwise my trace  
in gpio_post_bind() wouldn't get called at all, but perhaps it's bound too  
late somehow?

Best wishes,

Chris.
Simon Glass July 29, 2024, 4:18 p.m. UTC | #7
Hi Chris,

On Mon, 29 Jul 2024 at 09:44, Chris Webb <chris@arachsys.com> wrote:
>
> Simon Glass <sjg@chromium.org> wrote:
>
> > We cannot probe devices when they are bound since it breaks the
> > ordering of driver model.
> >
> > From your trace it looks like everything is happening after
> > relocation. I can't quite see what is actually going wrong. But if you
> > look at dm_init_and_scan(), it does the probe at the end, immediately
> > after all devices have been bound. So it should do what you want.
> >
> > Is the GPIO device not being bound? There is something strange here.
>
> Hi Simon, many thanks for your follow up. Yes I wasn't convinced the patch
> was the correct fix (hence the RFC) but posted as it was one of the two
> ways I found to make gpio-hog work, the other being adding a .bind
> function in U_BOOT_DRIVER(mt7981_pinctrl) like
>
>      static int mtk_pinctrl_mt7981_bind(struct udevice *dev)
>      {
>         dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);
>         return 0;
>      }
>
> to force a probe after bind in the parent pinctrl device. I was hoping
> someone with more clue than me might go 'Aha! This is just...'  :)
?
>
> The device I tested on has been deployed but I can probably get it back
> for a bit and resolder a serial console on to test again if that would be
> helpful. Are there other significant places I should be adding some traces
> that would make the problem clearer?
>
> Is it significant/relevant that the gpio device is a child of the pinctrl
> device in the mt7981 device tree?
>
> I think the gpio device must be getting bound, because otherwise my trace
> in gpio_post_bind() wouldn't get called at all, but perhaps it's bound too
> late somehow?

Well, yes, mt7981_pinctrl is wrong since it is not actually binding
the GPIO devices until it itself is probed. It should do it when it is
bound.

Better still, those GPIO devices should be in the devicetree and bound
automatically by driver model. But, sigh, I see that there is no
compatible string in the gpio subnode of pinctrl@11d00000. It should
really have one and avoid all this pointless code and problems.

mtk_pinctrl_common_probe() is misnamed, as it actually binds and then probes.

So (unless Linux allows a patch to add a compatible string) it needs a
new mtk_pinctrl_common_bind() (called from mtk_pinctrl_mt7981_bind())
which calls mtk_gpiochip_register(). Then you won't need to add your
dev_or_flags() into mtk_pinctrl_mt7981_bind().

Regards,
Simon
Chris Webb July 29, 2024, 4:45 p.m. UTC | #8
Hi Simon,

Simon Glass <sjg@chromium.org> wrote:

> Well, yes, mt7981_pinctrl is wrong since it is not actually binding
> the GPIO devices until it itself is probed. It should do it when it is
> bound.

Oh I see! Yes, I can see the mtk_gpiochip_register(dev) in  
mtk_pinctrl_common_probe() exactly as you say.

> Better still, those GPIO devices should be in the devicetree and bound
> automatically by driver model. But, sigh, I see that there is no
> compatible string in the gpio subnode of pinctrl@11d00000. It should
> really have one and avoid all this pointless code and problems.
>
> mtk_pinctrl_common_probe() is misnamed, as it actually binds and then  
> probes.
>
> So (unless Linux allows a patch to add a compatible string) it needs a
> new mtk_pinctrl_common_bind() (called from mtk_pinctrl_mt7981_bind())
> which calls mtk_gpiochip_register(). Then you won't need to add your
> dev_or_flags() into mtk_pinctrl_mt7981_bind().

Yes, that makes complete sense. Many thanks! I'm very happy to write that  
patch and grab back the physical hardware to double-check on if you like?  
(Or equally happy to leave it if you'd prefer to fix yourself?)

Presumably it needs to apply to every mtk soc that uses  
mtk_pinctrl_common_probe() as they'll all be affected by this problem.

Best wishes,

Chris.
Simon Glass July 29, 2024, 5:21 p.m. UTC | #9
Hi Chris,

On Mon, 29 Jul 2024 at 10:45, Chris Webb <chris@arachsys.com> wrote:
>
> Hi Simon,
>
> Simon Glass <sjg@chromium.org> wrote:
>
> > Well, yes, mt7981_pinctrl is wrong since it is not actually binding
> > the GPIO devices until it itself is probed. It should do it when it is
> > bound.
>
> Oh I see! Yes, I can see the mtk_gpiochip_register(dev) in
> mtk_pinctrl_common_probe() exactly as you say.
>
> > Better still, those GPIO devices should be in the devicetree and bound
> > automatically by driver model. But, sigh, I see that there is no
> > compatible string in the gpio subnode of pinctrl@11d00000. It should
> > really have one and avoid all this pointless code and problems.
> >
> > mtk_pinctrl_common_probe() is misnamed, as it actually binds and then
> > probes.
> >
> > So (unless Linux allows a patch to add a compatible string) it needs a
> > new mtk_pinctrl_common_bind() (called from mtk_pinctrl_mt7981_bind())
> > which calls mtk_gpiochip_register(). Then you won't need to add your
> > dev_or_flags() into mtk_pinctrl_mt7981_bind().
>
> Yes, that makes complete sense. Many thanks! I'm very happy to write that
> patch and grab back the physical hardware to double-check on if you like?
> (Or equally happy to leave it if you'd prefer to fix yourself?)

OK good. Please go ahead!

>
> Presumably it needs to apply to every mtk soc that uses
> mtk_pinctrl_common_probe() as they'll all be affected by this problem.

Yes I suppose so.

Regards,
Simon
Chris Webb July 31, 2024, 10:14 a.m. UTC | #10
Hi Simon,


Simon Glass <sjg@chromium.org> wrote:

>> Presumably it needs to apply to every mtk soc that uses
>> mtk_pinctrl_common_probe() as they'll all be affected by this problem.
>
> Yes I suppose so.

As well as the mediatek case (patch just sent), I thought I should look  
through the other pinctrl drivers for other examples of this problem you  
explained to me.

Both starfive/pinctrl-starfive.c and mvebu/pinctrl-armada-37xx.c do the  
same thing, calling their gpiochip_register as part of the driver probe  
method. The pinctrl-armada-37xx.c driver also has a bind action:

   static int armada_37xx_pinctrl_bind(struct udevice *dev)
   {
           /*
            * Make sure that the pinctrl driver gets probed after binding
            * as on A37XX the pinctrl driver is the one that is also
            * registering the GPIO one during probe, so if its not probed
            * GPIO-s are not registered as well.
            */
           dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);

           return 0;
   }

which presumably wouldn't be needed if the gpiochip were bound at pinctrl  
bind time instead of pinctrl probe time?

Alas I don't have any boards to test on for either of these platforms.

Best wishes,

Chris.
Simon Glass July 31, 2024, 2:38 p.m. UTC | #11
Hi Chris,

On Wed, 31 Jul 2024 at 04:14, Chris Webb <chris@arachsys.com> wrote:
>
> Hi Simon,
>
>
> Simon Glass <sjg@chromium.org> wrote:
>
> >> Presumably it needs to apply to every mtk soc that uses
> >> mtk_pinctrl_common_probe() as they'll all be affected by this problem.
> >
> > Yes I suppose so.
>
> As well as the mediatek case (patch just sent), I thought I should look
> through the other pinctrl drivers for other examples of this problem you
> explained to me.
>
> Both starfive/pinctrl-starfive.c and mvebu/pinctrl-armada-37xx.c do the
> same thing, calling their gpiochip_register as part of the driver probe
> method. The pinctrl-armada-37xx.c driver also has a bind action:
>
>    static int armada_37xx_pinctrl_bind(struct udevice *dev)
>    {
>            /*
>             * Make sure that the pinctrl driver gets probed after binding
>             * as on A37XX the pinctrl driver is the one that is also
>             * registering the GPIO one during probe, so if its not probed
>             * GPIO-s are not registered as well.
>             */
>            dev_or_flags(dev, DM_FLAG_PROBE_AFTER_BIND);
>
>            return 0;
>    }
>
> which presumably wouldn't be needed if the gpiochip were bound at pinctrl
> bind time instead of pinctrl probe time?
>
> Alas I don't have any boards to test on for either of these platforms.

If you have the inclination it is still worth sending a patch. The
maintainer can check it. These sorts of counter-examples can be copied
and soon everyone is making the same mistake!

Regards,
Simon
Chris Webb July 31, 2024, 6:14 p.m. UTC | #12
Hi Simon,

Simon Glass <sjg@chromium.org> wrote:

> On Wed, 31 Jul 2024 at 04:14, Chris Webb <chris@arachsys.com> wrote:
>> Alas I don't have any boards to test on for either of these platforms.
>
> If you have the inclination it is still worth sending a patch. The
> maintainer can check it. These sorts of counter-examples can be copied
> and soon everyone is making the same mistake!

Makes sense! If the Mediatek patch is okay, I'll write the equivalents for  
the other two platforms and put a comment after the --- to warn the  
maintainers that I haven't been able to test on real hardware. I can do a  
compile test of them both at least, and they're simple, easy to verify  
changes.

Best wishes,

Chris.
diff mbox series

Patch

diff --git a/drivers/gpio/gpio-uclass.c b/drivers/gpio/gpio-uclass.c
index 4234cd91..1c6e1715 100644
--- a/drivers/gpio/gpio-uclass.c
+++ b/drivers/gpio/gpio-uclass.c
@@ -1539,7 +1539,9 @@  static int gpio_post_bind(struct udevice *dev)
 				 * since hogs can be essential to the hardware
 				 * system.
 				 */
-				dev_or_flags(child, DM_FLAG_PROBE_AFTER_BIND);
+				ret = device_probe(child);
+				if (ret)
+					return ret;
 			}
 		}
 	}