Message ID | 20210802150016.588750-8-rasmus.villemoes@prevas.dk |
---|---|
State | Superseded |
Delegated to: | Stefan Roese |
Headers | show |
Series | handling all DM watchdogs in watchdog_reset() | expand |
Hi Rasmus, On Mon, 2 Aug 2021 at 09:00, Rasmus Villemoes <rasmus.villemoes@prevas.dk> wrote: > > A board can have and make use of more than one watchdog device, say > one built into the SOC and an external gpio-petted one. Having > wdt-uclass only handle the first is both a little arbitrary and > unexpected. > > So change initr_watchdog() so we visit (probe) all DM watchdog > devices, and call the init_watchdog_dev helper for each. > > Similarly let watchdog_reset() loop over the whole uclass - each > having their own ratelimiting metadata, and a separate "is this device > running" flag. > > This gets rid of the watchdog_dev member of struct global_data. We > do, however, still need the GD_FLG_WDT_READY set in > initr_watchdog(). This is because watchdog_reset() can get called > before DM is ready, and I don't think we can call uclass_get() that > early. > > The current code just returns 0 if "getting" the first device fails - > that can of course happen because there are no devices, but it could > also happen if its ->probe call failed. In keeping with that, continue > with the handling of the remaining devices even if one fails to > probe. This is also why we cannot use uclass_probe_all(). > > If desired, it's possible to later add a per-device "u-boot,autostart" > boolean property, so that one can do CONFIG_WATCHDOG_AUTOSTART > per-device. > > Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> > --- > drivers/watchdog/wdt-uclass.c | 56 ++++++++++++++++++++----------- > include/asm-generic/global_data.h | 6 ---- > 2 files changed, 36 insertions(+), 26 deletions(-) Reviewed-by: Simon Glass <sjg@chromium.org>
On 02.08.21 17:00, Rasmus Villemoes wrote: > A board can have and make use of more than one watchdog device, say > one built into the SOC and an external gpio-petted one. Having > wdt-uclass only handle the first is both a little arbitrary and > unexpected. > > So change initr_watchdog() so we visit (probe) all DM watchdog > devices, and call the init_watchdog_dev helper for each. > > Similarly let watchdog_reset() loop over the whole uclass - each > having their own ratelimiting metadata, and a separate "is this device > running" flag. > > This gets rid of the watchdog_dev member of struct global_data. We > do, however, still need the GD_FLG_WDT_READY set in > initr_watchdog(). This is because watchdog_reset() can get called > before DM is ready, and I don't think we can call uclass_get() that > early. > > The current code just returns 0 if "getting" the first device fails - > that can of course happen because there are no devices, but it could > also happen if its ->probe call failed. In keeping with that, continue > with the handling of the remaining devices even if one fails to > probe. This is also why we cannot use uclass_probe_all(). > > If desired, it's possible to later add a per-device "u-boot,autostart" > boolean property, so that one can do CONFIG_WATCHDOG_AUTOSTART > per-device. > > Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Reviewed-by: Stefan Roese <sr@denx.de> Thanks, Stefan > --- > drivers/watchdog/wdt-uclass.c | 56 ++++++++++++++++++++----------- > include/asm-generic/global_data.h | 6 ---- > 2 files changed, 36 insertions(+), 26 deletions(-) > > diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c > index 358fc68e27..0ce8b3a425 100644 > --- a/drivers/watchdog/wdt-uclass.c > +++ b/drivers/watchdog/wdt-uclass.c > @@ -61,20 +61,24 @@ static void init_watchdog_dev(struct udevice *dev) > > int initr_watchdog(void) > { > - /* > - * Init watchdog: This will call the probe function of the > - * watchdog driver, enabling the use of the device > - */ > - if (uclass_get_device_by_seq(UCLASS_WDT, 0, > - (struct udevice **)&gd->watchdog_dev)) { > - debug("WDT: Not found by seq!\n"); > - if (uclass_get_device(UCLASS_WDT, 0, > - (struct udevice **)&gd->watchdog_dev)) { > - printf("WDT: Not found!\n"); > - return 0; > + struct udevice *dev; > + struct uclass *uc; > + int ret; > + > + ret = uclass_get(UCLASS_WDT, &uc); > + if (ret) { > + log_debug("Error getting UCLASS_WDT: %d\n", ret); > + return 0; > + } > + > + uclass_foreach_dev(dev, uc) { > + ret = device_probe(dev); > + if (ret) { > + log_debug("Error probing %s: %d\n", dev->name, ret); > + continue; > } > + init_watchdog_dev(dev); > } > - init_watchdog_dev(gd->watchdog_dev); > > gd->flags |= GD_FLG_WDT_READY; > return 0; > @@ -157,22 +161,34 @@ void watchdog_reset(void) > { > struct wdt_priv *priv; > struct udevice *dev; > + struct uclass *uc; > ulong now; > > /* Exit if GD is not ready or watchdog is not initialized yet */ > if (!gd || !(gd->flags & GD_FLG_WDT_READY)) > return; > > - dev = gd->watchdog_dev; > - priv = dev_get_uclass_priv(dev); > - if (!priv->running) > + if (uclass_get(UCLASS_WDT, &uc)) > return; > > - /* Do not reset the watchdog too often */ > - now = get_timer(0); > - if (time_after_eq(now, priv->next_reset)) { > - priv->next_reset = now + priv->reset_period; > - wdt_reset(dev); > + /* > + * All devices bound to the wdt uclass should have been probed > + * in initr_watchdog(). But just in case something went wrong, > + * check device_active() before accessing the uclass private > + * data. > + */ > + uclass_foreach_dev(dev, uc) { > + if (!device_active(dev)) > + continue; > + priv = dev_get_uclass_priv(dev); > + if (!priv->running) > + continue; > + /* Do not reset the watchdog too often */ > + now = get_timer(0); > + if (time_after_eq(now, priv->next_reset)) { > + priv->next_reset = now + priv->reset_period; > + wdt_reset(dev); > + } > } > } > #endif > diff --git a/include/asm-generic/global_data.h b/include/asm-generic/global_data.h > index e55070303f..28d749538c 100644 > --- a/include/asm-generic/global_data.h > +++ b/include/asm-generic/global_data.h > @@ -447,12 +447,6 @@ struct global_data { > */ > fdt_addr_t translation_offset; > #endif > -#if CONFIG_IS_ENABLED(WDT) > - /** > - * @watchdog_dev: watchdog device > - */ > - struct udevice *watchdog_dev; > -#endif > #ifdef CONFIG_GENERATE_ACPI_TABLE > /** > * @acpi_ctx: ACPI context pointer > Viele Grüße, Stefan
Hi Rasmus, On 02.08.21 17:00, Rasmus Villemoes wrote: > A board can have and make use of more than one watchdog device, say > one built into the SOC and an external gpio-petted one. Having > wdt-uclass only handle the first is both a little arbitrary and > unexpected. > > So change initr_watchdog() so we visit (probe) all DM watchdog > devices, and call the init_watchdog_dev helper for each. > > Similarly let watchdog_reset() loop over the whole uclass - each > having their own ratelimiting metadata, and a separate "is this device > running" flag. > > This gets rid of the watchdog_dev member of struct global_data. We > do, however, still need the GD_FLG_WDT_READY set in > initr_watchdog(). This is because watchdog_reset() can get called > before DM is ready, and I don't think we can call uclass_get() that > early. > > The current code just returns 0 if "getting" the first device fails - > that can of course happen because there are no devices, but it could > also happen if its ->probe call failed. In keeping with that, continue > with the handling of the remaining devices even if one fails to > probe. This is also why we cannot use uclass_probe_all(). > > If desired, it's possible to later add a per-device "u-boot,autostart" > boolean property, so that one can do CONFIG_WATCHDOG_AUTOSTART > per-device. > > Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> > --- > drivers/watchdog/wdt-uclass.c | 56 ++++++++++++++++++++----------- > include/asm-generic/global_data.h | 6 ---- > 2 files changed, 36 insertions(+), 26 deletions(-) > > diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c > index 358fc68e27..0ce8b3a425 100644 > --- a/drivers/watchdog/wdt-uclass.c > +++ b/drivers/watchdog/wdt-uclass.c > @@ -61,20 +61,24 @@ static void init_watchdog_dev(struct udevice *dev) > > int initr_watchdog(void) > { > - /* > - * Init watchdog: This will call the probe function of the > - * watchdog driver, enabling the use of the device > - */ > - if (uclass_get_device_by_seq(UCLASS_WDT, 0, > - (struct udevice **)&gd->watchdog_dev)) { > - debug("WDT: Not found by seq!\n"); > - if (uclass_get_device(UCLASS_WDT, 0, > - (struct udevice **)&gd->watchdog_dev)) { > - printf("WDT: Not found!\n"); > - return 0; > + struct udevice *dev; > + struct uclass *uc; > + int ret; > + > + ret = uclass_get(UCLASS_WDT, &uc); > + if (ret) { > + log_debug("Error getting UCLASS_WDT: %d\n", ret); > + return 0; > + } > + > + uclass_foreach_dev(dev, uc) { > + ret = device_probe(dev); > + if (ret) { > + log_debug("Error probing %s: %d\n", dev->name, ret); > + continue; > } > + init_watchdog_dev(dev); > } > - init_watchdog_dev(gd->watchdog_dev); > > gd->flags |= GD_FLG_WDT_READY; > return 0; > @@ -157,22 +161,34 @@ void watchdog_reset(void) > { > struct wdt_priv *priv; > struct udevice *dev; > + struct uclass *uc; > ulong now; > > /* Exit if GD is not ready or watchdog is not initialized yet */ > if (!gd || !(gd->flags & GD_FLG_WDT_READY)) > return; > > - dev = gd->watchdog_dev; > - priv = dev_get_uclass_priv(dev); > - if (!priv->running) > + if (uclass_get(UCLASS_WDT, &uc)) > return; > > - /* Do not reset the watchdog too often */ > - now = get_timer(0); > - if (time_after_eq(now, priv->next_reset)) { > - priv->next_reset = now + priv->reset_period; > - wdt_reset(dev); > + /* > + * All devices bound to the wdt uclass should have been probed > + * in initr_watchdog(). But just in case something went wrong, > + * check device_active() before accessing the uclass private > + * data. > + */ > + uclass_foreach_dev(dev, uc) { > + if (!device_active(dev)) > + continue; > + priv = dev_get_uclass_priv(dev); > + if (!priv->running) > + continue; > + /* Do not reset the watchdog too often */ > + now = get_timer(0); > + if (time_after_eq(now, priv->next_reset)) { > + priv->next_reset = now + priv->reset_period; > + wdt_reset(dev); > + } > } > } > #endif > diff --git a/include/asm-generic/global_data.h b/include/asm-generic/global_data.h > index e55070303f..28d749538c 100644 > --- a/include/asm-generic/global_data.h > +++ b/include/asm-generic/global_data.h > @@ -447,12 +447,6 @@ struct global_data { > */ > fdt_addr_t translation_offset; > #endif > -#if CONFIG_IS_ENABLED(WDT) > - /** > - * @watchdog_dev: watchdog device > - */ > - struct udevice *watchdog_dev; > -#endif After applying the patchset v4 to current master, I see this error when building for "x530": board/alliedtelesis/x530/x530.c: In function 'arch_preboot_os': board/alliedtelesis/x530/x530.c:125:13: error: 'gd_t' {aka 'volatile struct global_data'} has no member named 'watchdog_dev' 125 | wdt_stop(gd->watchdog_dev); | ^~ make[1]: *** [scripts/Makefile.build:254: board/alliedtelesis/x530/x530.o] Error 1 Perhaps we need a common function now to stop all watchdogs, which can be called from such places? Thanks, Stefan
On 03/08/2021 10.28, Stefan Roese wrote: > Hi Rasmus, > >> #endif >> diff --git a/include/asm-generic/global_data.h >> b/include/asm-generic/global_data.h >> index e55070303f..28d749538c 100644 >> --- a/include/asm-generic/global_data.h >> +++ b/include/asm-generic/global_data.h >> @@ -447,12 +447,6 @@ struct global_data { >> */ >> fdt_addr_t translation_offset; >> #endif >> -#if CONFIG_IS_ENABLED(WDT) >> - /** >> - * @watchdog_dev: watchdog device >> - */ >> - struct udevice *watchdog_dev; >> -#endif > > After applying the patchset v4 to current master, I see this error when > building for "x530": I'm not sure how I missed this, I was convinced I had done a grep and seen that all references to ->watchdog_dev were gone. > board/alliedtelesis/x530/x530.c: In function 'arch_preboot_os': > board/alliedtelesis/x530/x530.c:125:13: error: 'gd_t' {aka 'volatile > struct global_data'} has no member named 'watchdog_dev' > 125 | wdt_stop(gd->watchdog_dev); > | ^~ > make[1]: *** [scripts/Makefile.build:254: > board/alliedtelesis/x530/x530.o] Error 1 > > Perhaps we need a common function now to stop all watchdogs, which can > be called from such places? Yes, I think that's the right thing, even if there's just one single caller. Dead code elimination should remove that global function for all other boards. Here is what I have right now: watchdog: wdt-uclass: add wdt_stop_all() helper Since the watchdog_dev member of struct global_data is going away in favor of the wdt-uclass handling all watchdog devices, prepare for that by adding a helper to call wdt_stop() on all known devices. Initially, this will only be used in one single place (board/alliedtelesis/x530/x530.c), and that user currently ignores the return value of wdt_stop(). It's not clear what one should do if we encounter an error (remember the error but still stop the remaining ones? return immediately? try to unwind and restart the devices already stopped?). Moreover, some watchdogs are by design always-running (and don't have a ->stop method at all), so at the very least some exception for -ENOSYS would be in order. So for now, and until a real use case appears from which we can decide the desired semantics, have the function return void and just emit a log_debug() if an error is encountered. diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index 358fc68e27..75ff4c2a6c 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -116,6 +116,31 @@ int wdt_stop(struct udevice *dev) return ret; } +void wdt_stop_all(void) +{ + struct wdt_priv *priv; + struct udevice *dev; + struct uclass *uc; + int ret; + + ret = uclass_get(UCLASS_WDT, &uc); + if (ret) { + log_debug("Error getting UCLASS_WDT: %d\n", ret); + return; + } + + uclass_foreach_dev(dev, uc) { + if (!device_active(dev)) + continue; + priv = dev_get_uclass_priv(dev); + if (!priv->running) + continue; + ret = wdt_stop(dev); + if (ret) + log_debug("wdt_stop(%s) failed: %d\n", dev->name, ret); + } +} + int wdt_reset(struct udevice *dev) { const struct wdt_ops *ops = device_get_ops(dev); (along with a declaration in wdt.h, and another patch for x530 to make use of it). Something like that? Rasmus
Hi Rasmus, On 11.08.21 13:32, Rasmus Villemoes wrote: > On 03/08/2021 10.28, Stefan Roese wrote: >> Hi Rasmus, >> >>> #endif >>> diff --git a/include/asm-generic/global_data.h >>> b/include/asm-generic/global_data.h >>> index e55070303f..28d749538c 100644 >>> --- a/include/asm-generic/global_data.h >>> +++ b/include/asm-generic/global_data.h >>> @@ -447,12 +447,6 @@ struct global_data { >>> */ >>> fdt_addr_t translation_offset; >>> #endif >>> -#if CONFIG_IS_ENABLED(WDT) >>> - /** >>> - * @watchdog_dev: watchdog device >>> - */ >>> - struct udevice *watchdog_dev; >>> -#endif >> >> After applying the patchset v4 to current master, I see this error when >> building for "x530": > > I'm not sure how I missed this, I was convinced I had done a grep and > seen that all references to ->watchdog_dev were gone. > >> board/alliedtelesis/x530/x530.c: In function 'arch_preboot_os': >> board/alliedtelesis/x530/x530.c:125:13: error: 'gd_t' {aka 'volatile >> struct global_data'} has no member named 'watchdog_dev' >> 125 | wdt_stop(gd->watchdog_dev); >> | ^~ >> make[1]: *** [scripts/Makefile.build:254: >> board/alliedtelesis/x530/x530.o] Error 1 >> >> Perhaps we need a common function now to stop all watchdogs, which can >> be called from such places? > > Yes, I think that's the right thing, even if there's just one single > caller. Dead code elimination should remove that global function for all > other boards. Here is what I have right now: > > watchdog: wdt-uclass: add wdt_stop_all() helper > > Since the watchdog_dev member of struct global_data is going away in > favor of the wdt-uclass handling all watchdog devices, prepare for > that by adding a helper to call wdt_stop() on all known devices. > > Initially, this will only be used in one single > place (board/alliedtelesis/x530/x530.c), and that user currently > ignores the return value of wdt_stop(). It's not clear what one should > do if we encounter an error (remember the error but still stop the > remaining ones? return immediately? try to unwind and > restart the devices already stopped?). Moreover, some watchdogs are by > design always-running (and don't have a ->stop method at all), so at > the very least some exception for -ENOSYS would be in order. > > So for now, and until a real use case appears from which we can decide > the desired semantics, have the function return void and just emit a > log_debug() if an error is encountered. > > diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c > index 358fc68e27..75ff4c2a6c 100644 > --- a/drivers/watchdog/wdt-uclass.c > +++ b/drivers/watchdog/wdt-uclass.c > @@ -116,6 +116,31 @@ int wdt_stop(struct udevice *dev) > return ret; > } > > +void wdt_stop_all(void) > +{ > + struct wdt_priv *priv; > + struct udevice *dev; > + struct uclass *uc; > + int ret; > + > + ret = uclass_get(UCLASS_WDT, &uc); > + if (ret) { > + log_debug("Error getting UCLASS_WDT: %d\n", ret); Perhaps log_err()? > + return; > + } > + > + uclass_foreach_dev(dev, uc) { > + if (!device_active(dev)) > + continue; > + priv = dev_get_uclass_priv(dev); > + if (!priv->running) > + continue; > + ret = wdt_stop(dev); > + if (ret) > + log_debug("wdt_stop(%s) failed: %d\n", > dev->name, ret); Again, log_err() might be better? > + } > +} > + > int wdt_reset(struct udevice *dev) > { > const struct wdt_ops *ops = device_get_ops(dev); > > (along with a declaration in wdt.h, and another patch for x530 to make > use of it). Something like that? Looks good, thanks for quickly working on this. Not sure, if this new function should be "void" or better "int" so that the error can be returned. A follow-up patch on-top your patchset v4 will cause git bisect problems. So you need to integrate this into your patches and send a v5 I'm afraid. Thanks, Stefan
On 11/08/2021 13.49, Stefan Roese wrote: > Hi Rasmus, > > On 11.08.21 13:32, Rasmus Villemoes wrote: >> On 03/08/2021 10.28, Stefan Roese wrote: >>> Hi Rasmus, >>> >>>> #endif >>>> diff --git a/include/asm-generic/global_data.h >>>> b/include/asm-generic/global_data.h >>>> index e55070303f..28d749538c 100644 >>>> --- a/include/asm-generic/global_data.h >>>> +++ b/include/asm-generic/global_data.h >>>> @@ -447,12 +447,6 @@ struct global_data { >>>> */ >>>> fdt_addr_t translation_offset; >>>> #endif >>>> -#if CONFIG_IS_ENABLED(WDT) >>>> - /** >>>> - * @watchdog_dev: watchdog device >>>> - */ >>>> - struct udevice *watchdog_dev; >>>> -#endif >>> >>> After applying the patchset v4 to current master, I see this error when >>> building for "x530": >> >> I'm not sure how I missed this, I was convinced I had done a grep and >> seen that all references to ->watchdog_dev were gone. >> >>> board/alliedtelesis/x530/x530.c: In function 'arch_preboot_os': >>> board/alliedtelesis/x530/x530.c:125:13: error: 'gd_t' {aka 'volatile >>> struct global_data'} has no member named 'watchdog_dev' >>> 125 | wdt_stop(gd->watchdog_dev); >>> | ^~ >>> make[1]: *** [scripts/Makefile.build:254: >>> board/alliedtelesis/x530/x530.o] Error 1 >>> >>> Perhaps we need a common function now to stop all watchdogs, which can >>> be called from such places? >> >> Yes, I think that's the right thing, even if there's just one single >> caller. Dead code elimination should remove that global function for all >> other boards. Here is what I have right now: >> >> watchdog: wdt-uclass: add wdt_stop_all() helper >> >> Since the watchdog_dev member of struct global_data is going away in >> favor of the wdt-uclass handling all watchdog devices, prepare for >> that by adding a helper to call wdt_stop() on all known devices. >> >> Initially, this will only be used in one single >> place (board/alliedtelesis/x530/x530.c), and that user currently >> ignores the return value of wdt_stop(). It's not clear what one >> should >> do if we encounter an error (remember the error but still stop the >> remaining ones? return immediately? try to unwind and >> restart the devices already stopped?). Moreover, some watchdogs >> are by >> design always-running (and don't have a ->stop method at all), so at >> the very least some exception for -ENOSYS would be in order. >> >> So for now, and until a real use case appears from which we can >> decide >> the desired semantics, have the function return void and just emit a >> log_debug() if an error is encountered. >> >> diff --git a/drivers/watchdog/wdt-uclass.c >> b/drivers/watchdog/wdt-uclass.c >> index 358fc68e27..75ff4c2a6c 100644 >> --- a/drivers/watchdog/wdt-uclass.c >> +++ b/drivers/watchdog/wdt-uclass.c >> @@ -116,6 +116,31 @@ int wdt_stop(struct udevice *dev) >> return ret; >> } >> >> +void wdt_stop_all(void) >> +{ >> + struct wdt_priv *priv; >> + struct udevice *dev; >> + struct uclass *uc; >> + int ret; >> + >> + ret = uclass_get(UCLASS_WDT, &uc); >> + if (ret) { >> + log_debug("Error getting UCLASS_WDT: %d\n", ret); > > Perhaps log_err()? No, we've already been over this in earlier discussions (it's the exact same pattern and reasoning as initr_watchdog). If I made it log_err(), it would cost .text for something that never-ever happens in practice, while log_debug() is usually a no-op, but can be compiled in if something truly fishy seems to be going on. >> int wdt_reset(struct udevice *dev) >> { >> const struct wdt_ops *ops = device_get_ops(dev); >> >> (along with a declaration in wdt.h, and another patch for x530 to make >> use of it). Something like that? > > Looks good, thanks for quickly working on this. Not sure, if this new > function should be "void" or better "int" so that the error can be > returned. That's why I included my tentative commit log, so you could see my explanation for why I made it void. Until some user shows up that _wants_ a return value, there's no point making it return int. When that user shows up, we can discuss which int (return early on failure? remember that an error was seen but still call wdt_stop on remaining devices? etc. etc.). > A follow-up patch on-top your patchset v4 will cause git bisect > problems. So you need to integrate this into your patches and send a > v5 I'm afraid. Definitely, it's already reworked in between 6 and 7 in my local branch, I just didn't want to send a 12-patch series without some "yeah, something like that would work". Rasmus
On 11.08.21 14:13, Rasmus Villemoes wrote: > On 11/08/2021 13.49, Stefan Roese wrote: >> Hi Rasmus, >> >> On 11.08.21 13:32, Rasmus Villemoes wrote: >>> On 03/08/2021 10.28, Stefan Roese wrote: >>>> Hi Rasmus, >>>> >>>>> #endif >>>>> diff --git a/include/asm-generic/global_data.h >>>>> b/include/asm-generic/global_data.h >>>>> index e55070303f..28d749538c 100644 >>>>> --- a/include/asm-generic/global_data.h >>>>> +++ b/include/asm-generic/global_data.h >>>>> @@ -447,12 +447,6 @@ struct global_data { >>>>> */ >>>>> fdt_addr_t translation_offset; >>>>> #endif >>>>> -#if CONFIG_IS_ENABLED(WDT) >>>>> - /** >>>>> - * @watchdog_dev: watchdog device >>>>> - */ >>>>> - struct udevice *watchdog_dev; >>>>> -#endif >>>> >>>> After applying the patchset v4 to current master, I see this error when >>>> building for "x530": >>> >>> I'm not sure how I missed this, I was convinced I had done a grep and >>> seen that all references to ->watchdog_dev were gone. >>> >>>> board/alliedtelesis/x530/x530.c: In function 'arch_preboot_os': >>>> board/alliedtelesis/x530/x530.c:125:13: error: 'gd_t' {aka 'volatile >>>> struct global_data'} has no member named 'watchdog_dev' >>>> 125 | wdt_stop(gd->watchdog_dev); >>>> | ^~ >>>> make[1]: *** [scripts/Makefile.build:254: >>>> board/alliedtelesis/x530/x530.o] Error 1 >>>> >>>> Perhaps we need a common function now to stop all watchdogs, which can >>>> be called from such places? >>> >>> Yes, I think that's the right thing, even if there's just one single >>> caller. Dead code elimination should remove that global function for all >>> other boards. Here is what I have right now: >>> >>> watchdog: wdt-uclass: add wdt_stop_all() helper >>> >>> Since the watchdog_dev member of struct global_data is going away in >>> favor of the wdt-uclass handling all watchdog devices, prepare for >>> that by adding a helper to call wdt_stop() on all known devices. >>> >>> Initially, this will only be used in one single >>> place (board/alliedtelesis/x530/x530.c), and that user currently >>> ignores the return value of wdt_stop(). It's not clear what one >>> should >>> do if we encounter an error (remember the error but still stop the >>> remaining ones? return immediately? try to unwind and >>> restart the devices already stopped?). Moreover, some watchdogs >>> are by >>> design always-running (and don't have a ->stop method at all), so at >>> the very least some exception for -ENOSYS would be in order. >>> >>> So for now, and until a real use case appears from which we can >>> decide >>> the desired semantics, have the function return void and just emit a >>> log_debug() if an error is encountered. >>> >>> diff --git a/drivers/watchdog/wdt-uclass.c >>> b/drivers/watchdog/wdt-uclass.c >>> index 358fc68e27..75ff4c2a6c 100644 >>> --- a/drivers/watchdog/wdt-uclass.c >>> +++ b/drivers/watchdog/wdt-uclass.c >>> @@ -116,6 +116,31 @@ int wdt_stop(struct udevice *dev) >>> return ret; >>> } >>> >>> +void wdt_stop_all(void) >>> +{ >>> + struct wdt_priv *priv; >>> + struct udevice *dev; >>> + struct uclass *uc; >>> + int ret; >>> + >>> + ret = uclass_get(UCLASS_WDT, &uc); >>> + if (ret) { >>> + log_debug("Error getting UCLASS_WDT: %d\n", ret); >> >> Perhaps log_err()? > > No, we've already been over this in earlier discussions (it's the exact > same pattern and reasoning as initr_watchdog). If I made it log_err(), > it would cost .text for something that never-ever happens in practice, > while log_debug() is usually a no-op, but can be compiled in if > something truly fishy seems to be going on. Okay, then please stick with log_debug(). >>> int wdt_reset(struct udevice *dev) >>> { >>> const struct wdt_ops *ops = device_get_ops(dev); >>> >>> (along with a declaration in wdt.h, and another patch for x530 to make >>> use of it). Something like that? >> >> Looks good, thanks for quickly working on this. Not sure, if this new >> function should be "void" or better "int" so that the error can be >> returned. > > That's why I included my tentative commit log, so you could see my > explanation for why I made it void. Ah, sorry for not reading it thoroughly. My bad. > Until some user shows up that > _wants_ a return value, there's no point making it return int. When that > user shows up, we can discuss which int (return early on failure? > remember that an error was seen but still call wdt_stop on remaining > devices? etc. etc.). I'm fine with it. >> A follow-up patch on-top your patchset v4 will cause git bisect >> problems. So you need to integrate this into your patches and send a >> v5 I'm afraid. > > Definitely, it's already reworked in between 6 and 7 in my local branch, > I just didn't want to send a 12-patch series without some "yeah, > something like that would work". Perfect. Please go ahead then. Thanks, Stefan
Dear Rasmus, In message <3d48015a-07d3-e296-b9ba-a1edd455ce9e@prevas.dk> you wrote: > > >> + if (ret) { > >> + log_debug("Error getting UCLASS_WDT: %d\n", ret); > > > > Perhaps log_err()? > > No, we've already been over this in earlier discussions (it's the exact > same pattern and reasoning as initr_watchdog). If I made it log_err(), > it would cost .text for something that never-ever happens in practice, > while log_debug() is usually a no-op, but can be compiled in if > something truly fishy seems to be going on. This argument fits on all types or effors: they are supposed to never ever happen - at least in theory; in reality they do, and more often than we like. And a proper error message is mandatory for correct error handling. > > Looks good, thanks for quickly working on this. Not sure, if this new > > function should be "void" or better "int" so that the error can be > > returned. > > That's why I included my tentative commit log, so you could see my > explanation for why I made it void. Until some user shows up that > _wants_ a return value, there's no point making it return int. When that > user shows up, we can discuss which int (return early on failure? > remember that an error was seen but still call wdt_stop on remaining > devices? etc. etc.). Returning an error code is always a good ide, no matter if current users check it or not. Best regards, Wolfgang Denk
On Wed, Aug 11, 2021 at 02:29:12PM +0200, Wolfgang Denk wrote: > Dear Rasmus, > > In message <3d48015a-07d3-e296-b9ba-a1edd455ce9e@prevas.dk> you wrote: > > > > >> + if (ret) { > > >> + log_debug("Error getting UCLASS_WDT: %d\n", ret); > > > > > > Perhaps log_err()? > > > > No, we've already been over this in earlier discussions (it's the exact > > same pattern and reasoning as initr_watchdog). If I made it log_err(), > > it would cost .text for something that never-ever happens in practice, > > while log_debug() is usually a no-op, but can be compiled in if > > something truly fishy seems to be going on. > > This argument fits on all types or effors: they are supposed to > never ever happen - at least in theory; in reality they do, and more > often than we like. > > And a proper error message is mandatory for correct error handling. Error messages are a fine line to walk. We've got to have been very badly corrupted to go down this error path. There's going to be lots of other error messages popping out. Saving a bit of .text is good. It makes it easier to justify spending a little .text later. > > > Looks good, thanks for quickly working on this. Not sure, if this new > > > function should be "void" or better "int" so that the error can be > > > returned. > > > > That's why I included my tentative commit log, so you could see my > > explanation for why I made it void. Until some user shows up that > > _wants_ a return value, there's no point making it return int. When that > > user shows up, we can discuss which int (return early on failure? > > remember that an error was seen but still call wdt_stop on remaining > > devices? etc. etc.). > > Returning an error code is always a good ide, no matter if > current users check it or not. And here I agree, catch an error code, pass the error code back to the caller. That's far more important than making sure that every error code we catch logs a message by default every time.
Dear Tom, In message <20210811124318.GT858@bill-the-cat> you wrote: > > > This argument fits on all types or effors: they are supposed to > > never ever happen - at least in theory; in reality they do, and more > > often than we like. > > > > And a proper error message is mandatory for correct error handling. > > Error messages are a fine line to walk. We've got to have been very > badly corrupted to go down this error path. There's going to be lots of > other error messages popping out. Saving a bit of .text is good. It > makes it easier to justify spending a little .text later. Letting errors slip through unnoticed when there is a trival way to at least inform the user of the problem is simply unacceptable. Please do not let U-Boot degrade into such a crappy piece of code. There are tons of other places where we don't even mention code size, so if you want to save on that, there are many bette4r places to save than error handling. > And here I agree, catch an error code, pass the error code back to the > caller. That's far more important than making sure that every error > code we catch logs a message by default every time. It does not matter where the error is reported - in the called function, or in some caller firther up the call tree. But it _must_ be reportet at least once. So if we don't issue an error message here, we need to check and fix the callers, too. Best regards, Wolfgang Denk
On Thu, Aug 12, 2021 at 08:40:21AM +0200, Wolfgang Denk wrote: > Dear Tom, > > In message <20210811124318.GT858@bill-the-cat> you wrote: > > > > > This argument fits on all types or effors: they are supposed to > > > never ever happen - at least in theory; in reality they do, and more > > > often than we like. > > > > > > And a proper error message is mandatory for correct error handling. > > > > Error messages are a fine line to walk. We've got to have been very > > badly corrupted to go down this error path. There's going to be lots of > > other error messages popping out. Saving a bit of .text is good. It > > makes it easier to justify spending a little .text later. > > Letting errors slip through unnoticed when there is a trival way to > at least inform the user of the problem is simply unacceptable. > > Please do not let U-Boot degrade into such a crappy piece of code. > > There are tons of other places where we don't even mention code > size, so if you want to save on that, there are many bette4r places > to save than error handling. Alright, lets take a look at what kind of area of the code we're talking about. uclass_get is a pretty fundamental thing. If that fails, your system is on fire. Things are massively corrupt. Lets look at other existing callers to see what happens. Most callers check the return code, like you need to, and pass it up the chain to deal with. We have a few board specific ones such as board/Marvell/octeontx2/board.c::board_quiesce_devices() that is also conceptually like the x530 case in the next part of the series. That does print on failure. The rest of the ones that print an error message are about commands and it's somewhat helpful there. So yes, return codes need to be checked and passed. But no, not every single error path needs to print to the user along every part of an error path either. > > And here I agree, catch an error code, pass the error code back to the > > caller. That's far more important than making sure that every error > > code we catch logs a message by default every time. > > It does not matter where the error is reported - in the called > function, or in some caller firther up the call tree. But it _must_ > be reportet at least once. > > So if we don't issue an error message here, we need to check and fix > the callers, too. That would be the next patch in the series where the BSP author isn't currently checking the return value, and this series doesn't change that. Perhaps it should, and CC the maintainer. But I think has been said a few times over the course of this series, what exactly is one going to do about the failure? Getting specific for a moment, if you're in the case of "shutdown the watchdog" and the watchdog doesn't shutdown like you want it to, do you hang and hope the watchdog is alive to kick things still? hang and lock the system? Figure the system is on fire anyhow but add another message to the failure spew? Again, I think the change that's needed to this patch is to make it return the error code to the caller. Let the caller decide. And make sure to CC the board maintainer on the next go-round so they can chime in about that.
Hi Tom, On Thu, 12 Aug 2021 at 07:48, Tom Rini <trini@konsulko.com> wrote: > > On Thu, Aug 12, 2021 at 08:40:21AM +0200, Wolfgang Denk wrote: > > Dear Tom, > > > > In message <20210811124318.GT858@bill-the-cat> you wrote: > > > > > > > This argument fits on all types or effors: they are supposed to > > > > never ever happen - at least in theory; in reality they do, and more > > > > often than we like. > > > > > > > > And a proper error message is mandatory for correct error handling. > > > > > > Error messages are a fine line to walk. We've got to have been very > > > badly corrupted to go down this error path. There's going to be lots of > > > other error messages popping out. Saving a bit of .text is good. It > > > makes it easier to justify spending a little .text later. > > > > Letting errors slip through unnoticed when there is a trival way to > > at least inform the user of the problem is simply unacceptable. > > > > Please do not let U-Boot degrade into such a crappy piece of code. > > > > There are tons of other places where we don't even mention code > > size, so if you want to save on that, there are many bette4r places > > to save than error handling. > > Alright, lets take a look at what kind of area of the code we're talking > about. uclass_get is a pretty fundamental thing. If that fails, your > system is on fire. Things are massively corrupt. Lets look at other > existing callers to see what happens. Most callers check the return > code, like you need to, and pass it up the chain to deal with. We have > a few board specific ones such as > board/Marvell/octeontx2/board.c::board_quiesce_devices() that is also > conceptually like the x530 case in the next part of the series. That > does print on failure. The rest of the ones that print an error message > are about commands and it's somewhat helpful there. > > So yes, return codes need to be checked and passed. But no, not every > single error path needs to print to the user along every part of an > error path either. > > > > And here I agree, catch an error code, pass the error code back to the > > > caller. That's far more important than making sure that every error > > > code we catch logs a message by default every time. > > > > It does not matter where the error is reported - in the called > > function, or in some caller firther up the call tree. But it _must_ > > be reportet at least once. > > > > So if we don't issue an error message here, we need to check and fix > > the callers, too. > > That would be the next patch in the series where the BSP author isn't > currently checking the return value, and this series doesn't change > that. Perhaps it should, and CC the maintainer. But I think has been > said a few times over the course of this series, what exactly is one > going to do about the failure? Getting specific for a moment, if you're > in the case of "shutdown the watchdog" and the watchdog doesn't shutdown > like you want it to, do you hang and hope the watchdog is alive to kick > things still? hang and lock the system? Figure the system is on fire > anyhow but add another message to the failure spew? > > Again, I think the change that's needed to this patch is to make it > return the error code to the caller. Let the caller decide. And make > sure to CC the board maintainer on the next go-round so they can chime > in about that. I strongly agree with this.I try to encourage people not to report errors inside drivers, since the caller should be able to deal with it, particularly if the error number provides useful information. It bloats the code. But we do have these strange cases where there is no obvious thing to do. Where we are probing several devices to fire them up and one fails, people worry that this means that some of them won't get probed. In that case I tend to continue on and probe them all, but then return error if any of them failed. At least the caller knows. Also I like the new log_ret() and log_msg_ret() functions, which can log an error as it goes up the stack if LOG_ERROR_RETURN is enabled. It is nice to be able to see where an error came from. We could perhaps improve this by logging the error when it is created (the first time some calls 'return -Exxx'). I'm not a fan of board code which calls a function and doesn't check the error. The board may not operate correctly, or it may limp along, but the board author should be able to get all the bugs of it at the start so that we are not requested invalid clocks, etc. Regards, Simon
Dear Tom, In message <20210812134833.GU858@bill-the-cat> you wrote: > > Alright, lets take a look at what kind of area of the code we're talking > about. uclass_get is a pretty fundamental thing. If that fails, your > system is on fire. Things are massively corrupt. Full agreement here. > So yes, return codes need to be checked and passed. But no, not every > single error path needs to print to the user along every part of an > error path either. So if "the system is on fire" is one of the cases where an error message should be omitted to save maybe 50 or 100 bytes of image size? This sounds wrong to me. When a system crashes or hangs, it is extremely helpful to gen an indication of what happened. Printing messages only with debug enabled is pretty worthless, as in the real world the failures will happen in the field when running the production image with debug not enabled. And as soon as you do enable debug, you will have a different image, which may or may not show the problem. We have all been there before. > That would be the next patch in the series where the BSP author isn't > currently checking the return value, and this series doesn't change > that. Perhaps it should, and CC the maintainer. Indeed. > But I think has been > said a few times over the course of this series, what exactly is one > going to do about the failure? Getting specific for a moment, if you're > in the case of "shutdown the watchdog" and the watchdog doesn't shutdown > like you want it to, do you hang and hope the watchdog is alive to kick > things still? hang and lock the system? Figure the system is on fire > anyhow but add another message to the failure spew? Adding a message about the _cause_ of the failure, i. e. at least information about the place where it was first detected, is what will be helpful to the engineer who is supposed to analyze the problem. And yes, if such a fundamental operation fails, it is wise to abort the operation of this device - either by hard resetting it or by hanging it, depending of what the chances are that a reboot might fix the problem. IMO it is better to fail a broken device in a reliable mode instead of letting it run and having more spectacular crashes (likely with more serious consequences) happen later. > Again, I think the change that's needed to this patch is to make it > return the error code to the caller. Let the caller decide. And make > sure to CC the board maintainer on the next go-round so they can chime > in about that. If we agree that in the disputed case "the system is on fire", then there is actually very little to decide. There should be only one possible action: stop here, before more damage happens. Best regards, Wolfgang Denk
On Thu, Aug 12, 2021 at 04:21:29PM +0200, Wolfgang Denk wrote: > Dear Tom, > > In message <20210812134833.GU858@bill-the-cat> you wrote: > > > > Alright, lets take a look at what kind of area of the code we're talking > > about. uclass_get is a pretty fundamental thing. If that fails, your > > system is on fire. Things are massively corrupt. > > Full agreement here. > > > So yes, return codes need to be checked and passed. But no, not every > > single error path needs to print to the user along every part of an > > error path either. > > So if "the system is on fire" is one of the cases where an error > message should be omitted to save maybe 50 or 100 bytes of image > size? This sounds wrong to me. It sounds right to me because it's unlikely everything caught fire because of this call right here and likely it's because of one of the messages much further up on the console log. Hopefully we haven't caused that message to be unavailable now due to unhelpful failure messages. A log message needs to have value to it above and beyond boiling down to "%s: %d", __func__, __LINE__ having been reached. This, right here, is not a log message that matters. With DM we've made a great deal of progress in being able to populate meaningful errors back up to our callers rather than -1 for everything. So yes, in sum, these functions need to return a value. The BSP ought to care (in the next patch), even if it doesn't today when it could. But that's on the BSP author as they know better than you or I what that system is being used for.
Dear Tom, In message <20210812162034.GY858@bill-the-cat> you wrote: > > > So if "the system is on fire" is one of the cases where an error > > message should be omitted to save maybe 50 or 100 bytes of image > > size? This sounds wrong to me. > > It sounds right to me because it's unlikely everything caught fire > because of this call right here and likely it's because of one of the > messages much further up on the console log. Hopefully we haven't > caused that message to be unavailable now due to unhelpful failure > messages. Omitting code to handle situations that are unlikely to happen is definitely not what I consider robust programming, and nothing which I would let pass a code review. But if you insist, there is no more to do for me here that to note that fact that we disagree. Wolfgang Denk
On 12.08.21 15:48, Tom Rini wrote: > On Thu, Aug 12, 2021 at 08:40:21AM +0200, Wolfgang Denk wrote: >> Dear Tom, >> >> In message <20210811124318.GT858@bill-the-cat> you wrote: >>> >>>> This argument fits on all types or effors: they are supposed to >>>> never ever happen - at least in theory; in reality they do, and more >>>> often than we like. >>>> >>>> And a proper error message is mandatory for correct error handling. >>> >>> Error messages are a fine line to walk. We've got to have been very >>> badly corrupted to go down this error path. There's going to be lots of >>> other error messages popping out. Saving a bit of .text is good. It >>> makes it easier to justify spending a little .text later. >> >> Letting errors slip through unnoticed when there is a trival way to >> at least inform the user of the problem is simply unacceptable. >> >> Please do not let U-Boot degrade into such a crappy piece of code. >> >> There are tons of other places where we don't even mention code >> size, so if you want to save on that, there are many bette4r places >> to save than error handling. > > Alright, lets take a look at what kind of area of the code we're talking > about. uclass_get is a pretty fundamental thing. If that fails, your > system is on fire. Things are massively corrupt. Lets look at other > existing callers to see what happens. Most callers check the return > code, like you need to, and pass it up the chain to deal with. We have > a few board specific ones such as > board/Marvell/octeontx2/board.c::board_quiesce_devices() that is also > conceptually like the x530 case in the next part of the series. That > does print on failure. The rest of the ones that print an error message > are about commands and it's somewhat helpful there. > > So yes, return codes need to be checked and passed. But no, not every > single error path needs to print to the user along every part of an > error path either. > >>> And here I agree, catch an error code, pass the error code back to the >>> caller. That's far more important than making sure that every error >>> code we catch logs a message by default every time. >> >> It does not matter where the error is reported - in the called >> function, or in some caller firther up the call tree. But it _must_ >> be reportet at least once. >> >> So if we don't issue an error message here, we need to check and fix >> the callers, too. > > That would be the next patch in the series where the BSP author isn't > currently checking the return value, and this series doesn't change > that. Perhaps it should, and CC the maintainer. But I think has been > said a few times over the course of this series, what exactly is one > going to do about the failure? Getting specific for a moment, if you're > in the case of "shutdown the watchdog" and the watchdog doesn't shutdown > like you want it to, do you hang and hope the watchdog is alive to kick > things still? hang and lock the system? Figure the system is on fire > anyhow but add another message to the failure spew? > > Again, I think the change that's needed to this patch is to make it > return the error code to the caller. Let the caller decide. And make > sure to CC the board maintainer on the next go-round so they can chime > in about that. Getting back to this to hopefully get this decided: It seems that we (most of us?) agree on this change, that wdt_stop_all() shall be changed to return an error code and the caller can decide what to do with it? If yes, then Rasmus, could you please re-spin this patchset accordingly and send v6? Thanks, Stefan
On Tue, Aug 17, 2021 at 11:28:39AM +0200, Stefan Roese wrote: > On 12.08.21 15:48, Tom Rini wrote: > > On Thu, Aug 12, 2021 at 08:40:21AM +0200, Wolfgang Denk wrote: > > > Dear Tom, > > > > > > In message <20210811124318.GT858@bill-the-cat> you wrote: > > > > > > > > > This argument fits on all types or effors: they are supposed to > > > > > never ever happen - at least in theory; in reality they do, and more > > > > > often than we like. > > > > > > > > > > And a proper error message is mandatory for correct error handling. > > > > > > > > Error messages are a fine line to walk. We've got to have been very > > > > badly corrupted to go down this error path. There's going to be lots of > > > > other error messages popping out. Saving a bit of .text is good. It > > > > makes it easier to justify spending a little .text later. > > > > > > Letting errors slip through unnoticed when there is a trival way to > > > at least inform the user of the problem is simply unacceptable. > > > > > > Please do not let U-Boot degrade into such a crappy piece of code. > > > > > > There are tons of other places where we don't even mention code > > > size, so if you want to save on that, there are many bette4r places > > > to save than error handling. > > > > Alright, lets take a look at what kind of area of the code we're talking > > about. uclass_get is a pretty fundamental thing. If that fails, your > > system is on fire. Things are massively corrupt. Lets look at other > > existing callers to see what happens. Most callers check the return > > code, like you need to, and pass it up the chain to deal with. We have > > a few board specific ones such as > > board/Marvell/octeontx2/board.c::board_quiesce_devices() that is also > > conceptually like the x530 case in the next part of the series. That > > does print on failure. The rest of the ones that print an error message > > are about commands and it's somewhat helpful there. > > > > So yes, return codes need to be checked and passed. But no, not every > > single error path needs to print to the user along every part of an > > error path either. > > > > > > And here I agree, catch an error code, pass the error code back to the > > > > caller. That's far more important than making sure that every error > > > > code we catch logs a message by default every time. > > > > > > It does not matter where the error is reported - in the called > > > function, or in some caller firther up the call tree. But it _must_ > > > be reportet at least once. > > > > > > So if we don't issue an error message here, we need to check and fix > > > the callers, too. > > > > That would be the next patch in the series where the BSP author isn't > > currently checking the return value, and this series doesn't change > > that. Perhaps it should, and CC the maintainer. But I think has been > > said a few times over the course of this series, what exactly is one > > going to do about the failure? Getting specific for a moment, if you're > > in the case of "shutdown the watchdog" and the watchdog doesn't shutdown > > like you want it to, do you hang and hope the watchdog is alive to kick > > things still? hang and lock the system? Figure the system is on fire > > anyhow but add another message to the failure spew? > > > > Again, I think the change that's needed to this patch is to make it > > return the error code to the caller. Let the caller decide. And make > > sure to CC the board maintainer on the next go-round so they can chime > > in about that. > > Getting back to this to hopefully get this decided: > > It seems that we (most of us?) agree on this change, that wdt_stop_all() > shall be changed to return an error code and the caller can decide what > to do with it? > > If yes, then Rasmus, could you please re-spin this patchset accordingly > and send v6? Yes, please and thanks.
Hi Tom, On 17.08.21 14:35, Tom Rini wrote: <snip> >> Getting back to this to hopefully get this decided: >> >> It seems that we (most of us?) agree on this change, that wdt_stop_all() >> shall be changed to return an error code and the caller can decide what >> to do with it? >> >> If yes, then Rasmus, could you please re-spin this patchset accordingly >> and send v6? > > Yes, please and thanks. Tom, would you like me to push this patchset in at this stage (rc2), or better defer to the next merge window? Thanks, Stefan
On Fri, Aug 27, 2021 at 08:30:48AM +0200, Stefan Roese wrote: > Hi Tom, > > On 17.08.21 14:35, Tom Rini wrote: > > <snip> > > > > Getting back to this to hopefully get this decided: > > > > > > It seems that we (most of us?) agree on this change, that wdt_stop_all() > > > shall be changed to return an error code and the caller can decide what > > > to do with it? > > > > > > If yes, then Rasmus, could you please re-spin this patchset accordingly > > > and send v6? > > > > Yes, please and thanks. > > Tom, would you like me to push this patchset in at this stage (rc2), or > better defer to the next merge window? I'm going to open -next on Monday, so lets put it there. Thanks!
diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c index 358fc68e27..0ce8b3a425 100644 --- a/drivers/watchdog/wdt-uclass.c +++ b/drivers/watchdog/wdt-uclass.c @@ -61,20 +61,24 @@ static void init_watchdog_dev(struct udevice *dev) int initr_watchdog(void) { - /* - * Init watchdog: This will call the probe function of the - * watchdog driver, enabling the use of the device - */ - if (uclass_get_device_by_seq(UCLASS_WDT, 0, - (struct udevice **)&gd->watchdog_dev)) { - debug("WDT: Not found by seq!\n"); - if (uclass_get_device(UCLASS_WDT, 0, - (struct udevice **)&gd->watchdog_dev)) { - printf("WDT: Not found!\n"); - return 0; + struct udevice *dev; + struct uclass *uc; + int ret; + + ret = uclass_get(UCLASS_WDT, &uc); + if (ret) { + log_debug("Error getting UCLASS_WDT: %d\n", ret); + return 0; + } + + uclass_foreach_dev(dev, uc) { + ret = device_probe(dev); + if (ret) { + log_debug("Error probing %s: %d\n", dev->name, ret); + continue; } + init_watchdog_dev(dev); } - init_watchdog_dev(gd->watchdog_dev); gd->flags |= GD_FLG_WDT_READY; return 0; @@ -157,22 +161,34 @@ void watchdog_reset(void) { struct wdt_priv *priv; struct udevice *dev; + struct uclass *uc; ulong now; /* Exit if GD is not ready or watchdog is not initialized yet */ if (!gd || !(gd->flags & GD_FLG_WDT_READY)) return; - dev = gd->watchdog_dev; - priv = dev_get_uclass_priv(dev); - if (!priv->running) + if (uclass_get(UCLASS_WDT, &uc)) return; - /* Do not reset the watchdog too often */ - now = get_timer(0); - if (time_after_eq(now, priv->next_reset)) { - priv->next_reset = now + priv->reset_period; - wdt_reset(dev); + /* + * All devices bound to the wdt uclass should have been probed + * in initr_watchdog(). But just in case something went wrong, + * check device_active() before accessing the uclass private + * data. + */ + uclass_foreach_dev(dev, uc) { + if (!device_active(dev)) + continue; + priv = dev_get_uclass_priv(dev); + if (!priv->running) + continue; + /* Do not reset the watchdog too often */ + now = get_timer(0); + if (time_after_eq(now, priv->next_reset)) { + priv->next_reset = now + priv->reset_period; + wdt_reset(dev); + } } } #endif diff --git a/include/asm-generic/global_data.h b/include/asm-generic/global_data.h index e55070303f..28d749538c 100644 --- a/include/asm-generic/global_data.h +++ b/include/asm-generic/global_data.h @@ -447,12 +447,6 @@ struct global_data { */ fdt_addr_t translation_offset; #endif -#if CONFIG_IS_ENABLED(WDT) - /** - * @watchdog_dev: watchdog device - */ - struct udevice *watchdog_dev; -#endif #ifdef CONFIG_GENERATE_ACPI_TABLE /** * @acpi_ctx: ACPI context pointer
A board can have and make use of more than one watchdog device, say one built into the SOC and an external gpio-petted one. Having wdt-uclass only handle the first is both a little arbitrary and unexpected. So change initr_watchdog() so we visit (probe) all DM watchdog devices, and call the init_watchdog_dev helper for each. Similarly let watchdog_reset() loop over the whole uclass - each having their own ratelimiting metadata, and a separate "is this device running" flag. This gets rid of the watchdog_dev member of struct global_data. We do, however, still need the GD_FLG_WDT_READY set in initr_watchdog(). This is because watchdog_reset() can get called before DM is ready, and I don't think we can call uclass_get() that early. The current code just returns 0 if "getting" the first device fails - that can of course happen because there are no devices, but it could also happen if its ->probe call failed. In keeping with that, continue with the handling of the remaining devices even if one fails to probe. This is also why we cannot use uclass_probe_all(). If desired, it's possible to later add a per-device "u-boot,autostart" boolean property, so that one can do CONFIG_WATCHDOG_AUTOSTART per-device. Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> --- drivers/watchdog/wdt-uclass.c | 56 ++++++++++++++++++++----------- include/asm-generic/global_data.h | 6 ---- 2 files changed, 36 insertions(+), 26 deletions(-)