Message ID | 20241101121201.2464091-2-chin-ting_kuo@aspeedtech.com |
---|---|
State | New |
Headers | show |
Series | Update ASPEED WDT bootstatus | expand |
On Fri, Nov 01, 2024 at 08:11:59PM +0800, Chin-Ting Kuo wrote: > The boot status mapping rule follows the latest design guide from > the OpenBMC shown as below. > https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design > - WDIOF_EXTERN1 => system is reset by Software > - WDIOF_CARDRESET => system is reset by WDT SoC reset > - Others => other reset events, e.g., power on reset. I'm quite surprised that the above is relevant for a kernel driver at all. Isn't "EXTERN1" a name of a real watchdog signal from your hardware (my recollection is that there are 2 external watchdogs). I think the point of this referenced design document was that most users of BMCs have "EXTERN1" used a for software reset conditions. `CARDRESET` should be representing resets by the watchdog itself. The purpose of this design proposal was not to require very specific changes to individual watchdog drivers, but to align the userspace use with the best practices already from other watchdog drivers. I don't think the kernel driver should be bending to match a particular userspace implementation; you should be exposing the information available to your hardware. Having said that, it was known that there would need to be changes to the driver because some of these conditions were not adequately exposed at all. I'm just still surprised that we're needing to reference that document as part of these changes. > > On ASPEED platform, the boot status is recorded in the SCU registers. > - AST2400: Only a bit represents for any WDT reset. > - AST2500: The reset triggered by different WDT controllers can be > distinguished by different SCU bits. But, WDIOF_EXTERN1 or > WDIOF_CARDRESET still cannot be identified due to > HW limitation. > - AST2600: Different from AST2500, additional HW bits are added for > distinguishing WDIOF_EXTERN1 and WDIOF_CARDRESET.
On 11/1/24 05:11, Chin-Ting Kuo wrote: > The boot status in the watchdog device struct is updated during > controller probe stage. Application layer can get the boot status > through the command, cat /sys/class/watchdog/watchdogX/bootstatus. > > The boot status mapping rule follows the latest design guide from > the OpenBMC shown as below. > https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design > - WDIOF_EXTERN1 => system is reset by Software > - WDIOF_CARDRESET => system is reset by WDT SoC reset > - Others => other reset events, e.g., power on reset. > > On ASPEED platform, the boot status is recorded in the SCU registers. > - AST2400: Only a bit represents for any WDT reset. > - AST2500: The reset triggered by different WDT controllers can be > distinguished by different SCU bits. But, WDIOF_EXTERN1 or > WDIOF_CARDRESET still cannot be identified due to > HW limitation. > - AST2600: Different from AST2500, additional HW bits are added for > distinguishing WDIOF_EXTERN1 and WDIOF_CARDRESET. > > Besides, since alternating boot event is triggered by WDT SoC reset, > it is classified as WDIOF_CARDRESET. > > Signed-off-by: Chin-Ting Kuo <chin-ting_kuo@aspeedtech.com> > --- > drivers/watchdog/aspeed_wdt.c | 83 ++++++++++++++++++++++++++++++++++- > 1 file changed, 81 insertions(+), 2 deletions(-) > > diff --git a/drivers/watchdog/aspeed_wdt.c b/drivers/watchdog/aspeed_wdt.c > index b4773a6aaf8c..4ad6335ff25b 100644 > --- a/drivers/watchdog/aspeed_wdt.c > +++ b/drivers/watchdog/aspeed_wdt.c > @@ -11,21 +11,31 @@ > #include <linux/io.h> > #include <linux/kernel.h> > #include <linux/kstrtox.h> > +#include <linux/mfd/syscon.h> > #include <linux/module.h> > #include <linux/of.h> > #include <linux/of_irq.h> > #include <linux/platform_device.h> > +#include <linux/regmap.h> > #include <linux/watchdog.h> > > static bool nowayout = WATCHDOG_NOWAYOUT; > module_param(nowayout, bool, 0); > MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default=" > __MODULE_STRING(WATCHDOG_NOWAYOUT) ")"); > +struct aspeed_wdt_scu { > + const char *compatible; > + u32 reset_status_reg; > + u32 wdt_reset_mask; > + u32 wdt_sw_reset_mask; > + u32 wdt_reset_mask_shift; > +}; > > struct aspeed_wdt_config { > u32 ext_pulse_width_mask; > u32 irq_shift; > u32 irq_mask; > + struct aspeed_wdt_scu scu; > }; > > struct aspeed_wdt { > @@ -39,18 +49,39 @@ static const struct aspeed_wdt_config ast2400_config = { > .ext_pulse_width_mask = 0xff, > .irq_shift = 0, > .irq_mask = 0, > + .scu = { > + .compatible = "aspeed,ast2400-scu", > + .reset_status_reg = 0x3c, > + .wdt_reset_mask = 0x1, > + .wdt_sw_reset_mask = 0, > + .wdt_reset_mask_shift = 1, > + }, > }; > > static const struct aspeed_wdt_config ast2500_config = { > .ext_pulse_width_mask = 0xfffff, > .irq_shift = 12, > .irq_mask = GENMASK(31, 12), > + .scu = { > + .compatible = "aspeed,ast2500-scu", > + .reset_status_reg = 0x3c, > + .wdt_reset_mask = 0x1, > + .wdt_sw_reset_mask = 0, > + .wdt_reset_mask_shift = 2, > + }, > }; > > static const struct aspeed_wdt_config ast2600_config = { > .ext_pulse_width_mask = 0xfffff, > .irq_shift = 0, > .irq_mask = GENMASK(31, 10), > + .scu = { > + .compatible = "aspeed,ast2600-scu", > + .reset_status_reg = 0x74, > + .wdt_reset_mask = 0xf, > + .wdt_sw_reset_mask = 0x8, > + .wdt_reset_mask_shift = 16, > + }, > }; > > static const struct of_device_id aspeed_wdt_of_table[] = { > @@ -213,6 +244,52 @@ static int aspeed_wdt_restart(struct watchdog_device *wdd, > return 0; > } > > +static int aspeed_wdt_update_bootstatus(struct platform_device *pdev, > + struct aspeed_wdt *wdt) > +{ > + struct resource *res; > + struct aspeed_wdt_scu scu = wdt->cfg->scu; > + struct regmap *scu_base; > + u32 reset_mask_width; > + u32 reset_mask_shift; > + u32 reg_size = 0; Please no unnecesary initializations. > + u32 idx = 0; > + u32 status; > + int ret; > + > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > + reg_size = res->end - res->start; > + > + if (reg_size != 0) > + idx = ((intptr_t)wdt->base & 0x00000fff) / reg_size; > + > + /* On ast2400, only a bit is used to represent WDT reset */ > + if (of_device_is_compatible(pdev->dev.of_node, "aspeed,ast2400-wdt")) > + idx = 0; > + There is some redundancy in the above code, and platform_get_resource() can return NULL. If idx==0 for aspeed,ast2400-wdt anyway, the code can be rewritten as if (!of_device_is_compatible(pdev->dev.of_node, "aspeed,ast2400-wdt")) { res = platform_get_resource(pdev, IORESOURCE_MEM, 0); if (res) { reg_size = res->end - res->start; if (reg_size) idx = ((intptr_t)wdt->base & 0x00000fff) / reg_size; } } > + scu_base = syscon_regmap_lookup_by_compatible(scu.compatible); > + if (IS_ERR(scu_base)) > + return PTR_ERR(scu_base); > + > + ret = regmap_read(scu_base, scu.reset_status_reg, &status); > + if (ret) > + return ret; The above only affects bootstatus. Why fail to load the driver just because bootstatus can not be read ? > + > + reset_mask_width = hweight32(scu.wdt_reset_mask); > + reset_mask_shift = scu.wdt_reset_mask_shift + > + reset_mask_width * idx; > + > + if (status & (scu.wdt_sw_reset_mask << reset_mask_shift)) > + wdt->wdd.bootstatus = WDIOF_EXTERN1; > + else if (status & (scu.wdt_reset_mask << reset_mask_shift)) > + wdt->wdd.bootstatus = WDIOF_CARDRESET; > + else > + wdt->wdd.bootstatus = 0; That is already 0. > + > + return regmap_write(scu_base, scu.reset_status_reg, > + scu.wdt_reset_mask << reset_mask_shift); > +} > + > /* access_cs0 shows if cs0 is accessible, hence the reverted bit */ > static ssize_t access_cs0_show(struct device *dev, > struct device_attribute *attr, char *buf) > @@ -458,10 +535,12 @@ static int aspeed_wdt_probe(struct platform_device *pdev) > writel(duration - 1, wdt->base + WDT_RESET_WIDTH); > } > > + ret = aspeed_wdt_update_bootstatus(pdev, wdt); > + if (ret) > + return ret; > + > status = readl(wdt->base + WDT_TIMEOUT_STATUS); > if (status & WDT_TIMEOUT_STATUS_BOOT_SECONDARY) { > - wdt->wdd.bootstatus = WDIOF_CARDRESET; > - > if (of_device_is_compatible(np, "aspeed,ast2400-wdt") || > of_device_is_compatible(np, "aspeed,ast2500-wdt")) > wdt->wdd.groups = bswitch_groups;
On Fri, 2024-11-01 at 14:21 -0400, Patrick Williams wrote: > On Fri, Nov 01, 2024 at 08:11:59PM +0800, Chin-Ting Kuo wrote: > > The boot status mapping rule follows the latest design guide from > > the OpenBMC shown as below. > > https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design > > - WDIOF_EXTERN1 => system is reset by Software > > - WDIOF_CARDRESET => system is reset by WDT SoC reset > > - Others => other reset events, e.g., power on reset. > > I'm quite surprised that the above is relevant for a kernel driver at > all. Isn't "EXTERN1" a name of a real watchdog signal from your > hardware (my recollection is that there are 2 external watchdogs). I think you may be referring to WDTRST1 (and WDTRST2) here. WDIOF_EXTERN1 and WDIOF_EXTERN2 have an unrelated history: https://github.com/torvalds/linux/blame/master/include/uapi/linux/watchdog.h > I > think the point of this referenced design document was that most > users > of BMCs have "EXTERN1" used a for software reset conditions. > `CARDRESET` should be representing resets by the watchdog itself. I think this is what Chin-Ting is proposing for the Aspeed driver? > > The purpose of this design proposal was not to require very specific > changes to individual watchdog drivers, but to align the userspace > use > with the best practices already from other watchdog drivers. I don't > think the kernel driver should be bending to match a particular > userspace implementation; you should be exposing the information > available to your hardware. I agree with this in principle, but I'm not sure what else is meant to be done here. > > Having said that, it was known that there would need to be changes to > the driver because some of these conditions were not adequately > exposed > at all. I'm just still surprised that we're needing to reference > that > document as part of these changes. I think the main question is whether an internal, graceful (userspace- requested) reset is a reasonable use of WDIOF_EXTERN[12]. My feeling no. I wonder whether defining a new flag (WDIOF_REBOOT? WDIOF_GRACEFUL?) in the UAPI would be acceptable? Andrew
diff --git a/drivers/watchdog/aspeed_wdt.c b/drivers/watchdog/aspeed_wdt.c index b4773a6aaf8c..4ad6335ff25b 100644 --- a/drivers/watchdog/aspeed_wdt.c +++ b/drivers/watchdog/aspeed_wdt.c @@ -11,21 +11,31 @@ #include <linux/io.h> #include <linux/kernel.h> #include <linux/kstrtox.h> +#include <linux/mfd/syscon.h> #include <linux/module.h> #include <linux/of.h> #include <linux/of_irq.h> #include <linux/platform_device.h> +#include <linux/regmap.h> #include <linux/watchdog.h> static bool nowayout = WATCHDOG_NOWAYOUT; module_param(nowayout, bool, 0); MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default=" __MODULE_STRING(WATCHDOG_NOWAYOUT) ")"); +struct aspeed_wdt_scu { + const char *compatible; + u32 reset_status_reg; + u32 wdt_reset_mask; + u32 wdt_sw_reset_mask; + u32 wdt_reset_mask_shift; +}; struct aspeed_wdt_config { u32 ext_pulse_width_mask; u32 irq_shift; u32 irq_mask; + struct aspeed_wdt_scu scu; }; struct aspeed_wdt { @@ -39,18 +49,39 @@ static const struct aspeed_wdt_config ast2400_config = { .ext_pulse_width_mask = 0xff, .irq_shift = 0, .irq_mask = 0, + .scu = { + .compatible = "aspeed,ast2400-scu", + .reset_status_reg = 0x3c, + .wdt_reset_mask = 0x1, + .wdt_sw_reset_mask = 0, + .wdt_reset_mask_shift = 1, + }, }; static const struct aspeed_wdt_config ast2500_config = { .ext_pulse_width_mask = 0xfffff, .irq_shift = 12, .irq_mask = GENMASK(31, 12), + .scu = { + .compatible = "aspeed,ast2500-scu", + .reset_status_reg = 0x3c, + .wdt_reset_mask = 0x1, + .wdt_sw_reset_mask = 0, + .wdt_reset_mask_shift = 2, + }, }; static const struct aspeed_wdt_config ast2600_config = { .ext_pulse_width_mask = 0xfffff, .irq_shift = 0, .irq_mask = GENMASK(31, 10), + .scu = { + .compatible = "aspeed,ast2600-scu", + .reset_status_reg = 0x74, + .wdt_reset_mask = 0xf, + .wdt_sw_reset_mask = 0x8, + .wdt_reset_mask_shift = 16, + }, }; static const struct of_device_id aspeed_wdt_of_table[] = { @@ -213,6 +244,52 @@ static int aspeed_wdt_restart(struct watchdog_device *wdd, return 0; } +static int aspeed_wdt_update_bootstatus(struct platform_device *pdev, + struct aspeed_wdt *wdt) +{ + struct resource *res; + struct aspeed_wdt_scu scu = wdt->cfg->scu; + struct regmap *scu_base; + u32 reset_mask_width; + u32 reset_mask_shift; + u32 reg_size = 0; + u32 idx = 0; + u32 status; + int ret; + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + reg_size = res->end - res->start; + + if (reg_size != 0) + idx = ((intptr_t)wdt->base & 0x00000fff) / reg_size; + + /* On ast2400, only a bit is used to represent WDT reset */ + if (of_device_is_compatible(pdev->dev.of_node, "aspeed,ast2400-wdt")) + idx = 0; + + scu_base = syscon_regmap_lookup_by_compatible(scu.compatible); + if (IS_ERR(scu_base)) + return PTR_ERR(scu_base); + + ret = regmap_read(scu_base, scu.reset_status_reg, &status); + if (ret) + return ret; + + reset_mask_width = hweight32(scu.wdt_reset_mask); + reset_mask_shift = scu.wdt_reset_mask_shift + + reset_mask_width * idx; + + if (status & (scu.wdt_sw_reset_mask << reset_mask_shift)) + wdt->wdd.bootstatus = WDIOF_EXTERN1; + else if (status & (scu.wdt_reset_mask << reset_mask_shift)) + wdt->wdd.bootstatus = WDIOF_CARDRESET; + else + wdt->wdd.bootstatus = 0; + + return regmap_write(scu_base, scu.reset_status_reg, + scu.wdt_reset_mask << reset_mask_shift); +} + /* access_cs0 shows if cs0 is accessible, hence the reverted bit */ static ssize_t access_cs0_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -458,10 +535,12 @@ static int aspeed_wdt_probe(struct platform_device *pdev) writel(duration - 1, wdt->base + WDT_RESET_WIDTH); } + ret = aspeed_wdt_update_bootstatus(pdev, wdt); + if (ret) + return ret; + status = readl(wdt->base + WDT_TIMEOUT_STATUS); if (status & WDT_TIMEOUT_STATUS_BOOT_SECONDARY) { - wdt->wdd.bootstatus = WDIOF_CARDRESET; - if (of_device_is_compatible(np, "aspeed,ast2400-wdt") || of_device_is_compatible(np, "aspeed,ast2500-wdt")) wdt->wdd.groups = bswitch_groups;
The boot status in the watchdog device struct is updated during controller probe stage. Application layer can get the boot status through the command, cat /sys/class/watchdog/watchdogX/bootstatus. The boot status mapping rule follows the latest design guide from the OpenBMC shown as below. https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design - WDIOF_EXTERN1 => system is reset by Software - WDIOF_CARDRESET => system is reset by WDT SoC reset - Others => other reset events, e.g., power on reset. On ASPEED platform, the boot status is recorded in the SCU registers. - AST2400: Only a bit represents for any WDT reset. - AST2500: The reset triggered by different WDT controllers can be distinguished by different SCU bits. But, WDIOF_EXTERN1 or WDIOF_CARDRESET still cannot be identified due to HW limitation. - AST2600: Different from AST2500, additional HW bits are added for distinguishing WDIOF_EXTERN1 and WDIOF_CARDRESET. Besides, since alternating boot event is triggered by WDT SoC reset, it is classified as WDIOF_CARDRESET. Signed-off-by: Chin-Ting Kuo <chin-ting_kuo@aspeedtech.com> --- drivers/watchdog/aspeed_wdt.c | 83 ++++++++++++++++++++++++++++++++++- 1 file changed, 81 insertions(+), 2 deletions(-)