diff mbox series

[v4,1/3] watchdog: aspeed: Update bootstatus handling

Message ID 20241101121201.2464091-2-chin-ting_kuo@aspeedtech.com
State New
Headers show
Series Update ASPEED WDT bootstatus | expand

Commit Message

Chin-Ting Kuo Nov. 1, 2024, 12:11 p.m. UTC
The boot status in the watchdog device struct is updated during
controller probe stage. Application layer can get the boot status
through the command, cat /sys/class/watchdog/watchdogX/bootstatus.

The boot status mapping rule follows the latest design guide from
the OpenBMC shown as below.
https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design
- WDIOF_EXTERN1   => system is reset by Software
- WDIOF_CARDRESET => system is reset by WDT SoC reset
- Others          => other reset events, e.g., power on reset.

On ASPEED platform, the boot status is recorded in the SCU registers.
- AST2400: Only a bit represents for any WDT reset.
- AST2500: The reset triggered by different WDT controllers can be
           distinguished by different SCU bits. But, WDIOF_EXTERN1 or
           WDIOF_CARDRESET still cannot be identified due to
           HW limitation.
- AST2600: Different from AST2500, additional HW bits are added for
           distinguishing WDIOF_EXTERN1 and WDIOF_CARDRESET.

Besides, since alternating boot event is triggered by WDT SoC reset,
it is classified as WDIOF_CARDRESET.

Signed-off-by: Chin-Ting Kuo <chin-ting_kuo@aspeedtech.com>
---
 drivers/watchdog/aspeed_wdt.c | 83 ++++++++++++++++++++++++++++++++++-
 1 file changed, 81 insertions(+), 2 deletions(-)

Comments

Patrick Williams Nov. 1, 2024, 6:21 p.m. UTC | #1
On Fri, Nov 01, 2024 at 08:11:59PM +0800, Chin-Ting Kuo wrote:
> The boot status mapping rule follows the latest design guide from
> the OpenBMC shown as below.
> https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design
> - WDIOF_EXTERN1   => system is reset by Software
> - WDIOF_CARDRESET => system is reset by WDT SoC reset
> - Others          => other reset events, e.g., power on reset.

I'm quite surprised that the above is relevant for a kernel driver at
all.  Isn't "EXTERN1" a name of a real watchdog signal from your
hardware (my recollection is that there are 2 external watchdogs).  I
think the point of this referenced design document was that most users
of BMCs have "EXTERN1" used a for software reset conditions.
`CARDRESET` should be representing resets by the watchdog itself.

The purpose of this design proposal was not to require very specific
changes to individual watchdog drivers, but to align the userspace use
with the best practices already from other watchdog drivers.  I don't
think the kernel driver should be bending to match a particular
userspace implementation; you should be exposing the information
available to your hardware.

Having said that, it was known that there would need to be changes to
the driver because some of these conditions were not adequately exposed
at all.  I'm just still surprised that we're needing to reference that
document as part of these changes.

> 
> On ASPEED platform, the boot status is recorded in the SCU registers.
> - AST2400: Only a bit represents for any WDT reset.
> - AST2500: The reset triggered by different WDT controllers can be
>            distinguished by different SCU bits. But, WDIOF_EXTERN1 or
>            WDIOF_CARDRESET still cannot be identified due to
>            HW limitation.
> - AST2600: Different from AST2500, additional HW bits are added for
>            distinguishing WDIOF_EXTERN1 and WDIOF_CARDRESET.
Guenter Roeck Nov. 1, 2024, 10:16 p.m. UTC | #2
On 11/1/24 05:11, Chin-Ting Kuo wrote:
> The boot status in the watchdog device struct is updated during
> controller probe stage. Application layer can get the boot status
> through the command, cat /sys/class/watchdog/watchdogX/bootstatus.
> 
> The boot status mapping rule follows the latest design guide from
> the OpenBMC shown as below.
> https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design
> - WDIOF_EXTERN1   => system is reset by Software
> - WDIOF_CARDRESET => system is reset by WDT SoC reset
> - Others          => other reset events, e.g., power on reset.
> 
> On ASPEED platform, the boot status is recorded in the SCU registers.
> - AST2400: Only a bit represents for any WDT reset.
> - AST2500: The reset triggered by different WDT controllers can be
>             distinguished by different SCU bits. But, WDIOF_EXTERN1 or
>             WDIOF_CARDRESET still cannot be identified due to
>             HW limitation.
> - AST2600: Different from AST2500, additional HW bits are added for
>             distinguishing WDIOF_EXTERN1 and WDIOF_CARDRESET.
> 
> Besides, since alternating boot event is triggered by WDT SoC reset,
> it is classified as WDIOF_CARDRESET.
> 
> Signed-off-by: Chin-Ting Kuo <chin-ting_kuo@aspeedtech.com>
> ---
>   drivers/watchdog/aspeed_wdt.c | 83 ++++++++++++++++++++++++++++++++++-
>   1 file changed, 81 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/watchdog/aspeed_wdt.c b/drivers/watchdog/aspeed_wdt.c
> index b4773a6aaf8c..4ad6335ff25b 100644
> --- a/drivers/watchdog/aspeed_wdt.c
> +++ b/drivers/watchdog/aspeed_wdt.c
> @@ -11,21 +11,31 @@
>   #include <linux/io.h>
>   #include <linux/kernel.h>
>   #include <linux/kstrtox.h>
> +#include <linux/mfd/syscon.h>
>   #include <linux/module.h>
>   #include <linux/of.h>
>   #include <linux/of_irq.h>
>   #include <linux/platform_device.h>
> +#include <linux/regmap.h>
>   #include <linux/watchdog.h>
>   
>   static bool nowayout = WATCHDOG_NOWAYOUT;
>   module_param(nowayout, bool, 0);
>   MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
>   				__MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
> +struct aspeed_wdt_scu {
> +	const char *compatible;
> +	u32 reset_status_reg;
> +	u32 wdt_reset_mask;
> +	u32 wdt_sw_reset_mask;
> +	u32 wdt_reset_mask_shift;
> +};
>   
>   struct aspeed_wdt_config {
>   	u32 ext_pulse_width_mask;
>   	u32 irq_shift;
>   	u32 irq_mask;
> +	struct aspeed_wdt_scu scu;
>   };
>   
>   struct aspeed_wdt {
> @@ -39,18 +49,39 @@ static const struct aspeed_wdt_config ast2400_config = {
>   	.ext_pulse_width_mask = 0xff,
>   	.irq_shift = 0,
>   	.irq_mask = 0,
> +	.scu = {
> +		.compatible = "aspeed,ast2400-scu",
> +		.reset_status_reg = 0x3c,
> +		.wdt_reset_mask = 0x1,
> +		.wdt_sw_reset_mask = 0,
> +		.wdt_reset_mask_shift = 1,
> +	},
>   };
>   
>   static const struct aspeed_wdt_config ast2500_config = {
>   	.ext_pulse_width_mask = 0xfffff,
>   	.irq_shift = 12,
>   	.irq_mask = GENMASK(31, 12),
> +	.scu = {
> +		.compatible = "aspeed,ast2500-scu",
> +		.reset_status_reg = 0x3c,
> +		.wdt_reset_mask = 0x1,
> +		.wdt_sw_reset_mask = 0,
> +		.wdt_reset_mask_shift = 2,
> +	},
>   };
>   
>   static const struct aspeed_wdt_config ast2600_config = {
>   	.ext_pulse_width_mask = 0xfffff,
>   	.irq_shift = 0,
>   	.irq_mask = GENMASK(31, 10),
> +	.scu = {
> +		.compatible = "aspeed,ast2600-scu",
> +		.reset_status_reg = 0x74,
> +		.wdt_reset_mask = 0xf,
> +		.wdt_sw_reset_mask = 0x8,
> +		.wdt_reset_mask_shift = 16,
> +	},
>   };
>   
>   static const struct of_device_id aspeed_wdt_of_table[] = {
> @@ -213,6 +244,52 @@ static int aspeed_wdt_restart(struct watchdog_device *wdd,
>   	return 0;
>   }
>   
> +static int aspeed_wdt_update_bootstatus(struct platform_device *pdev,
> +					struct aspeed_wdt *wdt)
> +{
> +	struct resource *res;
> +	struct aspeed_wdt_scu scu = wdt->cfg->scu;
> +	struct regmap *scu_base;
> +	u32 reset_mask_width;
> +	u32 reset_mask_shift;
> +	u32 reg_size = 0;

Please no unnecesary initializations.

> +	u32 idx = 0;
> +	u32 status;
> +	int ret;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	reg_size = res->end - res->start;
> +
> +	if (reg_size != 0)
> +		idx = ((intptr_t)wdt->base & 0x00000fff) / reg_size;
> +
> +	/* On ast2400, only a bit is used to represent WDT reset */
> +	if (of_device_is_compatible(pdev->dev.of_node, "aspeed,ast2400-wdt"))
> +		idx = 0;
> +

There is some redundancy in the above code, and platform_get_resource()
can return NULL. If idx==0 for aspeed,ast2400-wdt anyway, the code can be
rewritten as

	if (!of_device_is_compatible(pdev->dev.of_node, "aspeed,ast2400-wdt")) {
		res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
		if (res) {
			reg_size = res->end - res->start;
			if (reg_size)
				idx = ((intptr_t)wdt->base & 0x00000fff) / reg_size;
		}
	}

> +	scu_base = syscon_regmap_lookup_by_compatible(scu.compatible);
> +	if (IS_ERR(scu_base))
> +		return PTR_ERR(scu_base);
> +
> +	ret = regmap_read(scu_base, scu.reset_status_reg, &status);
> +	if (ret)
> +		return ret;

The above only affects bootstatus. Why fail to load the driver just because
bootstatus can not be read ?

> +
> +	reset_mask_width = hweight32(scu.wdt_reset_mask);
> +	reset_mask_shift = scu.wdt_reset_mask_shift +
> +			   reset_mask_width * idx;
> +
> +	if (status & (scu.wdt_sw_reset_mask << reset_mask_shift))
> +		wdt->wdd.bootstatus = WDIOF_EXTERN1;
> +	else if (status & (scu.wdt_reset_mask << reset_mask_shift))
> +		wdt->wdd.bootstatus = WDIOF_CARDRESET;
> +	else
> +		wdt->wdd.bootstatus = 0;

That is already 0.

> +
> +	return regmap_write(scu_base, scu.reset_status_reg,
> +			    scu.wdt_reset_mask << reset_mask_shift);
> +}
> +
>   /* access_cs0 shows if cs0 is accessible, hence the reverted bit */
>   static ssize_t access_cs0_show(struct device *dev,
>   			       struct device_attribute *attr, char *buf)
> @@ -458,10 +535,12 @@ static int aspeed_wdt_probe(struct platform_device *pdev)
>   		writel(duration - 1, wdt->base + WDT_RESET_WIDTH);
>   	}
>   
> +	ret = aspeed_wdt_update_bootstatus(pdev, wdt);
> +	if (ret)
> +		return ret;
> +
>   	status = readl(wdt->base + WDT_TIMEOUT_STATUS);
>   	if (status & WDT_TIMEOUT_STATUS_BOOT_SECONDARY) {
> -		wdt->wdd.bootstatus = WDIOF_CARDRESET;
> -
>   		if (of_device_is_compatible(np, "aspeed,ast2400-wdt") ||
>   		    of_device_is_compatible(np, "aspeed,ast2500-wdt"))
>   			wdt->wdd.groups = bswitch_groups;
Andrew Jeffery Nov. 4, 2024, 12:01 a.m. UTC | #3
On Fri, 2024-11-01 at 14:21 -0400, Patrick Williams wrote:
> On Fri, Nov 01, 2024 at 08:11:59PM +0800, Chin-Ting Kuo wrote:
> > The boot status mapping rule follows the latest design guide from
> > the OpenBMC shown as below.
> > https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-update.md#proposed-design
> > - WDIOF_EXTERN1   => system is reset by Software
> > - WDIOF_CARDRESET => system is reset by WDT SoC reset
> > - Others          => other reset events, e.g., power on reset.
> 
> I'm quite surprised that the above is relevant for a kernel driver at
> all.  Isn't "EXTERN1" a name of a real watchdog signal from your
> hardware (my recollection is that there are 2 external watchdogs).

I think you may be referring to WDTRST1 (and WDTRST2) here.

WDIOF_EXTERN1 and WDIOF_EXTERN2 have an unrelated history:

https://github.com/torvalds/linux/blame/master/include/uapi/linux/watchdog.h

>   I
> think the point of this referenced design document was that most
> users
> of BMCs have "EXTERN1" used a for software reset conditions.
> `CARDRESET` should be representing resets by the watchdog itself.

I think this is what Chin-Ting is proposing for the Aspeed driver?

> 
> The purpose of this design proposal was not to require very specific
> changes to individual watchdog drivers, but to align the userspace
> use
> with the best practices already from other watchdog drivers.  I don't
> think the kernel driver should be bending to match a particular
> userspace implementation; you should be exposing the information
> available to your hardware.

I agree with this in principle, but I'm not sure what else is meant to
be done here.

> 
> Having said that, it was known that there would need to be changes to
> the driver because some of these conditions were not adequately
> exposed
> at all.  I'm just still surprised that we're needing to reference
> that
> document as part of these changes.

I think the main question is whether an internal, graceful (userspace-
requested) reset is a reasonable use of WDIOF_EXTERN[12]. My feeling
no. I wonder whether defining a new flag (WDIOF_REBOOT?
WDIOF_GRACEFUL?) in the UAPI would be acceptable?

Andrew
diff mbox series

Patch

diff --git a/drivers/watchdog/aspeed_wdt.c b/drivers/watchdog/aspeed_wdt.c
index b4773a6aaf8c..4ad6335ff25b 100644
--- a/drivers/watchdog/aspeed_wdt.c
+++ b/drivers/watchdog/aspeed_wdt.c
@@ -11,21 +11,31 @@ 
 #include <linux/io.h>
 #include <linux/kernel.h>
 #include <linux/kstrtox.h>
+#include <linux/mfd/syscon.h>
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/of_irq.h>
 #include <linux/platform_device.h>
+#include <linux/regmap.h>
 #include <linux/watchdog.h>
 
 static bool nowayout = WATCHDOG_NOWAYOUT;
 module_param(nowayout, bool, 0);
 MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
 				__MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
+struct aspeed_wdt_scu {
+	const char *compatible;
+	u32 reset_status_reg;
+	u32 wdt_reset_mask;
+	u32 wdt_sw_reset_mask;
+	u32 wdt_reset_mask_shift;
+};
 
 struct aspeed_wdt_config {
 	u32 ext_pulse_width_mask;
 	u32 irq_shift;
 	u32 irq_mask;
+	struct aspeed_wdt_scu scu;
 };
 
 struct aspeed_wdt {
@@ -39,18 +49,39 @@  static const struct aspeed_wdt_config ast2400_config = {
 	.ext_pulse_width_mask = 0xff,
 	.irq_shift = 0,
 	.irq_mask = 0,
+	.scu = {
+		.compatible = "aspeed,ast2400-scu",
+		.reset_status_reg = 0x3c,
+		.wdt_reset_mask = 0x1,
+		.wdt_sw_reset_mask = 0,
+		.wdt_reset_mask_shift = 1,
+	},
 };
 
 static const struct aspeed_wdt_config ast2500_config = {
 	.ext_pulse_width_mask = 0xfffff,
 	.irq_shift = 12,
 	.irq_mask = GENMASK(31, 12),
+	.scu = {
+		.compatible = "aspeed,ast2500-scu",
+		.reset_status_reg = 0x3c,
+		.wdt_reset_mask = 0x1,
+		.wdt_sw_reset_mask = 0,
+		.wdt_reset_mask_shift = 2,
+	},
 };
 
 static const struct aspeed_wdt_config ast2600_config = {
 	.ext_pulse_width_mask = 0xfffff,
 	.irq_shift = 0,
 	.irq_mask = GENMASK(31, 10),
+	.scu = {
+		.compatible = "aspeed,ast2600-scu",
+		.reset_status_reg = 0x74,
+		.wdt_reset_mask = 0xf,
+		.wdt_sw_reset_mask = 0x8,
+		.wdt_reset_mask_shift = 16,
+	},
 };
 
 static const struct of_device_id aspeed_wdt_of_table[] = {
@@ -213,6 +244,52 @@  static int aspeed_wdt_restart(struct watchdog_device *wdd,
 	return 0;
 }
 
+static int aspeed_wdt_update_bootstatus(struct platform_device *pdev,
+					struct aspeed_wdt *wdt)
+{
+	struct resource *res;
+	struct aspeed_wdt_scu scu = wdt->cfg->scu;
+	struct regmap *scu_base;
+	u32 reset_mask_width;
+	u32 reset_mask_shift;
+	u32 reg_size = 0;
+	u32 idx = 0;
+	u32 status;
+	int ret;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	reg_size = res->end - res->start;
+
+	if (reg_size != 0)
+		idx = ((intptr_t)wdt->base & 0x00000fff) / reg_size;
+
+	/* On ast2400, only a bit is used to represent WDT reset */
+	if (of_device_is_compatible(pdev->dev.of_node, "aspeed,ast2400-wdt"))
+		idx = 0;
+
+	scu_base = syscon_regmap_lookup_by_compatible(scu.compatible);
+	if (IS_ERR(scu_base))
+		return PTR_ERR(scu_base);
+
+	ret = regmap_read(scu_base, scu.reset_status_reg, &status);
+	if (ret)
+		return ret;
+
+	reset_mask_width = hweight32(scu.wdt_reset_mask);
+	reset_mask_shift = scu.wdt_reset_mask_shift +
+			   reset_mask_width * idx;
+
+	if (status & (scu.wdt_sw_reset_mask << reset_mask_shift))
+		wdt->wdd.bootstatus = WDIOF_EXTERN1;
+	else if (status & (scu.wdt_reset_mask << reset_mask_shift))
+		wdt->wdd.bootstatus = WDIOF_CARDRESET;
+	else
+		wdt->wdd.bootstatus = 0;
+
+	return regmap_write(scu_base, scu.reset_status_reg,
+			    scu.wdt_reset_mask << reset_mask_shift);
+}
+
 /* access_cs0 shows if cs0 is accessible, hence the reverted bit */
 static ssize_t access_cs0_show(struct device *dev,
 			       struct device_attribute *attr, char *buf)
@@ -458,10 +535,12 @@  static int aspeed_wdt_probe(struct platform_device *pdev)
 		writel(duration - 1, wdt->base + WDT_RESET_WIDTH);
 	}
 
+	ret = aspeed_wdt_update_bootstatus(pdev, wdt);
+	if (ret)
+		return ret;
+
 	status = readl(wdt->base + WDT_TIMEOUT_STATUS);
 	if (status & WDT_TIMEOUT_STATUS_BOOT_SECONDARY) {
-		wdt->wdd.bootstatus = WDIOF_CARDRESET;
-
 		if (of_device_is_compatible(np, "aspeed,ast2400-wdt") ||
 		    of_device_is_compatible(np, "aspeed,ast2500-wdt"))
 			wdt->wdd.groups = bswitch_groups;