Message ID | 1467412897-15220-1-git-send-email-oss@buserror.net |
---|---|
State | Not Applicable, archived |
Headers | show |
On Fri, Jul 01, 2016 at 05:41:37PM -0500, Scott Wood wrote: > Erratum A-008585 says that the ARM generic timer counter "has the > potential to contain an erroneous value for a small number of core > clock cycles every time the timer value changes". Accesses to TVAL > (both read and write) are also affected due to the implicit counter > read. Accesses to CVAL are not affected. > > The workaround is to reread TVAL and count registers until successive reads > return the same value, and when writing TVAL to retry until counter > reads before and after the write return the same value. > > This erratum can be found on LS1043A and LS2080A. > > Signed-off-by: Scott Wood <oss@buserror.net> > --- > v3: > - Used cval rather than a loop for the write side of the erratum > - Added a Kconfig control > - Moved the device tree binding into its own patch > - Added erratum to silicon-errata.txt > - Changed function names to contain the erratum name > - Factored out the setting of erratum versions of set_next_event > to improve readability > - Added a comment clarifying that the timeout is arbitrary > > v2: > Significant rework based on feedback, including using static_key, > disabling VDSO counter access rather than adding the workaround to the > VDSO, and uninlining the loops. > > Dropped the separate property for indicating that writes to TVAL are > affected, as I believe that's just a side effect of the implicit > counter read being corrupted, and thus a chip that is affected by one > will always be affected by the other. > > Dropped the arm32 portion as it seems there was confusion about whether > LS1021A is affected. Currently I am being told that it is not > affected. > > I considered writing to CVAL rather than looping on TVAL writes, but > that would still have required separate set_next_event() code for the > erratum, and adding CVAL to the enum would have required a bunch of > extra handlers in switch statements (even where unused, due to compiler > warnings about unhandled enum values) including in an arm32 header. It > seemed better to avoid the arm32 interaction and new untested > accessors. > --- > Documentation/arm64/silicon-errata.txt | 2 + > arch/arm64/include/asm/arch_timer.h | 48 ++++++++++++--- > drivers/clocksource/Kconfig | 10 ++++ > drivers/clocksource/arm_arch_timer.c | 103 +++++++++++++++++++++++++++++++++ > 4 files changed, 154 insertions(+), 9 deletions(-) > > diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt > index ba4b6ac..5778f62 100644 > --- a/Documentation/arm64/silicon-errata.txt > +++ b/Documentation/arm64/silicon-errata.txt > @@ -57,3 +57,5 @@ stable kernels. > | Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 | > | Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 | > | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | > +| | | | | > +| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | > diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h > index fbe0ca3..70fbad9 100644 > --- a/arch/arm64/include/asm/arch_timer.h > +++ b/arch/arm64/include/asm/arch_timer.h > @@ -23,10 +23,34 @@ > > #include <linux/bug.h> > #include <linux/init.h> > +#include <linux/jump_label.h> > #include <linux/types.h> > > #include <clocksource/arm_arch_timer.h> > > +extern struct static_key_false arch_timer_read_ool_enabled; > + > +#define ARCH_TIMER_REG_READ(reg, func) \ > +extern u64 func##_ool(void); \ > +static inline u64 __##func(void) \ > +{ \ > + u64 val; \ > + asm volatile("mrs %0, " reg : "=r" (val)); \ > + return val; \ > +} \ > +static inline u64 _##func(void) \ > +{ \ > + if (IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) && \ > + static_branch_unlikely(&arch_timer_read_ool_enabled)) \ > + return func##_ool(); \ > + else \ > + return __##func(); \ > +} > + > +ARCH_TIMER_REG_READ("cntp_tval_el0", arch_timer_get_ptval) > +ARCH_TIMER_REG_READ("cntv_tval_el0", arch_timer_get_vtval) > +ARCH_TIMER_REG_READ("cntvct_el0", arch_counter_get_cntvct) It's probably worth #undefing the macro here, since it really shouldn't grow any users outside of this header file. > + > /* > * These register accessors are marked inline so the compiler can > * nicely work out which register we want, and chuck away the rest of > @@ -58,6 +82,16 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val) > isb(); > } > > +static __always_inline void arch_timer_cval_write_cp15(int access, u64 val) > +{ > + if (access == ARCH_TIMER_PHYS_ACCESS) > + asm volatile("msr cntp_cval_el0, %0" : : "r" (val)); > + else if (access == ARCH_TIMER_VIRT_ACCESS) > + asm volatile("msr cntv_cval_el0, %0" : : "r" (val)); > + > + isb(); > +} > + > static __always_inline > u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) > { > @@ -66,19 +100,19 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) > if (access == ARCH_TIMER_PHYS_ACCESS) { > switch (reg) { > case ARCH_TIMER_REG_CTRL: > - asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); > + asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); > break; > case ARCH_TIMER_REG_TVAL: > - asm volatile("mrs %0, cntp_tval_el0" : "=r" (val)); > + val = _arch_timer_get_ptval(); > break; > } > } else if (access == ARCH_TIMER_VIRT_ACCESS) { > switch (reg) { > case ARCH_TIMER_REG_CTRL: > - asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); > + asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); > break; > case ARCH_TIMER_REG_TVAL: > - asm volatile("mrs %0, cntv_tval_el0" : "=r" (val)); > + val = _arch_timer_get_vtval(); > break; > } > } > @@ -116,12 +150,8 @@ static inline u64 arch_counter_get_cntpct(void) > > static inline u64 arch_counter_get_cntvct(void) > { > - u64 cval; > - > isb(); > - asm volatile("mrs %0, cntvct_el0" : "=r" (cval)); > - > - return cval; > + return _arch_counter_get_cntvct(); > } > > static inline int arch_timer_arch_init(void) > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > index c346be6..672ddc3 100644 > --- a/drivers/clocksource/Kconfig > +++ b/drivers/clocksource/Kconfig > @@ -207,6 +207,16 @@ config ARM_ARCH_TIMER_EVTSTREAM > This must be disabled for hardware validation purposes to detect any > hardware anomalies of missing events. > > +config FSL_ERRATUM_A008585 > + bool "Workaround for Freescale/NXP Erratum A-008585" > + default y > + depends on ARM_ARCH_TIMER && ARM64 > + help > + This option enables a workaround for Freescale/NXP Erratum > + A-008585 ("ARM generic timer may contain an erroneous > + value"). The workaround will only be active if the > + fsl,erratum-a008585 property is found in the timer node. > + > config ARM_GLOBAL_TIMER > bool > select CLKSRC_OF if OF > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c > index 5152b38..7ead4eb 100644 > --- a/drivers/clocksource/arm_arch_timer.c > +++ b/drivers/clocksource/arm_arch_timer.c > @@ -83,6 +83,51 @@ static bool arch_timer_mem_use_virtual; > * Architected system timer support. > */ > > +#ifdef CONFIG_FSL_ERRATUM_A008585 > +DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled); > +EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled); > + > +/* > + * __always_inline is used to ensure that func() is not an actual function > + * pointer, which would result in the register accesses potentially being too > + * far apart for the loop to work. > + * > + * The timeout is an arbitrary value well beyond the highest number > + * of iterations the loop has been observed to take. > + */ > +static __always_inline u64 fsl_a008585_reread_counter(u64 (*func)(void)) > +{ > + u64 cval_old, cval_new; > + int timeout = 200; > + > + do { > + isb(); > + cval_old = func(); > + cval_new = func(); > + timeout--; > + } while (unlikely(cval_old != cval_new) && timeout); > + > + WARN_ON_ONCE(!timeout); > + return cval_new; > +} > + > +u64 arch_counter_get_cntvct_ool(void) > +{ > + return fsl_a008585_reread_counter(__arch_counter_get_cntvct); > +} > + > +u64 arch_timer_get_vtval_ool(void) > +{ > + return fsl_a008585_reread_counter(__arch_timer_get_vtval); > +} > + > +u64 arch_timer_get_ptval_ool(void) > +{ > + return fsl_a008585_reread_counter(__arch_timer_get_ptval); > +} > + > +#endif /* CONFIG_FSL_ERRATUM_A008585 */ > + > static __always_inline > void arch_timer_reg_write(int access, enum arch_timer_reg reg, u32 val, > struct clock_event_device *clk) > @@ -232,6 +277,35 @@ static __always_inline void set_next_event(const int access, unsigned long evt, > arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk); > } > > +#ifdef CONFIG_FSL_ERRATUM_A008585 > +static __always_inline void fsl_a008585_set_next_event(const int access, > + unsigned long evt, struct clock_event_device *clk) > +{ > + unsigned long ctrl; > + u64 cval = evt + arch_counter_get_cntvct(); > + > + ctrl = arch_timer_reg_read(access, ARCH_TIMER_REG_CTRL, clk); > + ctrl |= ARCH_TIMER_CTRL_ENABLE; > + ctrl &= ~ARCH_TIMER_CTRL_IT_MASK; > + arch_timer_cval_write_cp15(access, cval); > + arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk); > +} > + > +static int fsl_a008585_set_next_event_virt(unsigned long evt, > + struct clock_event_device *clk) > +{ > + fsl_a008585_set_next_event(ARCH_TIMER_VIRT_ACCESS, evt, clk); > + return 0; > +} > + > +static int fsl_a008585_set_next_event_phys(unsigned long evt, > + struct clock_event_device *clk) > +{ > + fsl_a008585_set_next_event(ARCH_TIMER_PHYS_ACCESS, evt, clk); > + return 0; > +} > +#endif /* CONFIG_FSL_ERRATUM_A008585 */ > + > static int arch_timer_set_next_event_virt(unsigned long evt, > struct clock_event_device *clk) > { > @@ -260,6 +334,19 @@ static int arch_timer_set_next_event_phys_mem(unsigned long evt, > return 0; > } > > +static void fsl_a008585_set_sne(struct clock_event_device *clk) > +{ > +#ifdef CONFIG_FSL_ERRATUM_A008585 > + if (!static_branch_unlikely(&arch_timer_read_ool_enabled)) > + return; > + > + if (arch_timer_uses_ppi == VIRT_PPI) > + clk->set_next_event = fsl_a008585_set_next_event_virt; > + else > + clk->set_next_event = fsl_a008585_set_next_event_phys; > +#endif > +} > + > static void __arch_timer_setup(unsigned type, > struct clock_event_device *clk) > { > @@ -288,6 +375,8 @@ static void __arch_timer_setup(unsigned type, > default: > BUG(); > } > + > + fsl_a008585_set_sne(clk); > } else { > clk->features |= CLOCK_EVT_FEAT_DYNIRQ; > clk->name = "arch_mem_timer"; > @@ -485,6 +574,15 @@ static void __init arch_counter_register(unsigned type) > arch_timer_read_counter = arch_counter_get_cntvct; > else > arch_timer_read_counter = arch_counter_get_cntpct; > + > +#ifdef CONFIG_FSL_ERRATUM_A008585 > + /* > + * Don't use the vdso fastpath if errata require using > + * the out-of-line counter accessor. > + */ > + if (static_branch_unlikely(&arch_timer_read_ool_enabled)) > + clocksource_counter.name = "arch_sys_counter_ool"; We could avoid this entirely by providing an arch_clocksource_data structure like x86 does, and get rid of the ugly strcmp magic from update_vsyscall. Will -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 01, 2016 at 05:41:35PM -0500, Scott Wood wrote: > This erratum describes a bug in logic outside the core, so MIDR can't be > used to identify its presence, and reading an SoC-specific revision > register from common arch timer code would be awkward. So, describe it > in the device tree. > > Signed-off-by: Scott Wood <oss@buserror.net> > --- > Documentation/devicetree/bindings/arm/arch_timer.txt | 6 ++++++ > 1 file changed, 6 insertions(+) Acked-by: Rob Herring <robh@kernel.org> -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/7/2 6:41, Scott Wood wrote: > Erratum A-008585 says that the ARM generic timer counter "has the > potential to contain an erroneous value for a small number of core > clock cycles every time the timer value changes". Accesses to TVAL > (both read and write) are also affected due to the implicit counter > read. Accesses to CVAL are not affected. > > The workaround is to reread TVAL and count registers until successive reads > return the same value, and when writing TVAL to retry until counter > reads before and after the write return the same value. > > This erratum can be found on LS1043A and LS2080A. > > Signed-off-by: Scott Wood <oss@buserror.net> > --- > v3: > - Used cval rather than a loop for the write side of the erratum > - Added a Kconfig control > - Moved the device tree binding into its own patch > - Added erratum to silicon-errata.txt > - Changed function names to contain the erratum name > - Factored out the setting of erratum versions of set_next_event > to improve readability > - Added a comment clarifying that the timeout is arbitrary > > v2: > Significant rework based on feedback, including using static_key, > disabling VDSO counter access rather than adding the workaround to the > VDSO, and uninlining the loops. > > Dropped the separate property for indicating that writes to TVAL are > affected, as I believe that's just a side effect of the implicit > counter read being corrupted, and thus a chip that is affected by one > will always be affected by the other. > > Dropped the arm32 portion as it seems there was confusion about whether > LS1021A is affected. Currently I am being told that it is not > affected. > > I considered writing to CVAL rather than looping on TVAL writes, but > that would still have required separate set_next_event() code for the > erratum, and adding CVAL to the enum would have required a bunch of > extra handlers in switch statements (even where unused, due to compiler > warnings about unhandled enum values) including in an arm32 header. It > seemed better to avoid the arm32 interaction and new untested > accessors. > --- > Documentation/arm64/silicon-errata.txt | 2 + > arch/arm64/include/asm/arch_timer.h | 48 ++++++++++++--- > drivers/clocksource/Kconfig | 10 ++++ > drivers/clocksource/arm_arch_timer.c | 103 +++++++++++++++++++++++++++++++++ > 4 files changed, 154 insertions(+), 9 deletions(-) > > diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt > index ba4b6ac..5778f62 100644 > --- a/Documentation/arm64/silicon-errata.txt > +++ b/Documentation/arm64/silicon-errata.txt > @@ -57,3 +57,5 @@ stable kernels. > | Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 | > | Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 | > | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | > +| | | | | > +| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | > diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h > index fbe0ca3..70fbad9 100644 > --- a/arch/arm64/include/asm/arch_timer.h > +++ b/arch/arm64/include/asm/arch_timer.h > @@ -23,10 +23,34 @@ > > #include <linux/bug.h> > #include <linux/init.h> > +#include <linux/jump_label.h> > #include <linux/types.h> > > #include <clocksource/arm_arch_timer.h> > > +extern struct static_key_false arch_timer_read_ool_enabled; > + > +#define ARCH_TIMER_REG_READ(reg, func) \ > +extern u64 func##_ool(void); \ > +static inline u64 __##func(void) \ > +{ \ > + u64 val; \ > + asm volatile("mrs %0, " reg : "=r" (val)); \ > + return val; \ > +} \ > +static inline u64 _##func(void) \ > +{ \ > + if (IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) && \ > + static_branch_unlikely(&arch_timer_read_ool_enabled)) \ > + return func##_ool(); \ > + else \ > + return __##func(); \ > +} > + > +ARCH_TIMER_REG_READ("cntp_tval_el0", arch_timer_get_ptval) > +ARCH_TIMER_REG_READ("cntv_tval_el0", arch_timer_get_vtval) > +ARCH_TIMER_REG_READ("cntvct_el0", arch_counter_get_cntvct) > + > /* > * These register accessors are marked inline so the compiler can > * nicely work out which register we want, and chuck away the rest of > @@ -58,6 +82,16 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val) > isb(); > } > > +static __always_inline void arch_timer_cval_write_cp15(int access, u64 val) > +{ > + if (access == ARCH_TIMER_PHYS_ACCESS) > + asm volatile("msr cntp_cval_el0, %0" : : "r" (val)); > + else if (access == ARCH_TIMER_VIRT_ACCESS) > + asm volatile("msr cntv_cval_el0, %0" : : "r" (val)); > + > + isb(); > +} > + > static __always_inline > u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) > { > @@ -66,19 +100,19 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) > if (access == ARCH_TIMER_PHYS_ACCESS) { > switch (reg) { > case ARCH_TIMER_REG_CTRL: > - asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); > + asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); > break; > case ARCH_TIMER_REG_TVAL: > - asm volatile("mrs %0, cntp_tval_el0" : "=r" (val)); > + val = _arch_timer_get_ptval(); > break; > } > } else if (access == ARCH_TIMER_VIRT_ACCESS) { > switch (reg) { > case ARCH_TIMER_REG_CTRL: > - asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); > + asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); > break; > case ARCH_TIMER_REG_TVAL: > - asm volatile("mrs %0, cntv_tval_el0" : "=r" (val)); > + val = _arch_timer_get_vtval(); > break; > } > } > @@ -116,12 +150,8 @@ static inline u64 arch_counter_get_cntpct(void) > > static inline u64 arch_counter_get_cntvct(void) > { > - u64 cval; > - > isb(); > - asm volatile("mrs %0, cntvct_el0" : "=r" (cval)); > - > - return cval; > + return _arch_counter_get_cntvct(); > } > > static inline int arch_timer_arch_init(void) > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > index c346be6..672ddc3 100644 > --- a/drivers/clocksource/Kconfig > +++ b/drivers/clocksource/Kconfig > @@ -207,6 +207,16 @@ config ARM_ARCH_TIMER_EVTSTREAM > This must be disabled for hardware validation purposes to detect any > hardware anomalies of missing events. > > +config FSL_ERRATUM_A008585 > + bool "Workaround for Freescale/NXP Erratum A-008585" > + default y > + depends on ARM_ARCH_TIMER && ARM64 > + help > + This option enables a workaround for Freescale/NXP Erratum > + A-008585 ("ARM generic timer may contain an erroneous > + value"). The workaround will only be active if the > + fsl,erratum-a008585 property is found in the timer node. > + > config ARM_GLOBAL_TIMER > bool > select CLKSRC_OF if OF > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c > index 5152b38..7ead4eb 100644 > --- a/drivers/clocksource/arm_arch_timer.c > +++ b/drivers/clocksource/arm_arch_timer.c > @@ -83,6 +83,51 @@ static bool arch_timer_mem_use_virtual; > * Architected system timer support. > */ > > +#ifdef CONFIG_FSL_ERRATUM_A008585 > +DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled); > +EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled); > + > +/* > + * __always_inline is used to ensure that func() is not an actual function > + * pointer, which would result in the register accesses potentially being too > + * far apart for the loop to work. > + * > + * The timeout is an arbitrary value well beyond the highest number > + * of iterations the loop has been observed to take. > + */ > +static __always_inline u64 fsl_a008585_reread_counter(u64 (*func)(void)) > +{ > + u64 cval_old, cval_new; > + int timeout = 200; > + > + do { > + isb(); > + cval_old = func(); > + cval_new = func(); > + timeout--; > + } while (unlikely(cval_old != cval_new) && timeout); > + > + WARN_ON_ONCE(!timeout); > + return cval_new; > +} Hi Scott: I have test this patch, this solution looks will break the performance a little more than I expected. it will have more than 10% that the cval will read again, we could sure that the cval_old always equal to the cval_new in the normal circumstances, so I prefer this way: do { isb(); cval_old = func(); cval_new = func(); timeout--; } while (unlikely((cval_new - cval_old) >> 2) && timeout); Thanks. Ding > + > +u64 arch_counter_get_cntvct_ool(void) > +{ > + return fsl_a008585_reread_counter(__arch_counter_get_cntvct); > +} > + > +u64 arch_timer_get_vtval_ool(void) > +{ > + return fsl_a008585_reread_counter(__arch_timer_get_vtval); > +} > + > +u64 arch_timer_get_ptval_ool(void) > +{ > + return fsl_a008585_reread_counter(__arch_timer_get_ptval); > +} > + > +#endif /* CONFIG_FSL_ERRATUM_A008585 */ > + > static __always_inline > void arch_timer_reg_write(int access, enum arch_timer_reg reg, u32 val, > struct clock_event_device *clk) > @@ -232,6 +277,35 @@ static __always_inline void set_next_event(const int access, unsigned long evt, > arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk); > } > > +#ifdef CONFIG_FSL_ERRATUM_A008585 > +static __always_inline void fsl_a008585_set_next_event(const int access, > + unsigned long evt, struct clock_event_device *clk) > +{ > + unsigned long ctrl; > + u64 cval = evt + arch_counter_get_cntvct(); > + > + ctrl = arch_timer_reg_read(access, ARCH_TIMER_REG_CTRL, clk); > + ctrl |= ARCH_TIMER_CTRL_ENABLE; > + ctrl &= ~ARCH_TIMER_CTRL_IT_MASK; > + arch_timer_cval_write_cp15(access, cval); > + arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk); > +} > + > +static int fsl_a008585_set_next_event_virt(unsigned long evt, > + struct clock_event_device *clk) > +{ > + fsl_a008585_set_next_event(ARCH_TIMER_VIRT_ACCESS, evt, clk); > + return 0; > +} > + > +static int fsl_a008585_set_next_event_phys(unsigned long evt, > + struct clock_event_device *clk) > +{ > + fsl_a008585_set_next_event(ARCH_TIMER_PHYS_ACCESS, evt, clk); > + return 0; > +} > +#endif /* CONFIG_FSL_ERRATUM_A008585 */ > + > static int arch_timer_set_next_event_virt(unsigned long evt, > struct clock_event_device *clk) > { > @@ -260,6 +334,19 @@ static int arch_timer_set_next_event_phys_mem(unsigned long evt, > return 0; > } > > +static void fsl_a008585_set_sne(struct clock_event_device *clk) > +{ > +#ifdef CONFIG_FSL_ERRATUM_A008585 > + if (!static_branch_unlikely(&arch_timer_read_ool_enabled)) > + return; > + > + if (arch_timer_uses_ppi == VIRT_PPI) > + clk->set_next_event = fsl_a008585_set_next_event_virt; > + else > + clk->set_next_event = fsl_a008585_set_next_event_phys; > +#endif > +} > + > static void __arch_timer_setup(unsigned type, > struct clock_event_device *clk) > { > @@ -288,6 +375,8 @@ static void __arch_timer_setup(unsigned type, > default: > BUG(); > } > + > + fsl_a008585_set_sne(clk); > } else { > clk->features |= CLOCK_EVT_FEAT_DYNIRQ; > clk->name = "arch_mem_timer"; > @@ -485,6 +574,15 @@ static void __init arch_counter_register(unsigned type) > arch_timer_read_counter = arch_counter_get_cntvct; > else > arch_timer_read_counter = arch_counter_get_cntpct; > + > +#ifdef CONFIG_FSL_ERRATUM_A008585 > + /* > + * Don't use the vdso fastpath if errata require using > + * the out-of-line counter accessor. > + */ > + if (static_branch_unlikely(&arch_timer_read_ool_enabled)) > + clocksource_counter.name = "arch_sys_counter_ool"; > +#endif > } else { > arch_timer_read_counter = arch_counter_get_cntvct_mem; > > @@ -763,6 +861,11 @@ static void __init arch_timer_of_init(struct device_node *np) > > arch_timer_c3stop = !of_property_read_bool(np, "always-on"); > > +#ifdef CONFIG_FSL_ERRATUM_A008585 > + if (of_property_read_bool(np, "fsl,erratum-a008585")) > + static_branch_enable(&arch_timer_read_ool_enabled); > +#endif > + > /* > * If we cannot rely on firmware initializing the timer registers then > * we should use the physical timers instead. > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/07/16 10:34, Ding Tianhong wrote: > On 2016/7/2 6:41, Scott Wood wrote: >> Erratum A-008585 says that the ARM generic timer counter "has the >> potential to contain an erroneous value for a small number of core >> clock cycles every time the timer value changes". Accesses to TVAL >> (both read and write) are also affected due to the implicit counter >> read. Accesses to CVAL are not affected. >> >> The workaround is to reread TVAL and count registers until successive reads >> return the same value, and when writing TVAL to retry until counter >> reads before and after the write return the same value. >> >> This erratum can be found on LS1043A and LS2080A. >> >> Signed-off-by: Scott Wood <oss@buserror.net> >> --- >> v3: >> - Used cval rather than a loop for the write side of the erratum >> - Added a Kconfig control >> - Moved the device tree binding into its own patch >> - Added erratum to silicon-errata.txt >> - Changed function names to contain the erratum name >> - Factored out the setting of erratum versions of set_next_event >> to improve readability >> - Added a comment clarifying that the timeout is arbitrary >> >> v2: >> Significant rework based on feedback, including using static_key, >> disabling VDSO counter access rather than adding the workaround to the >> VDSO, and uninlining the loops. >> >> Dropped the separate property for indicating that writes to TVAL are >> affected, as I believe that's just a side effect of the implicit >> counter read being corrupted, and thus a chip that is affected by one >> will always be affected by the other. >> >> Dropped the arm32 portion as it seems there was confusion about whether >> LS1021A is affected. Currently I am being told that it is not >> affected. >> >> I considered writing to CVAL rather than looping on TVAL writes, but >> that would still have required separate set_next_event() code for the >> erratum, and adding CVAL to the enum would have required a bunch of >> extra handlers in switch statements (even where unused, due to compiler >> warnings about unhandled enum values) including in an arm32 header. It >> seemed better to avoid the arm32 interaction and new untested >> accessors. >> --- >> Documentation/arm64/silicon-errata.txt | 2 + >> arch/arm64/include/asm/arch_timer.h | 48 ++++++++++++--- >> drivers/clocksource/Kconfig | 10 ++++ >> drivers/clocksource/arm_arch_timer.c | 103 +++++++++++++++++++++++++++++++++ >> 4 files changed, 154 insertions(+), 9 deletions(-) >> >> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt >> index ba4b6ac..5778f62 100644 >> --- a/Documentation/arm64/silicon-errata.txt >> +++ b/Documentation/arm64/silicon-errata.txt >> @@ -57,3 +57,5 @@ stable kernels. >> | Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 | >> | Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 | >> | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | >> +| | | | | >> +| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | >> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h >> index fbe0ca3..70fbad9 100644 >> --- a/arch/arm64/include/asm/arch_timer.h >> +++ b/arch/arm64/include/asm/arch_timer.h >> @@ -23,10 +23,34 @@ >> >> #include <linux/bug.h> >> #include <linux/init.h> >> +#include <linux/jump_label.h> >> #include <linux/types.h> >> >> #include <clocksource/arm_arch_timer.h> >> >> +extern struct static_key_false arch_timer_read_ool_enabled; >> + >> +#define ARCH_TIMER_REG_READ(reg, func) \ >> +extern u64 func##_ool(void); \ >> +static inline u64 __##func(void) \ >> +{ \ >> + u64 val; \ >> + asm volatile("mrs %0, " reg : "=r" (val)); \ >> + return val; \ >> +} \ >> +static inline u64 _##func(void) \ >> +{ \ >> + if (IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) && \ >> + static_branch_unlikely(&arch_timer_read_ool_enabled)) \ >> + return func##_ool(); \ >> + else \ >> + return __##func(); \ >> +} >> + >> +ARCH_TIMER_REG_READ("cntp_tval_el0", arch_timer_get_ptval) >> +ARCH_TIMER_REG_READ("cntv_tval_el0", arch_timer_get_vtval) >> +ARCH_TIMER_REG_READ("cntvct_el0", arch_counter_get_cntvct) >> + >> /* >> * These register accessors are marked inline so the compiler can >> * nicely work out which register we want, and chuck away the rest of >> @@ -58,6 +82,16 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val) >> isb(); >> } >> >> +static __always_inline void arch_timer_cval_write_cp15(int access, u64 val) >> +{ >> + if (access == ARCH_TIMER_PHYS_ACCESS) >> + asm volatile("msr cntp_cval_el0, %0" : : "r" (val)); >> + else if (access == ARCH_TIMER_VIRT_ACCESS) >> + asm volatile("msr cntv_cval_el0, %0" : : "r" (val)); >> + >> + isb(); >> +} >> + >> static __always_inline >> u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >> { >> @@ -66,19 +100,19 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >> if (access == ARCH_TIMER_PHYS_ACCESS) { >> switch (reg) { >> case ARCH_TIMER_REG_CTRL: >> - asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >> + asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >> break; >> case ARCH_TIMER_REG_TVAL: >> - asm volatile("mrs %0, cntp_tval_el0" : "=r" (val)); >> + val = _arch_timer_get_ptval(); >> break; >> } >> } else if (access == ARCH_TIMER_VIRT_ACCESS) { >> switch (reg) { >> case ARCH_TIMER_REG_CTRL: >> - asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >> + asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >> break; >> case ARCH_TIMER_REG_TVAL: >> - asm volatile("mrs %0, cntv_tval_el0" : "=r" (val)); >> + val = _arch_timer_get_vtval(); >> break; >> } >> } >> @@ -116,12 +150,8 @@ static inline u64 arch_counter_get_cntpct(void) >> >> static inline u64 arch_counter_get_cntvct(void) >> { >> - u64 cval; >> - >> isb(); >> - asm volatile("mrs %0, cntvct_el0" : "=r" (cval)); >> - >> - return cval; >> + return _arch_counter_get_cntvct(); >> } >> >> static inline int arch_timer_arch_init(void) >> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig >> index c346be6..672ddc3 100644 >> --- a/drivers/clocksource/Kconfig >> +++ b/drivers/clocksource/Kconfig >> @@ -207,6 +207,16 @@ config ARM_ARCH_TIMER_EVTSTREAM >> This must be disabled for hardware validation purposes to detect any >> hardware anomalies of missing events. >> >> +config FSL_ERRATUM_A008585 >> + bool "Workaround for Freescale/NXP Erratum A-008585" >> + default y >> + depends on ARM_ARCH_TIMER && ARM64 >> + help >> + This option enables a workaround for Freescale/NXP Erratum >> + A-008585 ("ARM generic timer may contain an erroneous >> + value"). The workaround will only be active if the >> + fsl,erratum-a008585 property is found in the timer node. >> + >> config ARM_GLOBAL_TIMER >> bool >> select CLKSRC_OF if OF >> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c >> index 5152b38..7ead4eb 100644 >> --- a/drivers/clocksource/arm_arch_timer.c >> +++ b/drivers/clocksource/arm_arch_timer.c >> @@ -83,6 +83,51 @@ static bool arch_timer_mem_use_virtual; >> * Architected system timer support. >> */ >> >> +#ifdef CONFIG_FSL_ERRATUM_A008585 >> +DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled); >> +EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled); >> + >> +/* >> + * __always_inline is used to ensure that func() is not an actual function >> + * pointer, which would result in the register accesses potentially being too >> + * far apart for the loop to work. >> + * >> + * The timeout is an arbitrary value well beyond the highest number >> + * of iterations the loop has been observed to take. >> + */ >> +static __always_inline u64 fsl_a008585_reread_counter(u64 (*func)(void)) >> +{ >> + u64 cval_old, cval_new; >> + int timeout = 200; >> + >> + do { >> + isb(); >> + cval_old = func(); >> + cval_new = func(); >> + timeout--; >> + } while (unlikely(cval_old != cval_new) && timeout); >> + >> + WARN_ON_ONCE(!timeout); >> + return cval_new; >> +} > Hi Scott: > > I have test this patch, this solution looks will break the performance a little more than I expected. > it will have more than 10% that the cval will read again, we could sure that the cval_old always equal to the > cval_new in the normal circumstances, so I prefer this way: > > do { > isb(); > cval_old = func(); > cval_new = func(); > timeout--; > } while (unlikely((cval_new - cval_old) >> 2) && timeout); What makes you think that ignoring the two bottom bits is a safe thing to do? Talking about performance when the HW has such a dramatic bug is like putting a bigger engine on a car that has no brakes: you just hit the wall quicker. Thanks, M.
On 2016/7/7 17:49, Marc Zyngier wrote: > On 07/07/16 10:34, Ding Tianhong wrote: >> On 2016/7/2 6:41, Scott Wood wrote: >>> Erratum A-008585 says that the ARM generic timer counter "has the >>> potential to contain an erroneous value for a small number of core >>> clock cycles every time the timer value changes". Accesses to TVAL >>> (both read and write) are also affected due to the implicit counter >>> read. Accesses to CVAL are not affected. >>> >>> The workaround is to reread TVAL and count registers until successive reads >>> return the same value, and when writing TVAL to retry until counter >>> reads before and after the write return the same value. >>> >>> This erratum can be found on LS1043A and LS2080A. >>> >>> Signed-off-by: Scott Wood <oss@buserror.net> >>> --- >>> v3: >>> - Used cval rather than a loop for the write side of the erratum >>> - Added a Kconfig control >>> - Moved the device tree binding into its own patch >>> - Added erratum to silicon-errata.txt >>> - Changed function names to contain the erratum name >>> - Factored out the setting of erratum versions of set_next_event >>> to improve readability >>> - Added a comment clarifying that the timeout is arbitrary >>> >>> v2: >>> Significant rework based on feedback, including using static_key, >>> disabling VDSO counter access rather than adding the workaround to the >>> VDSO, and uninlining the loops. >>> >>> Dropped the separate property for indicating that writes to TVAL are >>> affected, as I believe that's just a side effect of the implicit >>> counter read being corrupted, and thus a chip that is affected by one >>> will always be affected by the other. >>> >>> Dropped the arm32 portion as it seems there was confusion about whether >>> LS1021A is affected. Currently I am being told that it is not >>> affected. >>> >>> I considered writing to CVAL rather than looping on TVAL writes, but >>> that would still have required separate set_next_event() code for the >>> erratum, and adding CVAL to the enum would have required a bunch of >>> extra handlers in switch statements (even where unused, due to compiler >>> warnings about unhandled enum values) including in an arm32 header. It >>> seemed better to avoid the arm32 interaction and new untested >>> accessors. >>> --- >>> Documentation/arm64/silicon-errata.txt | 2 + >>> arch/arm64/include/asm/arch_timer.h | 48 ++++++++++++--- >>> drivers/clocksource/Kconfig | 10 ++++ >>> drivers/clocksource/arm_arch_timer.c | 103 +++++++++++++++++++++++++++++++++ >>> 4 files changed, 154 insertions(+), 9 deletions(-) >>> >>> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt >>> index ba4b6ac..5778f62 100644 >>> --- a/Documentation/arm64/silicon-errata.txt >>> +++ b/Documentation/arm64/silicon-errata.txt >>> @@ -57,3 +57,5 @@ stable kernels. >>> | Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 | >>> | Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 | >>> | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | >>> +| | | | | >>> +| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | >>> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h >>> index fbe0ca3..70fbad9 100644 >>> --- a/arch/arm64/include/asm/arch_timer.h >>> +++ b/arch/arm64/include/asm/arch_timer.h >>> @@ -23,10 +23,34 @@ >>> >>> #include <linux/bug.h> >>> #include <linux/init.h> >>> +#include <linux/jump_label.h> >>> #include <linux/types.h> >>> >>> #include <clocksource/arm_arch_timer.h> >>> >>> +extern struct static_key_false arch_timer_read_ool_enabled; >>> + >>> +#define ARCH_TIMER_REG_READ(reg, func) \ >>> +extern u64 func##_ool(void); \ >>> +static inline u64 __##func(void) \ >>> +{ \ >>> + u64 val; \ >>> + asm volatile("mrs %0, " reg : "=r" (val)); \ >>> + return val; \ >>> +} \ >>> +static inline u64 _##func(void) \ >>> +{ \ >>> + if (IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) && \ >>> + static_branch_unlikely(&arch_timer_read_ool_enabled)) \ >>> + return func##_ool(); \ >>> + else \ >>> + return __##func(); \ >>> +} >>> + >>> +ARCH_TIMER_REG_READ("cntp_tval_el0", arch_timer_get_ptval) >>> +ARCH_TIMER_REG_READ("cntv_tval_el0", arch_timer_get_vtval) >>> +ARCH_TIMER_REG_READ("cntvct_el0", arch_counter_get_cntvct) >>> + >>> /* >>> * These register accessors are marked inline so the compiler can >>> * nicely work out which register we want, and chuck away the rest of >>> @@ -58,6 +82,16 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val) >>> isb(); >>> } >>> >>> +static __always_inline void arch_timer_cval_write_cp15(int access, u64 val) >>> +{ >>> + if (access == ARCH_TIMER_PHYS_ACCESS) >>> + asm volatile("msr cntp_cval_el0, %0" : : "r" (val)); >>> + else if (access == ARCH_TIMER_VIRT_ACCESS) >>> + asm volatile("msr cntv_cval_el0, %0" : : "r" (val)); >>> + >>> + isb(); >>> +} >>> + >>> static __always_inline >>> u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >>> { >>> @@ -66,19 +100,19 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >>> if (access == ARCH_TIMER_PHYS_ACCESS) { >>> switch (reg) { >>> case ARCH_TIMER_REG_CTRL: >>> - asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >>> + asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >>> break; >>> case ARCH_TIMER_REG_TVAL: >>> - asm volatile("mrs %0, cntp_tval_el0" : "=r" (val)); >>> + val = _arch_timer_get_ptval(); >>> break; >>> } >>> } else if (access == ARCH_TIMER_VIRT_ACCESS) { >>> switch (reg) { >>> case ARCH_TIMER_REG_CTRL: >>> - asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >>> + asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >>> break; >>> case ARCH_TIMER_REG_TVAL: >>> - asm volatile("mrs %0, cntv_tval_el0" : "=r" (val)); >>> + val = _arch_timer_get_vtval(); >>> break; >>> } >>> } >>> @@ -116,12 +150,8 @@ static inline u64 arch_counter_get_cntpct(void) >>> >>> static inline u64 arch_counter_get_cntvct(void) >>> { >>> - u64 cval; >>> - >>> isb(); >>> - asm volatile("mrs %0, cntvct_el0" : "=r" (cval)); >>> - >>> - return cval; >>> + return _arch_counter_get_cntvct(); >>> } >>> >>> static inline int arch_timer_arch_init(void) >>> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig >>> index c346be6..672ddc3 100644 >>> --- a/drivers/clocksource/Kconfig >>> +++ b/drivers/clocksource/Kconfig >>> @@ -207,6 +207,16 @@ config ARM_ARCH_TIMER_EVTSTREAM >>> This must be disabled for hardware validation purposes to detect any >>> hardware anomalies of missing events. >>> >>> +config FSL_ERRATUM_A008585 >>> + bool "Workaround for Freescale/NXP Erratum A-008585" >>> + default y >>> + depends on ARM_ARCH_TIMER && ARM64 >>> + help >>> + This option enables a workaround for Freescale/NXP Erratum >>> + A-008585 ("ARM generic timer may contain an erroneous >>> + value"). The workaround will only be active if the >>> + fsl,erratum-a008585 property is found in the timer node. >>> + >>> config ARM_GLOBAL_TIMER >>> bool >>> select CLKSRC_OF if OF >>> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c >>> index 5152b38..7ead4eb 100644 >>> --- a/drivers/clocksource/arm_arch_timer.c >>> +++ b/drivers/clocksource/arm_arch_timer.c >>> @@ -83,6 +83,51 @@ static bool arch_timer_mem_use_virtual; >>> * Architected system timer support. >>> */ >>> >>> +#ifdef CONFIG_FSL_ERRATUM_A008585 >>> +DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled); >>> +EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled); >>> + >>> +/* >>> + * __always_inline is used to ensure that func() is not an actual function >>> + * pointer, which would result in the register accesses potentially being too >>> + * far apart for the loop to work. >>> + * >>> + * The timeout is an arbitrary value well beyond the highest number >>> + * of iterations the loop has been observed to take. >>> + */ >>> +static __always_inline u64 fsl_a008585_reread_counter(u64 (*func)(void)) >>> +{ >>> + u64 cval_old, cval_new; >>> + int timeout = 200; >>> + >>> + do { >>> + isb(); >>> + cval_old = func(); >>> + cval_new = func(); >>> + timeout--; >>> + } while (unlikely(cval_old != cval_new) && timeout); >>> + >>> + WARN_ON_ONCE(!timeout); >>> + return cval_new; >>> +} >> Hi Scott: >> >> I have test this patch, this solution looks will break the performance a little more than I expected. >> it will have more than 10% that the cval will read again, we could sure that the cval_old always equal to the >> cval_new in the normal circumstances, so I prefer this way: >> >> do { >> isb(); >> cval_old = func(); >> cval_new = func(); >> timeout--; >> } while (unlikely((cval_new - cval_old) >> 2) && timeout); > > What makes you think that ignoring the two bottom bits is a safe thing > to do? Talking about performance when the HW has such a dramatic bug is > like putting a bigger engine on a car that has no brakes: you just hit > the wall quicker. > > Thanks, > I have a chip which has the same problem like Scott's chip, and I wish to solve this problem in the same way, our chip designer told me that if you got a wrong value from the cntvct_el0, you would not get a wrong value until 8 cycles later, so I could ignoring the lowest 3 bits if I reading twice together. The key problem is the probability of this bug, my chip has 1/100000 chance to met this bug, so use 10% performance to fix this bug looks more expensive. Thanks. Ding > M. > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/07/16 12:37, Ding Tianhong wrote: > On 2016/7/7 17:49, Marc Zyngier wrote: >> On 07/07/16 10:34, Ding Tianhong wrote: >>> On 2016/7/2 6:41, Scott Wood wrote: >>>> Erratum A-008585 says that the ARM generic timer counter "has the >>>> potential to contain an erroneous value for a small number of core >>>> clock cycles every time the timer value changes". Accesses to TVAL >>>> (both read and write) are also affected due to the implicit counter >>>> read. Accesses to CVAL are not affected. >>>> >>>> The workaround is to reread TVAL and count registers until successive reads >>>> return the same value, and when writing TVAL to retry until counter >>>> reads before and after the write return the same value. >>>> >>>> This erratum can be found on LS1043A and LS2080A. >>>> >>>> Signed-off-by: Scott Wood <oss@buserror.net> >>>> --- >>>> v3: >>>> - Used cval rather than a loop for the write side of the erratum >>>> - Added a Kconfig control >>>> - Moved the device tree binding into its own patch >>>> - Added erratum to silicon-errata.txt >>>> - Changed function names to contain the erratum name >>>> - Factored out the setting of erratum versions of set_next_event >>>> to improve readability >>>> - Added a comment clarifying that the timeout is arbitrary >>>> >>>> v2: >>>> Significant rework based on feedback, including using static_key, >>>> disabling VDSO counter access rather than adding the workaround to the >>>> VDSO, and uninlining the loops. >>>> >>>> Dropped the separate property for indicating that writes to TVAL are >>>> affected, as I believe that's just a side effect of the implicit >>>> counter read being corrupted, and thus a chip that is affected by one >>>> will always be affected by the other. >>>> >>>> Dropped the arm32 portion as it seems there was confusion about whether >>>> LS1021A is affected. Currently I am being told that it is not >>>> affected. >>>> >>>> I considered writing to CVAL rather than looping on TVAL writes, but >>>> that would still have required separate set_next_event() code for the >>>> erratum, and adding CVAL to the enum would have required a bunch of >>>> extra handlers in switch statements (even where unused, due to compiler >>>> warnings about unhandled enum values) including in an arm32 header. It >>>> seemed better to avoid the arm32 interaction and new untested >>>> accessors. >>>> --- >>>> Documentation/arm64/silicon-errata.txt | 2 + >>>> arch/arm64/include/asm/arch_timer.h | 48 ++++++++++++--- >>>> drivers/clocksource/Kconfig | 10 ++++ >>>> drivers/clocksource/arm_arch_timer.c | 103 +++++++++++++++++++++++++++++++++ >>>> 4 files changed, 154 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt >>>> index ba4b6ac..5778f62 100644 >>>> --- a/Documentation/arm64/silicon-errata.txt >>>> +++ b/Documentation/arm64/silicon-errata.txt >>>> @@ -57,3 +57,5 @@ stable kernels. >>>> | Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 | >>>> | Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 | >>>> | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | >>>> +| | | | | >>>> +| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | >>>> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h >>>> index fbe0ca3..70fbad9 100644 >>>> --- a/arch/arm64/include/asm/arch_timer.h >>>> +++ b/arch/arm64/include/asm/arch_timer.h >>>> @@ -23,10 +23,34 @@ >>>> >>>> #include <linux/bug.h> >>>> #include <linux/init.h> >>>> +#include <linux/jump_label.h> >>>> #include <linux/types.h> >>>> >>>> #include <clocksource/arm_arch_timer.h> >>>> >>>> +extern struct static_key_false arch_timer_read_ool_enabled; >>>> + >>>> +#define ARCH_TIMER_REG_READ(reg, func) \ >>>> +extern u64 func##_ool(void); \ >>>> +static inline u64 __##func(void) \ >>>> +{ \ >>>> + u64 val; \ >>>> + asm volatile("mrs %0, " reg : "=r" (val)); \ >>>> + return val; \ >>>> +} \ >>>> +static inline u64 _##func(void) \ >>>> +{ \ >>>> + if (IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) && \ >>>> + static_branch_unlikely(&arch_timer_read_ool_enabled)) \ >>>> + return func##_ool(); \ >>>> + else \ >>>> + return __##func(); \ >>>> +} >>>> + >>>> +ARCH_TIMER_REG_READ("cntp_tval_el0", arch_timer_get_ptval) >>>> +ARCH_TIMER_REG_READ("cntv_tval_el0", arch_timer_get_vtval) >>>> +ARCH_TIMER_REG_READ("cntvct_el0", arch_counter_get_cntvct) >>>> + >>>> /* >>>> * These register accessors are marked inline so the compiler can >>>> * nicely work out which register we want, and chuck away the rest of >>>> @@ -58,6 +82,16 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val) >>>> isb(); >>>> } >>>> >>>> +static __always_inline void arch_timer_cval_write_cp15(int access, u64 val) >>>> +{ >>>> + if (access == ARCH_TIMER_PHYS_ACCESS) >>>> + asm volatile("msr cntp_cval_el0, %0" : : "r" (val)); >>>> + else if (access == ARCH_TIMER_VIRT_ACCESS) >>>> + asm volatile("msr cntv_cval_el0, %0" : : "r" (val)); >>>> + >>>> + isb(); >>>> +} >>>> + >>>> static __always_inline >>>> u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >>>> { >>>> @@ -66,19 +100,19 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >>>> if (access == ARCH_TIMER_PHYS_ACCESS) { >>>> switch (reg) { >>>> case ARCH_TIMER_REG_CTRL: >>>> - asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >>>> + asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >>>> break; >>>> case ARCH_TIMER_REG_TVAL: >>>> - asm volatile("mrs %0, cntp_tval_el0" : "=r" (val)); >>>> + val = _arch_timer_get_ptval(); >>>> break; >>>> } >>>> } else if (access == ARCH_TIMER_VIRT_ACCESS) { >>>> switch (reg) { >>>> case ARCH_TIMER_REG_CTRL: >>>> - asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >>>> + asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >>>> break; >>>> case ARCH_TIMER_REG_TVAL: >>>> - asm volatile("mrs %0, cntv_tval_el0" : "=r" (val)); >>>> + val = _arch_timer_get_vtval(); >>>> break; >>>> } >>>> } >>>> @@ -116,12 +150,8 @@ static inline u64 arch_counter_get_cntpct(void) >>>> >>>> static inline u64 arch_counter_get_cntvct(void) >>>> { >>>> - u64 cval; >>>> - >>>> isb(); >>>> - asm volatile("mrs %0, cntvct_el0" : "=r" (cval)); >>>> - >>>> - return cval; >>>> + return _arch_counter_get_cntvct(); >>>> } >>>> >>>> static inline int arch_timer_arch_init(void) >>>> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig >>>> index c346be6..672ddc3 100644 >>>> --- a/drivers/clocksource/Kconfig >>>> +++ b/drivers/clocksource/Kconfig >>>> @@ -207,6 +207,16 @@ config ARM_ARCH_TIMER_EVTSTREAM >>>> This must be disabled for hardware validation purposes to detect any >>>> hardware anomalies of missing events. >>>> >>>> +config FSL_ERRATUM_A008585 >>>> + bool "Workaround for Freescale/NXP Erratum A-008585" >>>> + default y >>>> + depends on ARM_ARCH_TIMER && ARM64 >>>> + help >>>> + This option enables a workaround for Freescale/NXP Erratum >>>> + A-008585 ("ARM generic timer may contain an erroneous >>>> + value"). The workaround will only be active if the >>>> + fsl,erratum-a008585 property is found in the timer node. >>>> + >>>> config ARM_GLOBAL_TIMER >>>> bool >>>> select CLKSRC_OF if OF >>>> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c >>>> index 5152b38..7ead4eb 100644 >>>> --- a/drivers/clocksource/arm_arch_timer.c >>>> +++ b/drivers/clocksource/arm_arch_timer.c >>>> @@ -83,6 +83,51 @@ static bool arch_timer_mem_use_virtual; >>>> * Architected system timer support. >>>> */ >>>> >>>> +#ifdef CONFIG_FSL_ERRATUM_A008585 >>>> +DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled); >>>> +EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled); >>>> + >>>> +/* >>>> + * __always_inline is used to ensure that func() is not an actual function >>>> + * pointer, which would result in the register accesses potentially being too >>>> + * far apart for the loop to work. >>>> + * >>>> + * The timeout is an arbitrary value well beyond the highest number >>>> + * of iterations the loop has been observed to take. >>>> + */ >>>> +static __always_inline u64 fsl_a008585_reread_counter(u64 (*func)(void)) >>>> +{ >>>> + u64 cval_old, cval_new; >>>> + int timeout = 200; >>>> + >>>> + do { >>>> + isb(); >>>> + cval_old = func(); >>>> + cval_new = func(); >>>> + timeout--; >>>> + } while (unlikely(cval_old != cval_new) && timeout); >>>> + >>>> + WARN_ON_ONCE(!timeout); >>>> + return cval_new; >>>> +} >>> Hi Scott: >>> >>> I have test this patch, this solution looks will break the performance a little more than I expected. >>> it will have more than 10% that the cval will read again, we could sure that the cval_old always equal to the >>> cval_new in the normal circumstances, so I prefer this way: >>> >>> do { >>> isb(); >>> cval_old = func(); >>> cval_new = func(); >>> timeout--; >>> } while (unlikely((cval_new - cval_old) >> 2) && timeout); >> >> What makes you think that ignoring the two bottom bits is a safe thing >> to do? Talking about performance when the HW has such a dramatic bug is >> like putting a bigger engine on a car that has no brakes: you just hit >> the wall quicker. >> >> Thanks, >> > > I have a chip which has the same problem like Scott's chip, and I > wish to solve this problem in the same way, our chip designer told me > that if you got a wrong value from the cntvct_el0, you would not get > a wrong value until 8 cycles later, so I could ignoring the lowest 3 > bits if I reading twice together. Is that CPU cycles? Or timer cycles? What guarantees do you have that the two reads are *always* done in the right timing window? > The key problem is the probability of this bug, my chip has 1/100000 > chance to met this bug, so use 10% performance to fix this bug looks > more expensive. You care about performance, I care about correctness. If 10% of your CPU is wasted on a correct workaround, tough. Next time, your HW guy will spend more time getting it right. Anyway, what you are describing is a different bug from what FSL has, so don't try and shoehorn your own constraints in another workaround. Please implement your own. Thanks, M.
On 2016/7/7 19:51, Marc Zyngier wrote: > On 07/07/16 12:37, Ding Tianhong wrote: >> On 2016/7/7 17:49, Marc Zyngier wrote: >>> On 07/07/16 10:34, Ding Tianhong wrote: >>>> On 2016/7/2 6:41, Scott Wood wrote: >>>>> Erratum A-008585 says that the ARM generic timer counter "has the >>>>> potential to contain an erroneous value for a small number of core >>>>> clock cycles every time the timer value changes". Accesses to TVAL >>>>> (both read and write) are also affected due to the implicit counter >>>>> read. Accesses to CVAL are not affected. >>>>> >>>>> The workaround is to reread TVAL and count registers until successive reads >>>>> return the same value, and when writing TVAL to retry until counter >>>>> reads before and after the write return the same value. >>>>> >>>>> This erratum can be found on LS1043A and LS2080A. >>>>> >>>>> Signed-off-by: Scott Wood <oss@buserror.net> >>>>> --- >>>>> v3: >>>>> - Used cval rather than a loop for the write side of the erratum >>>>> - Added a Kconfig control >>>>> - Moved the device tree binding into its own patch >>>>> - Added erratum to silicon-errata.txt >>>>> - Changed function names to contain the erratum name >>>>> - Factored out the setting of erratum versions of set_next_event >>>>> to improve readability >>>>> - Added a comment clarifying that the timeout is arbitrary >>>>> >>>>> v2: >>>>> Significant rework based on feedback, including using static_key, >>>>> disabling VDSO counter access rather than adding the workaround to the >>>>> VDSO, and uninlining the loops. >>>>> >>>>> Dropped the separate property for indicating that writes to TVAL are >>>>> affected, as I believe that's just a side effect of the implicit >>>>> counter read being corrupted, and thus a chip that is affected by one >>>>> will always be affected by the other. >>>>> >>>>> Dropped the arm32 portion as it seems there was confusion about whether >>>>> LS1021A is affected. Currently I am being told that it is not >>>>> affected. >>>>> >>>>> I considered writing to CVAL rather than looping on TVAL writes, but >>>>> that would still have required separate set_next_event() code for the >>>>> erratum, and adding CVAL to the enum would have required a bunch of >>>>> extra handlers in switch statements (even where unused, due to compiler >>>>> warnings about unhandled enum values) including in an arm32 header. It >>>>> seemed better to avoid the arm32 interaction and new untested >>>>> accessors. >>>>> --- >>>>> Documentation/arm64/silicon-errata.txt | 2 + >>>>> arch/arm64/include/asm/arch_timer.h | 48 ++++++++++++--- >>>>> drivers/clocksource/Kconfig | 10 ++++ >>>>> drivers/clocksource/arm_arch_timer.c | 103 +++++++++++++++++++++++++++++++++ >>>>> 4 files changed, 154 insertions(+), 9 deletions(-) >>>>> >>>>> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt >>>>> index ba4b6ac..5778f62 100644 >>>>> --- a/Documentation/arm64/silicon-errata.txt >>>>> +++ b/Documentation/arm64/silicon-errata.txt >>>>> @@ -57,3 +57,5 @@ stable kernels. >>>>> | Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 | >>>>> | Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 | >>>>> | Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 | >>>>> +| | | | | >>>>> +| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 | >>>>> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h >>>>> index fbe0ca3..70fbad9 100644 >>>>> --- a/arch/arm64/include/asm/arch_timer.h >>>>> +++ b/arch/arm64/include/asm/arch_timer.h >>>>> @@ -23,10 +23,34 @@ >>>>> >>>>> #include <linux/bug.h> >>>>> #include <linux/init.h> >>>>> +#include <linux/jump_label.h> >>>>> #include <linux/types.h> >>>>> >>>>> #include <clocksource/arm_arch_timer.h> >>>>> >>>>> +extern struct static_key_false arch_timer_read_ool_enabled; >>>>> + >>>>> +#define ARCH_TIMER_REG_READ(reg, func) \ >>>>> +extern u64 func##_ool(void); \ >>>>> +static inline u64 __##func(void) \ >>>>> +{ \ >>>>> + u64 val; \ >>>>> + asm volatile("mrs %0, " reg : "=r" (val)); \ >>>>> + return val; \ >>>>> +} \ >>>>> +static inline u64 _##func(void) \ >>>>> +{ \ >>>>> + if (IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) && \ >>>>> + static_branch_unlikely(&arch_timer_read_ool_enabled)) \ >>>>> + return func##_ool(); \ >>>>> + else \ >>>>> + return __##func(); \ >>>>> +} >>>>> + >>>>> +ARCH_TIMER_REG_READ("cntp_tval_el0", arch_timer_get_ptval) >>>>> +ARCH_TIMER_REG_READ("cntv_tval_el0", arch_timer_get_vtval) >>>>> +ARCH_TIMER_REG_READ("cntvct_el0", arch_counter_get_cntvct) >>>>> + >>>>> /* >>>>> * These register accessors are marked inline so the compiler can >>>>> * nicely work out which register we want, and chuck away the rest of >>>>> @@ -58,6 +82,16 @@ void arch_timer_reg_write_cp15(int access, enum arch_timer_reg reg, u32 val) >>>>> isb(); >>>>> } >>>>> >>>>> +static __always_inline void arch_timer_cval_write_cp15(int access, u64 val) >>>>> +{ >>>>> + if (access == ARCH_TIMER_PHYS_ACCESS) >>>>> + asm volatile("msr cntp_cval_el0, %0" : : "r" (val)); >>>>> + else if (access == ARCH_TIMER_VIRT_ACCESS) >>>>> + asm volatile("msr cntv_cval_el0, %0" : : "r" (val)); >>>>> + >>>>> + isb(); >>>>> +} >>>>> + >>>>> static __always_inline >>>>> u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >>>>> { >>>>> @@ -66,19 +100,19 @@ u32 arch_timer_reg_read_cp15(int access, enum arch_timer_reg reg) >>>>> if (access == ARCH_TIMER_PHYS_ACCESS) { >>>>> switch (reg) { >>>>> case ARCH_TIMER_REG_CTRL: >>>>> - asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >>>>> + asm volatile("mrs %0, cntp_ctl_el0" : "=r" (val)); >>>>> break; >>>>> case ARCH_TIMER_REG_TVAL: >>>>> - asm volatile("mrs %0, cntp_tval_el0" : "=r" (val)); >>>>> + val = _arch_timer_get_ptval(); >>>>> break; >>>>> } >>>>> } else if (access == ARCH_TIMER_VIRT_ACCESS) { >>>>> switch (reg) { >>>>> case ARCH_TIMER_REG_CTRL: >>>>> - asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >>>>> + asm volatile("mrs %0, cntv_ctl_el0" : "=r" (val)); >>>>> break; >>>>> case ARCH_TIMER_REG_TVAL: >>>>> - asm volatile("mrs %0, cntv_tval_el0" : "=r" (val)); >>>>> + val = _arch_timer_get_vtval(); >>>>> break; >>>>> } >>>>> } >>>>> @@ -116,12 +150,8 @@ static inline u64 arch_counter_get_cntpct(void) >>>>> >>>>> static inline u64 arch_counter_get_cntvct(void) >>>>> { >>>>> - u64 cval; >>>>> - >>>>> isb(); >>>>> - asm volatile("mrs %0, cntvct_el0" : "=r" (cval)); >>>>> - >>>>> - return cval; >>>>> + return _arch_counter_get_cntvct(); >>>>> } >>>>> >>>>> static inline int arch_timer_arch_init(void) >>>>> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig >>>>> index c346be6..672ddc3 100644 >>>>> --- a/drivers/clocksource/Kconfig >>>>> +++ b/drivers/clocksource/Kconfig >>>>> @@ -207,6 +207,16 @@ config ARM_ARCH_TIMER_EVTSTREAM >>>>> This must be disabled for hardware validation purposes to detect any >>>>> hardware anomalies of missing events. >>>>> >>>>> +config FSL_ERRATUM_A008585 >>>>> + bool "Workaround for Freescale/NXP Erratum A-008585" >>>>> + default y >>>>> + depends on ARM_ARCH_TIMER && ARM64 >>>>> + help >>>>> + This option enables a workaround for Freescale/NXP Erratum >>>>> + A-008585 ("ARM generic timer may contain an erroneous >>>>> + value"). The workaround will only be active if the >>>>> + fsl,erratum-a008585 property is found in the timer node. >>>>> + >>>>> config ARM_GLOBAL_TIMER >>>>> bool >>>>> select CLKSRC_OF if OF >>>>> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c >>>>> index 5152b38..7ead4eb 100644 >>>>> --- a/drivers/clocksource/arm_arch_timer.c >>>>> +++ b/drivers/clocksource/arm_arch_timer.c >>>>> @@ -83,6 +83,51 @@ static bool arch_timer_mem_use_virtual; >>>>> * Architected system timer support. >>>>> */ >>>>> >>>>> +#ifdef CONFIG_FSL_ERRATUM_A008585 >>>>> +DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled); >>>>> +EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled); >>>>> + >>>>> +/* >>>>> + * __always_inline is used to ensure that func() is not an actual function >>>>> + * pointer, which would result in the register accesses potentially being too >>>>> + * far apart for the loop to work. >>>>> + * >>>>> + * The timeout is an arbitrary value well beyond the highest number >>>>> + * of iterations the loop has been observed to take. >>>>> + */ >>>>> +static __always_inline u64 fsl_a008585_reread_counter(u64 (*func)(void)) >>>>> +{ >>>>> + u64 cval_old, cval_new; >>>>> + int timeout = 200; >>>>> + >>>>> + do { >>>>> + isb(); >>>>> + cval_old = func(); >>>>> + cval_new = func(); >>>>> + timeout--; >>>>> + } while (unlikely(cval_old != cval_new) && timeout); >>>>> + >>>>> + WARN_ON_ONCE(!timeout); >>>>> + return cval_new; >>>>> +} >>>> Hi Scott: >>>> >>>> I have test this patch, this solution looks will break the performance a little more than I expected. >>>> it will have more than 10% that the cval will read again, we could sure that the cval_old always equal to the >>>> cval_new in the normal circumstances, so I prefer this way: >>>> >>>> do { >>>> isb(); >>>> cval_old = func(); >>>> cval_new = func(); >>>> timeout--; >>>> } while (unlikely((cval_new - cval_old) >> 2) && timeout); >>> >>> What makes you think that ignoring the two bottom bits is a safe thing >>> to do? Talking about performance when the HW has such a dramatic bug is >>> like putting a bigger engine on a car that has no brakes: you just hit >>> the wall quicker. >>> >>> Thanks, >>> >> >> I have a chip which has the same problem like Scott's chip, and I >> wish to solve this problem in the same way, our chip designer told me >> that if you got a wrong value from the cntvct_el0, you would not get >> a wrong value until 8 cycles later, so I could ignoring the lowest 3 >> bits if I reading twice together. > > Is that CPU cycles? Or timer cycles? What guarantees do you have that > the two reads are *always* done in the right timing window? > The timer counter only use 56 bits in aarch64, my chip would change one of the higher bit(55 to 3) to a wrong value when occur bug, so there will be more than 8 cycles between correct value and wrong value from the timer counter. Maybe Scott's problem is not just like mine. >> The key problem is the probability of this bug, my chip has 1/100000 >> chance to met this bug, so use 10% performance to fix this bug looks >> more expensive. > > You care about performance, I care about correctness. If 10% of your CPU > is wasted on a correct workaround, tough. Next time, your HW guy will > spend more time getting it right. > > Anyway, what you are describing is a different bug from what FSL has, so > don't try and shoehorn your own constraints in another workaround. > Please implement your own. > OK, In deed I should, thanks. Ding > Thanks, > > M. > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2016-07-07 at 20:59 +0800, Ding Tianhong wrote: > On 2016/7/7 19:51, Marc Zyngier wrote: > > > > On 07/07/16 12:37, Ding Tianhong wrote: > > > > > > On 2016/7/7 17:49, Marc Zyngier wrote: > > > > > > > > What makes you think that ignoring the two bottom bits is a safe thing > > > > to do? Talking about performance when the HW has such a dramatic bug > > > > is > > > > like putting a bigger engine on a car that has no brakes: you just hit > > > > the wall quicker. > > > > > > > > Thanks, > > > > > > > I have a chip which has the same problem like Scott's chip, and I > > > wish to solve this problem in the same way, our chip designer told me > > > that if you got a wrong value from the cntvct_el0, you would not get > > > a wrong value until 8 cycles later, so I could ignoring the lowest 3 > > > bits if I reading twice together. > > Is that CPU cycles? Or timer cycles? What guarantees do you have that > > the two reads are *always* done in the right timing window? > > > The timer counter only use 56 bits in aarch64, my chip would change one of > the higher > bit(55 to 3) to a wrong value when occur bug, so there will be more than 8 > cycles between > correct value and wrong value from the timer counter. Maybe Scott's problem > is not just like > mine. It's not like yours. Most errors I saw were time going backwards by 1, 3, or 7 cycles (with occasional larger errors). -Scott -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/7/8 1:39, Scott Wood wrote: > On Thu, 2016-07-07 at 20:59 +0800, Ding Tianhong wrote: >> On 2016/7/7 19:51, Marc Zyngier wrote: >>> >>> On 07/07/16 12:37, Ding Tianhong wrote: >>>> >>>> On 2016/7/7 17:49, Marc Zyngier wrote: >>>>> >>>>> What makes you think that ignoring the two bottom bits is a safe thing >>>>> to do? Talking about performance when the HW has such a dramatic bug >>>>> is >>>>> like putting a bigger engine on a car that has no brakes: you just hit >>>>> the wall quicker. >>>>> >>>>> Thanks, >>>>> >>>> I have a chip which has the same problem like Scott's chip, and I >>>> wish to solve this problem in the same way, our chip designer told me >>>> that if you got a wrong value from the cntvct_el0, you would not get >>>> a wrong value until 8 cycles later, so I could ignoring the lowest 3 >>>> bits if I reading twice together. >>> Is that CPU cycles? Or timer cycles? What guarantees do you have that >>> the two reads are *always* done in the right timing window? >>> >> The timer counter only use 56 bits in aarch64, my chip would change one of >> the higher >> bit(55 to 3) to a wrong value when occur bug, so there will be more than 8 >> cycles between >> correct value and wrong value from the timer counter. Maybe Scott's problem >> is not just like >> mine. > > It's not like yours. Most errors I saw were time going backwards by 1, 3, or > 7 cycles (with occasional larger errors). > Looks more series, agree with your solution, thanks。 Thanks. Ding > -Scott > > > . > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/devicetree/bindings/arm/arch_timer.txt b/Documentation/devicetree/bindings/arm/arch_timer.txt index e774128..ef5fbe9 100644 --- a/Documentation/devicetree/bindings/arm/arch_timer.txt +++ b/Documentation/devicetree/bindings/arm/arch_timer.txt @@ -25,6 +25,12 @@ to deliver its interrupts via SPIs. - always-on : a boolean property. If present, the timer is powered through an always-on power domain, therefore it never loses context. +- fsl,erratum-a008585 : A boolean property. Indicates the presence of + QorIQ erratum A-008585, which says that reading the counter is + unreliable unless the same value is returned by back-to-back reads. + This also affects writes to the tval register, due to the implicit + counter read. + ** Optional properties: - arm,cpu-registers-not-fw-configured : Firmware does not initialize
This erratum describes a bug in logic outside the core, so MIDR can't be used to identify its presence, and reading an SoC-specific revision register from common arch timer code would be awkward. So, describe it in the device tree. Signed-off-by: Scott Wood <oss@buserror.net> --- Documentation/devicetree/bindings/arm/arch_timer.txt | 6 ++++++ 1 file changed, 6 insertions(+)