Message ID | 20180309054708.14538-1-stewart@linux.vnet.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | Don't detect lock timeouts when timebase is invalid | expand |
On Fri, Mar 9, 2018 at 4:47 PM, Stewart Smith <stewart@linux.vnet.ibm.com> wrote: > We can have threads waiting on hmi_lock who have an > invalid timebase. Because of this, we want to go poke > the register directly rather than rely on > this_cpu()->tb_invalid (which won't have been set yet). Maybe we should just switch the polarity of tb_invalid to tb_valid instead? > > Without this patch, you get something like this when > you're injecting timebase errors: > [10976.202052846,4] WARNING: Lock has been spinning for 10976394ms > > Fixes: 84186ef0944c9413262f0974ddab3fb1343ccfe8 > Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> > --- > core/lock.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/core/lock.c b/core/lock.c > index 2d94ebb4e285..318f42d92d51 100644 > --- a/core/lock.c > +++ b/core/lock.c > @@ -140,6 +140,13 @@ static inline bool lock_timeout(unsigned long start) > unsigned long wait = tb_to_msecs(mftb()); > > if (wait - start > LOCK_TIMEOUT_MS) { > + /* > + * If the timebase is invalid, we shouldn't > + * throw an error. This is possible with pending HMIs > + * that need to recover TB. > + */ > + if( !(mfspr(SPR_TFMR) & SPR_TFMR_TB_VALID)) > + return false; > prlog(PR_WARNING, "WARNING: Lock has been "\ > "spinning for %lums\n", wait - start); > backtrace(); > -- > 2.14.3 > > _______________________________________________ > Skiboot mailing list > Skiboot@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/skiboot
Oliver <oohall@gmail.com> writes: > On Fri, Mar 9, 2018 at 4:47 PM, Stewart Smith > <stewart@linux.vnet.ibm.com> wrote: >> We can have threads waiting on hmi_lock who have an >> invalid timebase. Because of this, we want to go poke >> the register directly rather than rely on >> this_cpu()->tb_invalid (which won't have been set yet). > > Maybe we should just switch the polarity of tb_invalid to tb_valid > instead? possibly? Either way, there's going to be a window where it's lies. I do kind of like the "if (this_cpu()->tb_valid)" pattern more than the invalid one.
Stewart Smith <stewart@linux.vnet.ibm.com> writes: > We can have threads waiting on hmi_lock who have an > invalid timebase. Because of this, we want to go poke > the register directly rather than rely on > this_cpu()->tb_invalid (which won't have been set yet). > > Without this patch, you get something like this when > you're injecting timebase errors: > [10976.202052846,4] WARNING: Lock has been spinning for 10976394ms > > Fixes: 84186ef0944c9413262f0974ddab3fb1343ccfe8 > Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> > --- > core/lock.c | 7 +++++++ > 1 file changed, 7 insertions(+) Merged to master as of a8e6cc3f47525f86ef1d69d69a477b6264d0f8ee
diff --git a/core/lock.c b/core/lock.c index 2d94ebb4e285..318f42d92d51 100644 --- a/core/lock.c +++ b/core/lock.c @@ -140,6 +140,13 @@ static inline bool lock_timeout(unsigned long start) unsigned long wait = tb_to_msecs(mftb()); if (wait - start > LOCK_TIMEOUT_MS) { + /* + * If the timebase is invalid, we shouldn't + * throw an error. This is possible with pending HMIs + * that need to recover TB. + */ + if( !(mfspr(SPR_TFMR) & SPR_TFMR_TB_VALID)) + return false; prlog(PR_WARNING, "WARNING: Lock has been "\ "spinning for %lums\n", wait - start); backtrace();
We can have threads waiting on hmi_lock who have an invalid timebase. Because of this, we want to go poke the register directly rather than rely on this_cpu()->tb_invalid (which won't have been set yet). Without this patch, you get something like this when you're injecting timebase errors: [10976.202052846,4] WARNING: Lock has been spinning for 10976394ms Fixes: 84186ef0944c9413262f0974ddab3fb1343ccfe8 Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> --- core/lock.c | 7 +++++++ 1 file changed, 7 insertions(+)