Message ID | 20100414014930.GI2538@linux.vnet.ibm.com |
---|---|
State | RFC |
Delegated to: | David Miller |
Headers | show |
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Date: Tue, 13 Apr 2010 18:49:30 -0700 > On Tue, Apr 13, 2010 at 05:13:06PM -0700, David Miller wrote: >> From: Frederic Weisbecker <fweisbec@gmail.com> >> Date: Wed, 14 Apr 2010 02:02:27 +0200 >> >> > I just have a guess though.... >> > This seems to always happen from NMI path, and lockdep is disabled on NMI. >> > I suspect the lock_acquire() performed by rcu_read_lock() is just ignored >> > and then the rcu_read_lock_held() check has the wrong result... >> >> Yeah, I bet that's it too. >> >> lock_is_held() can't return anything meaningful while lockdep is >> disabled, which it is during NMIs. > > Ah! So I just need to add a "current->lockdep_recursion" > check to debug_lockdep_rcu_enabled(). And move the function to > kernel/rcutree_plugin.h to avoid #include hell. > > See below for (untested) patch. Looks good to me, Frederic please test :-) -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Paul E. McKenney wrote: > On Tue, Apr 13, 2010 at 05:13:06PM -0700, David Miller wrote: >> From: Frederic Weisbecker <fweisbec@gmail.com> >> Date: Wed, 14 Apr 2010 02:02:27 +0200 >> >>> I just have a guess though.... >>> This seems to always happen from NMI path, and lockdep is disabled on NMI. >>> I suspect the lock_acquire() performed by rcu_read_lock() is just ignored >>> and then the rcu_read_lock_held() check has the wrong result... >> Yeah, I bet that's it too. >> >> lock_is_held() can't return anything meaningful while lockdep is >> disabled, which it is during NMIs. > > Ah! So I just need to add a "current->lockdep_recursion" > check to debug_lockdep_rcu_enabled(). And move the function to > kernel/rcutree_plugin.h to avoid #include hell. > > See below for (untested) patch. > > Thanx, Paul > > ------------------------------------------------------------------------ > > include/linux/rcupdate.h | 5 +---- > kernel/rcutree_plugin.h | 11 +++++++++++ > 2 files changed, 12 insertions(+), 4 deletions(-) > > commit 304d8da6cd791a81ce3164f867e1b3ef4f9af1d1 > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Date: Tue Apr 13 18:45:51 2010 -0700 > > rcu: Make RCU lockdep check the lockdep_recursion variable > > The lockdep facility temporarily disables lockdep checking by incrementing > the current->lockdep_recursion variable. Such disabling happens in NMIs > and in other situations where lockdep might expect to recurse on itself. > This patch therefore checks current->lockdep_recursion, disabling RCU > lockdep splats when this variable is non-zero. > > Reported-by: Frederic Weisbecker <fweisbec@gmail.com> > Reported-by: David Miller <davem@davemloft.net> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index 9f1ddfe..07db2fe 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -101,10 +101,7 @@ extern struct lockdep_map rcu_sched_lock_map; > # define rcu_read_release_sched() \ > lock_release(&rcu_sched_lock_map, 1, _THIS_IP_) > > -static inline int debug_lockdep_rcu_enabled(void) > -{ > - return likely(rcu_scheduler_active && debug_locks); > -} > +extern int debug_lockdep_rcu_enabled(void); > > /** > * rcu_read_lock_held - might we be in RCU read-side critical section? > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > index 79b53bd..2169abe 100644 > --- a/kernel/rcutree_plugin.h > +++ b/kernel/rcutree_plugin.h > @@ -1067,3 +1067,14 @@ static void rcu_needs_cpu_flush(void) > } > > #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ > + > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + > +int debug_lockdep_rcu_enabled(void) > +{ > + return likely(rcu_scheduler_active && > + debug_locks && > + current->lockdep_recursion == 0); > +} > + Looks good to me too, but I think 'likely' is needless since the function is not inline. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 14, 2010 at 11:34:33AM +0800, Lai Jiangshan wrote: > Paul E. McKenney wrote: > > On Tue, Apr 13, 2010 at 05:13:06PM -0700, David Miller wrote: > >> From: Frederic Weisbecker <fweisbec@gmail.com> > >> Date: Wed, 14 Apr 2010 02:02:27 +0200 > >> > >>> I just have a guess though.... > >>> This seems to always happen from NMI path, and lockdep is disabled on NMI. > >>> I suspect the lock_acquire() performed by rcu_read_lock() is just ignored > >>> and then the rcu_read_lock_held() check has the wrong result... > >> Yeah, I bet that's it too. > >> > >> lock_is_held() can't return anything meaningful while lockdep is > >> disabled, which it is during NMIs. > > > > Ah! So I just need to add a "current->lockdep_recursion" > > check to debug_lockdep_rcu_enabled(). And move the function to > > kernel/rcutree_plugin.h to avoid #include hell. > > > > See below for (untested) patch. > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > include/linux/rcupdate.h | 5 +---- > > kernel/rcutree_plugin.h | 11 +++++++++++ > > 2 files changed, 12 insertions(+), 4 deletions(-) > > > > commit 304d8da6cd791a81ce3164f867e1b3ef4f9af1d1 > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Date: Tue Apr 13 18:45:51 2010 -0700 > > > > rcu: Make RCU lockdep check the lockdep_recursion variable > > > > The lockdep facility temporarily disables lockdep checking by incrementing > > the current->lockdep_recursion variable. Such disabling happens in NMIs > > and in other situations where lockdep might expect to recurse on itself. > > This patch therefore checks current->lockdep_recursion, disabling RCU > > lockdep splats when this variable is non-zero. > > > > Reported-by: Frederic Weisbecker <fweisbec@gmail.com> > > Reported-by: David Miller <davem@davemloft.net> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > > index 9f1ddfe..07db2fe 100644 > > --- a/include/linux/rcupdate.h > > +++ b/include/linux/rcupdate.h > > @@ -101,10 +101,7 @@ extern struct lockdep_map rcu_sched_lock_map; > > # define rcu_read_release_sched() \ > > lock_release(&rcu_sched_lock_map, 1, _THIS_IP_) > > > > -static inline int debug_lockdep_rcu_enabled(void) > > -{ > > - return likely(rcu_scheduler_active && debug_locks); > > -} > > +extern int debug_lockdep_rcu_enabled(void); > > > > /** > > * rcu_read_lock_held - might we be in RCU read-side critical section? > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > > index 79b53bd..2169abe 100644 > > --- a/kernel/rcutree_plugin.h > > +++ b/kernel/rcutree_plugin.h > > @@ -1067,3 +1067,14 @@ static void rcu_needs_cpu_flush(void) > > } > > > > #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ > > + > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > > + > > +int debug_lockdep_rcu_enabled(void) > > +{ > > + return likely(rcu_scheduler_active && > > + debug_locks && > > + current->lockdep_recursion == 0); > > +} > > + > > Looks good to me too, but I think > 'likely' is needless since the function is not inline. Good point. And to add injury to insult, I forgot EXPORT_SYMBOL_GPL(). Updated patch in the works. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 14, 2010 at 08:43:02AM -0700, Paul E. McKenney wrote: > On Wed, Apr 14, 2010 at 11:34:33AM +0800, Lai Jiangshan wrote: > > Paul E. McKenney wrote: > > > On Tue, Apr 13, 2010 at 05:13:06PM -0700, David Miller wrote: > > >> From: Frederic Weisbecker <fweisbec@gmail.com> > > >> Date: Wed, 14 Apr 2010 02:02:27 +0200 > > >> > > >>> I just have a guess though.... > > >>> This seems to always happen from NMI path, and lockdep is disabled on NMI. > > >>> I suspect the lock_acquire() performed by rcu_read_lock() is just ignored > > >>> and then the rcu_read_lock_held() check has the wrong result... > > >> Yeah, I bet that's it too. > > >> > > >> lock_is_held() can't return anything meaningful while lockdep is > > >> disabled, which it is during NMIs. > > > > > > Ah! So I just need to add a "current->lockdep_recursion" > > > check to debug_lockdep_rcu_enabled(). And move the function to > > > kernel/rcutree_plugin.h to avoid #include hell. > > > > > > See below for (untested) patch. > > > > > > Thanx, Paul > > > > > > ------------------------------------------------------------------------ > > > > > > include/linux/rcupdate.h | 5 +---- > > > kernel/rcutree_plugin.h | 11 +++++++++++ > > > 2 files changed, 12 insertions(+), 4 deletions(-) > > > > > > commit 304d8da6cd791a81ce3164f867e1b3ef4f9af1d1 > > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > Date: Tue Apr 13 18:45:51 2010 -0700 > > > > > > rcu: Make RCU lockdep check the lockdep_recursion variable > > > > > > The lockdep facility temporarily disables lockdep checking by incrementing > > > the current->lockdep_recursion variable. Such disabling happens in NMIs > > > and in other situations where lockdep might expect to recurse on itself. > > > This patch therefore checks current->lockdep_recursion, disabling RCU > > > lockdep splats when this variable is non-zero. > > > > > > Reported-by: Frederic Weisbecker <fweisbec@gmail.com> > > > Reported-by: David Miller <davem@davemloft.net> > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > > > index 9f1ddfe..07db2fe 100644 > > > --- a/include/linux/rcupdate.h > > > +++ b/include/linux/rcupdate.h > > > @@ -101,10 +101,7 @@ extern struct lockdep_map rcu_sched_lock_map; > > > # define rcu_read_release_sched() \ > > > lock_release(&rcu_sched_lock_map, 1, _THIS_IP_) > > > > > > -static inline int debug_lockdep_rcu_enabled(void) > > > -{ > > > - return likely(rcu_scheduler_active && debug_locks); > > > -} > > > +extern int debug_lockdep_rcu_enabled(void); > > > > > > /** > > > * rcu_read_lock_held - might we be in RCU read-side critical section? > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > > > index 79b53bd..2169abe 100644 > > > --- a/kernel/rcutree_plugin.h > > > +++ b/kernel/rcutree_plugin.h > > > @@ -1067,3 +1067,14 @@ static void rcu_needs_cpu_flush(void) > > > } > > > > > > #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ > > > + > > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > > > + > > > +int debug_lockdep_rcu_enabled(void) > > > +{ > > > + return likely(rcu_scheduler_active && > > > + debug_locks && > > > + current->lockdep_recursion == 0); > > > +} > > > + > > > > Looks good to me too, but I think > > 'likely' is needless since the function is not inline. > > Good point. And to add injury to insult, I forgot EXPORT_SYMBOL_GPL(). > > Updated patch in the works. Note I just tested the patch the previous one and it looks fine now. You can then safely consider the "general idea" fixes the problem :) Thanks. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 14, 2010 at 05:51:11PM +0200, Frederic Weisbecker wrote: > On Wed, Apr 14, 2010 at 08:43:02AM -0700, Paul E. McKenney wrote: > > On Wed, Apr 14, 2010 at 11:34:33AM +0800, Lai Jiangshan wrote: > > > Paul E. McKenney wrote: > > > > On Tue, Apr 13, 2010 at 05:13:06PM -0700, David Miller wrote: > > > >> From: Frederic Weisbecker <fweisbec@gmail.com> > > > >> Date: Wed, 14 Apr 2010 02:02:27 +0200 > > > >> > > > >>> I just have a guess though.... > > > >>> This seems to always happen from NMI path, and lockdep is disabled on NMI. > > > >>> I suspect the lock_acquire() performed by rcu_read_lock() is just ignored > > > >>> and then the rcu_read_lock_held() check has the wrong result... > > > >> Yeah, I bet that's it too. > > > >> > > > >> lock_is_held() can't return anything meaningful while lockdep is > > > >> disabled, which it is during NMIs. > > > > > > > > Ah! So I just need to add a "current->lockdep_recursion" > > > > check to debug_lockdep_rcu_enabled(). And move the function to > > > > kernel/rcutree_plugin.h to avoid #include hell. > > > > > > > > See below for (untested) patch. > > > > > > > > Thanx, Paul > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > include/linux/rcupdate.h | 5 +---- > > > > kernel/rcutree_plugin.h | 11 +++++++++++ > > > > 2 files changed, 12 insertions(+), 4 deletions(-) > > > > > > > > commit 304d8da6cd791a81ce3164f867e1b3ef4f9af1d1 > > > > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > Date: Tue Apr 13 18:45:51 2010 -0700 > > > > > > > > rcu: Make RCU lockdep check the lockdep_recursion variable > > > > > > > > The lockdep facility temporarily disables lockdep checking by incrementing > > > > the current->lockdep_recursion variable. Such disabling happens in NMIs > > > > and in other situations where lockdep might expect to recurse on itself. > > > > This patch therefore checks current->lockdep_recursion, disabling RCU > > > > lockdep splats when this variable is non-zero. > > > > > > > > Reported-by: Frederic Weisbecker <fweisbec@gmail.com> > > > > Reported-by: David Miller <davem@davemloft.net> > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > > > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > > > > index 9f1ddfe..07db2fe 100644 > > > > --- a/include/linux/rcupdate.h > > > > +++ b/include/linux/rcupdate.h > > > > @@ -101,10 +101,7 @@ extern struct lockdep_map rcu_sched_lock_map; > > > > # define rcu_read_release_sched() \ > > > > lock_release(&rcu_sched_lock_map, 1, _THIS_IP_) > > > > > > > > -static inline int debug_lockdep_rcu_enabled(void) > > > > -{ > > > > - return likely(rcu_scheduler_active && debug_locks); > > > > -} > > > > +extern int debug_lockdep_rcu_enabled(void); > > > > > > > > /** > > > > * rcu_read_lock_held - might we be in RCU read-side critical section? > > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > > > > index 79b53bd..2169abe 100644 > > > > --- a/kernel/rcutree_plugin.h > > > > +++ b/kernel/rcutree_plugin.h > > > > @@ -1067,3 +1067,14 @@ static void rcu_needs_cpu_flush(void) > > > > } > > > > > > > > #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ > > > > + > > > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > > > > + > > > > +int debug_lockdep_rcu_enabled(void) > > > > +{ > > > > + return likely(rcu_scheduler_active && > > > > + debug_locks && > > > > + current->lockdep_recursion == 0); > > > > +} > > > > + > > > > > > Looks good to me too, but I think > > > 'likely' is needless since the function is not inline. > > > > Good point. And to add injury to insult, I forgot EXPORT_SYMBOL_GPL(). > > > > Updated patch in the works. > > > Note I just tested the patch the previous one and it looks fine now. > You can then safely consider the "general idea" fixes the problem :) Thank you, Frederic!!! Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 9f1ddfe..07db2fe 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -101,10 +101,7 @@ extern struct lockdep_map rcu_sched_lock_map; # define rcu_read_release_sched() \ lock_release(&rcu_sched_lock_map, 1, _THIS_IP_) -static inline int debug_lockdep_rcu_enabled(void) -{ - return likely(rcu_scheduler_active && debug_locks); -} +extern int debug_lockdep_rcu_enabled(void); /** * rcu_read_lock_held - might we be in RCU read-side critical section? diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h index 79b53bd..2169abe 100644 --- a/kernel/rcutree_plugin.h +++ b/kernel/rcutree_plugin.h @@ -1067,3 +1067,14 @@ static void rcu_needs_cpu_flush(void) } #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ + +#ifdef CONFIG_DEBUG_LOCK_ALLOC + +int debug_lockdep_rcu_enabled(void) +{ + return likely(rcu_scheduler_active && + debug_locks && + current->lockdep_recursion == 0); +} + +#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */