Message ID | 20150714114305.17434.53731.stgit@buzz |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, 2015-07-14 at 14:43 +0300, Konstantin Khlebnikov wrote: > Kernel generates a lot of warnings when dst entry reference counter > overflows and becomes negative. This patch prints address of dst entry, > its refcount and then resets reference counter to INT_MAX/2. > > That bug was seen several times at machines with outdated 3.10.y kernels. > Most like it's already fixed in upstream. Anyway flood of that warnings > completely kills machine and makes further debugging impossible. > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> > --- > net/core/dst.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/core/dst.c b/net/core/dst.c > index e956ce6d1378..2ed91082b3cf 100644 > --- a/net/core/dst.c > +++ b/net/core/dst.c > @@ -284,7 +284,8 @@ void dst_release(struct dst_entry *dst) > int newrefcnt; > > newrefcnt = atomic_dec_return(&dst->__refcnt); > - WARN_ON(newrefcnt < 0); > + if (WARN(newrefcnt < 0, "dst: %p refcnt: %d\n", dst, newrefcnt)) > + atomic_set(&dst->__refcnt, INT_MAX / 2); > if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) > call_rcu(&dst->rcu_head, dst_destroy_rcu); > } WARN_ON_ONCE() if you want, but setting __refcnt like this is absolutely a dirty hack. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 14.07.2015 15:04, Eric Dumazet wrote: > On Tue, 2015-07-14 at 14:43 +0300, Konstantin Khlebnikov wrote: >> Kernel generates a lot of warnings when dst entry reference counter >> overflows and becomes negative. This patch prints address of dst entry, >> its refcount and then resets reference counter to INT_MAX/2. >> >> That bug was seen several times at machines with outdated 3.10.y kernels. >> Most like it's already fixed in upstream. Anyway flood of that warnings >> completely kills machine and makes further debugging impossible. >> >> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> >> --- >> net/core/dst.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/net/core/dst.c b/net/core/dst.c >> index e956ce6d1378..2ed91082b3cf 100644 >> --- a/net/core/dst.c >> +++ b/net/core/dst.c >> @@ -284,7 +284,8 @@ void dst_release(struct dst_entry *dst) >> int newrefcnt; >> >> newrefcnt = atomic_dec_return(&dst->__refcnt); >> - WARN_ON(newrefcnt < 0); >> + if (WARN(newrefcnt < 0, "dst: %p refcnt: %d\n", dst, newrefcnt)) >> + atomic_set(&dst->__refcnt, INT_MAX / 2); >> if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) >> call_rcu(&dst->rcu_head, dst_destroy_rcu); >> } > > > WARN_ON_ONCE() if you want, but setting __refcnt like this is absolutely > a dirty hack. Simple warn-once will hide a lot of information which could be useful. Also dst entry leak is better than freeing actually active entry. > > >
On Tue, 2015-07-14 at 15:15 +0300, Konstantin Khlebnikov wrote: > Simple warn-once will hide a lot of information which could be useful. > Also dst entry leak is better than freeing actually active entry. Then BUG_ON() . Really, we need to fix leaks, not brown paper them. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Tue, 14 Jul 2015 14:26:07 +0200 > On Tue, 2015-07-14 at 15:15 +0300, Konstantin Khlebnikov wrote: > >> Simple warn-once will hide a lot of information which could be useful. >> Also dst entry leak is better than freeing actually active entry. > > Then BUG_ON() . > > Really, we need to fix leaks, not brown paper them. No, killing the machine is not the answer. If you want to rate limit this message, do it on a per-device basis, but without corrupting the netdev state in the process. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/dst.c b/net/core/dst.c index e956ce6d1378..2ed91082b3cf 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -284,7 +284,8 @@ void dst_release(struct dst_entry *dst) int newrefcnt; newrefcnt = atomic_dec_return(&dst->__refcnt); - WARN_ON(newrefcnt < 0); + if (WARN(newrefcnt < 0, "dst: %p refcnt: %d\n", dst, newrefcnt)) + atomic_set(&dst->__refcnt, INT_MAX / 2); if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) call_rcu(&dst->rcu_head, dst_destroy_rcu); }
Kernel generates a lot of warnings when dst entry reference counter overflows and becomes negative. This patch prints address of dst entry, its refcount and then resets reference counter to INT_MAX/2. That bug was seen several times at machines with outdated 3.10.y kernels. Most like it's already fixed in upstream. Anyway flood of that warnings completely kills machine and makes further debugging impossible. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> --- net/core/dst.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html