Message ID | 1357753459-12872-1-git-send-email-roland@kernel.org |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: Roland Dreier <roland@kernel.org> Date: Wed, 9 Jan 2013 09:44:19 -0800 > When a bonding interface changes slaves (or goes from no active slaves > to having a slave with link), the bonding driver generates a > NETDEV_CHANGEADDR notification. In this case, the ipv6 neighbour > discovery code calls neigh_changeaddr(), which basically just calls > neigh_flush_dev(). > > Now, neigh_flush_dev() just goes through the neighbour hash table and > tries to free every neighbour from the device being flushed. However, > if someone else has an additional reference on the neighbour, we hit ... > which leaves the final freeing of the "stray" neighbour until the last > reference is dropped; in the meantime the output function is set to > neigh_blackhole, which does nothing but free the skb and return ENETDOWN. Another reason we must make ipv6 like ipv4, which looks up neighbours on demand at packet output time rather than caching them in the route entries. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Another reason we must make ipv6 like ipv4, which looks up neighbours > on demand at packet output time rather than caching them in the route > entries. Presumable the neighbour could be left cached in the route entry provided the code checks that the cached entry is still valid and does a relookup if not (rather than a send-blackhole). David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 09, 2013 at 09:44:19AM -0800, Roland Dreier wrote: > From: Roland Dreier <roland@purestorage.com> > > This patch is as much bug report as it is a proposal to merge this > specific patch. The problem is definitely real; we hit it in a > situation where we have two systems connected back-to-back with two > bonded links between them, one system reboots, and the other system > gets NETDEV_CHANGEADDR. This patch definitely fixes that case for us, > but I'm not sure it's the right place to fix this, or if it covers all > the cases where this could happen. Anyway... Great analysis, thanks! There is bug report which indicates a fix to this problem could also solve some other corner cases regarding neighbour discovery: https://bugzilla.kernel.org/show_bug.cgi?id=42991 I will try to reproduce it at the weekend. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jan 11, 2013 at 02:40:58PM +0100, Hannes Frederic Sowa wrote: > On Wed, Jan 09, 2013 at 09:44:19AM -0800, Roland Dreier wrote: > > From: Roland Dreier <roland@purestorage.com> > > > > This patch is as much bug report as it is a proposal to merge this > > specific patch. The problem is definitely real; we hit it in a > > situation where we have two systems connected back-to-back with two > > bonded links between them, one system reboots, and the other system > > gets NETDEV_CHANGEADDR. This patch definitely fixes that case for us, > > but I'm not sure it's the right place to fix this, or if it covers all > > the cases where this could happen. Anyway... > > Great analysis, thanks! > > There is bug report which indicates a fix to this problem could > also solve some other corner cases regarding neighbour discovery: > https://bugzilla.kernel.org/show_bug.cgi?id=42991 The report I meant was actually not referred in this bug report: http://article.gmane.org/gmane.linux.network/224832 ENETDOWN was also observed via bridges. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jan 11, 2013 at 6:01 AM, Hannes Frederic Sowa <hannes@stressinduktion.org> wrote: > The report I meant was actually not referred in this bug report: > http://article.gmane.org/gmane.linux.network/224832 > > ENETDOWN was also observed via bridges. Yes, I would have to think any report of ENETDOWN with ipv6 and bridging is quite likely to be the same issue I hit. - R. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 10, 2013 at 2:18 PM, David Miller <davem@davemloft.net> wrote: > Another reason we must make ipv6 like ipv4, which looks up neighbours > on demand at packet output time rather than caching them in the route > entries. Not sure I'm qualified to perform that level of surgery, but I'll take a look at ipv4 and try to understand how that works. In the meantime does it make sense to put a smaller bandaid in 3.8? I'm pretty sure this is a regression (we never saw it with older kernels), probably due to some part of the route cache improvements you've been doing. - R. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Roland Dreier <roland@kernel.org> Date: Fri, 11 Jan 2013 13:13:42 -0800 > In the meantime does it make sense to put a smaller bandaid in 3.8? I want it fixed properly from the start, sorry. > I'm pretty sure this is a regression (we never saw it with older > kernels), probably due to some part of the route cache improvements > you've been doing. I do not believe this is the situation at all. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 710cafd..5895b1c 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1602,8 +1602,9 @@ static int fib6_age(struct rt6_info *rt, void *arg) } gc_args.more++; } else if (rt->rt6i_flags & RTF_CACHE) { - if (atomic_read(&rt->dst.__refcnt) == 0 && - time_after_eq(now, rt->dst.lastuse + gc_args.timeout)) { + if ((rt->n && rt->n->dead) || + (atomic_read(&rt->dst.__refcnt) == 0 && + time_after_eq(now, rt->dst.lastuse + gc_args.timeout))) { RT6_TRACE("aging clone %p\n", rt); return -1; } else if (rt->rt6i_flags & RTF_GATEWAY) {