Message ID | 20200505185723.191944-1-zenczykowski@gmail.com |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | Revert "ipv6: add mtu lock check in __ip6_rt_update_pmtu" | expand |
From: Maciej Żenczykowski <zenczykowski@gmail.com> Date: Tue, 5 May 2020 11:57:23 -0700 > The above reasoning is incorrect. IPv6 *requires* icmp based pmtu to work. > There's already a comment to this effect elsewhere in the kernel: How can an internet standard specify a system local policy decision on this level? If I want to lock the MTU value on my ipv6 routes, I have a reason and I should be able to do it.
I don't buy your argument at all. There's *lots* of places where internet standards prevent Linux from doing various things. Trying to prevent users from shooting themselves in the foot, or trying to be a good netizen. If users require their computers to be broken, they can patch and build their own kernels. Indeed, the entire point of internet standards is interoperability and specifying things that must or must not be done. To quote from https://tools.ietf.org/html/rfc8201 Nodes not implementing Path MTU Discovery must use the IPv6 minimum link MTU defined in [RFC8200] as the maximum packet size. (my comment: ie. 1280) ... Note that Path MTU Discovery must be performed even in cases where a node "thinks" a destination is attached to the same link as itself, as it might have a PMTU lower than the link MTU. In a situation such as when a neighboring router acts as proxy [ND] for some destination, the destination can appear to be directly connected, but it is in fact more than one hop away. ... When a node receives a Packet Too Big message, it must reduce its estimate of the PMTU for the relevant path, based on the value of the MTU field in the message. ... After receiving a Packet Too Big message, a node must attempt to avoid eliciting more such messages in the near future. The node must reduce the size of the packets it is sending along the path ... Because each of these messages (and the dropped packets they respond to) consume network resources, nodes using Path MTU Discovery must detect decreases in PMTU as fast as possible. -- Furthermore, as we're finally upgrading to 4.9+ kernels, we now have customers complaining about broken ipv6 pmtud. This is a userspace visible regression in previously correct behaviour. And we do have a reason for locking the mtu with the old pre-4.9 behaviour: So we can change the mtu of the interfaces without it affecting the mtu of the routes through those interfaces.
From: Maciej Żenczykowski <zenczykowski@gmail.com> Date: Tue, 5 May 2020 14:56:03 -0700 > There's *lots* of places where internet standards prevent Linux from > doing various things. "Linux" generally speaking? That's true only if "rm -rf net/netfilter/ net/ipv4/netfilter net/ipv6/netfilter" Also, insert an XDP program... --> "non compliant" And so on and so forth. :-) It's local system policy, how do I react to packets. If it doesn't violate the min/max limits for ipv6 packets it emits onto the internet I don't see this as something that can be seen as mandatory.
> It's local system policy, how do I react to packets. If it doesn't > violate the min/max limits for ipv6 packets it emits onto the internet > I don't see this as something that can be seen as mandatory. And if you *truly* do want to violate internet standards you can indeed already achieve this behaviour by dropping incoming icmpv6 packet too big errors (and there's lots of reasons why that is a bad idea...). I'll repeat what I said previously: this is a userspace visible regression in behaviour, of none or very questionable benefit. It results in TCP over IPv6 simply not working to destinations to which your locked mtu is higher then the real path mtu. This is why 'locked mtu' on IPv4 turns of the Don't Fragment bit - to allow fragmentation at intermediate routers. There is no such thing in IPv6. There is no DF bit, and there is no router fragmentation - all ipv6 fragmentation is supposed to happen at the source host. This is why hosts must either use 1280 min guaranteed mtu or be responsive to pmtu errors. Otherwise things simply don't work.
> > It's local system policy, how do I react to packets. If it doesn't > > violate the min/max limits for ipv6 packets it emits onto the internet > > I don't see this as something that can be seen as mandatory. It does violate the max limit for ipv6 packets it emits onto the internet. You're not allowed to emit > 1280 mtu packets without also supporting pmtu. > > And if you *truly* do want to violate internet standards you can > indeed already achieve this behaviour by dropping incoming icmpv6 > packet too big errors (and there's lots of reasons why that is a bad > idea...). > > I'll repeat what I said previously: this is a userspace visible > regression in behaviour, of none or very questionable benefit. > > It results in TCP over IPv6 simply not working to destinations to > which your locked mtu is higher then the real path mtu. This is why > 'locked mtu' on IPv4 turns of the Don't Fragment bit - to allow > fragmentation at intermediate routers. There is no such thing in > IPv6. > There is no DF bit, and there is no router fragmentation - all ipv6 > fragmentation is supposed to happen at the source host. > This is why hosts must either use 1280 min guaranteed mtu or be > responsive to pmtu errors. Otherwise things simply don't work.
From: Maciej Żenczykowski <zenczykowski@gmail.com> Date: Tue, 5 May 2020 11:57:23 -0700 > From: Maciej Żenczykowski <maze@google.com> > > This reverts commit 19bda36c4299ce3d7e5bce10bebe01764a655a6d: > > | ipv6: add mtu lock check in __ip6_rt_update_pmtu > | > | Prior to this patch, ipv6 didn't do mtu lock check in ip6_update_pmtu. > | It leaded to that mtu lock doesn't really work when receiving the pkt > | of ICMPV6_PKT_TOOBIG. > | > | This patch is to add mtu lock check in __ip6_rt_update_pmtu just as ipv4 > | did in __ip_rt_update_pmtu. > > The above reasoning is incorrect. IPv6 *requires* icmp based pmtu to work. > There's already a comment to this effect elsewhere in the kernel: > > $ git grep -p -B1 -A3 'RTAX_MTU lock' > net/ipv6/route.c=4813= > > static int rt6_mtu_change_route(struct fib6_info *f6i, void *p_arg) > ... > /* In IPv6 pmtu discovery is not optional, > so that RTAX_MTU lock cannot disable it. > We still use this lock to block changes > caused by addrconf/ndisc. > */ > > This reverts to the pre-4.9 behaviour. > > Signed-off-by: Maciej Żenczykowski <maze@google.com> > Fixes: 19bda36c4299 ("ipv6: add mtu lock check in __ip6_rt_update_pmtu") I've thought about this some more and decided to apply this and queue it up for -stable, thank you.
> I've thought about this some more and decided to apply this and > queue it up for -stable, thank you. Thank you!
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 8d418038fe32..ff847a324220 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2722,8 +2722,10 @@ static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk, const struct in6_addr *daddr, *saddr; struct rt6_info *rt6 = (struct rt6_info *)dst; - if (dst_metric_locked(dst, RTAX_MTU)) - return; + /* Note: do *NOT* check dst_metric_locked(dst, RTAX_MTU) + * IPv6 pmtu discovery isn't optional, so 'mtu lock' cannot disable it. + * [see also comment in rt6_mtu_change_route()] + */ if (iph) { daddr = &iph->daddr;