Message ID | 20200923045143.3755128-1-zenczykowski@gmail.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [v2] net/ipv4: always honour route mtu during forwarding | expand |
On 9/23/20 6:51 AM, Maciej Żenczykowski wrote: > From: Maciej Żenczykowski <maze@google.com> > > Documentation/networking/ip-sysctl.txt:46 says: > ip_forward_use_pmtu - BOOLEAN > By default we don't trust protocol path MTUs while forwarding > because they could be easily forged and can lead to unwanted > fragmentation by the router. > You only need to enable this if you have user-space software > which tries to discover path mtus by itself and depends on the > kernel honoring this information. This is normally not the case. > Default: 0 (disabled) > Possible values: > 0 - disabled > 1 - enabled > > Which makes it pretty clear that setting it to 1 is a potential > security/safety/DoS issue, and yet it is entirely reasonable to want > forwarded traffic to honour explicitly administrator configured > route mtus (instead of defaulting to device mtu). > > Indeed, I can't think of a single reason why you wouldn't want to. > Since you configured a route mtu you probably know better... > > It is pretty common to have a higher device mtu to allow receiving > large (jumbo) frames, while having some routes via that interface > (potentially including the default route to the internet) specify > a lower mtu. > > Note that ipv6 forwarding uses device mtu unless the route is locked > (in which case it will use the route mtu). > > This approach is not usable for IPv4 where an 'mtu lock' on a route > also has the side effect of disabling TCP path mtu discovery via > disabling the IPv4 DF (don't frag) bit on all outgoing frames. > > I'm not aware of a way to lock a route from an IPv6 RA, so that also > potentially seems wrong. > > Signed-off-by: Maciej Żenczykowski <maze@google.com> > Cc: Eric Dumazet <maze@google.com> Note that my email address is more like : <edumazet@google.com> > Cc: Willem de Bruijn <willemb@google.com> > Cc: Lorenzo Colitti <lorenzo@google.com> > Cc: Sunmeet Gill (Sunny) <sgill@quicinc.com> > Cc: Vinay Paradkar <vparadka@qti.qualcomm.com> > Cc: Tyler Wear <twear@quicinc.com> > Cc: David Ahern <dsahern@kernel.org> > --- > include/net/ip.h | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/include/net/ip.h b/include/net/ip.h > index b09c48d862cc..c2188bebbc54 100644 > --- a/include/net/ip.h > +++ b/include/net/ip.h > @@ -436,12 +436,17 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst, > bool forwarding) > { > struct net *net = dev_net(dst->dev); > + unsigned int mtu; > > if (net->ipv4.sysctl_ip_fwd_use_pmtu || > ip_mtu_locked(dst) || > !forwarding) > return dst_mtu(dst); > > + /* 'forwarding = true' case should always honour route mtu */ > + mtu = dst_metric_raw(dst, RTAX_MTU); > + if (mtu) return mtu; if (mtu) return mtu; Apparently route mtu are capped to 65520, not sure where it is done exactly IP_MAX_MTU being 65535) # ip ro add 1.1.1.4 dev wlp2s0 mtu 100000 # ip ro get 1.1.1.4 1.1.1.4 dev wlp2s0 src 192.168.8.147 uid 0 cache mtu 65520 > + > return min(READ_ONCE(dst->dev->mtu), IP_MAX_MTU); > } > >
On 9/23/20 2:46 AM, Eric Dumazet wrote: >> diff --git a/include/net/ip.h b/include/net/ip.h >> index b09c48d862cc..c2188bebbc54 100644 >> --- a/include/net/ip.h >> +++ b/include/net/ip.h >> @@ -436,12 +436,17 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst, >> bool forwarding) >> { >> struct net *net = dev_net(dst->dev); >> + unsigned int mtu; >> >> if (net->ipv4.sysctl_ip_fwd_use_pmtu || >> ip_mtu_locked(dst) || >> !forwarding) >> return dst_mtu(dst); >> >> + /* 'forwarding = true' case should always honour route mtu */ >> + mtu = dst_metric_raw(dst, RTAX_MTU); >> + if (mtu) return mtu; > > > if (mtu) > return mtu; > > Apparently route mtu are capped to 65520, not sure where it is done exactly IP_MAX_MTU being 65535) ip_metrics_convert seems to be the place" if (type == RTAX_MTU && val > 65535 - 15) val = 65535 - 15; going back through the code moves, and it was added by Dave with 710ab6c03122c > > # ip ro add 1.1.1.4 dev wlp2s0 mtu 100000 > # ip ro get 1.1.1.4 > 1.1.1.4 dev wlp2s0 src 192.168.8.147 uid 0 > cache mtu 65520 > > > > >> + >> return min(READ_ONCE(dst->dev->mtu), IP_MAX_MTU); >> } >> >>
From: David Ahern <dsahern@gmail.com> Date: Wed, 23 Sep 2020 08:36:50 -0600 > On 9/23/20 2:46 AM, Eric Dumazet wrote: >> Apparently route mtu are capped to 65520, not sure where it is done exactly IP_MAX_MTU being 65535) > > ip_metrics_convert seems to be the place" > if (type == RTAX_MTU && val > 65535 - 15) > val = 65535 - 15; > > going back through the code moves, and it was added by Dave with > 710ab6c03122c At the time of that commit IP_MAX_MTU only existed in net/ipv4/route.c and it's value was 0xFFF0. So I didn't add anything :-) Just moving stuff around.
diff --git a/include/net/ip.h b/include/net/ip.h index b09c48d862cc..c2188bebbc54 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -436,12 +436,17 @@ static inline unsigned int ip_dst_mtu_maybe_forward(const struct dst_entry *dst, bool forwarding) { struct net *net = dev_net(dst->dev); + unsigned int mtu; if (net->ipv4.sysctl_ip_fwd_use_pmtu || ip_mtu_locked(dst) || !forwarding) return dst_mtu(dst); + /* 'forwarding = true' case should always honour route mtu */ + mtu = dst_metric_raw(dst, RTAX_MTU); + if (mtu) return mtu; + return min(READ_ONCE(dst->dev->mtu), IP_MAX_MTU); }