Message ID | 86a082ace1356cebc4430ea38256069e6e2966c3.1596487323.git.sbrivio@redhat.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | Support PMTU discovery with bridged UDP tunnels | expand |
On 8/3/20 2:52 PM, Stefano Brivio wrote: > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > index a01efa062f6b..c14fd8124f57 100644 > --- a/net/ipv4/route.c > +++ b/net/ipv4/route.c > @@ -1050,6 +1050,7 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk, > struct flowi4 fl4; > > ip_rt_build_flow_key(&fl4, sk, skb); > + fl4.flowi4_oif = 0; /* Don't make lookup fail for encapsulations */ > __ip_rt_update_pmtu(rt, &fl4, mtu); > } > > Can this be limited to: if (skb && netif_is_bridge_port(skb->dev) || netif_is_ovs_port(skb->dev)) fl4.flowi4_oif = 0; I'm not sure we want to reset oif for all MTU updates.
On Mon, 3 Aug 2020 17:30:46 -0600 David Ahern <dsahern@gmail.com> wrote: > On 8/3/20 2:52 PM, Stefano Brivio wrote: > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > > index a01efa062f6b..c14fd8124f57 100644 > > --- a/net/ipv4/route.c > > +++ b/net/ipv4/route.c > > @@ -1050,6 +1050,7 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk, > > struct flowi4 fl4; > > > > ip_rt_build_flow_key(&fl4, sk, skb); > > + fl4.flowi4_oif = 0; /* Don't make lookup fail for encapsulations */ > > __ip_rt_update_pmtu(rt, &fl4, mtu); > > } > > > > Can this be limited to: > if (skb && > netif_is_bridge_port(skb->dev) || netif_is_ovs_port(skb->dev)) > fl4.flowi4_oif = 0; > > I'm not sure we want to reset oif for all MTU updates. I think that generally speaking we might, because this is about the *path* MTU after all, so the output interface doesn't look very relevant. On the other hand, I couldn't find any other case where this makes a difference, and I guess it's better to eventually find out about those other cases if any, rather than fixing things by accident possibly in the wrong way. Changed in v2, thanks.
diff --git a/net/ipv4/route.c b/net/ipv4/route.c index a01efa062f6b..c14fd8124f57 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1050,6 +1050,7 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk, struct flowi4 fl4; ip_rt_build_flow_key(&fl4, sk, skb); + fl4.flowi4_oif = 0; /* Don't make lookup fail for encapsulations */ __ip_rt_update_pmtu(rt, &fl4, mtu); }
Currently, processes sending traffic to a local bridge with an encapsulation device as a port don't get ICMP errors if they exceed the PMTU of the encapsulated link. David Ahern suggested this as a hack, but it actually looks like the correct solution: when we update the PMTU for a given destination by means of updating or creating a route exception, the encapsulation might trigger this because of PMTU discovery happening either on the encapsulation device itself, or its lower layer. The output interface shouldn't matter, because we already have a valid destination. Drop the output interface restriction from the associated route lookup. For UDP tunnels, we will now have a route exception created for the encapsulation itself, with a MTU value reflecting its headroom, which allows a bridge forwarding IP packets originated locally to deliver errors back to the sending socket. The behaviour is now consistent with IPv6 and verified with selftests pmtu_ipv{4,6}_br_{geneve,vxlan}{4,6}_exception introduced later in this series. Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> --- net/ipv4/route.c | 1 + 1 file changed, 1 insertion(+)