Message ID | 20200720221118.26148-1-mathieu.desnoyers@efficios.com |
---|---|
State | RFC |
Delegated to: | David Miller |
Headers | show |
Series | [RFC] Fix: ipv4/icmp: icmp error route lookup performed on wrong routing table | expand |
Adding IPv4/IPv6 maintainers in CC, along with David Ahern's k.org email address. ----- On Jul 20, 2020, at 6:11 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > As per RFC792, ICMP errors should be sent to the source host. > > However, in configurations with Virtual Forwarding and Routing tables, > looking up which routing table to use is currently done by using the > destination net_device. > > commit 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to > determine L3 domain") changes the interfaces passed to > l3mdev_master_ifindex() and inet_addr_type_dev_table() from skb_in->dev > to skb_dst(skb_in)->dev in order to fix a NULL pointer dereference. This > changes the interface used for routing table lookup from source to > destination. Therefore, if the source and destination interfaces are > within separate VFR, or one in the global routing table and the other in > a VFR, looking up the source host in the destination interface's routing > table is likely to fail. > > One observable effect of this issue is that traceroute does not work in > the following cases: > > - Route leaking between global routing table and VRF > - Route leaking between VRFs > > [ Note 1: I'm not entirely sure what routing table should be used when > param->replyopts.opt.opt.srr is set ? Is it valid to honor Strict > Source Route when sending an ICMP error ? ] > > [ Note 2: I notice that ipv6 icmp6_send() uses skb_dst(skb)->dev as > argument to l3mdev_master_ifindex(). I'm not sure if it is correct ? ] > > [ This patch is only compile-tested. ] > > Fixes: 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to determine L3 > domain") > Link: https://tools.ietf.org/html/rfc792 > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: David Ahern <dsa@cumulusnetworks.com> > Cc: David S. Miller <davem@davemloft.net> > Cc: netdev@vger.kernel.org > --- > net/ipv4/icmp.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > index e30515f89802..3d1da70c7293 100644 > --- a/net/ipv4/icmp.c > +++ b/net/ipv4/icmp.c > @@ -465,6 +465,7 @@ static struct rtable *icmp_route_lookup(struct net *net, > int type, int code, > struct icmp_bxm *param) > { > + struct net_device *route_lookup_dev; > struct rtable *rt, *rt2; > struct flowi4 fl4_dec; > int err; > @@ -479,7 +480,14 @@ static struct rtable *icmp_route_lookup(struct net *net, > fl4->flowi4_proto = IPPROTO_ICMP; > fl4->fl4_icmp_type = type; > fl4->fl4_icmp_code = code; > - fl4->flowi4_oif = l3mdev_master_ifindex(skb_dst(skb_in)->dev); > + /* > + * The device used for looking up which routing table to use is > + * preferably the source whenever it is set, which should ensure > + * the icmp error can be sent to the source host, else fallback > + * on the destination device. > + */ > + route_lookup_dev = skb_in->dev ? skb_in->dev : skb_dst(skb_in)->dev; > + fl4->flowi4_oif = l3mdev_master_ifindex(route_lookup_dev); > > security_skb_classify_flow(skb_in, flowi4_to_flowi(fl4)); > rt = ip_route_output_key_hash(net, fl4, skb_in); > @@ -503,7 +511,7 @@ static struct rtable *icmp_route_lookup(struct net *net, > if (err) > goto relookup_failed; > > - if (inet_addr_type_dev_table(net, skb_dst(skb_in)->dev, > + if (inet_addr_type_dev_table(net, route_lookup_dev, > fl4_dec.saddr) == RTN_LOCAL) { > rt2 = __ip_route_output_key(net, &fl4_dec); > if (IS_ERR(rt2)) > -- > 2.11.0
On 7/22/20 11:26 AM, Mathieu Desnoyers wrote: > Adding IPv4/IPv6 maintainers in CC, along with David Ahern's k.org email address. > > ----- On Jul 20, 2020, at 6:11 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > >> As per RFC792, ICMP errors should be sent to the source host. >> >> However, in configurations with Virtual Forwarding and Routing tables, FYI: you have it backwards; it is VRF as in Virtual Routing and Forwarding, not VFR. The commit message should be update accordingly. >> looking up which routing table to use is currently done by using the >> destination net_device. >> >> commit 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to >> determine L3 domain") changes the interfaces passed to >> l3mdev_master_ifindex() and inet_addr_type_dev_table() from skb_in->dev >> to skb_dst(skb_in)->dev in order to fix a NULL pointer dereference. This >> changes the interface used for routing table lookup from source to >> destination. Therefore, if the source and destination interfaces are >> within separate VFR, or one in the global routing table and the other in >> a VFR, looking up the source host in the destination interface's routing >> table is likely to fail. >> >> One observable effect of this issue is that traceroute does not work in >> the following cases: >> >> - Route leaking between global routing table and VRF >> - Route leaking between VRFs >> >> [ Note 1: I'm not entirely sure what routing table should be used when >> param->replyopts.opt.opt.srr is set ? Is it valid to honor Strict >> Source Route when sending an ICMP error ? ] >> >> [ Note 2: I notice that ipv6 icmp6_send() uses skb_dst(skb)->dev as >> argument to l3mdev_master_ifindex(). I'm not sure if it is correct ? ] >> >> [ This patch is only compile-tested. ] please devise a test using namespaces which demonstrates the problem and proves the change fixes it. The test can be added to tools/testing/selftests/net/fcnal-test.sh, use_cases(). VRF route leaking is only useful and relevant for the forwarding case, not locally generated packets, so the test case should be based on forwarding packets across VRFs. >> >> Fixes: 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to determine L3 >> domain") >> Link: https://tools.ietf.org/html/rfc792 >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> Cc: David Ahern <dsa@cumulusnetworks.com> >> Cc: David S. Miller <davem@davemloft.net> >> Cc: netdev@vger.kernel.org >> --- >> net/ipv4/icmp.c | 12 ++++++++++-- >> 1 file changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c >> index e30515f89802..3d1da70c7293 100644 >> --- a/net/ipv4/icmp.c >> +++ b/net/ipv4/icmp.c >> @@ -465,6 +465,7 @@ static struct rtable *icmp_route_lookup(struct net *net, >> int type, int code, >> struct icmp_bxm *param) >> { >> + struct net_device *route_lookup_dev; >> struct rtable *rt, *rt2; >> struct flowi4 fl4_dec; >> int err; >> @@ -479,7 +480,14 @@ static struct rtable *icmp_route_lookup(struct net *net, >> fl4->flowi4_proto = IPPROTO_ICMP; >> fl4->fl4_icmp_type = type; >> fl4->fl4_icmp_code = code; >> - fl4->flowi4_oif = l3mdev_master_ifindex(skb_dst(skb_in)->dev); >> + /* >> + * The device used for looking up which routing table to use is >> + * preferably the source whenever it is set, which should ensure >> + * the icmp error can be sent to the source host, else fallback >> + * on the destination device. >> + */ >> + route_lookup_dev = skb_in->dev ? skb_in->dev : skb_dst(skb_in)->dev; >> + fl4->flowi4_oif = l3mdev_master_ifindex(route_lookup_dev); >> >> security_skb_classify_flow(skb_in, flowi4_to_flowi(fl4)); >> rt = ip_route_output_key_hash(net, fl4, skb_in); >> @@ -503,7 +511,7 @@ static struct rtable *icmp_route_lookup(struct net *net, >> if (err) >> goto relookup_failed; >> >> - if (inet_addr_type_dev_table(net, skb_dst(skb_in)->dev, >> + if (inet_addr_type_dev_table(net, route_lookup_dev, >> fl4_dec.saddr) == RTN_LOCAL) { >> rt2 = __ip_route_output_key(net, &fl4_dec); >> if (IS_ERR(rt2)) I *think* this is ok, but a test case and running all of the IPv4 existing tests in fcnal-test.sh (-4 arg) would help.
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index e30515f89802..3d1da70c7293 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -465,6 +465,7 @@ static struct rtable *icmp_route_lookup(struct net *net, int type, int code, struct icmp_bxm *param) { + struct net_device *route_lookup_dev; struct rtable *rt, *rt2; struct flowi4 fl4_dec; int err; @@ -479,7 +480,14 @@ static struct rtable *icmp_route_lookup(struct net *net, fl4->flowi4_proto = IPPROTO_ICMP; fl4->fl4_icmp_type = type; fl4->fl4_icmp_code = code; - fl4->flowi4_oif = l3mdev_master_ifindex(skb_dst(skb_in)->dev); + /* + * The device used for looking up which routing table to use is + * preferably the source whenever it is set, which should ensure + * the icmp error can be sent to the source host, else fallback + * on the destination device. + */ + route_lookup_dev = skb_in->dev ? skb_in->dev : skb_dst(skb_in)->dev; + fl4->flowi4_oif = l3mdev_master_ifindex(route_lookup_dev); security_skb_classify_flow(skb_in, flowi4_to_flowi(fl4)); rt = ip_route_output_key_hash(net, fl4, skb_in); @@ -503,7 +511,7 @@ static struct rtable *icmp_route_lookup(struct net *net, if (err) goto relookup_failed; - if (inet_addr_type_dev_table(net, skb_dst(skb_in)->dev, + if (inet_addr_type_dev_table(net, route_lookup_dev, fl4_dec.saddr) == RTN_LOCAL) { rt2 = __ip_route_output_key(net, &fl4_dec); if (IS_ERR(rt2))
As per RFC792, ICMP errors should be sent to the source host. However, in configurations with Virtual Forwarding and Routing tables, looking up which routing table to use is currently done by using the destination net_device. commit 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to determine L3 domain") changes the interfaces passed to l3mdev_master_ifindex() and inet_addr_type_dev_table() from skb_in->dev to skb_dst(skb_in)->dev in order to fix a NULL pointer dereference. This changes the interface used for routing table lookup from source to destination. Therefore, if the source and destination interfaces are within separate VFR, or one in the global routing table and the other in a VFR, looking up the source host in the destination interface's routing table is likely to fail. One observable effect of this issue is that traceroute does not work in the following cases: - Route leaking between global routing table and VRF - Route leaking between VRFs [ Note 1: I'm not entirely sure what routing table should be used when param->replyopts.opt.opt.srr is set ? Is it valid to honor Strict Source Route when sending an ICMP error ? ] [ Note 2: I notice that ipv6 icmp6_send() uses skb_dst(skb)->dev as argument to l3mdev_master_ifindex(). I'm not sure if it is correct ? ] [ This patch is only compile-tested. ] Fixes: 9d1a6c4ea43e ("net: icmp_route_lookup should use rt dev to determine L3 domain") Link: https://tools.ietf.org/html/rfc792 Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: David Ahern <dsa@cumulusnetworks.com> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org --- net/ipv4/icmp.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)