diff mbox series

net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)

Message ID 20190901174759.257032-1-zenczykowski@gmail.com
State Superseded, archived
Delegated to: David Miller
Headers show
Series net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others) | expand

Commit Message

Maciej Żenczykowski Sept. 1, 2019, 5:47 p.m. UTC
From: Maciej Żenczykowski <maze@google.com>

There is a subtle change in behaviour introduced by:
  commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
  'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'

Before that patch /proc/net/ipv6_route includes:
00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo

Afterwards /proc/net/ipv6_route includes:
00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo

ie. the above commit causes the ::/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).

AFAICT, this is incorrect since these routes are *not* coming from RA's.

As such, this patch restores the old behaviour.

Fixes: c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
 net/ipv6/route.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Maciej Żenczykowski Sept. 1, 2019, 5:55 p.m. UTC | #1
Some background:

This was found due to bad interactions with one of the few remaining
Android common kernel networking patches.
(The one that makes it possible for RA's to create routes in interface
specific tables)

The cleanup portion of it scours all tables and deletes all relevant
ADDRCONF routes, which in 5.2-rc1+ now includes ::1/128 and thus
terribly breaks things (in the Android Kernel Networking tests).

However, it *is* a userspace visible change in behaviour (since it's
visible via the above /proc file),
so one could argue for the above patch (or something similar).

The Android patch *could* also probably be adjusted to handle this
case (and thus prevent the breakage).

It's not immediately clear to me what is the better approach as I'm
not immediately certain what RTF_ADDRCONF truly means.
However the in kernel header file comment does explicitly mention this
being used to flag routes derived from RA's, and very clearly ::1/128
is not RA generated, so I *think* the correct fix is to return to the
old way the kernel used to do things and not flag with ADDRCONF...

Opinions?
Lorenzo Colitti Sept. 2, 2019, 2:12 a.m. UTC | #2
On Mon, Sep 2, 2019 at 2:55 AM Maciej Żenczykowski
<zenczykowski@gmail.com> wrote:
> It's not immediately clear to me what is the better approach as I'm
> not immediately certain what RTF_ADDRCONF truly means.
> However the in kernel header file comment does explicitly mention this
> being used to flag routes derived from RA's, and very clearly ::1/128
> is not RA generated, so I *think* the correct fix is to return to the
> old way the kernel used to do things and not flag with ADDRCONF...

AIUI, "addrconf" has always meant stateless address autoconfiguration
as per RFC 4862, i.e., addresses autoconfigured when getting an RA, or
autoconfigured based on adding the link-local prefix. Looking at 5.1
(the most recent release before c7a1ce397ada which you're fixing here)
confirms this interpretation, because RTF_ADDRCONF is only used by:

- addrconf_prefix_rcv: receiving a PIO from an RA
- rt6_add_route_info: receiving an RIO from an RA
- rt6_add_dflt_router, rt6_get_dflt_router: receiving the default
router from an RA
- __rt6_purge_dflt_routers: removing all routes received from RAs,
when enabling forwarding (i.e., switching from being a host to being a
router)


So, if I'm reading c7a1ce397ada right, I would say it's incorrect.
That patch changes things so that RTF_ADDRCONF is set for pretty much
all routes created by adding IPv6 addresses. That includes not only
IPv6 addresses created by RAs, which has always been the case, but
also IPv6 addresses created manually from userspace, or the loopback
address, and even multicast and anycast addresses created by
IPV6_JOIN_GROUP and IPV6_JOIN_ANYCAST. That's userspace-visible
breakage and should be reverted.

Not sure if this patch is the right fix, though, because it breaks
things in the opposite direction: even routes created by an IPv6
address added by receiving an RA will no longer have RTF_ADDRCONF.
Perhaps add something like this as well?

 struct fib6_info *addrconf_f6i_alloc(struct net *net, struct inet6_dev *idev,
-                                     const struct in6_addr *addr, bool anycast,
-                                     const struct in6_addr *addr, u8 flags,
                                      gfp_t gfp_flags);

flags would be RTF_ANYCAST iff the code previously called with true,
and RTF_ADDRCONF if called by a function that is adding an IPv6
address coming from an RA.
David Ahern Sept. 2, 2019, 1:37 p.m. UTC | #3
On 9/1/19 11:47 AM, Maciej Żenczykowski wrote:
> From: Maciej Żenczykowski <maze@google.com>
> 
> There is a subtle change in behaviour introduced by:
>   commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
>   'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'
> 
> Before that patch /proc/net/ipv6_route includes:
> 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo
> 
> Afterwards /proc/net/ipv6_route includes:
> 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo
> 
> ie. the above commit causes the ::/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).

The ADDRCONF flag is wrong.

> 
> AFAICT, this is incorrect since these routes are *not* coming from RA's.
> 
> As such, this patch restores the old behaviour.
> 
> Fixes: c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
> Cc: David Ahern <dsahern@gmail.com>
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> ---
>  net/ipv6/route.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 558c6c68855f..cee977e52d34 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -4365,13 +4365,14 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
>  	struct fib6_config cfg = {
>  		.fc_table = l3mdev_fib_table(idev->dev) ? : RT6_TABLE_LOCAL,
>  		.fc_ifindex = idev->dev->ifindex,
> -		.fc_flags = RTF_UP | RTF_ADDRCONF | RTF_NONEXTHOP,
> +		.fc_flags = RTF_UP | RTF_NONEXTHOP,
>  		.fc_dst = *addr,
>  		.fc_dst_len = 128,
>  		.fc_protocol = RTPROT_KERNEL,
>  		.fc_nlinfo.nl_net = net,
>  		.fc_ignore_dev_down = true,
>  	};
> +	struct fib6_info *f6i;
>  
>  	if (anycast) {
>  		cfg.fc_type = RTN_ANYCAST;
> @@ -4381,7 +4382,9 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
>  		cfg.fc_flags |= RTF_LOCAL;
>  	}
>  
> -	return ip6_route_info_create(&cfg, gfp_flags, NULL);
> +	f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);

ip6_route_info_create can return NULL.



> +	f6i->dst_nocount = true;
> +	return f6i;
>  }
>  
>  /* remove deleted ip from prefsrc entries */
>
David Ahern Sept. 3, 2019, 2:18 a.m. UTC | #4
On 9/1/19 8:12 PM, Lorenzo Colitti wrote:
> Not sure if this patch is the right fix, though, because it breaks
> things in the opposite direction: even routes created by an IPv6
> address added by receiving an RA will no longer have RTF_ADDRCONF.
> Perhaps add something like this as well?
> 
>  struct fib6_info *addrconf_f6i_alloc(struct net *net, struct inet6_dev *idev,
> -                                     const struct in6_addr *addr, bool anycast,
> -                                     const struct in6_addr *addr, u8 flags,
>                                       gfp_t gfp_flags);
> 
> flags would be RTF_ANYCAST iff the code previously called with true,
> and RTF_ADDRCONF if called by a function that is adding an IPv6
> address coming from an RA.

addrconf_f6i_alloc is used for addresses added by userspace
(ipv6_add_addr) and anycast. ie., from what I can see it is not used for RAs
Lorenzo Colitti Sept. 3, 2019, 4:58 a.m. UTC | #5
On Tue, Sep 3, 2019 at 11:18 AM David Ahern <dsahern@gmail.com> wrote:
> addrconf_f6i_alloc is used for addresses added by userspace
> (ipv6_add_addr) and anycast. ie., from what I can see it is not used for RAs

Isn't ipv6_add_addr called by addrconf_prefix_rcv_add_addr, which is
called by addrconf_prefix_rcv, which is called by
ndisc_router_discovery? That is what happens when we process an RA;
AFAICS manual configuration is inet6_addr_add, not ipv6_add_addr.

Maciej, with this patch, do SLAAC addresses still have RTF_ADDRCONF?
Per my previous message, my assumption would be no, but I might be
misreading the code.
Maciej Żenczykowski Sept. 3, 2019, 12:17 p.m. UTC | #6
Well, if you look at the commit my commit is fixing, ie.
  commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
then you'll see this in the commit description:
  "- dst_nocount is handled by the RTF_ADDRCONF flag"
and the patch diff itself is from
  "f6i->fib6_flags = RTF_UP | RTF_NONEXTHOP;
   f6i->dst_nocount = true;"
to
  " .fc_flags = RTF_UP | RTF_ADDRCONF | RTF_NONEXTHOP,"

(and RTF_ANYCAST or RTF_LOCAL is later or'ed in in both versions of the code)

so I'm pretty sure that patch adds ADDRCONF unconditionally to that
function, and my commit unconditionally removes it.

Perhaps since then the call graph has changed???

Unfortunately I'm already in Europe on ancient ipv6 free networks and
since the office move my ipv6 lab is still not up (something about the
newly wired office jacks being blue which means corp, instead of any
other colour for a lab...) so I don't actually have an easy way to
test ipv6 slaac behaviour. :-(

On Tue, Sep 3, 2019 at 6:58 AM Lorenzo Colitti <lorenzo@google.com> wrote:
>
> On Tue, Sep 3, 2019 at 11:18 AM David Ahern <dsahern@gmail.com> wrote:
> > addrconf_f6i_alloc is used for addresses added by userspace
> > (ipv6_add_addr) and anycast. ie., from what I can see it is not used for RAs
>
> Isn't ipv6_add_addr called by addrconf_prefix_rcv_add_addr, which is
> called by addrconf_prefix_rcv, which is called by
> ndisc_router_discovery? That is what happens when we process an RA;
> AFAICS manual configuration is inet6_addr_add, not ipv6_add_addr.
>
> Maciej, with this patch, do SLAAC addresses still have RTF_ADDRCONF?
> Per my previous message, my assumption would be no, but I might be
> misreading the code.
David Ahern Sept. 3, 2019, 7:45 p.m. UTC | #7
On 9/3/19 6:17 AM, Maciej Żenczykowski wrote:
> Well, if you look at the commit my commit is fixing, ie.
>   commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
> then you'll see this in the commit description:
>   "- dst_nocount is handled by the RTF_ADDRCONF flag"
> and the patch diff itself is from
>   "f6i->fib6_flags = RTF_UP | RTF_NONEXTHOP;
>    f6i->dst_nocount = true;"
> to
>   " .fc_flags = RTF_UP | RTF_ADDRCONF | RTF_NONEXTHOP,"
> 
> (and RTF_ANYCAST or RTF_LOCAL is later or'ed in in both versions of the code)
> 
> so I'm pretty sure that patch adds ADDRCONF unconditionally to that
> function, and my commit unconditionally removes it.
> 

exactly. It was shortsighted of me to add the ADDRCONF flag and removing
it reverts back to the previous behavior.

When I enable radvd, I do see the flag set when it should be and not for
other addresses. I believe the patch is correct.
Lorenzo Colitti Sept. 4, 2019, 3:17 a.m. UTC | #8
On Wed, Sep 4, 2019 at 4:45 AM David Ahern <dsahern@gmail.com> wrote:
>
> exactly. It was shortsighted of me to add the ADDRCONF flag and removing
> it reverts back to the previous behavior.
>
> When I enable radvd, I do see the flag set when it should be and not for
> other addresses. I believe the patch is correct.

Ah, wait, I was confused. Sorry.

What I was saying is that RTF_ADDRCONF flag should be set on the local
table /128 route for the IPv6 addresses configured by RAs. The way
those are configured is ndisc_router_discovery -> addrconf_prefix_rcv
-> addrconf_prefix_rcv_add_addr -> ipv6_add_addr ->
addrconf_f6i_alloc. Because in this patch, addrconf_f6i_alloc
unconditionally clears RTF_ADDRCONF, I didn't see how the flag could
be set. But I now realize that that was not happening before David's
commit, either: in 5.1 those routes were not created with RTF_ADDRCONF
either.

In other words: before 5.1, the /128 routes in the local table to IPv6
addresses created by SLAAC did not have RTF_ADDRCONF. After David's
commit, they did, because *all* /128 routes to IPv6 addresses had
RTF_ADDRCONF (correct for SLAAC adresses, but definitely incorrect for
manual addresses, loopback, etc.). If this commit is applied, we'll go
back to the 5.1 state. In the future it would be good to ensure that
at least the /128 routes created by SLAAC do have RTF_ADDRCONF, but no
need to do so in this commit.
diff mbox series

Patch

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 558c6c68855f..cee977e52d34 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -4365,13 +4365,14 @@  struct fib6_info *addrconf_f6i_alloc(struct net *net,
 	struct fib6_config cfg = {
 		.fc_table = l3mdev_fib_table(idev->dev) ? : RT6_TABLE_LOCAL,
 		.fc_ifindex = idev->dev->ifindex,
-		.fc_flags = RTF_UP | RTF_ADDRCONF | RTF_NONEXTHOP,
+		.fc_flags = RTF_UP | RTF_NONEXTHOP,
 		.fc_dst = *addr,
 		.fc_dst_len = 128,
 		.fc_protocol = RTPROT_KERNEL,
 		.fc_nlinfo.nl_net = net,
 		.fc_ignore_dev_down = true,
 	};
+	struct fib6_info *f6i;
 
 	if (anycast) {
 		cfg.fc_type = RTN_ANYCAST;
@@ -4381,7 +4382,9 @@  struct fib6_info *addrconf_f6i_alloc(struct net *net,
 		cfg.fc_flags |= RTF_LOCAL;
 	}
 
-	return ip6_route_info_create(&cfg, gfp_flags, NULL);
+	f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);
+	f6i->dst_nocount = true;
+	return f6i;
 }
 
 /* remove deleted ip from prefsrc entries */