diff mbox series

[net] net: route dump netlink NLM_F_MULTI flag missing

Message ID 156660549861.5753.7912871726096518275.stgit@john-XPS-13-9370
State Accepted
Delegated to: David Miller
Headers show
Series [net] net: route dump netlink NLM_F_MULTI flag missing | expand

Commit Message

John Fastabend Aug. 24, 2019, 12:11 a.m. UTC
An excerpt from netlink(7) man page,

  In multipart messages (multiple nlmsghdr headers with associated payload
  in one byte stream) the first and all following headers have the
  NLM_F_MULTI flag set, except for the last  header  which  has the type
  NLMSG_DONE.

but, after (ee28906) there is a missing NLM_F_MULTI flag in the middle of a
FIB dump. The result is user space applications following above man page
excerpt may get confused and may stop parsing msg believing something went
wrong.

In the golang netlink lib [0] the library logic stops parsing believing the
message is not a multipart message. Found this running Cilium[1] against
net-next while adding a feature to auto-detect routes. I noticed with
multiple route tables we no longer could detect the default routes on net
tree kernels because the library logic was not returning them.

Fix this by handling the fib_dump_info_fnhe() case the same way the
fib_dump_info() handles it by passing the flags argument through the
call chain and adding a flags argument to rt_fill_info().

Tested with Cilium stack and auto-detection of routes works again. Also
annotated libs to dump netlink msgs and inspected NLM_F_MULTI and
NLMSG_DONE flags look correct after this.

Note: In inet_rtm_getroute() pass rt_fill_info() '0' for flags the same
as is done for fib_dump_info() so this looks correct to me.

[0] https://github.com/vishvananda/netlink/
[1] https://github.com/cilium/

Fixes: ee28906fd7a14 ("ipv4: Dump route exceptions if requested")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 include/net/route.h |    2 +-
 net/ipv4/fib_trie.c |    2 +-
 net/ipv4/route.c    |   17 ++++++++++-------
 3 files changed, 12 insertions(+), 9 deletions(-)

Comments

Stefano Brivio Aug. 24, 2019, 1:06 a.m. UTC | #1
On Fri, 23 Aug 2019 17:11:38 -0700
John Fastabend <john.fastabend@gmail.com> wrote:

> An excerpt from netlink(7) man page,
> 
>   In multipart messages (multiple nlmsghdr headers with associated payload
>   in one byte stream) the first and all following headers have the
>   NLM_F_MULTI flag set, except for the last  header  which  has the type
>   NLMSG_DONE.
> 
> but, after (ee28906) there is a missing NLM_F_MULTI flag in the middle of a
> FIB dump.

In your case (see below), it can be zero or more, depending on how many
exception routes you have.

> The result is user space applications following above man page
> excerpt may get confused and may stop parsing msg believing something went
> wrong.

Worse yet, also RFC 3459 says:

	[...] For multipart
	messages, the first and all following headers have the NLM_F_MULTI
	Netlink header flag set, except for the last header which has the
	Netlink header type NLMSG_DONE.

But iproute2 doesn't check for this, so the selftests I added didn't
notice. Thanks for fixing this!

> In the golang netlink lib [0] the library logic stops parsing believing the
> message is not a multipart message. Found this running Cilium[1] against
> net-next while adding a feature to auto-detect routes. I noticed with
> multiple route tables we no longer could detect the default routes on net
> tree kernels because the library logic was not returning them.

However, note that if strict netlink checking is requested (I think the
library should be updated), and RTM_F_CLONED is not set (which should
be the case if you are just looking for "regular" routes), you won't
hit this.

> Fix this by handling the fib_dump_info_fnhe() case the same way the
> fib_dump_info() handles it by passing the flags argument through the
> call chain and adding a flags argument to rt_fill_info().
> 
> Tested with Cilium stack and auto-detection of routes works again. Also
> annotated libs to dump netlink msgs and inspected NLM_F_MULTI and
> NLMSG_DONE flags look correct after this.
> 
> Note: In inet_rtm_getroute() pass rt_fill_info() '0' for flags the same
> as is done for fib_dump_info() so this looks correct to me.

Yes, that's correct, because if the buffer is too small for a single
route dumped by a single rt_fill_info() call, we'll just fail, so that
will never be a multipart message.

> [0] https://github.com/vishvananda/netlink/
> [1] https://github.com/cilium/
> 
> Fixes: ee28906fd7a14 ("ipv4: Dump route exceptions if requested")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
David Miller Aug. 24, 2019, 11:50 p.m. UTC | #2
From: John Fastabend <john.fastabend@gmail.com>
Date: Fri, 23 Aug 2019 17:11:38 -0700

> An excerpt from netlink(7) man page,
> 
>   In multipart messages (multiple nlmsghdr headers with associated payload
>   in one byte stream) the first and all following headers have the
>   NLM_F_MULTI flag set, except for the last  header  which  has the type
>   NLMSG_DONE.
> 
> but, after (ee28906) there is a missing NLM_F_MULTI flag in the middle of a
> FIB dump. The result is user space applications following above man page
> excerpt may get confused and may stop parsing msg believing something went
> wrong.
> 
> In the golang netlink lib [0] the library logic stops parsing believing the
> message is not a multipart message. Found this running Cilium[1] against
> net-next while adding a feature to auto-detect routes. I noticed with
> multiple route tables we no longer could detect the default routes on net
> tree kernels because the library logic was not returning them.
> 
> Fix this by handling the fib_dump_info_fnhe() case the same way the
> fib_dump_info() handles it by passing the flags argument through the
> call chain and adding a flags argument to rt_fill_info().
> 
> Tested with Cilium stack and auto-detection of routes works again. Also
> annotated libs to dump netlink msgs and inspected NLM_F_MULTI and
> NLMSG_DONE flags look correct after this.
> 
> Note: In inet_rtm_getroute() pass rt_fill_info() '0' for flags the same
> as is done for fib_dump_info() so this looks correct to me.
> 
> [0] https://github.com/vishvananda/netlink/
> [1] https://github.com/cilium/
> 
> Fixes: ee28906fd7a14 ("ipv4: Dump route exceptions if requested")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Applied, thanks John.
diff mbox series

Patch

diff --git a/include/net/route.h b/include/net/route.h
index 630a0493f1f3..dfce19c9fa96 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -233,7 +233,7 @@  void rt_del_uncached_list(struct rtable *rt);
 
 int fib_dump_info_fnhe(struct sk_buff *skb, struct netlink_callback *cb,
 		       u32 table_id, struct fib_info *fi,
-		       int *fa_index, int fa_start);
+		       int *fa_index, int fa_start, unsigned int flags);
 
 static inline void ip_rt_put(struct rtable *rt)
 {
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 2b2b3d291ab0..1ab2fb6bb37d 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -2145,7 +2145,7 @@  static int fn_trie_dump_leaf(struct key_vector *l, struct fib_table *tb,
 
 		if (filter->dump_exceptions) {
 			err = fib_dump_info_fnhe(skb, cb, tb->tb_id, fi,
-						 &i_fa, s_fa);
+						 &i_fa, s_fa, flags);
 			if (err < 0)
 				goto stop;
 		}
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 517300d587a7..b6a6f18c3dd1 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2728,7 +2728,8 @@  EXPORT_SYMBOL_GPL(ip_route_output_flow);
 /* called with rcu_read_lock held */
 static int rt_fill_info(struct net *net, __be32 dst, __be32 src,
 			struct rtable *rt, u32 table_id, struct flowi4 *fl4,
-			struct sk_buff *skb, u32 portid, u32 seq)
+			struct sk_buff *skb, u32 portid, u32 seq,
+			unsigned int flags)
 {
 	struct rtmsg *r;
 	struct nlmsghdr *nlh;
@@ -2736,7 +2737,7 @@  static int rt_fill_info(struct net *net, __be32 dst, __be32 src,
 	u32 error;
 	u32 metrics[RTAX_MAX];
 
-	nlh = nlmsg_put(skb, portid, seq, RTM_NEWROUTE, sizeof(*r), 0);
+	nlh = nlmsg_put(skb, portid, seq, RTM_NEWROUTE, sizeof(*r), flags);
 	if (!nlh)
 		return -EMSGSIZE;
 
@@ -2860,7 +2861,7 @@  static int rt_fill_info(struct net *net, __be32 dst, __be32 src,
 static int fnhe_dump_bucket(struct net *net, struct sk_buff *skb,
 			    struct netlink_callback *cb, u32 table_id,
 			    struct fnhe_hash_bucket *bucket, int genid,
-			    int *fa_index, int fa_start)
+			    int *fa_index, int fa_start, unsigned int flags)
 {
 	int i;
 
@@ -2891,7 +2892,7 @@  static int fnhe_dump_bucket(struct net *net, struct sk_buff *skb,
 			err = rt_fill_info(net, fnhe->fnhe_daddr, 0, rt,
 					   table_id, NULL, skb,
 					   NETLINK_CB(cb->skb).portid,
-					   cb->nlh->nlmsg_seq);
+					   cb->nlh->nlmsg_seq, flags);
 			if (err)
 				return err;
 next:
@@ -2904,7 +2905,7 @@  static int fnhe_dump_bucket(struct net *net, struct sk_buff *skb,
 
 int fib_dump_info_fnhe(struct sk_buff *skb, struct netlink_callback *cb,
 		       u32 table_id, struct fib_info *fi,
-		       int *fa_index, int fa_start)
+		       int *fa_index, int fa_start, unsigned int flags)
 {
 	struct net *net = sock_net(cb->skb->sk);
 	int nhsel, genid = fnhe_genid(net);
@@ -2922,7 +2923,8 @@  int fib_dump_info_fnhe(struct sk_buff *skb, struct netlink_callback *cb,
 		err = 0;
 		if (bucket)
 			err = fnhe_dump_bucket(net, skb, cb, table_id, bucket,
-					       genid, fa_index, fa_start);
+					       genid, fa_index, fa_start,
+					       flags);
 		rcu_read_unlock();
 		if (err)
 			return err;
@@ -3183,7 +3185,8 @@  static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 				    fl4.flowi4_tos, res.fi, 0);
 	} else {
 		err = rt_fill_info(net, dst, src, rt, table_id, &fl4, skb,
-				   NETLINK_CB(in_skb).portid, nlh->nlmsg_seq);
+				   NETLINK_CB(in_skb).portid,
+				   nlh->nlmsg_seq, 0);
 	}
 	if (err < 0)
 		goto errout_rcu;