diff mbox series

[net-next,v1,3/5] ipv4: enable IFA_IF_NETNSID for RTM_GETADDR

Message ID 20180903043717.20136-4-christian@brauner.io
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series rtnetlink: add IFA_IF_NETNSID for RTM_GETADDR | expand

Commit Message

Christian Brauner Sept. 3, 2018, 4:37 a.m. UTC
- Backwards Compatibility:
  If userspace wants to determine whether ipv4 RTM_GETADDR requests support
  the new IFA_IF_NETNSID property they should verify that the reply after
  sending a request includes the IFA_IF_NETNSID property. If it does not
  userspace should assume that IFA_IF_NETNSID is not supported for ipv4
  RTM_GETADDR requests on this kernel.
- From what I gather from current userspace tools that make use of
  RTM_GETADDR requests some of them pass down struct ifinfomsg when they
  should actually pass down struct ifaddrmsg. To not break existing
  tools that pass down the wrong struct we will do the same as for
  RTM_GETLINK | NLM_F_DUMP requests and not error out when the
  nlmsg_parse() fails.

- Security:
  Callers must have CAP_NET_ADMIN in the owning user namespace of the
  target network namespace.

Signed-off-by: Christian Brauner <christian@brauner.io>
---
v0->v1:
- unchanged
---
 net/ipv4/devinet.c | 38 ++++++++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 8 deletions(-)

Comments

David Ahern Sept. 4, 2018, 3:11 a.m. UTC | #1
On 9/2/18 10:37 PM, Christian Brauner wrote:
> - Backwards Compatibility:
>   If userspace wants to determine whether ipv4 RTM_GETADDR requests support
>   the new IFA_IF_NETNSID property they should verify that the reply after
>   sending a request includes the IFA_IF_NETNSID property. If it does not
>   userspace should assume that IFA_IF_NETNSID is not supported for ipv4
>   RTM_GETADDR requests on this kernel.

Can only use it once per message type, but NLM_F_DUMP_FILTERED is a flag
that can be set to explicitly say the request is filtered as requested.
See 21fdd092acc7e. I would like to see other filters added for addresses
in the same release this gets used. The only one that comes to mind for
addresses is to only return addresses for devices with master device
index N (same intent as 21fdd092acc7e for neighbors).
Jiri Benc Sept. 4, 2018, 6:50 a.m. UTC | #2
On Mon, 3 Sep 2018 21:11:30 -0600, David Ahern wrote:
> Can only use it once per message type, but NLM_F_DUMP_FILTERED is a flag
> that can be set to explicitly say the request is filtered as requested.

The problem is that NLM_F_DUMP_FILTERED is too coarse. There's no way
to determine whether the netnsid was honored or whether it was not but
other filtering took effect.

This is a general problem with netlink: unknown attributes are ignored.
We need a way to detect that certain attribute was understood by the
kernel or was not. And it needs to work retroactively, i.e. the
application has to be able to determine the currently running kernel
does not support the feature (because it's too old).

That's why we return back the attribute in responses to a request with
IFLA_IF_NETNSID present and why we should do the same for
IFA_IF_NETNSID.

> See 21fdd092acc7e. I would like to see other filters added for addresses
> in the same release this gets used. The only one that comes to mind for
> addresses is to only return addresses for devices with master device
> index N (same intent as 21fdd092acc7e for neighbors).

I also question the statement that IFA_F_NETNSID is a filter: my
understanding of "filter" is something that limits the output to a
certain subset. I.e., unfiltered results always contain everything that
is in a filtered result. While with IFA_F_NETNSID, we get a completely
different set of data. Does that really constitute a filter? Note that
we can still filter in the target netns.

Thanks,

 Jiri
Nicolas Dichtel Sept. 4, 2018, 7:20 a.m. UTC | #3
Le 04/09/2018 à 08:50, Jiri Benc a écrit :
> On Mon, 3 Sep 2018 21:11:30 -0600, David Ahern wrote:
>> Can only use it once per message type, but NLM_F_DUMP_FILTERED is a flag
>> that can be set to explicitly say the request is filtered as requested.
> 
> The problem is that NLM_F_DUMP_FILTERED is too coarse. There's no way
> to determine whether the netnsid was honored or whether it was not but
> other filtering took effect.
> 
> This is a general problem with netlink: unknown attributes are ignored.
> We need a way to detect that certain attribute was understood by the
> kernel or was not. And it needs to work retroactively, i.e. the
> application has to be able to determine the currently running kernel
> does not support the feature (because it's too old).
> 
> That's why we return back the attribute in responses to a request with
> IFLA_IF_NETNSID present and why we should do the same for
> IFA_IF_NETNSID.
+1

> 
>> See 21fdd092acc7e. I would like to see other filters added for addresses
>> in the same release this gets used. The only one that comes to mind for
>> addresses is to only return addresses for devices with master device
>> index N (same intent as 21fdd092acc7e for neighbors).
> 
> I also question the statement that IFA_F_NETNSID is a filter: my
> understanding of "filter" is something that limits the output to a
> certain subset. I.e., unfiltered results always contain everything that
> is in a filtered result. While with IFA_F_NETNSID, we get a completely
> different set of data. Does that really constitute a filter? Note that
> we can still filter in the target netns.
+1


Regards,
Nicolas
David Ahern Sept. 4, 2018, 4:27 p.m. UTC | #4
On 9/4/18 12:50 AM, Jiri Benc wrote:
> 
> This is a general problem with netlink: unknown attributes are ignored.
> We need a way to detect that certain attribute was understood by the
> kernel or was not. And it needs to work retroactively, i.e. the
> application has to be able to determine the currently running kernel
> does not support the feature (because it's too old).

sure, and that has been discussed before.

> 
> That's why we return back the attribute in responses to a request with
> IFLA_IF_NETNSID present and why we should do the same for
> IFA_IF_NETNSID.
> 
>> See 21fdd092acc7e. I would like to see other filters added for addresses
>> in the same release this gets used. The only one that comes to mind for
>> addresses is to only return addresses for devices with master device
>> index N (same intent as 21fdd092acc7e for neighbors).
> 
> I also question the statement that IFA_F_NETNSID is a filter: my
> understanding of "filter" is something that limits the output to a
> certain subset. I.e., unfiltered results always contain everything that
> is in a filtered result. While with IFA_F_NETNSID, we get a completely
> different set of data. Does that really constitute a filter? Note that
> we can still filter in the target netns.
> 

I'll buy that argument over the 'too coarse' one. Looking at the link
version of this flag, the NLM_F_DUMP_FILTERED flag is not used there so
for consistency the address one should follow suit.
diff mbox series

Patch

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index ea4bd8a52422..c52271309a1f 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -100,6 +100,7 @@  static const struct nla_policy ifa_ipv4_policy[IFA_MAX+1] = {
 	[IFA_CACHEINFO]		= { .len = sizeof(struct ifa_cacheinfo) },
 	[IFA_FLAGS]		= { .type = NLA_U32 },
 	[IFA_RT_PRIORITY]	= { .type = NLA_U32 },
+	[IFA_IF_NETNSID]	= { .type = NLA_S32 },
 };
 
 #define IN4_ADDR_HSIZE_SHIFT	8
@@ -1584,7 +1585,8 @@  static int put_cacheinfo(struct sk_buff *skb, unsigned long cstamp,
 }
 
 static int inet_fill_ifaddr(struct sk_buff *skb, struct in_ifaddr *ifa,
-			    u32 portid, u32 seq, int event, unsigned int flags)
+			    u32 portid, u32 seq, int event, unsigned int flags,
+			    int netnsid)
 {
 	struct ifaddrmsg *ifm;
 	struct nlmsghdr  *nlh;
@@ -1601,6 +1603,9 @@  static int inet_fill_ifaddr(struct sk_buff *skb, struct in_ifaddr *ifa,
 	ifm->ifa_scope = ifa->ifa_scope;
 	ifm->ifa_index = ifa->ifa_dev->dev->ifindex;
 
+	if (netnsid >= 0 && nla_put_s32(skb, IFA_IF_NETNSID, netnsid))
+		goto nla_put_failure;
+
 	if (!(ifm->ifa_flags & IFA_F_PERMANENT)) {
 		preferred = ifa->ifa_preferred_lft;
 		valid = ifa->ifa_valid_lft;
@@ -1648,6 +1653,9 @@  static int inet_fill_ifaddr(struct sk_buff *skb, struct in_ifaddr *ifa,
 static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
+	struct nlattr *tb[IFA_MAX+1];
+	struct net *tgt_net = net;
+	int netnsid = -1;
 	int h, s_h;
 	int idx, s_idx;
 	int ip_idx, s_ip_idx;
@@ -1660,12 +1668,23 @@  static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
 	s_idx = idx = cb->args[1];
 	s_ip_idx = ip_idx = cb->args[2];
 
+	if (nlmsg_parse(cb->nlh, sizeof(struct ifaddrmsg), tb, IFA_MAX,
+			ifa_ipv4_policy, NULL) >= 0) {
+		if (tb[IFA_IF_NETNSID]) {
+			netnsid = nla_get_s32(tb[IFA_IF_NETNSID]);
+
+			tgt_net = rtnl_get_net_ns_capable(skb->sk, netnsid);
+			if (IS_ERR(tgt_net))
+				return PTR_ERR(tgt_net);
+		}
+	}
+
 	for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
 		idx = 0;
-		head = &net->dev_index_head[h];
+		head = &tgt_net->dev_index_head[h];
 		rcu_read_lock();
-		cb->seq = atomic_read(&net->ipv4.dev_addr_genid) ^
-			  net->dev_base_seq;
+		cb->seq = atomic_read(&tgt_net->ipv4.dev_addr_genid) ^
+			  tgt_net->dev_base_seq;
 		hlist_for_each_entry_rcu(dev, head, index_hlist) {
 			if (idx < s_idx)
 				goto cont;
@@ -1680,9 +1699,10 @@  static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
 				if (ip_idx < s_ip_idx)
 					continue;
 				if (inet_fill_ifaddr(skb, ifa,
-					     NETLINK_CB(cb->skb).portid,
-					     cb->nlh->nlmsg_seq,
-					     RTM_NEWADDR, NLM_F_MULTI) < 0) {
+						     NETLINK_CB(cb->skb).portid,
+						     cb->nlh->nlmsg_seq,
+						     RTM_NEWADDR, NLM_F_MULTI,
+						     netnsid) < 0) {
 					rcu_read_unlock();
 					goto done;
 				}
@@ -1698,6 +1718,8 @@  static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
 	cb->args[0] = h;
 	cb->args[1] = idx;
 	cb->args[2] = ip_idx;
+	if (netnsid >= 0)
+		put_net(tgt_net);
 
 	return skb->len;
 }
@@ -1715,7 +1737,7 @@  static void rtmsg_ifa(int event, struct in_ifaddr *ifa, struct nlmsghdr *nlh,
 	if (!skb)
 		goto errout;
 
-	err = inet_fill_ifaddr(skb, ifa, portid, seq, event, 0);
+	err = inet_fill_ifaddr(skb, ifa, portid, seq, event, 0, -1);
 	if (err < 0) {
 		/* -EMSGSIZE implies BUG in inet_nlmsg_size() */
 		WARN_ON(err == -EMSGSIZE);