diff mbox

[RFC] netlink: Increase netlink dump skb message size

Message ID 20110425220157.2012.96707.stgit@gitlad.jf.intel.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Rose, Gregory V April 25, 2011, 10:01 p.m. UTC
The message size allocated for rtnl info dumps was limited to a single page.
This is not enough for additional interface info available with devices
that support SR-IOV.  Check that the amount of data allocated is sufficient
for the amount of data requested.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
---

 include/linux/rtnetlink.h |    1 +
 net/core/rtnetlink.c      |    6 ++++++
 net/netlink/af_netlink.c  |   37 +++++++++++++++++++++++++++++++------
 3 files changed, 38 insertions(+), 6 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller April 29, 2011, 7:29 p.m. UTC | #1
From: Greg Rose <gregory.v.rose@intel.com>
Date: Mon, 25 Apr 2011 15:01:57 -0700

> The message size allocated for rtnl info dumps was limited to a single page.
> This is not enough for additional interface info available with devices
> that support SR-IOV.  Check that the amount of data allocated is sufficient
> for the amount of data requested.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>

Actually, thinking about this problem some more I think your approach
is close to what we should actually do.

The only problem is that you've specialized netlink_dump() too much,
hide the issue in the rtnetlink device dumping code and provide
the necessary information via the control block.

1) Add some information to struct netlink_callback.

   	"u16	min_dump_alloc;"

   I think you can change "int family;" there to "u16 family;"
   so that there is no new space required to add this functionality.

2) In netlink_dump().

	int alloc_size;

	mutex_lock(nlk->cb_mutex);
	cb = nlk->cb;
	if (cb == NULL) {
		...
	}
	alloc_size = max_t(int, cb->min_dump_alloc, NLMSG_GOODSIZE);
	skb = sock_rmalloc(sk, alloc_size, 0, GFP_KERNEL);
	if (!skb) {
	...

3) In net/core/rtnetlink.c add a new method pointer to struct rtnl_link,
   "u16 (*calc_min_dump_alloc)(struct sk_buff *);"
   Add an implementation for the RTM_GETLINK slot.

   This function does the logic to compute the maximum space needed
   for the devices currently configured, as you hacked directly into
   the netlink_dump() logic in your match.

4) Make netlink_dump_start() take a new argument, this is where you
   pass the new min_dump_alloc value that will get stored in the
   newly allocated callback object.

	calcit = rtnl_get_calcit(family, type);

   	min_dump_alloc = 0;
	if (calcit)
		min_dump_alloc = calcit(skb);
	err = netlink_dump_start(rtnl, skb, nlh, dumpit, min_dump_alloc, NULL);

Anyways, something like that.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rose, Gregory V April 29, 2011, 7:37 p.m. UTC | #2
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Friday, April 29, 2011 12:30 PM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org; bhutchings@solarflare.com
> Subject: Re: [RFC PATCH] netlink: Increase netlink dump skb message size
> 
> From: Greg Rose <gregory.v.rose@intel.com>
> Date: Mon, 25 Apr 2011 15:01:57 -0700
> 
> > The message size allocated for rtnl info dumps was limited to a single
> page.
> > This is not enough for additional interface info available with devices
> > that support SR-IOV.  Check that the amount of data allocated is
> sufficient
> > for the amount of data requested.
> >
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> 
> Actually, thinking about this problem some more I think your approach
> is close to what we should actually do.
> 
> The only problem is that you've specialized netlink_dump() too much,
> hide the issue in the rtnetlink device dumping code and provide
> the necessary information via the control block.
> 
> 1) Add some information to struct netlink_callback.
> 
>    	"u16	min_dump_alloc;"
> 
>    I think you can change "int family;" there to "u16 family;"
>    so that there is no new space required to add this functionality.
> 
> 2) In netlink_dump().
> 
> 	int alloc_size;
> 
> 	mutex_lock(nlk->cb_mutex);
> 	cb = nlk->cb;
> 	if (cb == NULL) {
> 		...
> 	}
> 	alloc_size = max_t(int, cb->min_dump_alloc, NLMSG_GOODSIZE);
> 	skb = sock_rmalloc(sk, alloc_size, 0, GFP_KERNEL);
> 	if (!skb) {
> 	...
> 
> 3) In net/core/rtnetlink.c add a new method pointer to struct rtnl_link,
>    "u16 (*calc_min_dump_alloc)(struct sk_buff *);"
>    Add an implementation for the RTM_GETLINK slot.
> 
>    This function does the logic to compute the maximum space needed
>    for the devices currently configured, as you hacked directly into
>    the netlink_dump() logic in your match.
> 
> 4) Make netlink_dump_start() take a new argument, this is where you
>    pass the new min_dump_alloc value that will get stored in the
>    newly allocated callback object.
> 
> 	calcit = rtnl_get_calcit(family, type);
> 
>    	min_dump_alloc = 0;
> 	if (calcit)
> 		min_dump_alloc = calcit(skb);
> 	err = netlink_dump_start(rtnl, skb, nlh, dumpit, min_dump_alloc,
> NULL);
> 
> Anyways, something like that.
[Greg Rose] 

I'll look into this and try to get a revised patch done by early next week.

The feedback and suggestions are much appreciated.

- Greg


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index bbad657..d1ff937 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -622,6 +622,7 @@  extern int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics);
 extern int rtnl_put_cacheinfo(struct sk_buff *skb, struct dst_entry *dst,
 			      u32 id, u32 ts, u32 tsage, long expires,
 			      u32 error);
+extern size_t rtnl_get_nlmsg_size(const struct net_device *dev);
 
 extern void __rta_fill(struct sk_buff *skb, int attrtype, int attrlen, const void *data);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index d7c4bb4..001c947 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -764,6 +764,12 @@  static noinline size_t if_nlmsg_size(const struct net_device *dev)
 	       + rtnl_link_get_af_size(dev); /* IFLA_AF_SPEC */
 }
 
+size_t rtnl_get_nlmsg_size(const struct net_device *dev)
+{
+	return if_nlmsg_size(dev);
+}
+EXPORT_SYMBOL(rtnl_get_nlmsg_size);
+
 static int rtnl_vf_ports_fill(struct sk_buff *skb, struct net_device *dev)
 {
 	struct nlattr *vf_ports;
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c8f35b5..5b1106c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1664,23 +1664,48 @@  static void netlink_destroy_callback(struct netlink_callback *cb)
 static int netlink_dump(struct sock *sk)
 {
 	struct netlink_sock *nlk = nlk_sk(sk);
+	struct net *net = sock_net(sk);
 	struct netlink_callback *cb;
 	struct sk_buff *skb;
 	struct nlmsghdr *nlh;
+	struct net_device *dev;
+	struct hlist_head *head;
+	struct hlist_node *node;
 	int len, err = -ENOBUFS;
-
-	skb = sock_rmalloc(sk, NLMSG_GOODSIZE, 0, GFP_KERNEL);
-	if (!skb)
-		goto errout;
+	int h, s_h;
+	int idx = 0, s_idx;
+	size_t alloc_size = NLMSG_GOODSIZE;
 
 	mutex_lock(nlk->cb_mutex);
 
 	cb = nlk->cb;
 	if (cb == NULL) {
 		err = -EINVAL;
-		goto errout_skb;
+		goto errout;
 	}
 
+	s_h = cb->args[0];
+	s_idx = cb->args[1];
+
+	for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
+		idx = 0;
+		head = &net->dev_index_head[h];
+		hlist_for_each_entry(dev, node, head, index_hlist) {
+			if (idx < s_idx) {
+				idx++;
+				continue;
+			}
+			alloc_size = rtnl_get_nlmsg_size(dev);
+			if (alloc_size < NLMSG_GOODSIZE)
+				alloc_size = NLMSG_GOODSIZE;
+			break;
+		}
+	}
+
+	skb = sock_rmalloc(sk, alloc_size, 0, GFP_KERNEL);
+	if (!skb)
+		goto errout;
+
 	len = cb->dump(skb, cb);
 
 	if (len > 0) {
@@ -1717,9 +1742,9 @@  static int netlink_dump(struct sock *sk)
 	return 0;
 
 errout_skb:
-	mutex_unlock(nlk->cb_mutex);
 	kfree_skb(skb);
 errout:
+	mutex_unlock(nlk->cb_mutex);
 	return err;
 }