diff mbox

[net-next] sockopt: Introduce the SO_BINDTOIFINDEX

Message ID 5086A7FD.3020503@parallels.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Pavel Emelyanov Oct. 23, 2012, 2:21 p.m. UTC
Gentlemen,

here's how the symmetrical sk_bound_dev set/get would look like. If it's OK
for you, I would ask David for inclusion.



It was noted that it was not good to have the non-symmetrical SO_BINDTODEVICE
sockoption (set works with name, get reports index) and we'd rather implement
another option to set and get socket bound device that work on ifindex.

The SO_BINDTONETDEV get-er is removed, as getting a device name is tricky and
having one doesn't worth that pain.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

---
 include/uapi/asm-generic/socket.h |    2 +
 net/core/sock.c                   |   41 ++++++++++++++++++++++++++++++++++--
 2 files changed, 40 insertions(+), 3 deletions(-)

Comments

David Miller Oct. 23, 2012, 5:36 p.m. UTC | #1
This is really a huge ball of confusion.

The user asks for a device to bind to, and they want the device which
had a particular name at the time of the call.

If you allow setting this via ifindex, the issues and races are the
same whether the name-->ifindex translation is done before the
sockopt() call or during it.

Furthermore both the name and the ifindex can change (that latter
via module unload/load).

If you want me to consider these changes seriously, talk less about
obtuse issues like symmetry and more about what problems it actually
solves.  As far as I can tell, all the same real issues still exist
even if we had this new interface.

In fact, I wouldn't mind if a getsockopt() on SO_BINDTODEVICE returned
BOTH the name and the ifindex in a special structure.  Then you could
actually construct a more foolproof mechanism on the user side to try
various ways to get the same device bound to during restart.

Symmetry is over-rated.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Haley Oct. 23, 2012, 9:43 p.m. UTC | #2
On 10/23/2012 01:36 PM, David Miller wrote:
> 
> This is really a huge ball of confusion.
> 
> The user asks for a device to bind to, and they want the device which
> had a particular name at the time of the call.
> 
> If you allow setting this via ifindex, the issues and races are the
> same whether the name-->ifindex translation is done before the
> sockopt() call or during it.

Yes, there are race conditions in the socket APIs, even IP addresses can go away
right before a bind() call.

> Furthermore both the name and the ifindex can change (that latter
> via module unload/load).
> 
> If you want me to consider these changes seriously, talk less about
> obtuse issues like symmetry and more about what problems it actually
> solves.  As far as I can tell, all the same real issues still exist
> even if we had this new interface.

Noone's complained about not having this getsockopt() for over 7+ years, so in
that sense I'm not sure what problem it solved adding it.

> In fact, I wouldn't mind if a getsockopt() on SO_BINDTODEVICE returned
> BOTH the name and the ifindex in a special structure.  Then you could
> actually construct a more foolproof mechanism on the user side to try
> various ways to get the same device bound to during restart.

There is no fool-proof way to do any of this since we can agree device names and
indexes can change.  But people aren't typically running daemons listening on
interfaces that are constantly changing like taps or tunnels, instead it's on
eth1 so they can run a private DHCP server.

> Symmetry is over-rated.

If any of the other SO_* options didn't set and get using the same structure I'd
agree with you, but I can't find any that do.

-Brian

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index b1bea03..c99c843 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -72,4 +72,6 @@ 
 /* Instruct lower device to use last 4-bytes of skb data as FCS */
 #define SO_NOFCS		43
 
+#define SO_BINDTOIFINDEX	44
+
 #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/net/core/sock.c b/net/core/sock.c
index c49412c..c4f56e9 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -505,6 +505,39 @@  struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie)
 }
 EXPORT_SYMBOL(sk_dst_check);
 
+static void sock_set_bound_dev(struct sock *sk, int idx)
+{
+	sk->sk_bound_dev_if = idx;
+	sk_dst_reset(sk);
+}
+
+static int sock_bindtoindex(struct sock *sk, int idx)
+{
+	int ret = -ENOPROTOOPT;
+#ifdef CONFIG_NETDEVICES
+	struct net *net = sock_net(sk);
+
+	if (!capable(CAP_NET_RAW))
+		return -EPERM;
+
+	if (idx != 0) {
+		struct net_device *dev;
+
+		rcu_read_lock();
+		dev = dev_get_by_index_rcu(net, idx);
+		rcu_read_unlock();
+		ret = -ENODEV;
+		if (!dev)
+			goto out;
+	}
+
+	sock_set_bound_dev(sk, idx);
+	ret = 0;
+out:
+#endif
+	return ret;
+}
+
 static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen)
 {
 	int ret = -ENOPROTOOPT;
@@ -550,8 +583,7 @@  static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen)
 	}
 
 	lock_sock(sk);
-	sk->sk_bound_dev_if = index;
-	sk_dst_reset(sk);
+	sock_set_bound_dev(sk, index);
 	release_sock(sk);
 
 	ret = 0;
@@ -839,6 +871,9 @@  set_rcvbuf:
 	case SO_NOFCS:
 		sock_valbool_flag(sk, SOCK_NOFCS, valbool);
 		break;
+	case SO_BINDTOIFINDEX:
+		ret = sock_bindtoindex(sk, val);
+		break;
 
 	default:
 		ret = -ENOPROTOOPT;
@@ -1074,7 +1109,7 @@  int sock_getsockopt(struct socket *sock, int level, int optname,
 	case SO_NOFCS:
 		v.val = sock_flag(sk, SOCK_NOFCS);
 		break;
-	case SO_BINDTODEVICE:
+	case SO_BINDTOIFINDEX:
 		v.val = sk->sk_bound_dev_if;
 		break;
 	default: