Message ID | 5086A7FD.3020503@parallels.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
This is really a huge ball of confusion. The user asks for a device to bind to, and they want the device which had a particular name at the time of the call. If you allow setting this via ifindex, the issues and races are the same whether the name-->ifindex translation is done before the sockopt() call or during it. Furthermore both the name and the ifindex can change (that latter via module unload/load). If you want me to consider these changes seriously, talk less about obtuse issues like symmetry and more about what problems it actually solves. As far as I can tell, all the same real issues still exist even if we had this new interface. In fact, I wouldn't mind if a getsockopt() on SO_BINDTODEVICE returned BOTH the name and the ifindex in a special structure. Then you could actually construct a more foolproof mechanism on the user side to try various ways to get the same device bound to during restart. Symmetry is over-rated. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/23/2012 01:36 PM, David Miller wrote: > > This is really a huge ball of confusion. > > The user asks for a device to bind to, and they want the device which > had a particular name at the time of the call. > > If you allow setting this via ifindex, the issues and races are the > same whether the name-->ifindex translation is done before the > sockopt() call or during it. Yes, there are race conditions in the socket APIs, even IP addresses can go away right before a bind() call. > Furthermore both the name and the ifindex can change (that latter > via module unload/load). > > If you want me to consider these changes seriously, talk less about > obtuse issues like symmetry and more about what problems it actually > solves. As far as I can tell, all the same real issues still exist > even if we had this new interface. Noone's complained about not having this getsockopt() for over 7+ years, so in that sense I'm not sure what problem it solved adding it. > In fact, I wouldn't mind if a getsockopt() on SO_BINDTODEVICE returned > BOTH the name and the ifindex in a special structure. Then you could > actually construct a more foolproof mechanism on the user side to try > various ways to get the same device bound to during restart. There is no fool-proof way to do any of this since we can agree device names and indexes can change. But people aren't typically running daemons listening on interfaces that are constantly changing like taps or tunnels, instead it's on eth1 so they can run a private DHCP server. > Symmetry is over-rated. If any of the other SO_* options didn't set and get using the same structure I'd agree with you, but I can't find any that do. -Brian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h index b1bea03..c99c843 100644 --- a/include/uapi/asm-generic/socket.h +++ b/include/uapi/asm-generic/socket.h @@ -72,4 +72,6 @@ /* Instruct lower device to use last 4-bytes of skb data as FCS */ #define SO_NOFCS 43 +#define SO_BINDTOIFINDEX 44 + #endif /* __ASM_GENERIC_SOCKET_H */ diff --git a/net/core/sock.c b/net/core/sock.c index c49412c..c4f56e9 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -505,6 +505,39 @@ struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie) } EXPORT_SYMBOL(sk_dst_check); +static void sock_set_bound_dev(struct sock *sk, int idx) +{ + sk->sk_bound_dev_if = idx; + sk_dst_reset(sk); +} + +static int sock_bindtoindex(struct sock *sk, int idx) +{ + int ret = -ENOPROTOOPT; +#ifdef CONFIG_NETDEVICES + struct net *net = sock_net(sk); + + if (!capable(CAP_NET_RAW)) + return -EPERM; + + if (idx != 0) { + struct net_device *dev; + + rcu_read_lock(); + dev = dev_get_by_index_rcu(net, idx); + rcu_read_unlock(); + ret = -ENODEV; + if (!dev) + goto out; + } + + sock_set_bound_dev(sk, idx); + ret = 0; +out: +#endif + return ret; +} + static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen) { int ret = -ENOPROTOOPT; @@ -550,8 +583,7 @@ static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen) } lock_sock(sk); - sk->sk_bound_dev_if = index; - sk_dst_reset(sk); + sock_set_bound_dev(sk, index); release_sock(sk); ret = 0; @@ -839,6 +871,9 @@ set_rcvbuf: case SO_NOFCS: sock_valbool_flag(sk, SOCK_NOFCS, valbool); break; + case SO_BINDTOIFINDEX: + ret = sock_bindtoindex(sk, val); + break; default: ret = -ENOPROTOOPT; @@ -1074,7 +1109,7 @@ int sock_getsockopt(struct socket *sock, int level, int optname, case SO_NOFCS: v.val = sock_flag(sk, SOCK_NOFCS); break; - case SO_BINDTODEVICE: + case SO_BINDTOIFINDEX: v.val = sk->sk_bound_dev_if; break; default:
Gentlemen, here's how the symmetrical sk_bound_dev set/get would look like. If it's OK for you, I would ask David for inclusion. It was noted that it was not good to have the non-symmetrical SO_BINDTODEVICE sockoption (set works with name, get reports index) and we'd rather implement another option to set and get socket bound device that work on ifindex. The SO_BINDTONETDEV get-er is removed, as getting a device name is tricky and having one doesn't worth that pain. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> --- include/uapi/asm-generic/socket.h | 2 + net/core/sock.c | 41 ++++++++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 3 deletions(-)