Message ID | 509184D9.8030103@hp.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Brian Haley <brian.haley@hp.com> writes: > Instead of having the getsockopt() of SO_BINDTODEVICE return an index, which > will then require another call like if_indextoname() to get the actual interface > name, have it return the name directly. > > This also matches the existing man page description on socket(7) which mentions > the argument being an interface name. > > If the value has not been set, zero is returned and optlen will be set to zero > to indicate there is no interface name present. That will break all existing programs using the return value, right? Better to fix the manpage -Andi
On 10/31/2012 04:47 PM, Andi Kleen wrote: > Brian Haley <brian.haley@hp.com> writes: > >> Instead of having the getsockopt() of SO_BINDTODEVICE return an index, which >> will then require another call like if_indextoname() to get the actual interface >> name, have it return the name directly. >> >> This also matches the existing man page description on socket(7) which mentions >> the argument being an interface name. >> >> If the value has not been set, zero is returned and optlen will be set to zero >> to indicate there is no interface name present. > > That will break all existing programs using the return value, right? > Better to fix the manpage Dave just accepted the original code for this into net-next last week, so I don't think it's too late to change it to be correct. -Brian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Andi Kleen <andi@firstfloor.org> Date: Wed, 31 Oct 2012 13:47:01 -0700 > Brian Haley <brian.haley@hp.com> writes: > >> Instead of having the getsockopt() of SO_BINDTODEVICE return an index, which >> will then require another call like if_indextoname() to get the actual interface >> name, have it return the name directly. >> >> This also matches the existing man page description on socket(7) which mentions >> the argument being an interface name. >> >> If the value has not been set, zero is returned and optlen will be set to zero >> to indicate there is no interface name present. > > That will break all existing programs using the return value, right? > Better to fix the manpage It never returned a value before, you could not getsockopt() on SO_BINDTODEVICE previously. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> +static int sock_getbindtodevice(struct sock *sk, char __user *optval, > + int __user *optlen, int len) > +{ > + int ret = -ENOPROTOOPT; > +#ifdef CONFIG_NETDEVICES > + struct net *net = sock_net(sk); > + struct net_device *dev; > + char devname[IFNAMSIZ]; > + > + if (sk->sk_bound_dev_if == 0) { > + len = 0; > + goto zero; > + } > + > + ret = -EINVAL; > + if (len < IFNAMSIZ) > + goto out; > + > + rcu_read_lock(); > + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); > + if (dev) > + strcpy(devname, dev->name); This still races with the device name change, potentially providing a name which never existed in the system, doesn't it? > + rcu_read_unlock(); > + ret = -ENODEV; > + if (!dev) > + goto out; > + > + len = strlen(devname) + 1; > + > + ret = -EFAULT; > + if (copy_to_user(optval, devname, len)) > + goto out; > + > +zero: > + ret = -EFAULT; > + if (put_user(len, optlen)) > + goto out; > + > + ret = 0; > + > +out: > +#endif > + > + return ret; > +} Thanks, Pavel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/02/2012 05:36 AM, Pavel Emelyanov wrote: >> +static int sock_getbindtodevice(struct sock *sk, char __user *optval, >> + int __user *optlen, int len) >> +{ >> + int ret = -ENOPROTOOPT; >> +#ifdef CONFIG_NETDEVICES >> + struct net *net = sock_net(sk); >> + struct net_device *dev; >> + char devname[IFNAMSIZ]; >> + >> + if (sk->sk_bound_dev_if == 0) { >> + len = 0; >> + goto zero; >> + } >> + >> + ret = -EINVAL; >> + if (len < IFNAMSIZ) >> + goto out; >> + >> + rcu_read_lock(); >> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); >> + if (dev) >> + strcpy(devname, dev->name); > > This still races with the device name change, potentially providing > a name which never existed in the system, doesn't it? My only argument here is that SIOCGIFNAME has had this same code forever, and noone has ever complained about that returning a garbled name. Even dev_get_by_name() only holds an rcu lock when doing a strncmp(). We'd need to audit the whole kernel to catch all the places where we potentially look at dev->name while it could change. Is it really worth it? -Brian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-11-02 at 11:02 -0400, Brian Haley wrote: > On 11/02/2012 05:36 AM, Pavel Emelyanov wrote: > >> +static int sock_getbindtodevice(struct sock *sk, char __user *optval, > >> + int __user *optlen, int len) > >> +{ > >> + int ret = -ENOPROTOOPT; > >> +#ifdef CONFIG_NETDEVICES > >> + struct net *net = sock_net(sk); > >> + struct net_device *dev; > >> + char devname[IFNAMSIZ]; > >> + > >> + if (sk->sk_bound_dev_if == 0) { > >> + len = 0; > >> + goto zero; > >> + } > >> + > >> + ret = -EINVAL; > >> + if (len < IFNAMSIZ) > >> + goto out; > >> + > >> + rcu_read_lock(); > >> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); > >> + if (dev) > >> + strcpy(devname, dev->name); > > > > This still races with the device name change, potentially providing > > a name which never existed in the system, doesn't it? > > My only argument here is that SIOCGIFNAME has had this same code forever, and > noone has ever complained about that returning a garbled name. Even > dev_get_by_name() only holds an rcu lock when doing a strncmp(). > > We'd need to audit the whole kernel to catch all the places where we potentially > look at dev->name while it could change. Is it really worth it? A net device name can't be changed while the device is up, or while another task holds the RTNL lock. I think that covers almost all uses. I don't know whether it's worth going out to look for exceptions, but we might as well fix the cases we know about. Ben.
On 11/02/2012 05:34 PM, Ben Hutchings wrote: > On Fri, 2012-11-02 at 11:02 -0400, Brian Haley wrote: >> On 11/02/2012 05:36 AM, Pavel Emelyanov wrote: >>>> +static int sock_getbindtodevice(struct sock *sk, char __user *optval, >>>> + int __user *optlen, int len) >>>> +{ >>>> + int ret = -ENOPROTOOPT; >>>> +#ifdef CONFIG_NETDEVICES >>>> + struct net *net = sock_net(sk); >>>> + struct net_device *dev; >>>> + char devname[IFNAMSIZ]; >>>> + >>>> + if (sk->sk_bound_dev_if == 0) { >>>> + len = 0; >>>> + goto zero; >>>> + } >>>> + >>>> + ret = -EINVAL; >>>> + if (len < IFNAMSIZ) >>>> + goto out; >>>> + >>>> + rcu_read_lock(); >>>> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); >>>> + if (dev) >>>> + strcpy(devname, dev->name); >>> >>> This still races with the device name change, potentially providing >>> a name which never existed in the system, doesn't it? >> >> My only argument here is that SIOCGIFNAME has had this same code forever, and >> noone has ever complained about that returning a garbled name. Even >> dev_get_by_name() only holds an rcu lock when doing a strncmp(). >> >> We'd need to audit the whole kernel to catch all the places where we potentially >> look at dev->name while it could change. Is it really worth it? > > A net device name can't be changed while the device is up, or while > another task holds the RTNL lock. I think that covers almost all uses. > I don't know whether it's worth going out to look for exceptions, but we > might as well fix the cases we know about. So do you think we can fix these corner cases later and get the API right first? -Brian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-11-05 at 21:01 -0700, Brian Haley wrote: > On 11/02/2012 05:34 PM, Ben Hutchings wrote: > > On Fri, 2012-11-02 at 11:02 -0400, Brian Haley wrote: > >> On 11/02/2012 05:36 AM, Pavel Emelyanov wrote: > >>>> +static int sock_getbindtodevice(struct sock *sk, char __user *optval, > >>>> + int __user *optlen, int len) > >>>> +{ > >>>> + int ret = -ENOPROTOOPT; > >>>> +#ifdef CONFIG_NETDEVICES > >>>> + struct net *net = sock_net(sk); > >>>> + struct net_device *dev; > >>>> + char devname[IFNAMSIZ]; > >>>> + > >>>> + if (sk->sk_bound_dev_if == 0) { > >>>> + len = 0; > >>>> + goto zero; > >>>> + } > >>>> + > >>>> + ret = -EINVAL; > >>>> + if (len < IFNAMSIZ) > >>>> + goto out; > >>>> + > >>>> + rcu_read_lock(); > >>>> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); > >>>> + if (dev) > >>>> + strcpy(devname, dev->name); > >>> > >>> This still races with the device name change, potentially providing > >>> a name which never existed in the system, doesn't it? > >> > >> My only argument here is that SIOCGIFNAME has had this same code forever, and > >> noone has ever complained about that returning a garbled name. Even > >> dev_get_by_name() only holds an rcu lock when doing a strncmp(). > >> > >> We'd need to audit the whole kernel to catch all the places where we potentially > >> look at dev->name while it could change. Is it really worth it? > > > > A net device name can't be changed while the device is up, or while > > another task holds the RTNL lock. I think that covers almost all uses. > > I don't know whether it's worth going out to look for exceptions, but we > > might as well fix the cases we know about. > > So do you think we can fix these corner cases later and get the API > right first? Well you can avoid the problem here by using rtnl_{,un}lock() and __dev_get_by_index(). But if that's too heavyweight (it depends on how often you think this might need to be called) then I suppose it can be left until we have a general solution for races with renaming. Ben.
On 11/05/2012 09:01 PM, Brian Haley wrote: >>>>> + rcu_read_lock(); >>>>> + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); >>>>> + if (dev) >>>>> + strcpy(devname, dev->name); >>>> >>>> This still races with the device name change, potentially providing >>>> a name which never existed in the system, doesn't it? >>> >>> My only argument here is that SIOCGIFNAME has had this same code >>> forever, and >>> noone has ever complained about that returning a garbled name. Even >>> dev_get_by_name() only holds an rcu lock when doing a strncmp(). >>> >>> We'd need to audit the whole kernel to catch all the places where we >>> potentially >>> look at dev->name while it could change. Is it really worth it? >> >> A net device name can't be changed while the device is up, or while >> another task holds the RTNL lock. I think that covers almost all uses. >> I don't know whether it's worth going out to look for exceptions, but we >> might as well fix the cases we know about. > > So do you think we can fix these corner cases later and get the API > right first? Hi Dave, I noticed this isn't in patchwork anymore. Is it possible to get this in and then work on the other issue in a separate patch since it's not just this code that has this problem? Of course I'm still not convinced that extra check is necessary (but I'll do it to make others happy) because if you do a setsockopt("eth0"), all you care is that a getsockopt() returns "eth0" - if it returns anything else, eth0_renamed, eth0_foo, etc you'll notice it's not the same and take action. And the fact that the name can't change when UP means the odds are small it can happen anyways. Thanks, -Brian -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Brian Haley <brian.haley@hp.com> Date: Thu, 08 Nov 2012 08:36:02 -0700 > I noticed this isn't in patchwork anymore. I want all the races resolved, and in any event the thing to do when something isn't in patchwork IS TO RESUBMIT IT! Every time you ask me stuff like this you take up more of my already low amount of time. Today I'm again without internet access other than my cell phone, so everything you can do TO REDUCE my overhead would be greatly appreciated. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/sock.c b/net/core/sock.c index 0a023b8..9172ff4 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -505,7 +505,8 @@ struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie) } EXPORT_SYMBOL(sk_dst_check); -static int sock_bindtodevice(struct sock *sk, char __user *optval, int optlen) +static int sock_setbindtodevice(struct sock *sk, char __user *optval, + int optlen) { int ret = -ENOPROTOOPT; #ifdef CONFIG_NETDEVICES @@ -562,6 +563,52 @@ out: return ret; } +static int sock_getbindtodevice(struct sock *sk, char __user *optval, + int __user *optlen, int len) +{ + int ret = -ENOPROTOOPT; +#ifdef CONFIG_NETDEVICES + struct net *net = sock_net(sk); + struct net_device *dev; + char devname[IFNAMSIZ]; + + if (sk->sk_bound_dev_if == 0) { + len = 0; + goto zero; + } + + ret = -EINVAL; + if (len < IFNAMSIZ) + goto out; + + rcu_read_lock(); + dev = dev_get_by_index_rcu(net, sk->sk_bound_dev_if); + if (dev) + strcpy(devname, dev->name); + rcu_read_unlock(); + ret = -ENODEV; + if (!dev) + goto out; + + len = strlen(devname) + 1; + + ret = -EFAULT; + if (copy_to_user(optval, devname, len)) + goto out; + +zero: + ret = -EFAULT; + if (put_user(len, optlen)) + goto out; + + ret = 0; + +out: +#endif + + return ret; +} + static inline void sock_valbool_flag(struct sock *sk, int bit, int valbool) { if (valbool) @@ -589,7 +636,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname, */ if (optname == SO_BINDTODEVICE) - return sock_bindtodevice(sk, optval, optlen); + return sock_setbindtodevice(sk, optval, optlen); if (optlen < sizeof(int)) return -EINVAL; @@ -1074,9 +1121,10 @@ int sock_getsockopt(struct socket *sock, int level, int optname, case SO_NOFCS: v.val = sock_flag(sk, SOCK_NOFCS); break; + case SO_BINDTODEVICE: - v.val = sk->sk_bound_dev_if; - break; + return sock_getbindtodevice(sk, optval, optlen, len); + default: return -ENOPROTOOPT;
Instead of having the getsockopt() of SO_BINDTODEVICE return an index, which will then require another call like if_indextoname() to get the actual interface name, have it return the name directly. This also matches the existing man page description on socket(7) which mentions the argument being an interface name. If the value has not been set, zero is returned and optlen will be set to zero to indicate there is no interface name present. Signed-off-by: Brian Haley <brian.haley@hp.com> -- } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html