diff mbox series

[RFC,net-next,v1] net: core: enable SO_BINDTODEVICE for non-root users

Message ID 20200315155910.3262015-1-vincent@bernat.ch
State RFC
Delegated to: David Miller
Headers show
Series [RFC,net-next,v1] net: core: enable SO_BINDTODEVICE for non-root users | expand

Commit Message

Vincent Bernat March 15, 2020, 3:59 p.m. UTC
Currently, SO_BINDTODEVICE requires CAP_NET_RAW. This change allows a
non-root user to bind a socket to an interface if it is not already
bound. This is useful to allow an application to bind itself to a
specific VRF for outgoing or incoming connections. Currently, an
application wanting to manage connections through several VRF need to
be privileged. Moreover, I don't see a reason why an application
couldn't restrict its own scope. Such a privilege is already possible
with UDP through IP_UNICAST_IF.

When an application is restricted to a VRF (with `ip vrf exec`), the
socket is bound to an interface at creation and therefore, a
non-privileged call to SO_BINDTODEVICE to escape the VRF fails.

When an application bound a socket to SO_BINDTODEVICE and transmit it
to a non-privileged process through a Unix socket, a tentative to
change the bound device also fails.

Before:

    >>> import socket
    >>> s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    >>> s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE, b"dummy0")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    PermissionError: [Errno 1] Operation not permitted

After:

    >>> import socket
    >>> s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    >>> s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE, b"dummy0")
    >>> s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE, b"dummy0")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    PermissionError: [Errno 1] Operation not permitted

Signed-off-by: Vincent Bernat <vincent@bernat.ch>
---
 net/core/sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Miller March 16, 2020, 12:02 a.m. UTC | #1
From: Vincent Bernat <vincent@bernat.ch>
Date: Sun, 15 Mar 2020 16:59:11 +0100

> Currently, SO_BINDTODEVICE requires CAP_NET_RAW. This change allows a
> non-root user to bind a socket to an interface if it is not already
> bound. This is useful to allow an application to bind itself to a
> specific VRF for outgoing or incoming connections. Currently, an
> application wanting to manage connections through several VRF need to
> be privileged. Moreover, I don't see a reason why an application
> couldn't restrict its own scope. Such a privilege is already possible
> with UDP through IP_UNICAST_IF.

It could be argued that IP_UNICAST_IF and similar should be privileged
as well.

When the administrator sets up the routes, they don't expect that
arbitrary user applications can "escape" the route configuration by
specifying the interface so readily.
David Ahern March 16, 2020, 2:36 a.m. UTC | #2
On 3/15/20 6:02 PM, David Miller wrote:
> From: Vincent Bernat <vincent@bernat.ch>
> Date: Sun, 15 Mar 2020 16:59:11 +0100
> 
>> Currently, SO_BINDTODEVICE requires CAP_NET_RAW. This change allows a
>> non-root user to bind a socket to an interface if it is not already
>> bound. This is useful to allow an application to bind itself to a
>> specific VRF for outgoing or incoming connections. Currently, an
>> application wanting to manage connections through several VRF need to
>> be privileged. Moreover, I don't see a reason why an application
>> couldn't restrict its own scope. Such a privilege is already possible
>> with UDP through IP_UNICAST_IF.
> 
> It could be argued that IP_UNICAST_IF and similar should be privileged
> as well.
> 
> When the administrator sets up the routes, they don't expect that
> arbitrary user applications can "escape" the route configuration by
> specifying the interface so readily.
> 

Hi Dave:

As a reminder, there are currently 3 APIs to specify a preferred device
association which influences route lookups:

1. SO_BINDTODEVICE - sets sk_bound_dev_if and is the strongest binding
(ie., can not be overridden),

2. IP_UNICAST_IF / IPV6_UNICAST_IF - sets uc_index / ucast_oif and is
sticky for a socket, and

3. IP_PKTINFO / IPV6_PKTINFO - which is per message.

The first, SO_BINDTODEVICE, requires root privileges. The last 2 do not
require root privileges but only apply to raw and UDP sockets making TCP
the outlier.

Further, a downside to the last 2 is that they work for sendmsg only;
there is no way to definitively match a response to the sending socket.
The key point is that UDP and raw have multiple non-root APIs to dictate
a preferred device for sending messages.

Vincent's patch simplifies things quite a bit - allowing consistency
across the protocols and directions - but without overriding any
administrator settings (e.g., inherited bindings via ebpf programs).
David Miller March 16, 2020, 9:13 a.m. UTC | #3
From: David Ahern <dsahern@gmail.com>
Date: Sun, 15 Mar 2020 20:36:10 -0600

> As a reminder, there are currently 3 APIs to specify a preferred device
> association which influences route lookups:
> 
> 1. SO_BINDTODEVICE - sets sk_bound_dev_if and is the strongest binding
> (ie., can not be overridden),
> 
> 2. IP_UNICAST_IF / IPV6_UNICAST_IF - sets uc_index / ucast_oif and is
> sticky for a socket, and
> 
> 3. IP_PKTINFO / IPV6_PKTINFO - which is per message.
> 
> The first, SO_BINDTODEVICE, requires root privileges. The last 2 do not
> require root privileges but only apply to raw and UDP sockets making TCP
> the outlier.
> 
> Further, a downside to the last 2 is that they work for sendmsg only;
> there is no way to definitively match a response to the sending socket.
> The key point is that UDP and raw have multiple non-root APIs to dictate
> a preferred device for sending messages.
> 
> Vincent's patch simplifies things quite a bit - allowing consistency
> across the protocols and directions - but without overriding any
> administrator settings (e.g., inherited bindings via ebpf programs).

Understood, but I still wonder if this mis-match of privilege
requirements was by design or unintentional.

Allowing arbitrary users to specify SO_BINDTODEVICE has broad and far
reaching consequences, so at a minimum if we are going to remove the
restriction we should at least discuss the implications.
Vincent Bernat March 16, 2020, 1:58 p.m. UTC | #4
❦ 16 mars 2020 02:13 -07, David Miller:

>> As a reminder, there are currently 3 APIs to specify a preferred device
>> association which influences route lookups:
>> 
>> 1. SO_BINDTODEVICE - sets sk_bound_dev_if and is the strongest binding
>> (ie., can not be overridden),
>> 
>> 2. IP_UNICAST_IF / IPV6_UNICAST_IF - sets uc_index / ucast_oif and is
>> sticky for a socket, and
>> 
>> 3. IP_PKTINFO / IPV6_PKTINFO - which is per message.
>> 
>> The first, SO_BINDTODEVICE, requires root privileges. The last 2 do not
>> require root privileges but only apply to raw and UDP sockets making TCP
>> the outlier.
>> 
>> Further, a downside to the last 2 is that they work for sendmsg only;
>> there is no way to definitively match a response to the sending socket.
>> The key point is that UDP and raw have multiple non-root APIs to dictate
>> a preferred device for sending messages.
>> 
>> Vincent's patch simplifies things quite a bit - allowing consistency
>> across the protocols and directions - but without overriding any
>> administrator settings (e.g., inherited bindings via ebpf programs).
>
> Understood, but I still wonder if this mis-match of privilege
> requirements was by design or unintentional.
>
> Allowing arbitrary users to specify SO_BINDTODEVICE has broad and far
> reaching consequences, so at a minimum if we are going to remove the
> restriction we should at least discuss the implications.

Without VRF, it's hard to build a case where a process could "evade" its
setup using SO_BINDTODEVICE. It could be used to use alternative routing
table when ip rules are using interfaces (which is not uncommon), but I
think almost all such setups are using this to have some isolation in
routing (not for local processes), something that could be replaced by
VRF.

With VRF, removing the restriction allows a process to have more
possibilities than previously if it has not been restricted. This is my
use case: it is actually difficult to use VRF for anything else than
routing because local processes may want to receive connections in a VRF
and forward them to another one.

In summary, unless I am missing something, the main implication is when
using VRF. Without VRF, no real change.

An alternative would be to use a sysctl to decide the behaviour.
David Ahern March 20, 2020, 3:21 p.m. UTC | #5
[ sorry for the delay; out of commission for a few days ]

On 3/16/20 7:58 AM, Vincent Bernat wrote:
>  ❦ 16 mars 2020 02:13 -07, David Miller:
> 
>>> As a reminder, there are currently 3 APIs to specify a preferred device
>>> association which influences route lookups:
>>>
>>> 1. SO_BINDTODEVICE - sets sk_bound_dev_if and is the strongest binding
>>> (ie., can not be overridden),
>>>
>>> 2. IP_UNICAST_IF / IPV6_UNICAST_IF - sets uc_index / ucast_oif and is
>>> sticky for a socket, and
>>>
>>> 3. IP_PKTINFO / IPV6_PKTINFO - which is per message.
>>>
>>> The first, SO_BINDTODEVICE, requires root privileges. The last 2 do not
>>> require root privileges but only apply to raw and UDP sockets making TCP
>>> the outlier.
>>>
>>> Further, a downside to the last 2 is that they work for sendmsg only;
>>> there is no way to definitively match a response to the sending socket.
>>> The key point is that UDP and raw have multiple non-root APIs to dictate
>>> a preferred device for sending messages.
>>>
>>> Vincent's patch simplifies things quite a bit - allowing consistency
>>> across the protocols and directions - but without overriding any
>>> administrator settings (e.g., inherited bindings via ebpf programs).
>>
>> Understood, but I still wonder if this mis-match of privilege
>> requirements was by design or unintentional.

IP_UNICAST_IF and IPV6_UNICAST_IF were added for wine (76e21053b5bf3 and
c4062dfc425e9) specifically for use by non-root processes. It is not
clear in the commit message why only sendmsg was needed and why only
udp/raw. I have not touched wine in 15+ years, so no easy way for me to
look into how the response side worked. My guess is just relying on the
socket matching when sk_bound_dev_if is not set.

>>
>> Allowing arbitrary users to specify SO_BINDTODEVICE has broad and far
>> reaching consequences, so at a minimum if we are going to remove the
>> restriction we should at least discuss the implications.

certainly. I brought up this need for non-privileged binding at netconf
a few years ago.[0] My thought at that time was to make IP_UNICAST_IF /
IPV6_UNICAST_IF work for TCP. What I like about Vincent's proposed
change is that it checks for an existing setting and does not override.

As I mentioned before, UDP and RAW both have APIs to get around the root
requirement for sends and responses can match sockets for non-VRF
scenarios. So, the biggest change for this patch is to TCP (and VRF use
cases).

wrt to TCP, allowing the setting for daemons narrows the scope of
bind(), so that should be ok. The daemon could, with additional work,
just ignore connection requests it does not believe come through the
preferred device, though routing changes could move an established
connection to a different interface. Overall, this should not be an
impact to servers.

TCP clients are the biggest question mark wrt to evading an admin's
routing setup. I can't say I can make a persuasive argument here beyond
consistency - a concern for TCP connections but not UDP.

> 
> Without VRF, it's hard to build a case where a process could "evade" its
> setup using SO_BINDTODEVICE. It could be used to use alternative routing
> table when ip rules are using interfaces (which is not uncommon), but I
> think almost all such setups are using this to have some isolation in
> routing (not for local processes), something that could be replaced by
> VRF.
> 
> With VRF, removing the restriction allows a process to have more
> possibilities than previously if it has not been restricted. This is my
> use case: it is actually difficult to use VRF for anything else than
> routing because local processes may want to receive connections in a VRF
> and forward them to another one.
> 
> In summary, unless I am missing something, the main implication is when
> using VRF. Without VRF, no real change.
> 
> An alternative would be to use a sysctl to decide the behaviour.

That is consistent with other l3mdev/vrf features.

[0] https://lwn.net/Articles/719393/
diff mbox series

Patch

diff --git a/net/core/sock.c b/net/core/sock.c
index 0fc8937a7ff4..e89c6148177b 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -574,7 +574,7 @@  static int sock_setbindtodevice_locked(struct sock *sk, int ifindex)
 
 	/* Sorry... */
 	ret = -EPERM;
-	if (!ns_capable(net->user_ns, CAP_NET_RAW))
+	if (sk->sk_bound_dev_if && !ns_capable(net->user_ns, CAP_NET_RAW))
 		goto out;
 
 	ret = -EINVAL;