diff mbox

[RFC,v2,2/4] net: enables interface option to skip IP

Message ID 1392433180-16052-3-git-send-email-mcgrof@do-not-panic.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Luis R. Rodriguez Feb. 15, 2014, 2:59 a.m. UTC
From: "Luis R. Rodriguez" <mcgrof@suse.com>

Some interfaces do not need to have any IPv4 or IPv6
addresses, so enable an option to specify this. One
example where this is observed are virtualization
backend interfaces which just use the net_device
constructs to help with their respective frontends.

This should optimize boot time and complexity on
virtualization environments for each backend interface
while also avoiding triggering SLAAC and DAD, which is
simply pointless for these type of interfaces.

Cc: "David S. Miller" <davem@davemloft.net>
cC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 include/uapi/linux/if.h | 1 +
 net/ipv4/devinet.c      | 3 +++
 net/ipv6/addrconf.c     | 6 ++++++
 3 files changed, 10 insertions(+)

Comments

Dan Williams Feb. 17, 2014, 8:23 p.m. UTC | #1
On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> Some interfaces do not need to have any IPv4 or IPv6
> addresses, so enable an option to specify this. One
> example where this is observed are virtualization
> backend interfaces which just use the net_device
> constructs to help with their respective frontends.
> 
> This should optimize boot time and complexity on
> virtualization environments for each backend interface
> while also avoiding triggering SLAAC and DAD, which is
> simply pointless for these type of interfaces.

Would it not be better/cleaner to use disable_ipv6 and then add a
disable_ipv4 sysctl, then use those with that interface?  The
IFF_SKIP_IP seems to duplicate at least part of what disable_ipv6 is
already doing.

Dan

> Cc: "David S. Miller" <davem@davemloft.net>
> cC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: James Morris <jmorris@namei.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
> ---
>  include/uapi/linux/if.h | 1 +
>  net/ipv4/devinet.c      | 3 +++
>  net/ipv6/addrconf.c     | 6 ++++++
>  3 files changed, 10 insertions(+)
> 
> diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
> index 8d10382..566d856 100644
> --- a/include/uapi/linux/if.h
> +++ b/include/uapi/linux/if.h
> @@ -85,6 +85,7 @@
>  					 * change when it's running */
>  #define IFF_MACVLAN 0x200000		/* Macvlan device */
>  #define IFF_BRIDGE_NON_ROOT 0x400000    /* Don't consider for root bridge */
> +#define IFF_SKIP_IP	0x800000	/* Skip IPv4, IPv6 */
>  
> 
>  #define IF_GET_IFACE	0x0001		/* for querying only */
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index a1b5bcb..8e9ef07 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -1342,6 +1342,9 @@ static int inetdev_event(struct notifier_block *this, unsigned long event,
>  
>  	ASSERT_RTNL();
>  
> +	if (dev->priv_flags & IFF_SKIP_IP)
> +		goto out;
> +
>  	if (!in_dev) {
>  		if (event == NETDEV_REGISTER) {
>  			in_dev = inetdev_init(dev);
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 4b6b720..57f58e3 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -314,6 +314,9 @@ static struct inet6_dev *ipv6_add_dev(struct net_device *dev)
>  
>  	ASSERT_RTNL();
>  
> +	if (dev->priv_flags & IFF_SKIP_IP)
> +		return NULL;
> +
>  	if (dev->mtu < IPV6_MIN_MTU)
>  		return NULL;
>  
> @@ -2749,6 +2752,9 @@ static int addrconf_notify(struct notifier_block *this, unsigned long event,
>  	int run_pending = 0;
>  	int err;
>  
> +	if (dev->priv_flags & IFF_SKIP_IP)
> +		return NOTIFY_OK;
> +
>  	switch (event) {
>  	case NETDEV_REGISTER:
>  		if (!idev && dev->mtu >= IPV6_MIN_MTU) {


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez Feb. 18, 2014, 9:19 p.m. UTC | #2
On Mon, Feb 17, 2014 at 12:23 PM, Dan Williams <dcbw@redhat.com> wrote:
> On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:
>> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>>
>> Some interfaces do not need to have any IPv4 or IPv6
>> addresses, so enable an option to specify this. One
>> example where this is observed are virtualization
>> backend interfaces which just use the net_device
>> constructs to help with their respective frontends.
>>
>> This should optimize boot time and complexity on
>> virtualization environments for each backend interface
>> while also avoiding triggering SLAAC and DAD, which is
>> simply pointless for these type of interfaces.
>
> Would it not be better/cleaner to use disable_ipv6 and then add a
> disable_ipv4 sysctl, then use those with that interface?

Sure, but note that the both disable_ipv6 and accept_dada sysctl
parameters are global. ipv4 and ipv6 interfaces are created upon
NETDEVICE_REGISTER, which will get triggered when a driver calls
register_netdev(). The goal of this patch was to enable an early
optimization for drivers that have no need ever for ipv4 or ipv6
interfaces.

Zoltan has noted though some use cases of IPv4 or IPv6 addresses on
backends though, as such this is no longer applicable as a
requirement. The ipv4 sysctl however still seems like a reasonable
approach to enable optimizations of the network in topologies where
its known we won't need them but -- we'd need to consider a much more
granular solution, not just global as it is now for disable_ipv6, and
we'd also have to figure out a clean way to do this to not incur the
cost of early address interface addition upon register_netdev().

Given that we have a use case for ipv4 and ipv6 addresses on
xen-netback we no longer have an immediate use case for such early
optimization primitives though, so I'll drop this.

> The IFF_SKIP_IP seems to duplicate at least part of what disable_ipv6 is
> already doing.

disable_ipv6 is global, the goal was to make this granular and skip
the cost upon early boot, but its been clarified we don't need this.

   Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Hemminger Feb. 18, 2014, 9:42 p.m. UTC | #3
On Tue, 18 Feb 2014 13:19:15 -0800
"Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:

> Sure, but note that the both disable_ipv6 and accept_dada sysctl
> parameters are global. ipv4 and ipv6 interfaces are created upon
> NETDEVICE_REGISTER, which will get triggered when a driver calls
> register_netdev(). The goal of this patch was to enable an early
> optimization for drivers that have no need ever for ipv4 or ipv6
> interfaces.

The trick with ipv6 is to register the device, then have userspace
do the ipv6 sysctl before bringing the device up.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 19, 2014, 4:45 p.m. UTC | #4
On Tue, 2014-02-18 at 13:19 -0800, Luis R. Rodriguez wrote:
> On Mon, Feb 17, 2014 at 12:23 PM, Dan Williams <dcbw@redhat.com> wrote:
> > On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:
> >> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> >>
> >> Some interfaces do not need to have any IPv4 or IPv6
> >> addresses, so enable an option to specify this. One
> >> example where this is observed are virtualization
> >> backend interfaces which just use the net_device
> >> constructs to help with their respective frontends.
> >>
> >> This should optimize boot time and complexity on
> >> virtualization environments for each backend interface
> >> while also avoiding triggering SLAAC and DAD, which is
> >> simply pointless for these type of interfaces.
> >
> > Would it not be better/cleaner to use disable_ipv6 and then add a
> > disable_ipv4 sysctl, then use those with that interface?
> 
> Sure, but note that the both disable_ipv6 and accept_dada sysctl
> parameters are global. ipv4 and ipv6 interfaces are created upon
> NETDEVICE_REGISTER, which will get triggered when a driver calls
> register_netdev(). The goal of this patch was to enable an early
> optimization for drivers that have no need ever for ipv4 or ipv6
> interfaces.

Each interface gets override sysctls too though, eg:

/proc/sys/net/ipv6/conf/enp0s25/disable_ipv6

which is the one I meant; you're obviously right that the global ones
aren't what you want here.  But the specific ones should be suitable?
If you set that on a per-interface basis, then you'll get EPERM or
something whenever you try to add IPv6 addresses or do IPv6 routing.

> Zoltan has noted though some use cases of IPv4 or IPv6 addresses on
> backends though, as such this is no longer applicable as a
> requirement. The ipv4 sysctl however still seems like a reasonable
> approach to enable optimizations of the network in topologies where
> its known we won't need them but -- we'd need to consider a much more
> granular solution, not just global as it is now for disable_ipv6, and
> we'd also have to figure out a clean way to do this to not incur the
> cost of early address interface addition upon register_netdev().
> 
> Given that we have a use case for ipv4 and ipv6 addresses on
> xen-netback we no longer have an immediate use case for such early
> optimization primitives though, so I'll drop this.
> 
> > The IFF_SKIP_IP seems to duplicate at least part of what disable_ipv6 is
> > already doing.
> 
> disable_ipv6 is global, the goal was to make this granular and skip
> the cost upon early boot, but its been clarified we don't need this.

Like Stephen says, you need to make sure you set them before IFF_UP, but
beyond that, wouldn't the interface-specific sysctls work?

Dan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez Feb. 19, 2014, 5:13 p.m. UTC | #5
On Tue, Feb 18, 2014 at 1:42 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Tue, 18 Feb 2014 13:19:15 -0800
> "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:
>
>> Sure, but note that the both disable_ipv6 and accept_dada sysctl
>> parameters are global. ipv4 and ipv6 interfaces are created upon
>> NETDEVICE_REGISTER, which will get triggered when a driver calls
>> register_netdev(). The goal of this patch was to enable an early
>> optimization for drivers that have no need ever for ipv4 or ipv6
>> interfaces.
>
> The trick with ipv6 is to register the device, then have userspace
> do the ipv6 sysctl before bringing the device up.

Nice, thanks!

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez Feb. 19, 2014, 5:20 p.m. UTC | #6
On Wed, Feb 19, 2014 at 8:45 AM, Dan Williams <dcbw@redhat.com> wrote:
> On Tue, 2014-02-18 at 13:19 -0800, Luis R. Rodriguez wrote:
>> On Mon, Feb 17, 2014 at 12:23 PM, Dan Williams <dcbw@redhat.com> wrote:
>> > On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:
>> >> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>> >>
>> >> Some interfaces do not need to have any IPv4 or IPv6
>> >> addresses, so enable an option to specify this. One
>> >> example where this is observed are virtualization
>> >> backend interfaces which just use the net_device
>> >> constructs to help with their respective frontends.
>> >>
>> >> This should optimize boot time and complexity on
>> >> virtualization environments for each backend interface
>> >> while also avoiding triggering SLAAC and DAD, which is
>> >> simply pointless for these type of interfaces.
>> >
>> > Would it not be better/cleaner to use disable_ipv6 and then add a
>> > disable_ipv4 sysctl, then use those with that interface?
>>
>> Sure, but note that the both disable_ipv6 and accept_dada sysctl
>> parameters are global. ipv4 and ipv6 interfaces are created upon
>> NETDEVICE_REGISTER, which will get triggered when a driver calls
>> register_netdev(). The goal of this patch was to enable an early
>> optimization for drivers that have no need ever for ipv4 or ipv6
>> interfaces.
>
> Each interface gets override sysctls too though, eg:
>
> /proc/sys/net/ipv6/conf/enp0s25/disable_ipv6

I hadn't seen those, thanks!

> which is the one I meant; you're obviously right that the global ones
> aren't what you want here.  But the specific ones should be suitable?

Under the approach Stephen mentioned by first ensuring the interface
is down yes. There's one use case I can consider to still want the
patch though, more on that below.

> If you set that on a per-interface basis, then you'll get EPERM or
> something whenever you try to add IPv6 addresses or do IPv6 routing.

Neat, thanks.

>> Zoltan has noted though some use cases of IPv4 or IPv6 addresses on
>> backends though, as such this is no longer applicable as a
>> requirement. The ipv4 sysctl however still seems like a reasonable
>> approach to enable optimizations of the network in topologies where
>> its known we won't need them but -- we'd need to consider a much more
>> granular solution, not just global as it is now for disable_ipv6, and
>> we'd also have to figure out a clean way to do this to not incur the
>> cost of early address interface addition upon register_netdev().
>>
>> Given that we have a use case for ipv4 and ipv6 addresses on
>> xen-netback we no longer have an immediate use case for such early
>> optimization primitives though, so I'll drop this.
>>
>> > The IFF_SKIP_IP seems to duplicate at least part of what disable_ipv6 is
>> > already doing.
>>
>> disable_ipv6 is global, the goal was to make this granular and skip
>> the cost upon early boot, but its been clarified we don't need this.
>
> Like Stephen says, you need to make sure you set them before IFF_UP, but
> beyond that, wouldn't the interface-specific sysctls work?

Yeah that'll do it, unless there is a measurable run time benefit cost
to never even add these in the first place. Consider a host with tons
of guests, not sure how many is 'a lot' these days. One would have to
measure the cost of reducing the amount of time it takes to boot these
up. As discussed in the other threads though there *is* some use cases
of assigning IPv4 or IPv6 addresses to the backend interfaces though:
routing them (although its unclear to me if iptables can be used
instead, Zoltan?). So at least now there no clear requirement to
remove these interfaces or not have them at all. The boot time cost
savings should be considered though if this is ultimately desirable. I
saw tons of timers and events that'd get triggered with any IPv4 or
IPv6 interface laying around.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zoltan Kiss Feb. 19, 2014, 7:13 p.m. UTC | #7
On 19/02/14 17:20, Luis R. Rodriguez wrote:
> On Wed, Feb 19, 2014 at 8:45 AM, Dan Williams <dcbw@redhat.com> wrote:
>> On Tue, 2014-02-18 at 13:19 -0800, Luis R. Rodriguez wrote:
>>> On Mon, Feb 17, 2014 at 12:23 PM, Dan Williams <dcbw@redhat.com> wrote:
>>>> On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:
>>>>> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>>>>>
>>>>> Some interfaces do not need to have any IPv4 or IPv6
>>>>> addresses, so enable an option to specify this. One
>>>>> example where this is observed are virtualization
>>>>> backend interfaces which just use the net_device
>>>>> constructs to help with their respective frontends.
>>>>>
>>>>> This should optimize boot time and complexity on
>>>>> virtualization environments for each backend interface
>>>>> while also avoiding triggering SLAAC and DAD, which is
>>>>> simply pointless for these type of interfaces.
>>>>
>>>> Would it not be better/cleaner to use disable_ipv6 and then add a
>>>> disable_ipv4 sysctl, then use those with that interface?
>>>
>>> Sure, but note that the both disable_ipv6 and accept_dada sysctl
>>> parameters are global. ipv4 and ipv6 interfaces are created upon
>>> NETDEVICE_REGISTER, which will get triggered when a driver calls
>>> register_netdev(). The goal of this patch was to enable an early
>>> optimization for drivers that have no need ever for ipv4 or ipv6
>>> interfaces.
>>
>> Each interface gets override sysctls too though, eg:
>>
>> /proc/sys/net/ipv6/conf/enp0s25/disable_ipv6
>
> I hadn't seen those, thanks!
>
>> which is the one I meant; you're obviously right that the global ones
>> aren't what you want here.  But the specific ones should be suitable?
>
> Under the approach Stephen mentioned by first ensuring the interface
> is down yes. There's one use case I can consider to still want the
> patch though, more on that below.
>
>> If you set that on a per-interface basis, then you'll get EPERM or
>> something whenever you try to add IPv6 addresses or do IPv6 routing.
>
> Neat, thanks.
>
>>> Zoltan has noted though some use cases of IPv4 or IPv6 addresses on
>>> backends though, as such this is no longer applicable as a
>>> requirement. The ipv4 sysctl however still seems like a reasonable
>>> approach to enable optimizations of the network in topologies where
>>> its known we won't need them but -- we'd need to consider a much more
>>> granular solution, not just global as it is now for disable_ipv6, and
>>> we'd also have to figure out a clean way to do this to not incur the
>>> cost of early address interface addition upon register_netdev().
>>>
>>> Given that we have a use case for ipv4 and ipv6 addresses on
>>> xen-netback we no longer have an immediate use case for such early
>>> optimization primitives though, so I'll drop this.
>>>
>>>> The IFF_SKIP_IP seems to duplicate at least part of what disable_ipv6 is
>>>> already doing.
>>>
>>> disable_ipv6 is global, the goal was to make this granular and skip
>>> the cost upon early boot, but its been clarified we don't need this.
>>
>> Like Stephen says, you need to make sure you set them before IFF_UP, but
>> beyond that, wouldn't the interface-specific sysctls work?
>
> Yeah that'll do it, unless there is a measurable run time benefit cost
> to never even add these in the first place. Consider a host with tons
> of guests, not sure how many is 'a lot' these days. One would have to
> measure the cost of reducing the amount of time it takes to boot these
> up. As discussed in the other threads though there *is* some use cases
> of assigning IPv4 or IPv6 addresses to the backend interfaces though:
> routing them (although its unclear to me if iptables can be used
> instead, Zoltan?).

Not with OVS, it steals the packet before netfilter hooks.

Zoli
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 20, 2014, 12:56 a.m. UTC | #8
On Wed, 2014-02-19 at 09:20 -0800, Luis R. Rodriguez wrote:
> On Wed, Feb 19, 2014 at 8:45 AM, Dan Williams <dcbw@redhat.com> wrote:
> > On Tue, 2014-02-18 at 13:19 -0800, Luis R. Rodriguez wrote:
> >> On Mon, Feb 17, 2014 at 12:23 PM, Dan Williams <dcbw@redhat.com> wrote:
> >> > On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote:
> >> >> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> >> >>
> >> >> Some interfaces do not need to have any IPv4 or IPv6
> >> >> addresses, so enable an option to specify this. One
> >> >> example where this is observed are virtualization
> >> >> backend interfaces which just use the net_device
> >> >> constructs to help with their respective frontends.
> >> >>
> >> >> This should optimize boot time and complexity on
> >> >> virtualization environments for each backend interface
> >> >> while also avoiding triggering SLAAC and DAD, which is
> >> >> simply pointless for these type of interfaces.
> >> >
> >> > Would it not be better/cleaner to use disable_ipv6 and then add a
> >> > disable_ipv4 sysctl, then use those with that interface?
> >>
> >> Sure, but note that the both disable_ipv6 and accept_dada sysctl
> >> parameters are global. ipv4 and ipv6 interfaces are created upon
> >> NETDEVICE_REGISTER, which will get triggered when a driver calls
> >> register_netdev(). The goal of this patch was to enable an early
> >> optimization for drivers that have no need ever for ipv4 or ipv6
> >> interfaces.
> >
> > Each interface gets override sysctls too though, eg:
> >
> > /proc/sys/net/ipv6/conf/enp0s25/disable_ipv6
> 
> I hadn't seen those, thanks!

Note that there isn't yet a disable_ipv4 knob though, I was
perhaps-too-subtly trying to get you to send a patch for it, since I can
use it too :)

Dan

> > which is the one I meant; you're obviously right that the global ones
> > aren't what you want here.  But the specific ones should be suitable?
> 
> Under the approach Stephen mentioned by first ensuring the interface
> is down yes. There's one use case I can consider to still want the
> patch though, more on that below.
> 
> > If you set that on a per-interface basis, then you'll get EPERM or
> > something whenever you try to add IPv6 addresses or do IPv6 routing.
> 
> Neat, thanks.
> 
> >> Zoltan has noted though some use cases of IPv4 or IPv6 addresses on
> >> backends though, as such this is no longer applicable as a
> >> requirement. The ipv4 sysctl however still seems like a reasonable
> >> approach to enable optimizations of the network in topologies where
> >> its known we won't need them but -- we'd need to consider a much more
> >> granular solution, not just global as it is now for disable_ipv6, and
> >> we'd also have to figure out a clean way to do this to not incur the
> >> cost of early address interface addition upon register_netdev().
> >>
> >> Given that we have a use case for ipv4 and ipv6 addresses on
> >> xen-netback we no longer have an immediate use case for such early
> >> optimization primitives though, so I'll drop this.
> >>
> >> > The IFF_SKIP_IP seems to duplicate at least part of what disable_ipv6 is
> >> > already doing.
> >>
> >> disable_ipv6 is global, the goal was to make this granular and skip
> >> the cost upon early boot, but its been clarified we don't need this.
> >
> > Like Stephen says, you need to make sure you set them before IFF_UP, but
> > beyond that, wouldn't the interface-specific sysctls work?
> 
> Yeah that'll do it, unless there is a measurable run time benefit cost
> to never even add these in the first place. Consider a host with tons
> of guests, not sure how many is 'a lot' these days. One would have to
> measure the cost of reducing the amount of time it takes to boot these
> up. As discussed in the other threads though there *is* some use cases
> of assigning IPv4 or IPv6 addresses to the backend interfaces though:
> routing them (although its unclear to me if iptables can be used
> instead, Zoltan?). So at least now there no clear requirement to
> remove these interfaces or not have them at all. The boot time cost
> savings should be considered though if this is ultimately desirable. I
> saw tons of timers and events that'd get triggered with any IPv4 or
> IPv6 interface laying around.
> 
>   Luis
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa Feb. 20, 2014, 12:58 a.m. UTC | #9
On Wed, Feb 19, 2014 at 06:56:17PM -0600, Dan Williams wrote:
> Note that there isn't yet a disable_ipv4 knob though, I was
> perhaps-too-subtly trying to get you to send a patch for it, since I can
> use it too :)

Do you plan to implement
<http://datatracker.ietf.org/doc/draft-ietf-sunset4-noipv4/>?

;)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 20, 2014, 1:02 a.m. UTC | #10
On Thu, 2014-02-20 at 01:58 +0100, Hannes Frederic Sowa wrote:
> On Wed, Feb 19, 2014 at 06:56:17PM -0600, Dan Williams wrote:
> > Note that there isn't yet a disable_ipv4 knob though, I was
> > perhaps-too-subtly trying to get you to send a patch for it, since I can
> > use it too :)
> 
> Do you plan to implement
> <http://datatracker.ietf.org/doc/draft-ietf-sunset4-noipv4/>?
> 
> ;)

Well, not specifically, but with NetworkManager we do have a "disable
IPv4" method for IPv4, which now just doesn't do any kind of IPv4, but
obviously doesn't disable IPv4 entirely because that's not possible.  I
was only thinking that it would be nice to actually guarantee that IPv4
was disabled, just like disable_ipv6 does.

But we could certainly implement that draft if a patch shows up or if it
bubbled up the priority stack :)

Dan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez Feb. 20, 2014, 8:31 p.m. UTC | #11
On Wed, Feb 19, 2014 at 4:56 PM, Dan Williams <dcbw@redhat.com> wrote:
> Note that there isn't yet a disable_ipv4 knob though, I was
> perhaps-too-subtly trying to get you to send a patch for it, since I can
> use it too :)

Sure, can you describe a little better the use case, as I could use
that for the commit log. My only current use case was the xen-netback
case but Zoltan has noted a few cases where an IPv4 or IPv6 address
*could* be used on the backend interfaces (which I'll still poke as
its unclear to me why they have 'em).

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez Feb. 20, 2014, 8:39 p.m. UTC | #12
On Wed, Feb 19, 2014 at 11:13 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote:
> On 19/02/14 17:20, Luis R. Rodriguez wrote:
>>>> On 19/02/14 17:20, Luis R. Rodriguez also wrote:
>>>> Zoltan has noted though some use cases of IPv4 or IPv6 addresses on
>>>> backends though <...>
>>
>> As discussed in the other threads though there *is* some use cases
>> of assigning IPv4 or IPv6 addresses to the backend interfaces though:
>> routing them (although its unclear to me if iptables can be used
>> instead, Zoltan?).
>
> Not with OVS, it steals the packet before netfilter hooks.

Got it, thanks! Can't the route be added using a front-end IP address
instead on the host though ? I just tried that on a Xen system and it
seems to work. Perhaps I'm not understand the exact topology on the
routing case. So in my case I have the backend without any IPv4 or
IPv6 interfaces, the guest has IPv4, IPv6 addresses and even a TUN for
VPN and I can create routes on the host to the front end by not using
the backend device name but instead using the front-end target IP.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zoltan Kiss Feb. 21, 2014, 1:02 p.m. UTC | #13
On 20/02/14 20:39, Luis R. Rodriguez wrote:
> On Wed, Feb 19, 2014 at 11:13 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote:
>> On 19/02/14 17:20, Luis R. Rodriguez wrote:
>>>>> On 19/02/14 17:20, Luis R. Rodriguez also wrote:
>>>>> Zoltan has noted though some use cases of IPv4 or IPv6 addresses on
>>>>> backends though <...>
>>>
>>> As discussed in the other threads though there *is* some use cases
>>> of assigning IPv4 or IPv6 addresses to the backend interfaces though:
>>> routing them (although its unclear to me if iptables can be used
>>> instead, Zoltan?).
>>
>> Not with OVS, it steals the packet before netfilter hooks.
>
> Got it, thanks! Can't the route be added using a front-end IP address
> instead on the host though ? I just tried that on a Xen system and it
> seems to work. Perhaps I'm not understand the exact topology on the
> routing case. So in my case I have the backend without any IPv4 or
> IPv6 interfaces, the guest has IPv4, IPv6 addresses and even a TUN for
> VPN and I can create routes on the host to the front end by not using
> the backend device name but instead using the front-end target IP.
Check this how current Xen scripts does routed networking:

http://wiki.xen.org/wiki/Xen_Networking#Associating_routes_with_virtual_devices

Note, there are no bridges involved here! As the above page says, the 
backend has to have IP address, maybe it's not true anymore. I'm not too 
familiar with this setup too, I've used it only once.

Zoli

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez Feb. 22, 2014, 1:40 a.m. UTC | #14
On Fri, Feb 21, 2014 at 5:02 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote:
> Check this how current Xen scripts does routed networking:
>
> http://wiki.xen.org/wiki/Xen_Networking#Associating_routes_with_virtual_devices
>
> Note, there are no bridges involved here! As the above page says, the
> backend has to have IP address, maybe it's not true anymore. I'm not too
> familiar with this setup too, I've used it only once.

Thanks, in such case I do think actually adding a bridge, adding the
backend interface to it, and then adding a route to the front end IP
would suffice to cover that case, but I'm pretty limited with test
devices so would appreciate if someone with a setup like that can test
it as an alternative. Please recall that the possible gains here
should be pretty significant in terms of simplification. And of
course, I still also haven't had time / systems to test the NAT
case...

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 24, 2014, 6:22 p.m. UTC | #15
On Thu, 2014-02-20 at 12:31 -0800, Luis R. Rodriguez wrote:
> On Wed, Feb 19, 2014 at 4:56 PM, Dan Williams <dcbw@redhat.com> wrote:
> > Note that there isn't yet a disable_ipv4 knob though, I was
> > perhaps-too-subtly trying to get you to send a patch for it, since I can
> > use it too :)
> 
> Sure, can you describe a little better the use case, as I could use
> that for the commit log. My only current use case was the xen-netback
> case but Zoltan has noted a few cases where an IPv4 or IPv6 address
> *could* be used on the backend interfaces (which I'll still poke as
> its unclear to me why they have 'em).

My use-case would simply be to have an analogue for the disable_ipv6
case.  In the future I expect more people will want to disable IPv4 as
they move to IPv6.  If you don't have something like disable_ipv4, then
there's no way to ensure that some random program or something doesn't
set up IPv4 stuff that you don't want.

Same thing for IPv6; some people really don't want IPv6 enabled on an
interface no matter what; they don't want an IPv6LL address assigned,
they don't want kernel SLAAC, they want to ensure that *nothing*
IPv6-related gets done for that interface.  The same can be true for
IPv4, but we don't have a way of doing that right now.

Dan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez Feb. 24, 2014, 8:33 p.m. UTC | #16
On Mon, Feb 24, 2014 at 10:22 AM, Dan Williams <dcbw@redhat.com> wrote:
> My use-case would simply be to have an analogue for the disable_ipv6
> case.  In the future I expect more people will want to disable IPv4 as
> they move to IPv6.  If you don't have something like disable_ipv4, then
> there's no way to ensure that some random program or something doesn't
> set up IPv4 stuff that you don't want.
>
> Same thing for IPv6; some people really don't want IPv6 enabled on an
> interface no matter what; they don't want an IPv6LL address assigned,
> they don't want kernel SLAAC, they want to ensure that *nothing*
> IPv6-related gets done for that interface.  The same can be true for
> IPv4, but we don't have a way of doing that right now.

I'll add this to my queue.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 24, 2014, 11:04 p.m. UTC | #17
From: Dan Williams <dcbw@redhat.com>
Date: Mon, 24 Feb 2014 12:22:00 -0600

> In the future I expect more people will want to disable IPv4 as
> they move to IPv6.

I definitely don't.

I've been lightly following this conversation and I have to say
a few things.

disable_ipv6 was added because people wanted to make sure their
machines didn't generate any ipv6 traffic because "ipv6 is not
mature", "we don't have our firewalls configured to handle that
kind of traffic" etc.

None of these things apply to ipv4.

And if you think people will go to ipv6 only, you are dreaming.

Name a provider of a major web sitewho will go to strictly only
providing an ipv6 facing site?

Only an idiot who wanted to lose significiant nunbers of page views
and traffic would do that, so ipv4 based connectivity will be
universally necessary forever.

I think disable_ipv4 is absolutely a non-starter.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Feb. 25, 2014, 12:02 a.m. UTC | #18
On Mon, 2014-02-24 at 18:04 -0500, David Miller wrote:
> From: Dan Williams <dcbw@redhat.com>
> Date: Mon, 24 Feb 2014 12:22:00 -0600
> 
> > In the future I expect more people will want to disable IPv4 as
> > they move to IPv6.
> 
> I definitely don't.
> 
> I've been lightly following this conversation and I have to say
> a few things.
> 
> disable_ipv6 was added because people wanted to make sure their
> machines didn't generate any ipv6 traffic because "ipv6 is not
> mature", "we don't have our firewalls configured to handle that
> kind of traffic" etc.
> 
> None of these things apply to ipv4.
> 
> And if you think people will go to ipv6 only, you are dreaming.
>
> Name a provider of a major web sitewho will go to strictly only
> providing an ipv6 facing site?
>
> Only an idiot who wanted to lose significiant nunbers of page views
> and traffic would do that,

That's obviously true for public-facing servers, but that doesn't mean
it's not useful to anyone.

> so ipv4 based connectivity will be universally necessary forever.

You can run an internal network, or access network, as v6-only with
NAT64 and DNS64 at the border.  I believe some mobile networks are doing
this; it was also done on the main FOSDEM wireless network this year.

Ben.

> I think disable_ipv4 is absolutely a non-starter.
David Miller Feb. 25, 2014, 12:12 a.m. UTC | #19
From: Ben Hutchings <ben@decadent.org.uk>
Date: Tue, 25 Feb 2014 00:02:00 +0000

> You can run an internal network, or access network, as v6-only with
> NAT64 and DNS64 at the border.  I believe some mobile networks are doing
> this; it was also done on the main FOSDEM wireless network this year.

This seems to be bloating up the networking headers of the internal
network, for what purpose?

For mobile that's doubly inadvisable.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Feb. 25, 2014, 2:01 a.m. UTC | #20
On Mon, 2014-02-24 at 19:12 -0500, David Miller wrote:
> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Tue, 25 Feb 2014 00:02:00 +0000
> 
> > You can run an internal network, or access network, as v6-only with
> > NAT64 and DNS64 at the border.  I believe some mobile networks are doing
> > this; it was also done on the main FOSDEM wireless network this year.
> 
> This seems to be bloating up the networking headers of the internal
> network, for what purpose?
> 
> For mobile that's doubly inadvisable.

I don't know what the reasoning is for the mobile network operators.
They're forced to do NAT for v4 somewhere, and maybe v6-only makes the
access network easier to manage.

I doubt the extra header length hurts that much on a 3G or 4G network.

Ben.
Hannes Frederic Sowa Feb. 25, 2014, 2:23 a.m. UTC | #21
On Tue, Feb 25, 2014 at 02:01:59AM +0000, Ben Hutchings wrote:
> On Mon, 2014-02-24 at 19:12 -0500, David Miller wrote:
> > From: Ben Hutchings <ben@decadent.org.uk>
> > Date: Tue, 25 Feb 2014 00:02:00 +0000
> > 
> > > You can run an internal network, or access network, as v6-only with
> > > NAT64 and DNS64 at the border.  I believe some mobile networks are doing
> > > this; it was also done on the main FOSDEM wireless network this year.
> > 
> > This seems to be bloating up the networking headers of the internal
> > network, for what purpose?
> > 
> > For mobile that's doubly inadvisable.
> 
> I don't know what the reasoning is for the mobile network operators.
> They're forced to do NAT for v4 somewhere, and maybe v6-only makes the
> access network easier to manage.

Yes, it seems the way to go:
<http://www.dslreports.com/shownews/TMobile-Goes-IPv6-Only-on-Android-44-Devices-126506>

I can't comment on the 464xlat that much because I haven't looked at an
implementation yet, but it can very well be the case it still needs IPv4
on the outgoing interface, I don't know (from the spec pov it doesn't
look like that).

Greetings,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Marks Feb. 25, 2014, 7:50 p.m. UTC | #22
On Mon, Feb 24, 2014 at 4:12 PM, David Miller <davem@davemloft.net> wrote:
> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Tue, 25 Feb 2014 00:02:00 +0000
>
>> You can run an internal network, or access network, as v6-only with
>> NAT64 and DNS64 at the border.  I believe some mobile networks are doing
>> this; it was also done on the main FOSDEM wireless network this year.
>
> This seems to be bloating up the networking headers of the internal
> network, for what purpose?

The primary purpose of IPv6 is to bloat up network headers, because
the IPv4 headers were too small to address all the endpoints.

NAT64 is an intriguing solution to the problem of "I have too many
customers for 10.0.0.0/8".  Here's are some slides on the topic from
this week's APNIC conference:
https://conference.apnic.net/data/37/464xlat-apricot-2014_1393236641.pdf

A kernel with disable_ipv4 would be fairly usable on such a network
today, as long as you avoid AF_INET-specific apps.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 25, 2014, 9:07 p.m. UTC | #23
On Mon, 2014-02-24 at 18:04 -0500, David Miller wrote:
> From: Dan Williams <dcbw@redhat.com>
> Date: Mon, 24 Feb 2014 12:22:00 -0600
> 
> > In the future I expect more people will want to disable IPv4 as
> > they move to IPv6.
> 
> I definitely don't.
> 
> I've been lightly following this conversation and I have to say
> a few things.
> 
> disable_ipv6 was added because people wanted to make sure their
> machines didn't generate any ipv6 traffic because "ipv6 is not
> mature", "we don't have our firewalls configured to handle that
> kind of traffic" etc.
> 
> None of these things apply to ipv4.
> 
> And if you think people will go to ipv6 only, you are dreaming.
> 
> Name a provider of a major web sitewho will go to strictly only
> providing an ipv6 facing site?
> 
> Only an idiot who wanted to lose significiant nunbers of page views
> and traffic would do that, so ipv4 based connectivity will be
> universally necessary forever.
> 
> I think disable_ipv4 is absolutely a non-starter.

Also, disable_ipv4 signals *intent*, which is distinct from current
state.

Does an interface without an IPv4 address mean that the user wished it
not to have one?

Or does it mean that DHCP hasn't started yet (but is supposed to), or
failed, or something hasn't gotten around to assigning an address yet?

disable_ipv4 lets you distinguish between these two cases, the same way
disable_ipv6 does.

Dan


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 25, 2014, 9:18 p.m. UTC | #24
From: Dan Williams <dcbw@redhat.com>
Date: Tue, 25 Feb 2014 15:07:00 -0600

> Also, disable_ipv4 signals *intent*, which is distinct from current
> state.
> 
> Does an interface without an IPv4 address mean that the user wished it
> not to have one?
> 
> Or does it mean that DHCP hasn't started yet (but is supposed to), or
> failed, or something hasn't gotten around to assigning an address yet?
> 
> disable_ipv4 lets you distinguish between these two cases, the same way
> disable_ipv6 does.

Intent only matters on the kernel side if the kernel automatically
assigns addresses to interfaces which have been brought up like ipv6
does.

Since it does not do this for ipv4, this can be handled entirely in
userspace.

It is not a valid argument to say that a rogue dhcp might run on
the machine and configure an ipv4 address.  That's the admin's
responsibility, and still a user side problem.  A "rogue" program
could just as equally turn the theoretical disable_ipv4 off too.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa Feb. 26, 2014, 1:29 a.m. UTC | #25
On Tue, Feb 25, 2014 at 04:18:17PM -0500, David Miller wrote:
> From: Dan Williams <dcbw@redhat.com>
> Date: Tue, 25 Feb 2014 15:07:00 -0600
> 
> > Also, disable_ipv4 signals *intent*, which is distinct from current
> > state.
> > 
> > Does an interface without an IPv4 address mean that the user wished it
> > not to have one?
> > 
> > Or does it mean that DHCP hasn't started yet (but is supposed to), or
> > failed, or something hasn't gotten around to assigning an address yet?
> > 
> > disable_ipv4 lets you distinguish between these two cases, the same way
> > disable_ipv6 does.
> 
> Intent only matters on the kernel side if the kernel automatically
> assigns addresses to interfaces which have been brought up like ipv6
> does.
> 
> Since it does not do this for ipv4, this can be handled entirely in
> userspace.
> 
> It is not a valid argument to say that a rogue dhcp might run on
> the machine and configure an ipv4 address.  That's the admin's
> responsibility, and still a user side problem.  A "rogue" program
> could just as equally turn the theoretical disable_ipv4 off too.

Week end model strikes again. :)

Currently one would need to set arp_filter and arp_ignore and have no
ip address on the interface to isolate it from the ipv4 network.

IFF_NOARP is of no use here as it also disables neighbour discovery.

I am not sure we completley tear down igmp processing on that interface
if no ip address is available. Maybe there are some special cases with
forwarding, too.

Such a "silent" mode could come handy for intrusion detection systems
where one would ensure that no ip processing takes place but could also
be realized with nftables/netfilter/arpfilter, I think.

Bye,

  Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
index 8d10382..566d856 100644
--- a/include/uapi/linux/if.h
+++ b/include/uapi/linux/if.h
@@ -85,6 +85,7 @@ 
 					 * change when it's running */
 #define IFF_MACVLAN 0x200000		/* Macvlan device */
 #define IFF_BRIDGE_NON_ROOT 0x400000    /* Don't consider for root bridge */
+#define IFF_SKIP_IP	0x800000	/* Skip IPv4, IPv6 */
 
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index a1b5bcb..8e9ef07 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1342,6 +1342,9 @@  static int inetdev_event(struct notifier_block *this, unsigned long event,
 
 	ASSERT_RTNL();
 
+	if (dev->priv_flags & IFF_SKIP_IP)
+		goto out;
+
 	if (!in_dev) {
 		if (event == NETDEV_REGISTER) {
 			in_dev = inetdev_init(dev);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 4b6b720..57f58e3 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -314,6 +314,9 @@  static struct inet6_dev *ipv6_add_dev(struct net_device *dev)
 
 	ASSERT_RTNL();
 
+	if (dev->priv_flags & IFF_SKIP_IP)
+		return NULL;
+
 	if (dev->mtu < IPV6_MIN_MTU)
 		return NULL;
 
@@ -2749,6 +2752,9 @@  static int addrconf_notify(struct notifier_block *this, unsigned long event,
 	int run_pending = 0;
 	int err;
 
+	if (dev->priv_flags & IFF_SKIP_IP)
+		return NOTIFY_OK;
+
 	switch (event) {
 	case NETDEV_REGISTER:
 		if (!idev && dev->mtu >= IPV6_MIN_MTU) {