diff mbox

[PATCHv2,net] net: restore lro after device detached from bridge

Message ID 54CEDEDC.7060507@gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

FengYu LeiDian Feb. 2, 2015, 2:20 a.m. UTC
于 2015年01月31日 04:48, Alexander Duyck 写道:
> On 01/30/2015 04:33 AM, Fan Du wrote:
>> Either detaching a device from bridge or switching a device
>> out of FORWARDING state, the original lro feature should
>> possibly be enabled for good reason, e.g. hw feature like
>> receive side coalescing could come into play.
>>
>> BEFORE:
>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>> large-receive-offload: off
>>
>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>> large-receive-offload: off
>>
>> AFTER:
>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>> large-receive-offload: off
>>
>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>> large-receive-offload: on
>>
>> Signed-off-by: Fan Du <fan.du@intel.com>
>> Fixes: 0187bdfb0567 ("net: Disable LRO on devices that are forwarding")
>

> First off this isn't a "fix".  This is going to likely break more than
> it fixes.  The main reason why LRO is disabled is because it can cause
> more harm then it helps.  Since GRO is available we should err on the
> side of caution since enabling LRO/RSC can have undesirable side effects
> in a number of cases.

I think you are talking about bad scenarios when net device is attached to a bridge.
Then what's the good reason user has to pay extra cpu power for using GRO, instead
of using hw capable LRO/RSC when this net device is detached from bridge acting as
a standalone NIC?

Note, SRC is defaulted to *ON* in practice for ALL ixgbe NICs, as same other RSC capable
NICs. Attaching net device to a bridge _once_ should not changed its default configuration,
moreover it's a subtle change without any message that user won't noticed at all.


 From 1e76b2625b3e6aa239b5ef8399fe441a587c6646 Mon Sep 17 00:00:00 2001
From: Fan Du <fan.du@intel.com>
Date: Mon, 2 Feb 2015 05:02:11 -0500
Subject: [PATCH] net: restore lro after device detached from bridge

When detached net device from a bridge, the original lro
feature should possibly be enabled for good reason, e.g.
hw feature like receive side coalescing could come into play.

Signed-off-by: Fan Du <fan.du@intel.com>
Fixes: 0187bdfb0567 ("net: Disable LRO on devices that are forwarding")
---
ChangeLog:
v2:
   - Restore lro only when device detached from bridge
---
  include/linux/netdevice.h |  1 +
  net/bridge/br_if.c        |  1 +
  net/core/dev.c            | 23 +++++++++++++++++++++++
  3 files changed, 25 insertions(+)

Comments

Alexander Duyck Feb. 2, 2015, 10:35 a.m. UTC | #1
On 02/01/2015 06:20 PM, Fan Du wrote:
> 于 2015年01月31日 04:48, Alexander Duyck 写道:
>> On 01/30/2015 04:33 AM, Fan Du wrote:
>>> Either detaching a device from bridge or switching a device
>>> out of FORWARDING state, the original lro feature should
>>> possibly be enabled for good reason, e.g. hw feature like
>>> receive side coalescing could come into play.
>>>
>>> BEFORE:
>>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>> ens806f0 | grep large
>>> large-receive-offload: off
>>>
>>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>> ens806f0 | grep large
>>> large-receive-offload: off
>>>
>>> AFTER:
>>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>> ens806f0 | grep large
>>> large-receive-offload: off
>>>
>>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>> ens806f0 | grep large
>>> large-receive-offload: on
>>>
>>> Signed-off-by: Fan Du <fan.du@intel.com>
>>> Fixes: 0187bdfb0567 ("net: Disable LRO on devices that are forwarding")
>>
>
>> First off this isn't a "fix".  This is going to likely break more than
>> it fixes.  The main reason why LRO is disabled is because it can cause
>> more harm then it helps.  Since GRO is available we should err on the
>> side of caution since enabling LRO/RSC can have undesirable side effects
>> in a number of cases.
>
> I think you are talking about bad scenarios when net device is 
> attached to a bridge.
> Then what's the good reason user has to pay extra cpu power for using 
> GRO, instead
> of using hw capable LRO/RSC when this net device is detached from 
> bridge acting as
> a standalone NIC?
>
> Note, SRC is defaulted to *ON* in practice for ALL ixgbe NICs, as same 
> other RSC capable
> NICs. Attaching net device to a bridge _once_ should not changed its 
> default configuration,
> moreover it's a subtle change without any message that user won't 
> noticed at all.

No, RSC only has benefits for IPv4/TCP large packets.  However 
historically there have been issues seen w/ small packet performance 
with RSC enabled.  Some have been addressed, however there are still 
other effects such as increasing latency for receive unless the push 
flag is set in the frame.

I still say this patch is not valid, even with your changes.  Your 
performance gain doesn't trump the regressions you would be causing on 
other peoples platforms.

I would suggest figuring out why you are seeing issues with routing or 
bridging being enabled and disabled and possibly cleaning up the issue 
via a script rather than trying to modify the kernel to make it take 
care of it for you.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Kubecek Feb. 2, 2015, 11:15 a.m. UTC | #2
On Mon, Feb 02, 2015 at 10:20:12AM +0800, Fan Du wrote:
> 
> I think you are talking about bad scenarios when net device is
> attached to a bridge.  Then what's the good reason user has to pay
> extra cpu power for using GRO, instead of using hw capable LRO/RSC
> when this net device is detached from bridge acting as a standalone
> NIC?

Being bridged is only one of the situations when LRO needs to be
disabled. Does your patch make sure it doesn't enable LRO if there are
other reasons for it to be disabled, e.g. if forwarding is enabled for
it or any of its upper devices?

I'm afraid the only way to make the automatic reenabling work correctly
would be to keep track if LRO was disabled manually (e.g. by ethtool) or
only automatically because the device is bridged, forwarding is enabled
for it, LRO is disabled for any upper device etc. And to reenable LRO
only in the second case and even then only if none of the possible
reasons holds. I don't think it's worth the effort.

> Note, SRC is defaulted to *ON* in practice for ALL ixgbe NICs, as same
> other RSC capable NICs.

A very bad idea, IMHO. A lot of bug reports resulted from it.

> Attaching net device to a bridge _once_ should not changed its default
> configuration, moreover it's a subtle change without any message that
> user won't noticed at all.

IMHO the key point here is that LRO enabled when it shouldn't is much
more serious problem than LRO disabled when it could be enabled.

                                                       Michal Kubecek

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
FengYu LeiDian Feb. 3, 2015, 2:29 a.m. UTC | #3
于 2015年02月02日 19:15, Michal Kubecek 写道:
> On Mon, Feb 02, 2015 at 10:20:12AM +0800, Fan Du wrote:
>>
>> I think you are talking about bad scenarios when net device is
>> attached to a bridge.  Then what's the good reason user has to pay
>> extra cpu power for using GRO, instead of using hw capable LRO/RSC
>> when this net device is detached from bridge acting as a standalone
>> NIC?
>
> Being bridged is only one of the situations when LRO needs to be
> disabled. Does your patch make sure it doesn't enable LRO if there are
> other reasons for it to be disabled, e.g. if forwarding is enabled for
> it or any of its upper devices?
>
> I'm afraid the only way to make the automatic reenabling work correctly
> would be to keep track if LRO was disabled manually (e.g. by ethtool) or
> only automatically because the device is bridged, forwarding is enabled
> for it, LRO is disabled for any upper device etc. And to reenable LRO
> only in the second case and even then only if none of the possible
> reasons holds. I don't think it's worth the effort.
>
>> Note, SRC is defaulted to *ON* in practice for ALL ixgbe NICs, as same
>> other RSC capable NICs.
>
> A very bad idea, IMHO. A lot of bug reports resulted from it.

Why are you saying this an idea?? this a fact for all RSC capable NIC drivers.
search drivers/net/ethernet/ to find more.

>> Attaching net device to a bridge _once_ should not changed its default
>> configuration, moreover it's a subtle change without any message that
>> user won't noticed at all.
> IMHO the key point here is that LRO enabled when it shouldn't is much
> more serious problem than LRO disabled when it could be enabled.

I agree with you here more than I disagree.
btw, I see this subtle behaviour not because I "seeing issues with routing or
bridging being enabled and" as Alexander assuming, it's because I see my testbed
82599EB supports LRO, and the driver enabled after probing from code review,
but for no reason ethtool -k reports lro is off, it looks like this interface
is bound to bridge by one of service.

>
>                                                         Michal Kubecek
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michal Kubecek Feb. 3, 2015, 6:54 a.m. UTC | #4
On Tue, Feb 03, 2015 at 10:29:00AM +0800, Fan Du wrote:
> 于 2015年02月02日 19:15, Michal Kubecek 写道:
> >On Mon, Feb 02, 2015 at 10:20:12AM +0800, Fan Du wrote:
> >
> >>Note, SRC is defaulted to *ON* in practice for ALL ixgbe NICs, as same
> >>other RSC capable NICs.
> >
> >A very bad idea, IMHO. A lot of bug reports resulted from it.
> 
> Why are you saying this an idea?? this a fact for all RSC capable NIC
> drivers.  search drivers/net/ethernet/ to find more.

I didn't say it's not turned on by default, I just said I consider this
a bad idea. Why? Because the feature is known to break network
communication in common and frequently used scenarios (essentially
whenever received packets may leave the host). When this happens, you
observe strange networking malfunction and unless you know this may be
the cause (or read release notes very carefully), it's very difficult to
identify the cause. Personally, I handled four bug reports of this type
in last three years.

On the other hand, having it turned off when it could be on is only a
performance problem and the communication works. When you are tuning the
performance, you obviously look at the offloading features and check
which could be turned on.

This disbalance (broken communication on one side and slight performance
difference on the other) is why I believe feature like this should not
be turned on by default. Unfortunately it is and we have to deal with
it; that's why dev_disable_lro() is called in certain situations and why
it is propagated down lo lower devices. Turning it back on without
carefully checking that _none_ of the reasons to have it off exists
would do more harm than good.

                                                        Michal Kubecek

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
FengYu LeiDian Feb. 3, 2015, 7:08 a.m. UTC | #5
于 2015年02月02日 18:35, Alexander Duyck 写道:
> On 02/01/2015 06:20 PM, Fan Du wrote:
>> 于 2015年01月31日 04:48, Alexander Duyck 写道:
>>> On 01/30/2015 04:33 AM, Fan Du wrote:
>>>> Either detaching a device from bridge or switching a device
>>>> out of FORWARDING state, the original lro feature should
>>>> possibly be enabled for good reason, e.g. hw feature like
>>>> receive side coalescing could come into play.
>>>>
>>>> BEFORE:
>>>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>>>> large-receive-offload: off
>>>>
>>>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>>>> large-receive-offload: off
>>>>
>>>> AFTER:
>>>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>>>> large-receive-offload: off
>>>>
>>>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k ens806f0 | grep large
>>>> large-receive-offload: on
>>>>
>>>> Signed-off-by: Fan Du <fan.du@intel.com>
>>>> Fixes: 0187bdfb0567 ("net: Disable LRO on devices that are forwarding")
>>>
>>
>>> First off this isn't a "fix".  This is going to likely break more than
>>> it fixes.  The main reason why LRO is disabled is because it can cause
>>> more harm then it helps.  Since GRO is available we should err on the
>>> side of caution since enabling LRO/RSC can have undesirable side effects
>>> in a number of cases.
>>
>> I think you are talking about bad scenarios when net device is attached to a bridge.
>> Then what's the good reason user has to pay extra cpu power for using GRO, instead
>> of using hw capable LRO/RSC when this net device is detached from bridge acting as
>> a standalone NIC?
>>
>> Note, SRC is defaulted to *ON* in practice for ALL ixgbe NICs, as same other RSC capable
>> NICs. Attaching net device to a bridge _once_ should not changed its default configuration,
>> moreover it's a subtle change without any message that user won't noticed at all.

> No, RSC only has benefits for IPv4/TCP large packets.  However
> historically there have been issues seen w/ small packet performance
> with RSC enabled.

Only when parallel client exceeds 4, gro trumps lro performance on my testbed for small packets.
The difference comes from the fact that TCP RSS hash flows from clients into different NIC queues
for multiple cpu, while RSC engine inside NIC has limit resource compared with that of cpu used by gro.

NICs: 82599EB
server:ipserf -s -B 192.168.5.1
client:iperf  -c 192.168.5.1 -i 1 -M 100 -P x

-P   Bandwidth/lro on        Bandwidth/lro off
                gro off                 gro on

1     2.31 Gbits/sec           947 Mbits/sec
2     3.09 Gbits/sec          1.97 Gbits/sec
3     3.19 Gbits/sec          2.70 Gbits/sec
4     3.16 Gbits/sec          3.39 Gbits/sec
5     3.23 Gbits/sec          3.33 Gbits/sec
6     3.19 Gbits/sec          3.74 Gbits/sec
7     3.18 Gbits/sec          3.88 Gbits/sec
8     3.17 Gbits/sec          3.24 Gbits/sec
9     3.16 Gbits/sec          3.70 Gbits/sec
10    3.15 Gbits/sec          3.76 Gbits/sec
11    3.10 Gbits/sec          4.03 Gbits/sec
12    3.11 Gbits/sec          3.13 Gbits/sec
13    3.12 Gbits/sec          4.12 Gbits/sec
14    3.07 Gbits/sec          4.04 Gbits/sec
15    3.03 Gbits/sec          3.14 Gbits/sec
16    2.99 Gbits/sec          3.93 Gbits/sec




Some have been addressed, however there are still
> other effects such as increasing latency for receive unless the push
> flag is set in the frame.
>
> I still say this patch is not valid, even with your changes.  Your
> performance gain doesn't trump the regressions you would be causing on
> other peoples platforms.
>
> I would suggest figuring out why you are seeing issues with routing or
> bridging being enabled and disabled and possibly cleaning up the issue
> via a script rather than trying to modify the kernel to make it take
> care of it for you.
> - Alex

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Duyck Feb. 3, 2015, 8:37 a.m. UTC | #6
On 02/03/2015 08:08 AM, Fan Du wrote:
> 于 2015年02月02日 18:35, Alexander Duyck 写道:
>> On 02/01/2015 06:20 PM, Fan Du wrote:
>>> 于 2015年01月31日 04:48, Alexander Duyck 写道:
>>>> On 01/30/2015 04:33 AM, Fan Du wrote:
>>>>> Either detaching a device from bridge or switching a device
>>>>> out of FORWARDING state, the original lro feature should
>>>>> possibly be enabled for good reason, e.g. hw feature like
>>>>> receive side coalescing could come into play.
>>>>>
>>>>> BEFORE:
>>>>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>>>> ens806f0 | grep large
>>>>> large-receive-offload: off
>>>>>
>>>>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>>>> ens806f0 | grep large
>>>>> large-receive-offload: off
>>>>>
>>>>> AFTER:
>>>>> echo 1 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>>>> ens806f0 | grep large
>>>>> large-receive-offload: off
>>>>>
>>>>> echo 0 > /proc/sys/net/ipv4/conf/ens806f0/forwarding && ethtool -k 
>>>>> ens806f0 | grep large
>>>>> large-receive-offload: on
>>>>>
>>>>> Signed-off-by: Fan Du <fan.du@intel.com>
>>>>> Fixes: 0187bdfb0567 ("net: Disable LRO on devices that are 
>>>>> forwarding")
>>>>
>>>
>>>> First off this isn't a "fix".  This is going to likely break more than
>>>> it fixes.  The main reason why LRO is disabled is because it can cause
>>>> more harm then it helps.  Since GRO is available we should err on the
>>>> side of caution since enabling LRO/RSC can have undesirable side 
>>>> effects
>>>> in a number of cases.
>>>
>>> I think you are talking about bad scenarios when net device is 
>>> attached to a bridge.
>>> Then what's the good reason user has to pay extra cpu power for 
>>> using GRO, instead
>>> of using hw capable LRO/RSC when this net device is detached from 
>>> bridge acting as
>>> a standalone NIC?
>>>
>>> Note, SRC is defaulted to *ON* in practice for ALL ixgbe NICs, as 
>>> same other RSC capable
>>> NICs. Attaching net device to a bridge _once_ should not changed its 
>>> default configuration,
>>> moreover it's a subtle change without any message that user won't 
>>> noticed at all.
>
>> No, RSC only has benefits for IPv4/TCP large packets.  However
>> historically there have been issues seen w/ small packet performance
>> with RSC enabled.
>
> Only when parallel client exceeds 4, gro trumps lro performance on my 
> testbed for small packets.
> The difference comes from the fact that TCP RSS hash flows from 
> clients into different NIC queues
> for multiple cpu, while RSC engine inside NIC has limit resource 
> compared with that of cpu used by gro.
>
> NICs: 82599EB
> server:ipserf -s -B 192.168.5.1
> client:iperf  -c 192.168.5.1 -i 1 -M 100 -P x
>
> -P   Bandwidth/lro on        Bandwidth/lro off
>                gro off                 gro on
>
> 1     2.31 Gbits/sec           947 Mbits/sec
> 2     3.09 Gbits/sec          1.97 Gbits/sec
> 3     3.19 Gbits/sec          2.70 Gbits/sec
> 4     3.16 Gbits/sec          3.39 Gbits/sec
> 5     3.23 Gbits/sec          3.33 Gbits/sec
> 6     3.19 Gbits/sec          3.74 Gbits/sec
> 7     3.18 Gbits/sec          3.88 Gbits/sec
> 8     3.17 Gbits/sec          3.24 Gbits/sec
> 9     3.16 Gbits/sec          3.70 Gbits/sec
> 10    3.15 Gbits/sec          3.76 Gbits/sec
> 11    3.10 Gbits/sec          4.03 Gbits/sec
> 12    3.11 Gbits/sec          3.13 Gbits/sec
> 13    3.12 Gbits/sec          4.12 Gbits/sec
> 14    3.07 Gbits/sec          4.04 Gbits/sec
> 15    3.03 Gbits/sec          3.14 Gbits/sec
> 16    2.99 Gbits/sec          3.93 Gbits/sec
>
>
>
>
> Some have been addressed, however there are still

The point I think you are not getting is that bulk throughput 
performance does not justify enabling a feature that may impact 
stability or possibly harm small packet performance.

There are more reasons than routing and bridging to disable LRO. Those 
two reasons though were so bad that we couldn't allow end users to 
possibly encounter them so we disabled the feature for them.

There are a number of other cases where LRO might be disabled as in the 
possible latency case I reported.  As such you should not be enabling 
LRO just because only two of the possible issues have now been addressed.

It is best to leave this up to the end-user to re-enable.  If you are 
seeing the feature disabled as a result of some init script on the 
system you may want to look at re-enabling it as a part of some other 
init script that you use when disabling routing or bridging.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 642d426..904b1a4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2153,6 +2153,7 @@  int dev_alloc_name(struct net_device *dev, const char *name);
  int dev_open(struct net_device *dev);
  int dev_close(struct net_device *dev);
  void dev_disable_lro(struct net_device *dev);
+void dev_enable_lro(struct net_device *dev);
  int dev_loopback_xmit(struct sk_buff *newskb);
  int dev_queue_xmit(struct sk_buff *skb);
  int dev_queue_xmit_accel(struct sk_buff *skb, void *accel_priv);
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 81e49fb..4236f3a 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -565,6 +565,7 @@  int br_del_if(struct net_bridge *br, struct net_device *dev)
  		call_netdevice_notifiers(NETDEV_CHANGEADDR, br->dev);

  	netdev_update_features(br->dev);
+	dev_enable_lro(dev);

  	return 0;
  }
diff --git a/net/core/dev.c b/net/core/dev.c
index 1e325ad..76f2ed7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1451,6 +1451,29 @@  void dev_disable_lro(struct net_device *dev)
  }
  EXPORT_SYMBOL(dev_disable_lro);

+/**
+ *	dev_enable_lro - enable Large Receive Offload on a device
+ *	@dev: device
+ *
+ *	Enable Large Receive Offload (LRO) on a net device.
+ *	This is needed if device is not attached to a bridge.
+ */
+void dev_enable_lro(struct net_device *dev)
+{
+	struct net_device *lower_dev;
+	struct list_head *iter;
+
+	dev->wanted_features |= NETIF_F_LRO;
+	netdev_update_features(dev);
+
+	if (unlikely(!(dev->features & NETIF_F_LRO)))
+		netdev_WARN(dev, "failed to enable LRO!\n");
+
+	netdev_for_each_lower_dev(dev, lower_dev, iter)
+		dev_enable_lro(lower_dev);
+}
+EXPORT_SYMBOL(dev_enable_lro);
+
  static int call_netdevice_notifier(struct notifier_block *nb, unsigned long val,
  				   struct net_device *dev)
  {