diff mbox

[net-next] Fix time-lag of IFF_RUNNING flag consistency between vlan and real devices

Message ID 20110826060257.5304.62723.stgit@ltc219.sdl.hitachi.co.jp
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Mitsuo Hayasaka Aug. 26, 2011, 6:02 a.m. UTC
There is a time-lag of IFF_RUNNING flag consistency between vlan and real
devices when the real devices are in problem such as link or cable broken.
This leads to a degradation of Availability such as a delay of failover in
HA systems using vlan since the detection of the problem at real device is
delayed.

Why this happens:
Network devices' flags can be checked using ioctl with SIOCGIFFLAGS. When
vlan technique is used, it checks the flags of vlan device, not real
device.

Patch:
This patch adds vlan-device check into dev_get_flags(). So, it can check
flags of the real device even if the vlan is used.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Tom Herbert <therbert@google.com>
Cc: Jesse Gross <jesse@nicira.com>
---

 include/linux/if_vlan.h |    2 +-
 net/core/dev.c          |    7 +++++++
 2 files changed, 8 insertions(+), 1 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

stephen hemminger Aug. 26, 2011, 6:08 a.m. UTC | #1
On Fri, 26 Aug 2011 15:02:57 +0900
Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> wrote:

> There is a time-lag of IFF_RUNNING flag consistency between vlan and real
> devices when the real devices are in problem such as link or cable broken.
> This leads to a degradation of Availability such as a delay of failover in
> HA systems using vlan since the detection of the problem at real device is
> delayed.
> 
> Why this happens:
> Network devices' flags can be checked using ioctl with SIOCGIFFLAGS. When
> vlan technique is used, it checks the flags of vlan device, not real
> device.
> 
> Patch:
> This patch adds vlan-device check into dev_get_flags(). So, it can check
> flags of the real device even if the vlan is used.
> 
> Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Jesse Gross <jesse@nicira.com>

I don't think this is the right way to solve the problem.

The flags are supposed to propagate back from real device to vlan
via network notifications.

Just doing this for ioctl is not enough, API's other than user space depend on this.
Also the user may have manually set different flags on vlan than on
the real device.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu Aug. 26, 2011, 6:45 a.m. UTC | #2
On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
>
> Just doing this for ioctl is not enough, API's other than user space depend on this.
> Also the user may have manually set different flags on vlan than on
> the real device.

Right, anything that tests netif_carrier_ok directly on the VLAN
device will still be delayed.

Now I remember discussing this issue in Japan.  However, I can't
recall the exact scenario in which the delay occured.

Is the issue with the link status going down on the real device,
or the real device coming up?

IIRC we already have mechanisms in place to ensure that down events
are not delayed by linkwatch.  Of course it is possible that this
isn't working for some reason, or some other part of the system is
causing the delay.

So please clarify the scenario for us Hayasaka-san.  Also please
let us know how you measured the delay.

Thanks,
Mitsuo Hayasaka Aug. 28, 2011, 1:20 p.m. UTC | #3
Hi Stephen and Herbert

Thank you for your comments.

(2011/08/26 15:08), Stephen Hemminger wrote:
> I don't think this is the right way to solve the problem.
>
> The flags are supposed to propagate back from real device to vlan
> via network notifications.
>
> Just doing this for ioctl is not enough, API's other than user space depend on this.
> Also the user may have manually set different flags on vlan than on
> the real device.

I agreed.
I will try another way to solve this problem, as you said.


(2011/08/26 15:45), Herbert Xu wrote:
> On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
>> Just doing this for ioctl is not enough, API's other than user space depend on this.
>> Also the user may have manually set different flags on vlan than on
>> the real device.
> Right, anything that tests netif_carrier_ok directly on the VLAN
> device will still be delayed.
>
> Now I remember discussing this issue in Japan.  However, I can't
> recall the exact scenario in which the delay occured.
>
> Is the issue with the link status going down on the real device,
> or the real device coming up?
>
> IIRC we already have mechanisms in place to ensure that down events
> are not delayed by linkwatch.  Of course it is possible that this
> isn't working for some reason, or some other part of the system is
> causing the delay.
>
> So please clarify the scenario for us Hayasaka-san.  Also please
> let us know how you measured the delay.
>
> Thanks,

This issue happens when the link status is going down on the real 
device.

ex) A cable is broken, or is unplugged from a NIC.

I measured the delay using ioctl with SIOCGIFFLAGS from userspace 
in order to check if there is a time-lag of the flag between vlan 
and real devices.

Also, you can check it using a script below.

-------------------------
#!/bin/sh
t=0
while :
do
	echo $t; t=$((t+1))
	echo -n real; ifconfig RealDev | grep UP
	echo -n vlan; ifconfig VlanDev | grep UP
	sleep 0.2
done
-------------------------

The result is shown as follows.
It is observed that there is a time-lag of RUNNING status between 
real and vlan devices.


....

19
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
20
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1  * A cable is unplugged from NIC.
21
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
22
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
23
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
24
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
25
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
26
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
27
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
28
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
29
real          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
30
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
31
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
32
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
33
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
34
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
35
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST MULTICAST  MTU:1500  Metric:1
36
real          UP BROADCAST MULTICAST  MTU:1500  Metric:1
vlan          UP BROADCAST MULTICAST  MTU:1500  Metric:1


Thanks.








--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 28, 2011, 2:09 p.m. UTC | #4
Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> Hi Stephen and Herbert
> 
> Thank you for your comments.
> 
> (2011/08/26 15:08), Stephen Hemminger wrote:
> > I don't think this is the right way to solve the problem.
> >
> > The flags are supposed to propagate back from real device to vlan
> > via network notifications.
> >
> > Just doing this for ioctl is not enough, API's other than user space depend on this.
> > Also the user may have manually set different flags on vlan than on
> > the real device.
> 
> I agreed.
> I will try another way to solve this problem, as you said.
> 
> 
> (2011/08/26 15:45), Herbert Xu wrote:
> > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger wrote:
> >> Just doing this for ioctl is not enough, API's other than user space depend on this.
> >> Also the user may have manually set different flags on vlan than on
> >> the real device.
> > Right, anything that tests netif_carrier_ok directly on the VLAN
> > device will still be delayed.
> >
> > Now I remember discussing this issue in Japan.  However, I can't
> > recall the exact scenario in which the delay occured.
> >
> > Is the issue with the link status going down on the real device,
> > or the real device coming up?
> >
> > IIRC we already have mechanisms in place to ensure that down events
> > are not delayed by linkwatch.  Of course it is possible that this
> > isn't working for some reason, or some other part of the system is
> > causing the delay.
> >
> > So please clarify the scenario for us Hayasaka-san.  Also please
> > let us know how you measured the delay.
> >
> > Thanks,
> 
> This issue happens when the link status is going down on the real 
> device.
> 
> ex) A cable is broken, or is unplugged from a NIC.
> 
> I measured the delay using ioctl with SIOCGIFFLAGS from userspace 
> in order to check if there is a time-lag of the flag between vlan 
> and real devices.
> 
> Also, you can check it using a script below.
> 
> -------------------------
> #!/bin/sh
> t=0
> while :
> do
> 	echo $t; t=$((t+1))
> 	echo -n real; ifconfig RealDev | grep UP
> 	echo -n vlan; ifconfig VlanDev | grep UP
> 	sleep 0.2
> done
> -------------------------
> 
> The result is shown as follows.
> It is observed that there is a time-lag of RUNNING status between 
> real and vlan devices.
> 
> 

Hi !

This reminds me some work done in linkwatch

Please take a look at commit e014debecd3ee3832e647 (linkwatch:
linkwatch_forget_dev() to speedup device dismantle)

And more generally, code in net/core/link_watch.c




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Hemminger Aug. 29, 2011, 6:06 a.m. UTC | #5
----- Original Message -----
> Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> > Hi Stephen and Herbert
> > 
> > Thank you for your comments.
> > 
> > (2011/08/26 15:08), Stephen Hemminger wrote:
> > > I don't think this is the right way to solve the problem.
> > >
> > > The flags are supposed to propagate back from real device to vlan
> > > via network notifications.
> > >
> > > Just doing this for ioctl is not enough, API's other than user
> > > space depend on this.
> > > Also the user may have manually set different flags on vlan than
> > > on
> > > the real device.
> > 
> > I agreed.
> > I will try another way to solve this problem, as you said.
> > 
> > 
> > (2011/08/26 15:45), Herbert Xu wrote:
> > > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger
> > > wrote:
> > >> Just doing this for ioctl is not enough, API's other than user
> > >> space depend on this.
> > >> Also the user may have manually set different flags on vlan than
> > >> on
> > >> the real device.
> > > Right, anything that tests netif_carrier_ok directly on the VLAN
> > > device will still be delayed.
> > >
> > > Now I remember discussing this issue in Japan.  However, I can't
> > > recall the exact scenario in which the delay occured.
> > >
> > > Is the issue with the link status going down on the real device,
> > > or the real device coming up?
> > >
> > > IIRC we already have mechanisms in place to ensure that down
> > > events
> > > are not delayed by linkwatch.  Of course it is possible that this
> > > isn't working for some reason, or some other part of the system
> > > is
> > > causing the delay.
> > >
> > > So please clarify the scenario for us Hayasaka-san.  Also please
> > > let us know how you measured the delay.
> > >
> > > Thanks,
> > 
> > This issue happens when the link status is going down on the real
> > device.
> > 
> > ex) A cable is broken, or is unplugged from a NIC.
> > 
> > I measured the delay using ioctl with SIOCGIFFLAGS from userspace
> > in order to check if there is a time-lag of the flag between vlan
> > and real devices.
> > 
> > Also, you can check it using a script below.
> > 
> > -------------------------
> > #!/bin/sh
> > t=0
> > while :
> > do
> > 	echo $t; t=$((t+1))
> > 	echo -n real; ifconfig RealDev | grep UP
> > 	echo -n vlan; ifconfig VlanDev | grep UP
> > 	sleep 0.2
> > done
> > -------------------------
> > 
> > The result is shown as follows.
> > It is observed that there is a time-lag of RUNNING status between
> > real and vlan devices.
> > 
> > 
> 
> Hi !
> 
> This reminds me some work done in linkwatch
> 
> Please take a look at commit e014debecd3ee3832e647 (linkwatch:
> linkwatch_forget_dev() to speedup device dismantle)
> 
> And more generally, code in net/core/link_watch.c

Maybe the problem is specific to a ethernet driver. Some devices poll
for link changes, and also do a manual check when ioctl was done.
This was mostly typical of older hardware that did not have a PHY
interrupt.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 29, 2011, 6:23 a.m. UTC | #6
Le dimanche 28 août 2011 à 23:06 -0700, Stephen Hemminger a écrit :
> 
> ----- Original Message -----
> > Le dimanche 28 août 2011 à 22:20 +0900, HAYASAKA Mitsuo a écrit :
> > > Hi Stephen and Herbert
> > > 
> > > Thank you for your comments.
> > > 
> > > (2011/08/26 15:08), Stephen Hemminger wrote:
> > > > I don't think this is the right way to solve the problem.
> > > >
> > > > The flags are supposed to propagate back from real device to vlan
> > > > via network notifications.
> > > >
> > > > Just doing this for ioctl is not enough, API's other than user
> > > > space depend on this.
> > > > Also the user may have manually set different flags on vlan than
> > > > on
> > > > the real device.
> > > 
> > > I agreed.
> > > I will try another way to solve this problem, as you said.
> > > 
> > > 
> > > (2011/08/26 15:45), Herbert Xu wrote:
> > > > On Thu, Aug 25, 2011 at 11:08:59PM -0700, Stephen Hemminger
> > > > wrote:
> > > >> Just doing this for ioctl is not enough, API's other than user
> > > >> space depend on this.
> > > >> Also the user may have manually set different flags on vlan than
> > > >> on
> > > >> the real device.
> > > > Right, anything that tests netif_carrier_ok directly on the VLAN
> > > > device will still be delayed.
> > > >
> > > > Now I remember discussing this issue in Japan.  However, I can't
> > > > recall the exact scenario in which the delay occured.
> > > >
> > > > Is the issue with the link status going down on the real device,
> > > > or the real device coming up?
> > > >
> > > > IIRC we already have mechanisms in place to ensure that down
> > > > events
> > > > are not delayed by linkwatch.  Of course it is possible that this
> > > > isn't working for some reason, or some other part of the system
> > > > is
> > > > causing the delay.
> > > >
> > > > So please clarify the scenario for us Hayasaka-san.  Also please
> > > > let us know how you measured the delay.
> > > >
> > > > Thanks,
> > > 
> > > This issue happens when the link status is going down on the real
> > > device.
> > > 
> > > ex) A cable is broken, or is unplugged from a NIC.
> > > 
> > > I measured the delay using ioctl with SIOCGIFFLAGS from userspace
> > > in order to check if there is a time-lag of the flag between vlan
> > > and real devices.
> > > 
> > > Also, you can check it using a script below.
> > > 
> > > -------------------------
> > > #!/bin/sh
> > > t=0
> > > while :
> > > do
> > > 	echo $t; t=$((t+1))
> > > 	echo -n real; ifconfig RealDev | grep UP
> > > 	echo -n vlan; ifconfig VlanDev | grep UP
> > > 	sleep 0.2
> > > done
> > > -------------------------
> > > 
> > > The result is shown as follows.
> > > It is observed that there is a time-lag of RUNNING status between
> > > real and vlan devices.
> > > 
> > > 
> > 
> > Hi !
> > 
> > This reminds me some work done in linkwatch
> > 
> > Please take a look at commit e014debecd3ee3832e647 (linkwatch:
> > linkwatch_forget_dev() to speedup device dismantle)
> > 
> > And more generally, code in net/core/link_watch.c
> 
> Maybe the problem is specific to a ethernet driver. Some devices poll
> for link changes, and also do a manual check when ioctl was done.
> This was mostly typical of older hardware that did not have a PHY
> interrupt.

Hmm, I just tried the script on my laptop, and reproduced the problem
with a tg3 driver, considered as a reference one ;)

the 'carrier is on' event is immediately present on both devices, but
the 'carrier is off' is delayed by one second.

09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755M
Gigabit Ethernet PCI Express (rev 02)
	Subsystem: Dell Device 01f9
	Flags: bus master, fast devsel, latency 0, IRQ 45
	Memory at f1ef0000 (64-bit, non-prefetchable) [size=64K]
	Expansion ROM at <ignored> [disabled]
	Capabilities: <access denied>
	Kernel driver in use: tg3
	Kernel modules: tg3


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Aug. 29, 2011, 6:34 a.m. UTC | #7
From: Stephen Hemminger <stephen.hemminger@vyatta.com>
Date: Sun, 28 Aug 2011 23:06:28 -0700 (PDT)

> This was mostly typical of older hardware that did not have a PHY
> interrupt.

Many have to poll because the PHY interrupt is simply unreliable.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 44da482..4df4e6f 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -91,7 +91,7 @@  struct vlan_group {
 	struct rcu_head		rcu;
 };
 
-static inline int is_vlan_dev(struct net_device *dev)
+static inline int is_vlan_dev(const struct net_device *dev)
 {
         return dev->priv_flags & IFF_802_1Q_VLAN;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index a4306f7..527e21b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4603,6 +4603,13 @@  unsigned dev_get_flags(const struct net_device *dev)
 		(dev->gflags & (IFF_PROMISC |
 				IFF_ALLMULTI));
 
+	/*
+	 * If we're trying to get flags on a vlan device
+	 * use the underlying physical device instead.
+	 */
+	if (is_vlan_dev(dev))
+		dev = vlan_dev_real_dev(dev);
+
 	if (netif_running(dev)) {
 		if (netif_oper_up(dev))
 			flags |= IFF_RUNNING;