diff mbox

Toggling link state breaks network connectivity

Message ID b3d1fd0c-281a-dd5a-d0e8-75f42475a248@free.fr
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Mason June 13, 2017, 3:07 p.m. UTC
On 12/06/2017 18:38, Florian Fainelli wrote:

> On 06/12/2017 06:22 AM, Mason wrote:
>
>> I am using the following drivers for Ethernet connectivity.
>> drivers/net/ethernet/aurora/nb8800.c
>> drivers/net/phy/at803x.c
>>
>> Pulling the cable and plugging it back works as expected.
>> (I can ping both before and after.)
>>
>> However, if I toggle the link state in software (using ip link set),
>> the board loses network connectivity.
>>
>> # Statically assign IP address
>> ip addr add 172.27.64.77/18 brd 172.27.127.255 dev eth0
>> # Set link state to "up"
>> ip link set eth0 up
>> # ping -c 3 172.27.64.1 > /tmp/v1
>>
>> PING 172.27.64.1 (172.27.64.1): 56 data bytes
>> 64 bytes from 172.27.64.1: seq=0 ttl=64 time=18.321 ms
> 
> This delay seems abnormally long unless you are purposely introducing
> delay (e.g: with cls_netem) or this is a really remote host, does not
> seem to be based on your traces later on.

I think the delay is due to calling ping before the link
is actually up. For example, if I ping immediately after
setting the link up, the first 4 packets are lost.

PING 172.27.64.1 (172.27.64.1): 56 data bytes
64 bytes from 172.27.64.1: seq=4 ttl=64 time=0.235 ms
64 bytes from 172.27.64.1: seq=5 ttl=64 time=0.142 ms
64 bytes from 172.27.64.1: seq=6 ttl=64 time=0.110 ms
64 bytes from 172.27.64.1: seq=7 ttl=64 time=0.095 ms
64 bytes from 172.27.64.1: seq=8 ttl=64 time=0.139 ms
64 bytes from 172.27.64.1: seq=9 ttl=64 time=0.120 ms

--- 172.27.64.1 ping statistics ---
10 packets transmitted, 6 packets received, 40% packet loss
round-trip min/avg/max = 0.095/0.140/0.235 ms


>> So basically, the board is asking the desktop for its MAC address,
>> and the desktop is answering immediately. But the board doesn't seem
>> to be getting the replies... Any ideas, or words of wisdom, as they say?
> 
> - check the Ethernet MAC counters to see if there is packet loss, or
> error, or both
> 
> - consult with your HW engineers for possible flaws in your
> ndo_open/ndo_close paths and possible interactions with the MAC/PHY
> clocks, or reset etc.
> 
> - see if your PHY needs a complete re-init after an up/down sequence and
> if you are doing this properly

I'm using the following test script:

ip addr add 172.27.64.77/18 brd 172.27.127.255 dev eth0
ip link set eth0 up
sleep 3  ## hopefully autoneg is complete

ethtool -S eth0 > /tmp/s0
ping -c 10 172.27.64.1 > /tmp/v1
ethtool -S eth0 > /tmp/s1


I did note something that seems important.

If I toggle the link state in software, then connectivity breaks.

If I unplug the ethernet cable, and replug, connectivity remains.

The difference is that plugging/unplugging doesn't call the
.ndo_stop callback. But 'ip link set eth0 down' does call it.

Should the .ndo_stop callback be symmetric to the .ndo_open callback?
In other words, should .ndo_open(); .ndo_stop(); be a NOP?

Regards.

Comments

Mason June 13, 2017, 3:32 p.m. UTC | #1
On 13/06/2017 17:07, Mason wrote:

> I did note something that seems important.
> 
> If I toggle the link state in software, then connectivity breaks.
> 
> If I unplug the ethernet cable, and replug, connectivity remains.
> 
> The difference is that plugging/unplugging doesn't call the
> .ndo_stop callback. But 'ip link set eth0 down' does call it.
> 
> Should the .ndo_stop callback be symmetric to the .ndo_open callback?
> In other words, should .ndo_open(); .ndo_stop(); be a NOP?

I changed the ndo_open callback to a wrapper that calls:

	nb8800_open(dev);
	nb8800_stop(dev);
	nb8800_open(dev);

With this change, connectivity is broken from the start.
Valid ARP requests are correctly *sent* but the corresponding
ARP replies are not received.

I'm hoping this limits the scope of what needs to be investigated.

Regards.
Florian Fainelli June 13, 2017, 3:36 p.m. UTC | #2
On 06/13/2017 08:07 AM, Mason wrote:> I did note something that seems
important.
> 
> If I toggle the link state in software, then connectivity breaks.
> 
> If I unplug the ethernet cable, and replug, connectivity remains.

What does that actually mean? If you disconnect the cable a link state
should be notified and another link state should be notified, even if
that happens faster than the PHYLIB polling time (1s).

> 
> The difference is that plugging/unplugging doesn't call the
> .ndo_stop callback. But 'ip link set eth0 down' does call it.
> 
> Should the .ndo_stop callback be symmetric to the .ndo_open callback?

Yes obviously.

> In other words, should .ndo_open(); .ndo_stop(); be a NOP?
Mason June 13, 2017, 3:47 p.m. UTC | #3
On 13/06/2017 17:36, Florian Fainelli wrote:

> On 06/13/2017 08:07 AM, Mason wrote:
>
>> I did note something that seems important.
>> If I toggle the link state in software, then connectivity breaks.
>> If I unplug the ethernet cable, and replug, connectivity remains.
> 
> What does that actually mean? If you disconnect the cable a link state
> should be notified and another link state should be notified, even if
> that happens faster than the PHYLIB polling time (1s).

Sorry for being unclear.

If I unplug/replug the cable, ping still works afterward.
(Packet RX appears to be *not* wedged)

If I toggle the link state in SW (set link down/set link up),
I can no longer ping afterward.
(Packet RX appears to be wedged, so ARP replies are not seen)

Maybe the two experiments are too unrelated to consider
the different results relevant in any way?

Regards.
Florian Fainelli June 13, 2017, 3:50 p.m. UTC | #4
On 06/13/2017 08:47 AM, Mason wrote:
> On 13/06/2017 17:36, Florian Fainelli wrote:
> 
>> On 06/13/2017 08:07 AM, Mason wrote:
>>
>>> I did note something that seems important.
>>> If I toggle the link state in software, then connectivity breaks.
>>> If I unplug the ethernet cable, and replug, connectivity remains.
>>
>> What does that actually mean? If you disconnect the cable a link state
>> should be notified and another link state should be notified, even if
>> that happens faster than the PHYLIB polling time (1s).
> 
> Sorry for being unclear.
> 
> If I unplug/replug the cable, ping still works afterward.
> (Packet RX appears to be *not* wedged)
> 
> If I toggle the link state in SW (set link down/set link up),
> I can no longer ping afterward.
> (Packet RX appears to be wedged, so ARP replies are not seen)
> 
> Maybe the two experiments are too unrelated to consider
> the different results relevant in any way?

No, they are not, this really tells you that whatever your ndo_open()
and ndo_stop() functions do, they are currently broken with your
particular HW. I have seen numerous problems on some of our older
platforms using bcmgenet where the PHY needs to be reset *before* any
MAC activity occurs, and any MAC activity even qualifies as a reset of
the MAC itself, we can perform a PHY reset there because the PHY is
internal, that may not be an option, and your HW is different anyway.
Mason June 14, 2017, 2:49 p.m. UTC | #5
On 13/06/2017 17:50, Florian Fainelli wrote:

> On 06/13/2017 08:47 AM, Mason wrote:
>
>> If I unplug/replug the cable, ping still works afterward.
>> (Packet RX appears to be *not* wedged)
>>
>> If I toggle the link state in SW (set link down/set link up),
>> I can no longer ping afterward.
>> (Packet RX appears to be wedged, so ARP replies are not seen)
>>
>> Maybe the two experiments are too unrelated to consider
>> the different results relevant in any way?
> 
> No, they are not, this really tells you that whatever your ndo_open()
> and ndo_stop() functions do, they are currently broken with your
> particular HW. I have seen numerous problems on some of our older
> platforms using bcmgenet where the PHY needs to be reset *before* any
> MAC activity occurs, and any MAC activity even qualifies as a reset of
> the MAC itself, we can perform a PHY reset there because the PHY is
> internal, that may not be an option, and your HW is different anyway.

I have logged *all* register writes, to compare state after probe,
after open, and after close. I do see a few odd values...
I'm still deep in the maze, but I may find an exit eventually.

What state is probe supposed to leave the HW in?
What state is open  supposed to leave the HW in?

I'm thinking that I could reset the entire block in close.

Regards.


Found ethernet phy : AR8035
[    0.852746] libphy: Fixed MDIO Bus: probed
[    0.856975] ++ETH++ gw8  0xf0026424 0x00		# ETH_SW_RESET
[    0.870942] ++ETH++ gw8  0xf0026424 0x01		# ETH_SW_RESET
[    0.874894] ++ETH++ gw16 0xf0026420 0x0050		# ETH_MDIO_CLK_DIV
[    0.879101] libphy: nb8800-mii: probed
[    0.885005] ++ETH++ gw8  0xf0026000 0x1c		# ETH_MAC_CORE_TRANSMIT_CONTROL
[    0.888967] ++ETH++ gw8  0xf0026001 0x05		# ETH_MAC_CORE_TRANSMIT_CONTROL
[    0.892911] ++ETH++ gw8  0xf0026004 0x22		# ETH_MAC_CORE_RECEIVE_CONTROL
[    0.896855] ++ETH++ gw8  0xf0026008 0x04		# ETH_MAC_CORE_RANDOM_SEED
[    0.900815] ++ETH++ gw8  0xf0026014 0x0c		# ETH_MAC_CORE_TRANSMIT_SINGLE_DEFERRAL
[    0.904758] ++ETH++ gw8  0xf0026051 0x00		# ETH_MAC_CORE_THRESHOLDS
[    0.908700] ++ETH++ gw8  0xf0026052 0xff		# ETH_MAC_CORE_THRESHOLDS
[    0.912643] ++ETH++ gw8  0xf0026054 0x40		# ETH_MAC_CORE_BUFFER_SIZE_TX_AND_ETH_MAC_CORE_FIFO_CONTROL
[    0.916586] ++ETH++ gw32 0xf0026100 0x000002fe	# TRANSMIT_CHANNEL_CONTROL
[    0.921054] ++ETH++ gw32 0xf0026118 0x003cc4a4	# TRANSMIT_INTERRUPT_TIME
[    0.925521] ++ETH++ gw32 0xf0026200 0x000002fe	# RECEIVE_CHANNEL_CONTROL
[    0.929990] ++ETH++ gw32 0xf0026218 0x00004dc8	# RECEIVE_INTERRUPT_TIME
[    0.934456] ++ETH++ gw8  0xf0026060 0x00		# ETH_MAC_CORE_PAUSE_QUANTA
[    0.938398] ++ETH++ gw8  0xf0026061 0xc3		# ETH_MAC_CORE_PAUSE_QUANTA
[    0.942338] ++ETH++ gw8  0xf002602e 0x00		# ETH_MAC_CORE_MULTICAST_ADDRESS1
[    0.946283] ++ETH++ gw8  0xf0026400 0x01		# ETH_GPIO_PAD_MODE
[    0.950226] ++ETH++ gw32 0xf0026200 0x0000028e	# RECEIVE_CHANNEL_CONTROL
[    0.954695] ++ETH++ gw8  0xf002606a 0x00		# ETH_MAC_CORE_SOURCE_ADDRESS0
[    0.958638] ++ETH++ gw8  0xf002606b 0x16		# ETH_MAC_CORE_SOURCE_ADDRESS0
[    0.962581] ++ETH++ gw8  0xf002606c 0xe8		# ETH_MAC_CORE_SOURCE_ADDRESS1
[    0.966523] ++ETH++ gw8  0xf002606d 0x4d		# ETH_MAC_CORE_SOURCE_ADDRESS1
[    0.970465] ++ETH++ gw8  0xf002606e 0x7f		# ETH_MAC_CORE_SOURCE_ADDRESS1
[    0.974408] ++ETH++ gw8  0xf002606f 0xc4		# ETH_MAC_CORE_SOURCE_ADDRESS1
[    0.978349] ++ETH++ gw8  0xf002603c 0x00		# ETH_MAC_CORE_UNICAST_ADDRESS0
[    0.982291] ++ETH++ gw8  0xf002603d 0x16		# ETH_MAC_CORE_UNICAST_ADDRESS0
[    0.986234] ++ETH++ gw8  0xf002603e 0xe8		# ETH_MAC_CORE_UNICAST_ADDRESS0
[    0.990176] ++ETH++ gw8  0xf002603f 0x4d		# ETH_MAC_CORE_UNICAST_ADDRESS0
[    0.994118] ++ETH++ gw8  0xf0026040 0x7f		# ETH_MAC_CORE_UNICAST_ADDRESS1
[    0.998060] ++ETH++ gw8  0xf0026041 0xc4		# ETH_MAC_CORE_UNICAST_ADDRESS1
[    1.002435] nb8800 26000.ethernet eth0: MAC address 00:16:e8:4d:7f:c4

# ip link set eth0 up
[   33.202058] ENTER nb8800_open
[   33.205134] ++ETH++ gw32 0xf0026204 0x0000000f	# RECEIVE_CHANNEL_STATUS
[   33.209610] ++ETH++ gw32 0xf0026104 0x0000000f	# TRANSMIT_CHANNEL_STATUS
[   33.214626] ++ETH++ gw32 0xf002620c 0x9de36000	# RECEIVE_DESCRIPTOR_ADDRESS
[   33.219138] ++ETH++ gw8  0xf0026004 0x23		# ETH_MAC_CORE_RECEIVE_CONTROL
[   33.223097] ++ETH++ gw8  0xf0026000 0x1d		# ETH_MAC_CORE_TRANSMIT_CONTROL
[   33.227054] nb8800_mdio_write: reg=0 val=8000	# by genphy_soft_reset()
[   33.339579] nb8800_mdio_write: reg=0 val=0000	# by at803x_resume()
[   33.370911] nb8800_mdio_write: reg=18 val=ec00	# by at803x_config_intr()
[   33.384357] ++ETH++ gw32 0xf0026200 0x0000028f	# RECEIVE_CHANNEL_CONTROL
[   33.393330] EXIT  nb8800_open

[   33.393332] ENTER nb8800_stop
[   33.421731] nb8800_mdio_write: reg=9 val=0200	# by genphy_config_aneg()
[   33.444062] nb8800_mdio_write: reg=0 val=1200	# by genphy_restart_aneg()
[   33.466411] nb8800_mdio_write: reg=18 val=0000	# by at803x_config_intr() from phy_stop()
[   33.488859] ++ETH++ gw8  0xf0026004 0x03		# ETH_MAC_CORE_RECEIVE_CONTROL
[   33.492805] ++ETH++ gw8  0xf0026044 0x04		# ETH_MAC_CORE_MAC_MODE  *WHY ??*
[   33.496748] ++ETH++ gw32 0xf002610c 0x9dcac000	\
[   33.501217] ++ETH++ gw32 0xf0026100 0x000002ff	|
[   33.505750] ++ETH++ gw32 0xf0026104 0x00000004	| IRQ?
[   33.510213] ++ETH++ gw32 0xf0026204 0x00000004	| 
[   33.514674] ++ETH++ gw32 0xf0026218 0x003cc4a4	|
[   33.519146] ++ETH++ gw32 0xf0026104 0x00000001	/
[   33.605719] ++ETH++ gw32 0xf002610c 0x9dcac000	\
[   33.610187] ++ETH++ gw32 0xf0026100 0x000002ff	|
[   33.614717] ++ETH++ gw32 0xf0026104 0x00000004	| ANOTHER IRQ?
[   33.619178] ++ETH++ gw32 0xf0026204 0x00000004	|
[   33.623637] ++ETH++ gw32 0xf0026218 0x003cc4a4	|
[   33.628106] ++ETH++ gw32 0xf0026104 0x00000001	/
[   33.714689] ++ETH++ gw32 0xf002610c 0x9dcac000	|
[   33.719156] ++ETH++ gw32 0xf0026100 0x000002ff	| ANOTHER IRQ?
[   33.723686] ++ETH++ gw32 0xf0026104 0x00000004	|
[   33.724625] ++ETH++ gw8  0xf0026004 0x23		# ETH_MAC_CORE_RECEIVE_CONTROL
[   33.724627] ++ETH++ gw8  0xf0026044 0x00		# ETH_MAC_CORE_MAC_MODE *AGAIN ??*
[   33.724640] ++ETH++ gw32 0xf002620c 0x9de36000
[   33.724642] ++ETH++ gw8  0xf0026004 0x22		# ETH_MAC_CORE_RECEIVE_CONTROL AGAIN
[   33.724644] ++ETH++ gw8  0xf0026000 0x1c		# ETH_MAC_CORE_TRANSMIT_CONTROL
[   33.724690] nb8800_mdio_write: reg=18 val=0000	# by at803x_config_intr() from phy_disable_interrupts()
[   33.724913] nb8800_mdio_write: reg=0 val=1800	# by at803x_suspend() from phy_suspend()
[   33.819562] ++ETH++ gw32 0xf0026204 0x00000004
[   33.824021] ++ETH++ gw32 0xf0026218 0x003cc4a4
[   33.828518] EXIT  nb8800_stop

[   33.831505] ENTER nb8800_open
[   33.834491] ++ETH++ gw32 0xf0026204 0x0000000f
[   33.838959] ++ETH++ gw32 0xf0026104 0x0000000f
[   33.843970] ++ETH++ gw32 0xf002620c 0x9dc1e000
[   33.848449] ++ETH++ gw8  0xf0026004 0x23
[   33.852396] ++ETH++ gw8  0xf0026000 0x1d
[   33.856343] nb8800_mdio_write: reg=0 val=8000
[   33.969686] nb8800_mdio_write: reg=0 val=0000
[   34.001128] nb8800_mdio_write: reg=18 val=ec00
[   34.014631] ++ETH++ gw32 0xf0026200 0x06100a8f	# RECEIVE_CHANNEL_CONTROL
[   34.023592] EXIT  nb8800_open

[   34.023760] nb8800_mdio_write: reg=9 val=0200
[   34.023890] nb8800_mdio_write: reg=0 val=1200
[   34.023991] nb8800_mdio_write: reg=18 val=0000
[   34.024122] nb8800_mdio_write: reg=18 val=ec00
[   34.200493] ++ETH++ gw8  0xf002602e 0x00
[   34.204473] ++ETH++ gw8  0xf002602e 0x00
[   34.208418] ++ETH++ gw8  0xf0026028 0x01
[   34.212363] ++ETH++ gw8  0xf0026029 0x00
[   34.216305] ++ETH++ gw8  0xf002602a 0x5e
[   34.220248] ++ETH++ gw8  0xf002602b 0x00
[   34.224190] ++ETH++ gw8  0xf002602c 0x00
[   34.228132] ++ETH++ gw8  0xf002602d 0x01
[   34.232073] ++ETH++ gw8  0xf002602e 0xff
[   36.738162] nb8800_mdio_write: reg=18 val=0000
[   36.769615] nb8800_mdio_write: reg=18 val=ec00
[   36.836941] ++ETH++ gw8  0xf0026050 0x01
[   36.840884] ++ETH++ gw8  0xf002601c 0xff
[   36.844829] ++ETH++ gw8  0xf0026044 0x81
[   36.848780] nb8800 26000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   36.865544] nb8800_mdio_write: reg=18 val=ec00
diff mbox

Patch

diff -U0 /tmp/s0 /tmp/s1

ip link set eth0 down
sleep 1
ip link set eth0 up
sleep 1

ethtool -S eth0 > /tmp/s0
ping -c 10 172.27.64.1 > /tmp/v2
ethtool -S eth0 > /tmp/s1
diff -U0 /tmp/s0 /tmp/s1

Testing with a generic PHY driver (no Atheros 8035 support built).
Apparently, ethtool doesn't report any packet loss or error.

First time:

# diff -U0 /tmp/s0 /tmp/s1
--- /tmp/s0
+++ /tmp/s1
@@ -2,2 +2,2 @@ 
-     rx_bytes_ok: 0
-     rx_frames_ok: 0
+     rx_bytes_ok: 1084
+     rx_frames_ok: 11
@@ -6,2 +6,2 @@ 
-     rx_64_byte_frames: 0
-     rx_127_byte_frames: 0
+     rx_64_byte_frames: 1
+     rx_127_byte_frames: 10
@@ -22,6 +22,6 @@ 
-     rx_bytes: 0
-     rx_frames: 0
-     tx_bytes_ok: 0
-     tx_frames_ok: 0
-     tx_64_byte_frames: 0
-     tx_127_byte_frames: 0
+     rx_bytes: 1084
+     rx_frames: 11
+     tx_bytes_ok: 1084
+     tx_frames_ok: 11
+     tx_64_byte_frames: 1
+     tx_127_byte_frames: 10
@@ -33 +33 @@ 
-     tx_broadcast_frames: 0
+     tx_broadcast_frames: 1
@@ -43,2 +43,2 @@ 
-     tx_bytes: 0
-     tx_frames: 0
+     tx_bytes: 1084
+     tx_frames: 11


Second time:

# diff -U0 /tmp/s0 /tmp/s1
--- /tmp/s0
+++ /tmp/s1
@@ -2,2 +2,2 @@ 
-     rx_bytes_ok: 1276
-     rx_frames_ok: 14
+     rx_bytes_ok: 1779
+     rx_frames_ok: 19
@@ -6 +6 @@ 
-     rx_64_byte_frames: 4
+     rx_64_byte_frames: 8
@@ -8 +8 @@ 
-     rx_255_byte_frames: 0
+     rx_255_byte_frames: 1
@@ -14 +14 @@ 
-     rx_broadcast_frames: 0
+     rx_broadcast_frames: 1
@@ -22,5 +22,5 @@ 
-     rx_bytes: 1276
-     rx_frames: 14
-     tx_bytes_ok: 1276
-     tx_frames_ok: 14
-     tx_64_byte_frames: 4
+     rx_bytes: 1779
+     rx_frames: 19
+     tx_bytes_ok: 1724
+     tx_frames_ok: 21
+     tx_64_byte_frames: 11
@@ -33 +33 @@ 
-     tx_broadcast_frames: 1
+     tx_broadcast_frames: 8
@@ -43,2 +43,2 @@ 
-     tx_bytes: 1276
-     tx_frames: 14
+     tx_bytes: 1724
+     tx_frames: 21