mbox series

[RFC,net-next,0/3] RTNL: Add link-down reason reporting

Message ID cover.1552672441.git.petrm@mellanox.com
Headers show
Series RTNL: Add link-down reason reporting | expand

Message

Petr Machata March 15, 2019, 5:56 p.m. UTC
In general, after a port is put administratively up, certain handshake
protocols have to finish successfully, otherwise the port is left in a
NO-CARRIER or DORMANT state. When that happens, it would be useful to
communicate the reasons to the administrator, so that the problem that
prevents the link from being established can be corrected.

This patch set adds two new RTNL attributes, IFLA_LINK_DOWN_REASON_MAJOR
and _MINOR, to carry the information. Major reason codes are drawn from
a well-known enum that is part of the kernel interface. They serve as
broad categories intended to convey a general idea of where the problem
is. Minor codes are arbitrary numbers specific for the driver in
question that add detail to the major reasons.

The hope is that an average user will not need to dive into the minor
reason codes. It is for example largely immaterial what it is that makes
any given cable unsupported, because the administrator will just take
another cable anyway. The minor code may still be provided though, for
the cases where further information is actually necessary.

The party with visibility into details of this process is the driver.
Therefore add two new RTNL hooks, link_down_reason_get_size and
fill_link_down_reason, to provide the necessary information.

Link-down reason is not included if the port is up or administratively
down, because those two state are easy to discover through existing
interfaces. The new interface is intended for debugging of the
transition between these two states.

This is all in patch #1. Patches #2 and #3 add implementation of the new
interfaces for mlxsw.

A preliminary iproute patch that implements display of the new
attributes is available here:

    https://github.com/pmachata/iproute2/tree/link_down_reason

And this is an example output:

    # ip -d link show dev sw1p1
    393: sw1p1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
        link/ether 7c:fe:90:f5:a3:7d brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 0 maxmtu 65535
        mlxsw addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 \
        portname p1 switchid 7cfe90f5a340 down_reason NO_CABLE 1024

Petr Machata (3):
  net: rtnetlink: Add link-down reason to RTNL messages
  mlxsw: reg: Add Port Diagnostics Database Register
  mlxsw: spectrum: Add rtnl_link_ops

 drivers/net/ethernet/mellanox/mlxsw/reg.h      |  54 +++++++++++++
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 108 +++++++++++++++++++++++++
 include/net/rtnetlink.h                        |   3 +
 include/uapi/linux/if_link.h                   |  16 ++++
 net/core/rtnetlink.c                           |  22 +++++
 5 files changed, 203 insertions(+)

Comments

Andrew Lunn March 16, 2019, 2:06 a.m. UTC | #1
> The party with visibility into details of this process is the driver.

Hi Petr

In the general case, i would disagree with this. It is the PHY layer
which knows about these things. phylib and phylink. The MAC driver has
no idea, it just sees that the carrier is off.

There are however some drivers which do PHYs without using the Linux
core code. But there are not so many of them. I would hope this code
is designed for the general case, and can also be used by those that
ignore the core code.

Please could you explain how you see this being linked to phylib and
phylink, without having to modify every MAC driver.

	 Andrew
Petr Machata March 18, 2019, 12:11 p.m. UTC | #2
Andrew Lunn <andrew@lunn.ch> writes:

>> The party with visibility into details of this process is the driver.
>
> In the general case, i would disagree with this. It is the PHY layer
> which knows about these things. phylib and phylink. The MAC driver has
> no idea, it just sees that the carrier is off.
>
> There are however some drivers which do PHYs without using the Linux
> core code. But there are not so many of them. I would hope this code
> is designed for the general case, and can also be used by those that
> ignore the core code.
>
> Please could you explain how you see this being linked to phylib and
> phylink, without having to modify every MAC driver.

I'll take a look at how to make this work with the phy drivers. Thanks,
Petr