diff mbox series

[net-next] net: phy: phylink: add fallback from SGMII to 1000BaseX

Message ID 1559330285-30246-4-git-send-email-hancock@sedsystems.ca
State Changes Requested
Delegated to: David Miller
Headers show
Series [net-next] net: phy: phylink: add fallback from SGMII to 1000BaseX | expand

Commit Message

Robert Hancock May 31, 2019, 7:18 p.m. UTC
Some copper SFP modules support both SGMII and 1000BaseX, but some
drivers/devices only support the 1000BaseX mode. Currently SGMII mode is
always being selected as the desired mode for such modules, and this
fails if the controller doesn't support SGMII. Add a fallback for this
case by trying 1000BaseX instead if the controller rejects SGMII mode.

Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
---
 drivers/net/phy/phylink.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Russell King (Oracle) May 31, 2019, 8:18 p.m. UTC | #1
On Fri, May 31, 2019 at 01:18:04PM -0600, Robert Hancock wrote:
> Some copper SFP modules support both SGMII and 1000BaseX,

The situation is way worse than that.  Some copper SFP modules are
programmed to support SGMII only.  Others are programmed to support
1000baseX only.  There is no way to tell from the EEPROM how they
are configured, and there is no way to auto-probe the format of the
control word (which is the difference between the two.)

> but some
> drivers/devices only support the 1000BaseX mode. Currently SGMII mode is
> always being selected as the desired mode for such modules, and this
> fails if the controller doesn't support SGMII. Add a fallback for this
> case by trying 1000BaseX instead if the controller rejects SGMII mode.

So, what happens when a controller supports both SGMII and 1000base-X
modes (such as the Marvell devices) but the module is setup for
1000base-X mode?

> 
> Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
> ---
>  drivers/net/phy/phylink.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index 68d0a89..4fd72c2 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -1626,6 +1626,7 @@ static int phylink_sfp_module_insert(void *upstream,
>  {
>  	struct phylink *pl = upstream;
>  	__ETHTOOL_DECLARE_LINK_MODE_MASK(support) = { 0, };
> +	__ETHTOOL_DECLARE_LINK_MODE_MASK(orig_support) = { 0, };
>  	struct phylink_link_state config;
>  	phy_interface_t iface;
>  	int ret = 0;
> @@ -1635,6 +1636,7 @@ static int phylink_sfp_module_insert(void *upstream,
>  	ASSERT_RTNL();
>  
>  	sfp_parse_support(pl->sfp_bus, id, support);
> +	linkmode_copy(orig_support, support);
>  	port = sfp_parse_port(pl->sfp_bus, id, support);
>  
>  	memset(&config, 0, sizeof(config));
> @@ -1663,6 +1665,25 @@ static int phylink_sfp_module_insert(void *upstream,
>  
>  	config.interface = iface;
>  	ret = phylink_validate(pl, support, &config);
> +
> +	if (ret && iface == PHY_INTERFACE_MODE_SGMII &&
> +	    phylink_test(orig_support, 1000baseX_Full)) {
> +		/* Copper modules may select SGMII but the interface may not
> +		 * support that mode, try 1000BaseX if supported.
> +		 */

Here, you are talking about what the module itself supports, but this
code is determining what it should do based on what the _network
controller_ supports.

If the SFP module is programmed for SGMII, and the network controller
supports 1000base-X, then it isn't going to work very well - the
sender of the control word will be sending one format, and the
receiver will be interpreting the bits wrongly.

> +
> +		netdev_warn(pl->netdev, "validation of %s/%s with support %*pb "
> +			    "failed: %d, trying 1000BaseX\n",
> +			    phylink_an_mode_str(MLO_AN_INBAND),
> +			    phy_modes(config.interface),
> +			    __ETHTOOL_LINK_MODE_MASK_NBITS, orig_support, ret);
> +		iface = PHY_INTERFACE_MODE_1000BASEX;
> +		config.interface = iface;
> +		linkmode_copy(config.advertising, orig_support);
> +		linkmode_copy(support, orig_support);
> +		ret = phylink_validate(pl, support, &config);
> +	}
> +
>  	if (ret) {
>  		phylink_err(pl, "validation of %s/%s with support %*pb failed: %d\n",
>  			    phylink_an_mode_str(MLO_AN_INBAND),
> -- 
> 1.8.3.1
> 
>
Robert Hancock June 1, 2019, 12:17 a.m. UTC | #2
On 2019-05-31 2:18 p.m., Russell King - ARM Linux admin wrote:
> On Fri, May 31, 2019 at 01:18:04PM -0600, Robert Hancock wrote:
>> Some copper SFP modules support both SGMII and 1000BaseX,
> 
> The situation is way worse than that.  Some copper SFP modules are
> programmed to support SGMII only.  Others are programmed to support
> 1000baseX only.  There is no way to tell from the EEPROM how they
> are configured, and there is no way to auto-probe the format of the
> control word (which is the difference between the two.)
> 
>> but some
>> drivers/devices only support the 1000BaseX mode. Currently SGMII mode is
>> always being selected as the desired mode for such modules, and this
>> fails if the controller doesn't support SGMII. Add a fallback for this
>> case by trying 1000BaseX instead if the controller rejects SGMII mode.
> 
> So, what happens when a controller supports both SGMII and 1000base-X
> modes (such as the Marvell devices) but the module is setup for
> 1000base-X mode?

My description is likely a bit incorrect.. rather than supporting both
1000BaseX and SGMII, a given module can support either 1000BaseX or
SGMII, but likely only one at a time (at least without magic
vendor-specific commands to switch modes).

The logic in sfp_select_interface always selects SGMII mode for copper
modules, which is the preferred mode of operation since 100 and 10 Mbps
modes won't work with 1000BaseX. If the controller and module actually
support SGMII, everything is fine. If the controller doesn't support
SGMII, it should fail validation and the link won't come up. If the
module doesn't support SGMII, it may try to come up but the link likely
won't work properly.

Our device is mainly intended for fiber modules, which is why 1000BaseX
is being used. The variant of fiber modules we are using (for example,
Finisar FCLF8520P2BTL) are set up for 1000BaseX, and seem like they are
kind of a hack to allow using copper on devices which only support
1000BaseX mode (in fact that particular one is extra hacky since you
have to disable 1000BaseX autonegotiation on the host side). This patch
is basically intended to allow that particular case to work.

It's kind of a dumb situation, but in the absence of a way to tell from
the EEPROM content what mode the module is actually in (and you would
likely know better than I if there was), it seems like the best we can do.

> 
>>
>> Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
>> ---
>>  drivers/net/phy/phylink.c | 21 +++++++++++++++++++++
>>  1 file changed, 21 insertions(+)
>>
>> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
>> index 68d0a89..4fd72c2 100644
>> --- a/drivers/net/phy/phylink.c
>> +++ b/drivers/net/phy/phylink.c
>> @@ -1626,6 +1626,7 @@ static int phylink_sfp_module_insert(void *upstream,
>>  {
>>  	struct phylink *pl = upstream;
>>  	__ETHTOOL_DECLARE_LINK_MODE_MASK(support) = { 0, };
>> +	__ETHTOOL_DECLARE_LINK_MODE_MASK(orig_support) = { 0, };
>>  	struct phylink_link_state config;
>>  	phy_interface_t iface;
>>  	int ret = 0;
>> @@ -1635,6 +1636,7 @@ static int phylink_sfp_module_insert(void *upstream,
>>  	ASSERT_RTNL();
>>  
>>  	sfp_parse_support(pl->sfp_bus, id, support);
>> +	linkmode_copy(orig_support, support);
>>  	port = sfp_parse_port(pl->sfp_bus, id, support);
>>  
>>  	memset(&config, 0, sizeof(config));
>> @@ -1663,6 +1665,25 @@ static int phylink_sfp_module_insert(void *upstream,
>>  
>>  	config.interface = iface;
>>  	ret = phylink_validate(pl, support, &config);
>> +
>> +	if (ret && iface == PHY_INTERFACE_MODE_SGMII &&
>> +	    phylink_test(orig_support, 1000baseX_Full)) {
>> +		/* Copper modules may select SGMII but the interface may not
>> +		 * support that mode, try 1000BaseX if supported.
>> +		 */
> 
> Here, you are talking about what the module itself supports, but this
> code is determining what it should do based on what the _network
> controller_ supports.

orig_support is from just after sfp_parse_support is called, so it's
reflecting everything we think the module supports. The "net: sfp: Set
1000BaseX support flag for 1000BaseT modules" patch adds 1000BaseX to
that list for copper modules, which allows this code to detect that
1000BaseX might be a possibility.

> 
> If the SFP module is programmed for SGMII, and the network controller
> supports 1000base-X, then it isn't going to work very well - the
> sender of the control word will be sending one format, and the
> receiver will be interpreting the bits wrongly.

Agreed, but as I mentioned above, it doesn't appear that there's any
sensible way to avoid that. Without this patch, both SGMII and 1000BaseX
copper modules would fail in a 1000BaseX-only controller. With this
patch in place, the 1000BaseX module will work.
Sergei Shtylyov June 1, 2019, 9:46 a.m. UTC | #3
Hello!

On 31.05.2019 22:18, Robert Hancock wrote:

> Some copper SFP modules support both SGMII and 1000BaseX, but some
> drivers/devices only support the 1000BaseX mode. Currently SGMII mode is
> always being selected as the desired mode for such modules, and this
> fails if the controller doesn't support SGMII. Add a fallback for this
> case by trying 1000BaseX instead if the controller rejects SGMII mode.
> 
> Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
> ---
>   drivers/net/phy/phylink.c | 21 +++++++++++++++++++++
>   1 file changed, 21 insertions(+)
> 
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index 68d0a89..4fd72c2 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
[...]
> @@ -1663,6 +1665,25 @@ static int phylink_sfp_module_insert(void *upstream,
>   
>   	config.interface = iface;
>   	ret = phylink_validate(pl, support, &config);
> +
> +	if (ret && iface == PHY_INTERFACE_MODE_SGMII &&
> +	    phylink_test(orig_support, 1000baseX_Full)) {
> +		/* Copper modules may select SGMII but the interface may not
> +		 * support that mode, try 1000BaseX if supported.
> +		 */
> +
> +		netdev_warn(pl->netdev, "validation of %s/%s with support %*pb "
> +			    "failed: %d, trying 1000BaseX\n",

    Don't break the messages like this, scripts/checkpatch.pl shouldn't 
complain about too long lines in this case.

[...]

MBR, Sergei
Russell King (Oracle) June 2, 2019, 3:15 p.m. UTC | #4
On Fri, May 31, 2019 at 06:17:51PM -0600, Robert Hancock wrote:
> Our device is mainly intended for fiber modules, which is why 1000BaseX
> is being used. The variant of fiber modules we are using (for example,
> Finisar FCLF8520P2BTL) are set up for 1000BaseX, and seem like they are
> kind of a hack to allow using copper on devices which only support
> 1000BaseX mode (in fact that particular one is extra hacky since you
> have to disable 1000BaseX autonegotiation on the host side). This patch
> is basically intended to allow that particular case to work.

Looking at the data sheet for FCLF8520P2BTL, it explicit states:

PRODUCT SELECTION
Part Number	Link Indicator	1000BASE-X auto-negotiation
		on RX_LOS Pin	enabled by default
FCLF8520P2BTL	Yes		No
FCLF8521P2BTL	No		Yes
FCLF8522P2BTL	Yes		Yes

The idea being, you buy the correct one according to what the host
equipment requires, rather than just picking one and hoping it works.

The data sheet goes on to mention that the module uses a Marvell
88e1111 PHY, which seems to be quite common for copper SFPs from
multiple manufacturers (but not all) and is very flexible in how it
can be configured.

If we detect a PHY on the SFP module, we check detect whether it is
an 88e1111 PHY, and then read out its configured link type.  We don't
have a way to deal with the difference between FCLF8520P2BTL and
FCLF8521P2BTL, but at least we'll be able to tell whether we should
be in 1000Base-X mode for these modules, rather than SGMII.

For a SFP cage meant to support fiber, I would recommend using the
FCLF8521P2BTL or FCLF8522P2BTL since those will behave more like a
802.3z standards-compliant gigabit fiber connection.
Robert Hancock June 7, 2019, 6:10 p.m. UTC | #5
On 2019-06-02 9:15 a.m., Russell King - ARM Linux admin wrote:
> On Fri, May 31, 2019 at 06:17:51PM -0600, Robert Hancock wrote:
>> Our device is mainly intended for fiber modules, which is why 1000BaseX
>> is being used. The variant of fiber modules we are using (for example,
>> Finisar FCLF8520P2BTL) are set up for 1000BaseX, and seem like they are
>> kind of a hack to allow using copper on devices which only support
>> 1000BaseX mode (in fact that particular one is extra hacky since you
>> have to disable 1000BaseX autonegotiation on the host side). This patch
>> is basically intended to allow that particular case to work.
> 
> Looking at the data sheet for FCLF8520P2BTL, it explicit states:
> 
> PRODUCT SELECTION
> Part Number	Link Indicator	1000BASE-X auto-negotiation
> 		on RX_LOS Pin	enabled by default
> FCLF8520P2BTL	Yes		No
> FCLF8521P2BTL	No		Yes
> FCLF8522P2BTL	Yes		Yes
> 
> The idea being, you buy the correct one according to what the host
> equipment requires, rather than just picking one and hoping it works.
> 
> The data sheet goes on to mention that the module uses a Marvell
> 88e1111 PHY, which seems to be quite common for copper SFPs from
> multiple manufacturers (but not all) and is very flexible in how it
> can be configured.
> 
> If we detect a PHY on the SFP module, we check detect whether it is
> an 88e1111 PHY, and then read out its configured link type.  We don't
> have a way to deal with the difference between FCLF8520P2BTL and
> FCLF8521P2BTL, but at least we'll be able to tell whether we should
> be in 1000Base-X mode for these modules, rather than SGMII.

It looks like that might provide a solution for modules using the
Marvell PHY, however some of the modules we are supporting seem to use a
Broadcom PHY, and I have no idea if there is any documentation for those.

It would really be rather silly if there were absolutely no way to tell
what mode the module wants from the EEPROM.. I don't have any copper
modules set up for SGMII, but below is the ethtool -m output for two of
the 1000BaseX modules I have. If you have access to an SGMII module, can
you compare this to what it indicates?

# ethtool -m eth1
	Identifier                                : 0x03 (SFP)
	Extended identifier                       : 0x04 (GBIC/SFP defined by
2-wire interface ID)
	Connector                                 : 0x00 (unknown or unspecified)
	Transceiver codes                         : 0x00 0x00 0x00 0x08 0x00
0x00 0x00 0x00 0x00
	Transceiver type                          : Ethernet: 1000BASE-T
	Encoding                                  : 0x01 (8B/10B)
	BR, Nominal                               : 1200MBd
	Rate identifier                           : 0x00 (unspecified)
	Length (SMF,km)                           : 0km
	Length (SMF)                              : 0m
	Length (50um)                             : 0m
	Length (62.5um)                           : 0m
	Length (Copper)                           : 100m
	Length (OM3)                              : 0m
	Laser wavelength                          : 0nm
	Vendor name                               : FINISAR CORP.
	Vendor OUI                                : 00:90:65
	Vendor PN                                 : FCLF8520P2BTL
	Vendor rev                                : A
	Option values                             : 0x00 0x12
	Option                                    : RX_LOS implemented
	Option                                    : TX_DISABLE implemented
	BR margin, max                            : 0%
	BR margin, min                            : 0%
	Vendor SN                                 : PX90NHX
	Date code                                 : 170303


# ethtool -m eth1
	Identifier                                : 0x03 (SFP)
	Extended identifier                       : 0x04 (GBIC/SFP defined by
2-wire interface ID)
	Connector                                 : 0x00 (unknown or unspecified)
	Transceiver codes                         : 0x00 0x00 0x00 0x08 0x00
0x00 0x00 0x00 0x00
	Transceiver type                          : Ethernet: 1000BASE-T
	Encoding                                  : 0x01 (8B/10B)
	BR, Nominal                               : 1300MBd
	Rate identifier                           : 0x00 (unspecified)
	Length (SMF,km)                           : 0km
	Length (SMF)                              : 0m
	Length (50um)                             : 0m
	Length (62.5um)                           : 0m
	Length (Copper)                           : 100m
	Length (OM3)                              : 0m
	Laser wavelength                          : 0nm
	Vendor name                               : BEL-FUSE
	Vendor OUI                                : 00:00:00
	Vendor PN                                 : 1GBT-SFP06
	Vendor rev                                : PB
	Option values                             : 0x00 0x12
	Option                                    : RX_LOS implemented
	Option                                    : TX_DISABLE implemented
	BR margin, max                            : 0%
	BR margin, min                            : 0%
	Vendor SN                                 :      0000000610
	Date code                                 : 1336


> 
> For a SFP cage meant to support fiber, I would recommend using the
> FCLF8521P2BTL or FCLF8522P2BTL since those will behave more like a
> 802.3z standards-compliant gigabit fiber connection.
>
Russell King (Oracle) June 7, 2019, 6:34 p.m. UTC | #6
On Fri, Jun 07, 2019 at 12:10:57PM -0600, Robert Hancock wrote:
> On 2019-06-02 9:15 a.m., Russell King - ARM Linux admin wrote:
> > On Fri, May 31, 2019 at 06:17:51PM -0600, Robert Hancock wrote:
> >> Our device is mainly intended for fiber modules, which is why 1000BaseX
> >> is being used. The variant of fiber modules we are using (for example,
> >> Finisar FCLF8520P2BTL) are set up for 1000BaseX, and seem like they are
> >> kind of a hack to allow using copper on devices which only support
> >> 1000BaseX mode (in fact that particular one is extra hacky since you
> >> have to disable 1000BaseX autonegotiation on the host side). This patch
> >> is basically intended to allow that particular case to work.
> > 
> > Looking at the data sheet for FCLF8520P2BTL, it explicit states:
> > 
> > PRODUCT SELECTION
> > Part Number	Link Indicator	1000BASE-X auto-negotiation
> > 		on RX_LOS Pin	enabled by default
> > FCLF8520P2BTL	Yes		No
> > FCLF8521P2BTL	No		Yes
> > FCLF8522P2BTL	Yes		Yes
> > 
> > The idea being, you buy the correct one according to what the host
> > equipment requires, rather than just picking one and hoping it works.
> > 
> > The data sheet goes on to mention that the module uses a Marvell
> > 88e1111 PHY, which seems to be quite common for copper SFPs from
> > multiple manufacturers (but not all) and is very flexible in how it
> > can be configured.
> > 
> > If we detect a PHY on the SFP module, we check detect whether it is
> > an 88e1111 PHY, and then read out its configured link type.  We don't
> > have a way to deal with the difference between FCLF8520P2BTL and
> > FCLF8521P2BTL, but at least we'll be able to tell whether we should
> > be in 1000Base-X mode for these modules, rather than SGMII.
> 
> It looks like that might provide a solution for modules using the
> Marvell PHY, however some of the modules we are supporting seem to use a
> Broadcom PHY, and I have no idea if there is any documentation for those.
> 
> It would really be rather silly if there were absolutely no way to tell
> what mode the module wants from the EEPROM..

It is something I've spent weeks looking at from many different angles.
There is no way to tell.

You have to bear in mind that 1000BaseX and SGMII are essentially
identical, except for the interpretation of that 16-bit control word
and how it is handled.  Both are 1250Mbaud, both are 8b/10b encoded.
Both identify as supporting 1000BASE-T.

As I've said, the only way I can come up with is a hard-coded table
of vendor name/part number to identify what each one requires.
diff mbox series

Patch

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 68d0a89..4fd72c2 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -1626,6 +1626,7 @@  static int phylink_sfp_module_insert(void *upstream,
 {
 	struct phylink *pl = upstream;
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(support) = { 0, };
+	__ETHTOOL_DECLARE_LINK_MODE_MASK(orig_support) = { 0, };
 	struct phylink_link_state config;
 	phy_interface_t iface;
 	int ret = 0;
@@ -1635,6 +1636,7 @@  static int phylink_sfp_module_insert(void *upstream,
 	ASSERT_RTNL();
 
 	sfp_parse_support(pl->sfp_bus, id, support);
+	linkmode_copy(orig_support, support);
 	port = sfp_parse_port(pl->sfp_bus, id, support);
 
 	memset(&config, 0, sizeof(config));
@@ -1663,6 +1665,25 @@  static int phylink_sfp_module_insert(void *upstream,
 
 	config.interface = iface;
 	ret = phylink_validate(pl, support, &config);
+
+	if (ret && iface == PHY_INTERFACE_MODE_SGMII &&
+	    phylink_test(orig_support, 1000baseX_Full)) {
+		/* Copper modules may select SGMII but the interface may not
+		 * support that mode, try 1000BaseX if supported.
+		 */
+
+		netdev_warn(pl->netdev, "validation of %s/%s with support %*pb "
+			    "failed: %d, trying 1000BaseX\n",
+			    phylink_an_mode_str(MLO_AN_INBAND),
+			    phy_modes(config.interface),
+			    __ETHTOOL_LINK_MODE_MASK_NBITS, orig_support, ret);
+		iface = PHY_INTERFACE_MODE_1000BASEX;
+		config.interface = iface;
+		linkmode_copy(config.advertising, orig_support);
+		linkmode_copy(support, orig_support);
+		ret = phylink_validate(pl, support, &config);
+	}
+
 	if (ret) {
 		phylink_err(pl, "validation of %s/%s with support %*pb failed: %d\n",
 			    phylink_an_mode_str(MLO_AN_INBAND),