Message ID | 20240418152503.30820-2-asmaa@nvidia.com |
---|---|
State | New |
Headers | show |
Series | UBUNTU: SAUCE: mlxbf-gige: autonegotiation fails to complete on BF2 | expand |
Hi @Tim Gardner<mailto:tim.gardner@canonical.com> @Bartlomiej Zolnierkiewicz<mailto:bartlomiej.zolnierkiewicz@canonical.com> could you please review this patch? > -----Original Message----- > From: Asmaa Mnebhi <asmaa@nvidia.com> > Sent: Thursday, April 18, 2024 11:25 AM > To: kernel-team@lists.ubuntu.com > Cc: Asmaa Mnebhi <asmaa@nvidia.com>; David Thompson > <davthompson@nvidia.com> > Subject: [SRU][J:linux-bluefield][PATCH v1 1/1] UBUNTU: SAUCE: mlxbf-gige: > autonegotiation fails to complete on BF2 > > BugLink: https://bugs.launchpad.net/bugs/2062384 > > During their reboot test, QA found an intermittent issue where the OOB link is > down. > The link is down because the KSZ9031 PHY fails to complete autonegotiation. > Even under "normal" circumstances where autonegotiation completes, it takes > an abnormal time to do so (on average, at least 8 seconds). > > Hence, the hardware team and Microchip are involved in this debug but the root > cause is still unknown. > In the meantime, we need to provide a software workaround since customers are > starting to see this issue as well. > > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com<mailto:asmaa@nvidia.com>> > Reviewed-by: David Thompson <davthompson@nvidia.com<mailto:davthompson@nvidia.com>> > --- > .../mellanox/mlxbf_gige/mlxbf_gige_main.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > index 56235cef5cd6..e377aaa4a2f4 100644 > --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > @@ -132,6 +132,7 @@ static int mlxbf_gige_open(struct net_device *netdev) { > struct mlxbf_gige *priv = netdev_priv(netdev); > struct phy_device *phydev = netdev->phydev; > + u8 timeout = 10; > u64 control; > u64 int_en; > int err; > @@ -154,6 +155,22 @@ static int mlxbf_gige_open(struct net_device *netdev) > > phy_start(phydev); > > + if (priv->hw_version == MLXBF_GIGE_BLUEFIELD2) { > + /* On BlueField-2 systems, the KSZ9031 PHY hardware could > fail > + * to complete autonegotiation and so the link remains down. > + * The software workaround is to restart autonegotiation. > + */ > + while (timeout) { > + if (phy_aneg_done(phydev)) > + break; > + msleep(1000); > + timeout--; > + }; > + > + if (timeout == 0) > + phy_restart_aneg(phydev); > + } > + > err = mlxbf_gige_tx_init(priv); > if (err) > goto phy_deinit; > -- > 2.30.1
++@Vladimir Sokolovsky<mailto:vlad@nvidia.com> From: Asmaa Mnebhi <asmaa@nvidia.com> Sent: Monday, April 29, 2024 9:22 AM To: kernel-team@lists.ubuntu.com; Tim Gardner <tim.gardner@canonical.com>; Bartlomiej Zolnierkiewicz <bartlomiej.zolnierkiewicz@canonical.com> Cc: David Thompson <davthompson@nvidia.com> Subject: RE: [SRU][J:linux-bluefield][PATCH v1 1/1] UBUNTU: SAUCE: mlxbf-gige: autonegotiation fails to complete on BF2 Hi @Tim Gardner<mailto:tim.gardner@canonical.com> @Bartlomiej Zolnierkiewicz<mailto:bartlomiej.zolnierkiewicz@canonical.com> could you please review this patch? > -----Original Message----- > From: Asmaa Mnebhi <asmaa@nvidia.com<mailto:asmaa@nvidia.com>> > Sent: Thursday, April 18, 2024 11:25 AM > To: kernel-team@lists.ubuntu.com<mailto:kernel-team@lists.ubuntu.com> > Cc: Asmaa Mnebhi <asmaa@nvidia.com<mailto:asmaa@nvidia.com>>; David Thompson > <davthompson@nvidia.com<mailto:davthompson@nvidia.com>> > Subject: [SRU][J:linux-bluefield][PATCH v1 1/1] UBUNTU: SAUCE: mlxbf-gige: > autonegotiation fails to complete on BF2 > > BugLink: https://bugs.launchpad.net/bugs/2062384 > > During their reboot test, QA found an intermittent issue where the OOB link is > down. > The link is down because the KSZ9031 PHY fails to complete autonegotiation. > Even under "normal" circumstances where autonegotiation completes, it takes > an abnormal time to do so (on average, at least 8 seconds). > > Hence, the hardware team and Microchip are involved in this debug but the root > cause is still unknown. > In the meantime, we need to provide a software workaround since customers are > starting to see this issue as well. > > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com<mailto:asmaa@nvidia.com>> > Reviewed-by: David Thompson <davthompson@nvidia.com<mailto:davthompson@nvidia.com>> > --- > .../mellanox/mlxbf_gige/mlxbf_gige_main.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > index 56235cef5cd6..e377aaa4a2f4 100644 > --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c > @@ -132,6 +132,7 @@ static int mlxbf_gige_open(struct net_device *netdev) { > struct mlxbf_gige *priv = netdev_priv(netdev); > struct phy_device *phydev = netdev->phydev; > + u8 timeout = 10; > u64 control; > u64 int_en; > int err; > @@ -154,6 +155,22 @@ static int mlxbf_gige_open(struct net_device *netdev) > > phy_start(phydev); > > + if (priv->hw_version == MLXBF_GIGE_BLUEFIELD2) { > + /* On BlueField-2 systems, the KSZ9031 PHY hardware could > fail > + * to complete autonegotiation and so the link remains down. > + * The software workaround is to restart autonegotiation. > + */ > + while (timeout) { > + if (phy_aneg_done(phydev)) > + break; > + msleep(1000); > + timeout--; > + }; > + > + if (timeout == 0) > + phy_restart_aneg(phydev); > + } > + > err = mlxbf_gige_tx_init(priv); > if (err) > goto phy_deinit; > -- > 2.30.1
diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c index 56235cef5cd6..e377aaa4a2f4 100644 --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c @@ -132,6 +132,7 @@ static int mlxbf_gige_open(struct net_device *netdev) { struct mlxbf_gige *priv = netdev_priv(netdev); struct phy_device *phydev = netdev->phydev; + u8 timeout = 10; u64 control; u64 int_en; int err; @@ -154,6 +155,22 @@ static int mlxbf_gige_open(struct net_device *netdev) phy_start(phydev); + if (priv->hw_version == MLXBF_GIGE_BLUEFIELD2) { + /* On BlueField-2 systems, the KSZ9031 PHY hardware could fail + * to complete autonegotiation and so the link remains down. + * The software workaround is to restart autonegotiation. + */ + while (timeout) { + if (phy_aneg_done(phydev)) + break; + msleep(1000); + timeout--; + }; + + if (timeout == 0) + phy_restart_aneg(phydev); + } + err = mlxbf_gige_tx_init(priv); if (err) goto phy_deinit;