diff mbox series

net: phy: phy_remove_link_mode should not advertise new modes

Message ID 20200714082540.GA31028@laureti-dev
State Rejected
Delegated to: David Miller
Headers show
Series net: phy: phy_remove_link_mode should not advertise new modes | expand

Commit Message

Helmut Grohne July 14, 2020, 8:25 a.m. UTC
When doing "ip link set dev ... up" for a ksz9477 backed link,
ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
called.

If one wants to advertise fewer modes than the supported ones, one
usually reduces the advertised link modes before upping the link (e.g.
by passing an appropriate .link file to udev).  However upping
overrwrites the advertised link modes due to the call to
phy_advertise_supported reverting to the supported link modes.

It seems unintentional to have phy_remove_link_mode enable advertising
bits and it does not match its description in any way. Instead of
calling phy_advertise_supported, we should simply clear the link mode to
be removed from both supported and advertising.

Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Fixes: 41124fa64d4b29 ("net: ethernet: Add helper to remove a supported link mode")
---
 drivers/net/phy/phy_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Miller July 14, 2020, 9:07 p.m. UTC | #1
From: Helmut Grohne <helmut.grohne@intenta.de>
Date: Tue, 14 Jul 2020 10:25:42 +0200

> When doing "ip link set dev ... up" for a ksz9477 backed link,
> ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
> 1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
> called.
> 
> If one wants to advertise fewer modes than the supported ones, one
> usually reduces the advertised link modes before upping the link (e.g.
> by passing an appropriate .link file to udev).  However upping
> overrwrites the advertised link modes due to the call to
> phy_advertise_supported reverting to the supported link modes.
> 
> It seems unintentional to have phy_remove_link_mode enable advertising
> bits and it does not match its description in any way. Instead of
> calling phy_advertise_supported, we should simply clear the link mode to
> be removed from both supported and advertising.
> 
> Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
> Fixes: 41124fa64d4b29 ("net: ethernet: Add helper to remove a supported link mode")

The problem is that we can't allow the advertised setting to exceed
what is in the supported list.

That's why this helper is coded this way from day one.
Helmut Grohne July 15, 2020, 7:03 a.m. UTC | #2
On Tue, Jul 14, 2020 at 11:07:10PM +0200, David Miller wrote:
> From: Helmut Grohne <helmut.grohne@intenta.de>
> Date: Tue, 14 Jul 2020 10:25:42 +0200
> 
> > When doing "ip link set dev ... up" for a ksz9477 backed link,
> > ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
> > 1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
> > called.
> > 
> > If one wants to advertise fewer modes than the supported ones, one
> > usually reduces the advertised link modes before upping the link (e.g.
> > by passing an appropriate .link file to udev).  However upping
> > overrwrites the advertised link modes due to the call to
> > phy_advertise_supported reverting to the supported link modes.
> > 
> > It seems unintentional to have phy_remove_link_mode enable advertising
> > bits and it does not match its description in any way. Instead of
> > calling phy_advertise_supported, we should simply clear the link mode to
> > be removed from both supported and advertising.
> 
> The problem is that we can't allow the advertised setting to exceed
> what is in the supported list.
> 
> That's why this helper is coded this way from day one.

Would you mind going into a little more detail here?

I think you have essentially two possible cases with respect to that
assertion.

Case A: advertised does not exceed supported before the call to
        phy_remove_link_mode.

    In this case, the relevant link mode is removed from both supported
    and advertised after my patch and therefore the requested invariant
    is still ok.

Case B: advertised exceeds supported prior to the call to
        phy_remove_link_mode.

    You said that we cannot allow this to happen. So it would seem to be
    a bug somewhere else. Do you see phy_remove_link_mode as a tool to
    fix up a violated invariant?

It also is not true that the current code ensures your assertion.
Specifically, phy_advertise_supported copies the pause bits from the old
advertised to the new one regardless of whether they're set in
supported. I believe this is expected, but it means that your invariant
needs to be:

    We cannot allow advertised to exceed the supported list for
    non-pause bits.

In any case, having a helper called "phy_remove_link_mode" enable bits
in the advertised bit field is fairly unexpected. Do you disagree with
this being a bug?

Helmut
Jakub Kicinski July 15, 2020, 6:20 p.m. UTC | #3
On Wed, 15 Jul 2020 09:03:45 +0200 Helmut Grohne wrote:
> On Tue, Jul 14, 2020 at 11:07:10PM +0200, David Miller wrote:
> > From: Helmut Grohne <helmut.grohne@intenta.de>
> > Date: Tue, 14 Jul 2020 10:25:42 +0200
> >   
> > > When doing "ip link set dev ... up" for a ksz9477 backed link,
> > > ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
> > > 1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
> > > called.
> > > 
> > > If one wants to advertise fewer modes than the supported ones, one
> > > usually reduces the advertised link modes before upping the link (e.g.
> > > by passing an appropriate .link file to udev).  However upping
> > > overrwrites the advertised link modes due to the call to
> > > phy_advertise_supported reverting to the supported link modes.
> > > 
> > > It seems unintentional to have phy_remove_link_mode enable advertising
> > > bits and it does not match its description in any way. Instead of
> > > calling phy_advertise_supported, we should simply clear the link mode to
> > > be removed from both supported and advertising.  
> > 
> > The problem is that we can't allow the advertised setting to exceed
> > what is in the supported list.
> > 
> > That's why this helper is coded this way from day one.  
> 
> Would you mind going into a little more detail here?
> 
> I think you have essentially two possible cases with respect to that
> assertion.
> 
> Case A: advertised does not exceed supported before the call to
>         phy_remove_link_mode.
> 
>     In this case, the relevant link mode is removed from both supported
>     and advertised after my patch and therefore the requested invariant
>     is still ok.
> 
> Case B: advertised exceeds supported prior to the call to
>         phy_remove_link_mode.
> 
>     You said that we cannot allow this to happen. So it would seem to be
>     a bug somewhere else. Do you see phy_remove_link_mode as a tool to
>     fix up a violated invariant?

Is 

Case C: driver does not initialize advertised at all and depends on
        phy_remove_link_mode() to do it

possible?

> It also is not true that the current code ensures your assertion.
> Specifically, phy_advertise_supported copies the pause bits from the old
> advertised to the new one regardless of whether they're set in
> supported. I believe this is expected, but it means that your invariant
> needs to be:
> 
>     We cannot allow advertised to exceed the supported list for
>     non-pause bits.
> 
> In any case, having a helper called "phy_remove_link_mode" enable bits
> in the advertised bit field is fairly unexpected. Do you disagree with
> this being a bug?

Hm. I think it's clear that the change may uncover other bugs, but
perhaps indeed those should be addressed elsewhere.

Andrew, WDYT?
Andrew Lunn July 15, 2020, 6:51 p.m. UTC | #4
> It also is not true that the current code ensures your assertion.
> Specifically, phy_advertise_supported copies the pause bits from the old
> advertised to the new one regardless of whether they're set in
> supported.

This is an oddity of Pause. The PHY should not sets Pause in
supported, because the PHY is not the device which implements
Pause. The MAC needs to indicate to PHYLIB it implements Pause, and
then the PHY will advertise Pause.

I will address the other points in a separate email.

  Andrew
Andrew Lunn July 15, 2020, 7:01 p.m. UTC | #5
> Is 
> 
> Case C: driver does not initialize advertised at all and depends on
>         phy_remove_link_mode() to do it
> 
> possible?

No. phylib initializes advertise to supported as part of probing the
PHY. So the PHY by default advertises everything it supports, except
the oddities of Pause.

    Andrew
Andrew Lunn July 15, 2020, 7:27 p.m. UTC | #6
On Tue, Jul 14, 2020 at 10:25:42AM +0200, Helmut Grohne wrote:
> When doing "ip link set dev ... up" for a ksz9477 backed link,
> ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
> 1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
> called.
> 
> If one wants to advertise fewer modes than the supported ones, one
> usually reduces the advertised link modes before upping the link (e.g.
> by passing an appropriate .link file to udev).  However upping
> overrwrites the advertised link modes due to the call to
> phy_advertise_supported reverting to the supported link modes.
> 
> It seems unintentional to have phy_remove_link_mode enable advertising
> bits and it does not match its description in any way. Instead of
> calling phy_advertise_supported, we should simply clear the link mode to
> be removed from both supported and advertising.

We have two different reasons for removing link modes.

1) The PHY cannot support a link mode. E.g.

static int bcm84881_get_features(struct phy_device *phydev)
{
        int ret;

        ret = genphy_c45_pma_read_abilities(phydev);
        if (ret)
                return ret;

        /* Although the PHY sets bit 1.11.8, it does not support 10M modes */
        linkmode_clear_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT,
                           phydev->supported);
        linkmode_clear_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT,
                           phydev->supported);

        return 0;
}

This is done very early on, as part of probing the PHY. This is done
before supported is copied into advertised towards the end of the PHYs
probe.

2) The MAC does not support a link mode. It uses
phy_remove_link_mode() to remove a link mode. There are two different
times this can be done:

a) As part of open(), the PHY is connected to the MAC. Since the PHY
is not connected to the MAC until you open it, you cannot use ethtool
to change the advertised modes until you have opened it. Hence user
space cannot of removed anything and you don't need to worry about
this copy.

b) As part of the MAC drivers probe, the PHY is connected to the MAC.
In this case, ethtool can be used by userspace to remove link
modes. But the MAC driver should of already removed the modes it does
not support, directly after connecting the PHY to the MAC in its probe
function. So advertising and supported at the same already.

The key point here is ksz9477_phy_setup(), and how it breaks this
model. It is called from ksz_enable_port(). That is called via
dsa_port_enable() in dsa_slave_open(). But the PHY was connected to
the MAC during probe of the MAC. So we have a bad mix of a) and b),
which is leading to your problem. You need to fix the switch driver so
it cleanly does b), removes the link mode early on before the user has
chance to use ethtool.

       Andrew
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index b4978c5fb2ca..74d06dc8fddb 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -2509,7 +2509,7 @@  EXPORT_SYMBOL(genphy_loopback);
 void phy_remove_link_mode(struct phy_device *phydev, u32 link_mode)
 {
 	linkmode_clear_bit(link_mode, phydev->supported);
-	phy_advertise_supported(phydev);
+	linkmode_clear_bit(link_mode, phydev->advertising);
 }
 EXPORT_SYMBOL(phy_remove_link_mode);