Message ID | 20200616074955.GA9092@laureti-dev |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | net: macb: reject unsupported rgmii delays | expand |
On 16/06/2020 at 09:49, Helmut Grohne wrote: > The macb driver does not support configuring rgmii delays. At least for > the Zynq GEM, delays are not supported by the hardware at all. However, > the driver happily accepts and ignores any such delays. > > When operating in a mac to phy connection, the delay setting applies to > the phy. Since the MAC does not support delays, the phy must provide > them and the only supported mode is rgmii-id. However, in a fixed mac > to mac connection, the delay applies to the mac itself. Therefore the > only supported rgmii mode is rgmii. > > Link: https://lore.kernel.org/netdev/20200610081236.GA31659@laureti-dev/ > Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de> > --- > drivers/net/ethernet/cadence/macb_main.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c > index 5b9d7c60eebc..bee5bf65e8b3 100644 > --- a/drivers/net/ethernet/cadence/macb_main.c > +++ b/drivers/net/ethernet/cadence/macb_main.c > @@ -514,7 +514,7 @@ static void macb_validate(struct phylink_config *config, > state->interface != PHY_INTERFACE_MODE_RMII && > state->interface != PHY_INTERFACE_MODE_GMII && > state->interface != PHY_INTERFACE_MODE_SGMII && > - !phy_interface_mode_is_rgmii(state->interface)) { > + state->interface != PHY_INTERFACE_MODE_RGMII_ID) { Nitpicking: there's a comment just above, might be interesting to make it more precisely matching this change. It mustn't delay addition though. > bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS); > return; > } > @@ -694,6 +694,13 @@ static int macb_phylink_connect(struct macb *bp) > struct phy_device *phydev; > int ret; > > + if (of_phy_is_fixed_link(dn) && > + phy_interface_mode_is_rgmii(bp->phy_interface) && > + bp->phy_interface != PHY_INTERFACE_MODE_RGMII) { > + netdev_err(dev, "RGMII delays are not supported\n"); > + return -EINVAL; > + } > + Otherwise, it looks good to me after reading the associated discussion link in your commit message: thanks for that! Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> > if (dn) > ret = phylink_of_phy_connect(bp->phylink, dn, 0); > > -- > 2.20.1 >
Hi Helmut, On Tue, 16 Jun 2020 at 11:00, Helmut Grohne <helmut.grohne@intenta.de> wrote: > > The macb driver does not support configuring rgmii delays. At least for > the Zynq GEM, delays are not supported by the hardware at all. However, > the driver happily accepts and ignores any such delays. > > When operating in a mac to phy connection, the delay setting applies to > the phy. Since the MAC does not support delays, the phy must provide > them and the only supported mode is rgmii-id. However, in a fixed mac > to mac connection, the delay applies to the mac itself. Therefore the > only supported rgmii mode is rgmii. > > Link: https://lore.kernel.org/netdev/20200610081236.GA31659@laureti-dev/ > Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de> > --- > drivers/net/ethernet/cadence/macb_main.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c > index 5b9d7c60eebc..bee5bf65e8b3 100644 > --- a/drivers/net/ethernet/cadence/macb_main.c > +++ b/drivers/net/ethernet/cadence/macb_main.c > @@ -514,7 +514,7 @@ static void macb_validate(struct phylink_config *config, > state->interface != PHY_INTERFACE_MODE_RMII && > state->interface != PHY_INTERFACE_MODE_GMII && > state->interface != PHY_INTERFACE_MODE_SGMII && > - !phy_interface_mode_is_rgmii(state->interface)) { > + state->interface != PHY_INTERFACE_MODE_RGMII_ID) { I don't think this change is correct though? What if there were PCB traces in place, for whatever reason? Then the driver would need to accept a phy with rgmii-txid, rgmii-rxid or rgmii. > bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS); > return; > } > @@ -694,6 +694,13 @@ static int macb_phylink_connect(struct macb *bp) > struct phy_device *phydev; > int ret; > > + if (of_phy_is_fixed_link(dn) && > + phy_interface_mode_is_rgmii(bp->phy_interface) && > + bp->phy_interface != PHY_INTERFACE_MODE_RGMII) { > + netdev_err(dev, "RGMII delays are not supported\n"); > + return -EINVAL; > + } > + Have you checked that this doesn't break any existing in-tree users? > if (dn) > ret = phylink_of_phy_connect(bp->phylink, dn, 0); > > -- > 2.20.1 > Thanks, -Vladimir
On Tue, Jun 16, 2020 at 09:49:56AM +0200, Helmut Grohne wrote: > The macb driver does not support configuring rgmii delays. At least for > the Zynq GEM, delays are not supported by the hardware at all. However, > the driver happily accepts and ignores any such delays. > > When operating in a mac to phy connection, the delay setting applies to > the phy. Since the MAC does not support delays, the phy must provide > them and the only supported mode is rgmii-id. However, in a fixed mac > to mac connection, the delay applies to the mac itself. Therefore the > only supported rgmii mode is rgmii. This seems incorrect - see the phy documentation in Documentation/networking/phy.rst: * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any internal delay by itself, it assumes that either the Ethernet MAC (if capable or the PCB traces) insert the correct 1.5-2ns delay * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay for the transmit data lines (TXD[3:0]) processed by the PHY device * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay for the receive data lines (RXD[3:0]) processed by the PHY device * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for both transmit AND receive data lines from/to the PHY device Note that PHY_INTERFACE_MODE_RGMII, the delay can be added by _either_ the MAC or by PCB trace routing. The individual RGMII delay modes are more about what the PHY itself is asked to do with respect to inserting delays, so I don't think your patch makes sense. In any case... > Link: https://lore.kernel.org/netdev/20200610081236.GA31659@laureti-dev/ > Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de> > --- > drivers/net/ethernet/cadence/macb_main.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c > index 5b9d7c60eebc..bee5bf65e8b3 100644 > --- a/drivers/net/ethernet/cadence/macb_main.c > +++ b/drivers/net/ethernet/cadence/macb_main.c > @@ -514,7 +514,7 @@ static void macb_validate(struct phylink_config *config, > state->interface != PHY_INTERFACE_MODE_RMII && > state->interface != PHY_INTERFACE_MODE_GMII && > state->interface != PHY_INTERFACE_MODE_SGMII && > - !phy_interface_mode_is_rgmii(state->interface)) { > + state->interface != PHY_INTERFACE_MODE_RGMII_ID) { Here you reject everything except PHY_INTERFACE_MODE_RGMII_ID. > bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS); > return; > } > @@ -694,6 +694,13 @@ static int macb_phylink_connect(struct macb *bp) > struct phy_device *phydev; > int ret; > > + if (of_phy_is_fixed_link(dn) && > + phy_interface_mode_is_rgmii(bp->phy_interface) && > + bp->phy_interface != PHY_INTERFACE_MODE_RGMII) { but here you reject everything except PHY_INTERFACE_MODE_RGMII. These can't both be right. If you start with PHY_INTERFACE_MODE_RGMII, and have a fixed link, you'll have PHY_INTERFACE_MODE_RGMII passed into the validate function, which will then fail.
Hi Vladimir, On Wed, Jun 17, 2020 at 11:57:23AM +0200, Vladimir Oltean wrote: > On Tue, 16 Jun 2020 at 11:00, Helmut Grohne <helmut.grohne@intenta.de> wrote: > > --- a/drivers/net/ethernet/cadence/macb_main.c > > +++ b/drivers/net/ethernet/cadence/macb_main.c > > @@ -514,7 +514,7 @@ static void macb_validate(struct phylink_config *config, > > state->interface != PHY_INTERFACE_MODE_RMII && > > state->interface != PHY_INTERFACE_MODE_GMII && > > state->interface != PHY_INTERFACE_MODE_SGMII && > > - !phy_interface_mode_is_rgmii(state->interface)) { > > + state->interface != PHY_INTERFACE_MODE_RGMII_ID) { > > I don't think this change is correct though? > What if there were PCB traces in place, for whatever reason? Then the > driver would need to accept a phy with rgmii-txid, rgmii-rxid or > rgmii. I must confess that after rereading the discussion and the other discussions at https://patchwork.ozlabs.org/project/netdev/patch/20190410005700.31582-19-olteanv@gmail.com/ and https://patchwork.ozlabs.org/project/netdev/patch/20190413012822.30931-21-olteanv@gmail.com/, this is less and less clear to me. I think we can agree that the current definition of phy-mode is confusing when it comes to rgmii delays. The documentation doesn't even mention the possibility of using serpentines. Your interpretation seems to be that this value is solely meant for the PHY to operate on and that the MAC should not act upon it (at least the delay part). That interpetation is consistent with previous discussions and mostly consistent with the documentation. The phy-mode property is documented in ethernet-controller.yaml, which suggests that it should apply to the MAC somehow. Following this interpretation, I think it would be good to also document it in ethernet-phy.yaml. Thank you for the review. I agree that the hunk should be dropped. However, in the fixed-link case (below) the interpretation regarding serpentines seems to be well-defined: If serpentines are present, both MACs should specify rgmii. > > bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS); > > return; > > } > > @@ -694,6 +694,13 @@ static int macb_phylink_connect(struct macb *bp) > > struct phy_device *phydev; > > int ret; > > > > + if (of_phy_is_fixed_link(dn) && > > + phy_interface_mode_is_rgmii(bp->phy_interface) && > > + bp->phy_interface != PHY_INTERFACE_MODE_RGMII) { > > + netdev_err(dev, "RGMII delays are not supported\n"); > > + return -EINVAL; > > + } > > + > > Have you checked that this doesn't break any existing in-tree users? It seems like all the in-tree users of this driver that do specify a phy-mode, specify rmii. If possible breakage is to be minimized, I'd be happy with using netdev_warn instead. The major benefit here is getting a clear indication of why things don't work. My emphasis is on getting something to dmesg, not to make things fail. So what should we prefer here? Failure or warning? In the long run, it would be nice to refactor the way to denote delays in a way that is consistently defined for MAC to PHY and MAC to MAC connections while accounting for serpentines. Helmut
On Wed, Jun 17, 2020 at 12:55:18PM +0200, Russell King - ARM Linux admin wrote: > The individual RGMII delay modes are more about what the PHY itself is > asked to do with respect to inserting delays, so I don't think your > patch makes sense. This seems to be the same aspect that Vladimir Oltean remarked. I agree that the relevant hunk should be dropped. > > --- a/drivers/net/ethernet/cadence/macb_main.c > > +++ b/drivers/net/ethernet/cadence/macb_main.c > > @@ -514,7 +514,7 @@ static void macb_validate(struct phylink_config *config, > > state->interface != PHY_INTERFACE_MODE_RMII && > > state->interface != PHY_INTERFACE_MODE_GMII && > > state->interface != PHY_INTERFACE_MODE_SGMII && > > - !phy_interface_mode_is_rgmii(state->interface)) { > > + state->interface != PHY_INTERFACE_MODE_RGMII_ID) { > > Here you reject everything except PHY_INTERFACE_MODE_RGMII_ID. > > > bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS); > > return; > > } > > @@ -694,6 +694,13 @@ static int macb_phylink_connect(struct macb *bp) > > struct phy_device *phydev; > > int ret; > > > > + if (of_phy_is_fixed_link(dn) && > > + phy_interface_mode_is_rgmii(bp->phy_interface) && > > + bp->phy_interface != PHY_INTERFACE_MODE_RGMII) { > > but here you reject everything except PHY_INTERFACE_MODE_RGMII. These > can't both be right. If you start with PHY_INTERFACE_MODE_RGMII, and > have a fixed link, you'll have PHY_INTERFACE_MODE_RGMII passed into > the validate function, which will then fail. For a fixed-link, the validation function is never called. Therefore, it cannot reject PHY_INTERFACE_MODE_RGMII. It works in practice. However, the consensus is to not reject that mode in the validation function. Helmut
On Wed, 17 Jun 2020 at 13:56, Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > > On Tue, Jun 16, 2020 at 09:49:56AM +0200, Helmut Grohne wrote: > > The macb driver does not support configuring rgmii delays. At least for > > the Zynq GEM, delays are not supported by the hardware at all. However, > > the driver happily accepts and ignores any such delays. > > > > When operating in a mac to phy connection, the delay setting applies to > > the phy. Since the MAC does not support delays, the phy must provide > > them and the only supported mode is rgmii-id. However, in a fixed mac > > to mac connection, the delay applies to the mac itself. Therefore the > > only supported rgmii mode is rgmii. > > This seems incorrect - see the phy documentation in > Documentation/networking/phy.rst: > > * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any > internal delay by itself, it assumes that either the Ethernet MAC (if capable > or the PCB traces) insert the correct 1.5-2ns delay > > * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay > for the transmit data lines (TXD[3:0]) processed by the PHY device > > * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay > for the receive data lines (RXD[3:0]) processed by the PHY device > > * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for > both transmit AND receive data lines from/to the PHY device > > Note that PHY_INTERFACE_MODE_RGMII, the delay can be added by _either_ > the MAC or by PCB trace routing. > What does it mean "can" be added? Is it or is it not added? As a MAC driver, what do you do? > The individual RGMII delay modes are more about what the PHY itself is > asked to do with respect to inserting delays, so I don't think your > patch makes sense. > We all read the phy-mode documentation, but we aren't really any smarter. That document completely fails to address the existence of PCB traces. Helmut's link points to some more discussion around this topic. > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Jun 17, 2020 at 02:32:09PM +0300, Vladimir Oltean wrote: > On Wed, 17 Jun 2020 at 13:56, Russell King - ARM Linux admin > <linux@armlinux.org.uk> wrote: > > > > On Tue, Jun 16, 2020 at 09:49:56AM +0200, Helmut Grohne wrote: > > > The macb driver does not support configuring rgmii delays. At least for > > > the Zynq GEM, delays are not supported by the hardware at all. However, > > > the driver happily accepts and ignores any such delays. > > > > > > When operating in a mac to phy connection, the delay setting applies to > > > the phy. Since the MAC does not support delays, the phy must provide > > > them and the only supported mode is rgmii-id. However, in a fixed mac > > > to mac connection, the delay applies to the mac itself. Therefore the > > > only supported rgmii mode is rgmii. > > > > This seems incorrect - see the phy documentation in > > Documentation/networking/phy.rst: > > > > * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any > > internal delay by itself, it assumes that either the Ethernet MAC (if capable > > or the PCB traces) insert the correct 1.5-2ns delay > > > > * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay > > for the transmit data lines (TXD[3:0]) processed by the PHY device > > > > * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay > > for the receive data lines (RXD[3:0]) processed by the PHY device > > > > * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for > > both transmit AND receive data lines from/to the PHY device > > > > Note that PHY_INTERFACE_MODE_RGMII, the delay can be added by _either_ > > the MAC or by PCB trace routing. > > > > What does it mean "can" be added? Is it or is it not added? As a MAC > driver, what do you do? I'm just stating what is documented. > > The individual RGMII delay modes are more about what the PHY itself is > > asked to do with respect to inserting delays, so I don't think your > > patch makes sense. > > > > We all read the phy-mode documentation, but we aren't really any > smarter. That document completely fails to address the existence of > PCB traces. > Helmut's link points to some more discussion around this topic. Why are you so abrasive? Not responding to this until you start behaving more reasonably.
On Wed, 17 Jun 2020 at 14:34, Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote: > > > Why are you so abrasive? > > Not responding to this until you start behaving more reasonably. > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! Sorry. What I meant to say is: the documentation is unclear. It has been interpreted in conflicting ways, where the strict interpretations have proven to have practical limitations, and the practical interpretations have been accused of being incorrect. In my opinion there is no way to fix it without introducing new bindings, which I am not sure is really worth doing. Thanks, -Vladimir
On Wed, Jun 17, 2020 at 01:21:53PM +0200, Helmut Grohne wrote: > On Wed, Jun 17, 2020 at 12:55:18PM +0200, Russell King - ARM Linux admin wrote: > > The individual RGMII delay modes are more about what the PHY itself is > > asked to do with respect to inserting delays, so I don't think your > > patch makes sense. > > This seems to be the same aspect that Vladimir Oltean remarked. I agree > that the relevant hunk should be dropped. > > > > --- a/drivers/net/ethernet/cadence/macb_main.c > > > +++ b/drivers/net/ethernet/cadence/macb_main.c > > > @@ -514,7 +514,7 @@ static void macb_validate(struct phylink_config *config, > > > state->interface != PHY_INTERFACE_MODE_RMII && > > > state->interface != PHY_INTERFACE_MODE_GMII && > > > state->interface != PHY_INTERFACE_MODE_SGMII && > > > - !phy_interface_mode_is_rgmii(state->interface)) { > > > + state->interface != PHY_INTERFACE_MODE_RGMII_ID) { > > > > Here you reject everything except PHY_INTERFACE_MODE_RGMII_ID. > > > > > bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS); > > > return; > > > } > > > @@ -694,6 +694,13 @@ static int macb_phylink_connect(struct macb *bp) > > > struct phy_device *phydev; > > > int ret; > > > > > > + if (of_phy_is_fixed_link(dn) && > > > + phy_interface_mode_is_rgmii(bp->phy_interface) && > > > + bp->phy_interface != PHY_INTERFACE_MODE_RGMII) { > > > > but here you reject everything except PHY_INTERFACE_MODE_RGMII. These > > can't both be right. If you start with PHY_INTERFACE_MODE_RGMII, and > > have a fixed link, you'll have PHY_INTERFACE_MODE_RGMII passed into > > the validate function, which will then fail. > > For a fixed-link, the validation function is never called. Therefore, it > cannot reject PHY_INTERFACE_MODE_RGMII. It works in practice. Hmm, I'm not so sure, but then I don't know exactly what code you're using. Looking at mainline, even for a fixed link, you call phylink_create(). phylink_create() will spot the fixed link, and parse the description, calling the validation function. If that fails, it will generate a warning at that point: "fixed link %s duplex %dMbps not recognised" It doesn't cause an operational failure, but it means that you end up with a zero supported mask, which is likely not expected. This is not an expected situation, so I'll modify your claim to "it works but issues a warning" which still means that it's not correct.
On Wed, Jun 17, 2020 at 01:40:25PM +0200, Russell King - ARM Linux admin wrote: > > For a fixed-link, the validation function is never called. Therefore, it > > cannot reject PHY_INTERFACE_MODE_RGMII. It works in practice. > > Hmm, I'm not so sure, but then I don't know exactly what code you're > using. Looking at mainline, even for a fixed link, you call > phylink_create(). phylink_create() will spot the fixed link, and > parse the description, calling the validation function. If that > fails, it will generate a warning at that point: > > "fixed link %s duplex %dMbps not recognised" > > It doesn't cause an operational failure, but it means that you end up > with a zero supported mask, which is likely not expected. > > This is not an expected situation, so I'll modify your claim to "it > works but issues a warning" which still means that it's not correct. I do see that warning. I agree with your correction of my claim. Thank you for your attention to detail. So we have two good reasons for not rejecting delay configuration in the validation function now. The remaining open question seems to be whether configuring a delay on a MAC to MAC connection should cause a failure or a only warning. Do you have an opinion on that? All in-tree bindings of the driver seem to use rmii when they specify a phy-mode. Helmut
On Wed, Jun 17, 2020 at 02:38:25PM +0300, Vladimir Oltean wrote: > On Wed, 17 Jun 2020 at 14:34, Russell King - ARM Linux admin > <linux@armlinux.org.uk> wrote: > > > > > > > Why are you so abrasive? > > > > Not responding to this until you start behaving more reasonably. > > > > -- > > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! > > Sorry. > What I meant to say is: the documentation is unclear. It has been > interpreted in conflicting ways, where the strict interpretations have > proven to have practical limitations, and the practical > interpretations have been accused of being incorrect. In my opinion > there is no way to fix it without introducing new bindings, which I am > not sure is really worth doing. The documentation was added in 2016, many years after we have had users of this, in an attempt to clear up some of the confusion. It is likely that it had to cater for existing users though - I'm sure if Florian cares, he can comment on that. It would be better if it made a definitive statement about it, but doing so would likely attract pedants to try to fix everything to conform, causing breakage in the process. I've recently had a painful experience of this with the Atheros PHYs, where there are lots of platforms using "rgmii" when they should have been using "rgmii-id". A patch changed this in the Atheros code breaking all these platforms, breakage which persisted over several kernel versions, needing fixes to DT files that then had to be back-ported. That's fine if you know what happened to break it, but if you don't, and you don't know what the fix is, you're mostly stuffed and stuck with non- working ethernet. That really was not nice. This is one of the reasons why I press for any new PHY interface mode to be documented in the phylib documentation right from the start, so that hopefully we can avoid this kind of thing in the future.
On Wed, Jun 17, 2020 at 01:52:01PM +0200, Helmut Grohne wrote: > On Wed, Jun 17, 2020 at 01:40:25PM +0200, Russell King - ARM Linux admin wrote: > > > For a fixed-link, the validation function is never called. Therefore, it > > > cannot reject PHY_INTERFACE_MODE_RGMII. It works in practice. > > > > Hmm, I'm not so sure, but then I don't know exactly what code you're > > using. Looking at mainline, even for a fixed link, you call > > phylink_create(). phylink_create() will spot the fixed link, and > > parse the description, calling the validation function. If that > > fails, it will generate a warning at that point: > > > > "fixed link %s duplex %dMbps not recognised" > > > > It doesn't cause an operational failure, but it means that you end up > > with a zero supported mask, which is likely not expected. > > > > This is not an expected situation, so I'll modify your claim to "it > > works but issues a warning" which still means that it's not correct. > > I do see that warning. I agree with your correction of my claim. Thank > you for your attention to detail. > > So we have two good reasons for not rejecting delay configuration in the > validation function now. > > The remaining open question seems to be whether configuring a delay on a > MAC to MAC connection should cause a failure or a only warning. Do you > have an opinion on that? > > All in-tree bindings of the driver seem to use rmii when they specify a > phy-mode. This brings up a problem in itself - the phy interface mode is currently defined in terms of a MAC-to-PHY setup, not a MAC-to-MAC setup. With a fixed link, we could be in either a MAC-to-PHY or MAC-to-MAC setup; we just don't know. However, we don't have is access to the PHY (if it exists) in the fixed link case to configure it for the delay. In the MAC-to-MAC RGMII setup, where neither MAC can insert the necessary delay, the only way to have a RGMII conformant link is to have the PCB traces induce the necessary delay. That errs towards PHY_INTERFACE_MODE_RGMII for this case. However, considering the MAC-to-PHY RGMII fixed link case, where the PHY may not be accessible, and may be configured with the necessary delay, should that case also use PHY_INTERFACE_MODE_RGMII - clearly that would be as wrong as using PHY_INTERFACE_MODE_RGMII_ID would be for the MAC-to-MAC RGMII with PCB-delays case. So, I think a MAC driver should not care about the specific RGMII mode being asked for in any case, and just accept them all. I also think that some of this ought to be put in the documentation as guidance for new implementations.
On Wed, Jun 17, 2020 at 02:08:09PM +0200, Russell King - ARM Linux admin wrote: > With a fixed link, we could be in either a MAC-to-PHY or MAC-to-MAC > setup; we just don't know. However, we don't have is access to the > PHY (if it exists) in the fixed link case to configure it for the > delay. Let me twist that a little: We may have access to the PHY, but we don't always have access. When we do have access, we have a separate device tree node with another fixed-link and another phy-mode. For fixed-links, we specify the phy-mode for each end. > In the MAC-to-MAC RGMII setup, where neither MAC can insert the > necessary delay, the only way to have a RGMII conformant link is to > have the PCB traces induce the necessary delay. That errs towards > PHY_INTERFACE_MODE_RGMII for this case. Yes. > However, considering the MAC-to-PHY RGMII fixed link case, where the > PHY may not be accessible, and may be configured with the necessary > delay, should that case also use PHY_INTERFACE_MODE_RGMII - clearly > that would be as wrong as using PHY_INTERFACE_MODE_RGMII_ID would > be for the MAC-to-MAC RGMII with PCB-delays case. If you take into account that the PHY has a separate node with phy-mode being rgmii-id, it makes a lot more sense to use rgmii for the phy-mode of the MAC. So I don't think it is that clear that doing so is wrong. In an earlier discussion Florian Fainelli said: https://patchwork.ozlabs.org/project/netdev/patch/20190410005700.31582-19-olteanv@gmail.com/#2149183 | fixed-link really should denote a MAC to MAC connection so if you have | "rgmii-id" on one side, you would expect "rgmii" on the other side | (unless PCB traces account for delays, grrr). For these reasons, I still think that rgmii would be a useful description for the fixed-link to PHY connection where the PHY adds the delay. > So, I think a MAC driver should not care about the specific RGMII > mode being asked for in any case, and just accept them all. I would like to agree to this. However, the implication is that when you get your delays wrong, your driver silently ignores you and you never notice your mistake until you see no traffic passing and wonder why. In this case, I was faced with a PHY that would do rgmii-txid and I configured that on the MAC. Unfortunately, macb_main.c didn't tell me that it did rgmii-id instead. > I also think that some of this ought to be put in the documentation > as guidance for new implementations. That seems to be the part where everyone agrees. Given the state of the discussion, I'm wondering whether this could be fixed at a more fundamental level in the device tree bindings. A number of people (including you) pointed out that retroactively fixing the meaning of phy modes does not work and causes pain instead. That hints that the only way to fix this is adding new properties. How about this? rgmii-delay-type: description: Responsibility for adding the rgmii delay enum: # The remote PHY or MAC to this MAC is responsible for adding the # delay. - remote # The delay is added by neither MAC nor MAC, but using PCB traces # instead. - pcb # The MAC must add the delay. - local rgmii-rx-delay: # Responsibility for RX delay. Delay specification in the phy-mode is # ignored when this is present. $ref: "#/properties/rgmii-delay-type" rgmii-tx-delay: # Responsibility for TX delay. Delay specification in the phy-mode is # ignored when this is present. $ref: "#/properties/rgmii-delay-type" The naming is up to discussion, but I think you get the idea. The core properties of this proposal are: * It does not break existing device trees. * It completely resolves the present ambiguity. The major downside is that you never know whether your driver supports such delay specifications already. Helmut
On Thu, Jun 18, 2020 at 10:14:33AM +0200, Helmut Grohne wrote: > On Wed, Jun 17, 2020 at 02:08:09PM +0200, Russell King - ARM Linux admin wrote: > > With a fixed link, we could be in either a MAC-to-PHY or MAC-to-MAC > > setup; we just don't know. However, we don't have is access to the > > PHY (if it exists) in the fixed link case to configure it for the > > delay. > > Let me twist that a little: We may have access to the PHY, but we don't > always have access. When we do have access, we have a separate device > tree node with another fixed-link and another phy-mode. For fixed-links, > we specify the phy-mode for each end. If you have access to the PHY, you shouldn't be using fixed-link. In any case, there is no support for a fixed-link specification at the PHY end in the kernel. The PHY binding doc does not allow for this either. > > In the MAC-to-MAC RGMII setup, where neither MAC can insert the > > necessary delay, the only way to have a RGMII conformant link is to > > have the PCB traces induce the necessary delay. That errs towards > > PHY_INTERFACE_MODE_RGMII for this case. > > Yes. > > > However, considering the MAC-to-PHY RGMII fixed link case, where the > > PHY may not be accessible, and may be configured with the necessary > > delay, should that case also use PHY_INTERFACE_MODE_RGMII - clearly > > that would be as wrong as using PHY_INTERFACE_MODE_RGMII_ID would > > be for the MAC-to-MAC RGMII with PCB-delays case. > > If you take into account that the PHY has a separate node with phy-mode > being rgmii-id, it makes a lot more sense to use rgmii for the phy-mode > of the MAC. So I don't think it is that clear that doing so is wrong. The PHY binding document does not allow for this, neither does the kernel. > In an earlier discussion Florian Fainelli said: > https://patchwork.ozlabs.org/project/netdev/patch/20190410005700.31582-19-olteanv@gmail.com/#2149183 > | fixed-link really should denote a MAC to MAC connection so if you have > | "rgmii-id" on one side, you would expect "rgmii" on the other side > | (unless PCB traces account for delays, grrr). > > For these reasons, I still think that rgmii would be a useful > description for the fixed-link to PHY connection where the PHY adds the > delay. I think Florian is wrong; consider what it means when you have a fixed-link connection between a MAC and an inaccessible PHY and the link as a whole is operating in what we would call "rgmii-id" mode if the PHY were accessible. Taking Florian's stance, it basically means that DT no longer describes the hardware, but how we have chosen to interpret the properties in _one_ specific case in a completely different way to all the other cases. > > So, I think a MAC driver should not care about the specific RGMII > > mode being asked for in any case, and just accept them all. > > I would like to agree to this. However, the implication is that when you > get your delays wrong, your driver silently ignores you and you never > notice your mistake until you see no traffic passing and wonder why. So explain to me this aspect of your reasoning: - If the link is operating in non-fixed-link mode, the rgmii* PHY modes describe the delay to be applied at the PHY end. - If the link is operating in fixed-link mode, the rgmii* PHY modes describe the delay to be applied at the MAC end. That doesn't make sense, and excludes properly describing a MAC-to- inaccessible-PHY setup. It also means that we're having to conditionalise how we deal with this PHY mode in every single driver, which means that every single driver is going to end up interpreting it differently, and it's going to become a buggy mess. > In this case, I was faced with a PHY that would do rgmii-txid and I > configured that on the MAC. Unfortunately, macb_main.c didn't tell me > that it did rgmii-id instead. The documentation makes it clear that "rgmii-*" (note the hyphen) are to be applied by the PHY *only*, and not by the MAC. > > I also think that some of this ought to be put in the documentation > > as guidance for new implementations. > > That seems to be the part where everyone agrees. > > Given the state of the discussion, I'm wondering whether this could be > fixed at a more fundamental level in the device tree bindings. > > A number of people (including you) pointed out that retroactively fixing > the meaning of phy modes does not work and causes pain instead. That > hints that the only way to fix this is adding new properties. How about > this? > > rgmii-delay-type: > description: > Responsibility for adding the rgmii delay > enum: > # The remote PHY or MAC to this MAC is responsible for adding the > # delay. > - remote > # The delay is added by neither MAC nor MAC, but using PCB traces > # instead. > - pcb > # The MAC must add the delay. > - local Why do we need that complexity? If we decide that we can allow phy-mode = "rgmii" and introduce new properties to control the delay, then we just need: rgmii-tx-delay-ps = <nnn>; rgmii-rx-delay-ps = <nnn>; specified in the MAC node (to be applied only at the MAC end) or specified in the PHY node (to be applied only at the PHY end.) In the normal case, this would be the standard delay value, but in exceptional cases where supported, the delays can be arbitary. We know there are PHYs out there which allow other delays. This means each end is responsible for parsing these properties in its own node and applying them - or raising an error if they can't be supported. With your "rgmii-delay-type" idea, if this is only specified at the MAC end, then the PHY code somehow needs to know what that setting is, which adds a lot of complexity - the PHY code has to go digging for the MAC node, we may even have to introduce a back-reference from the PHY node to the MAC node so the PHY can find it. There are MAC drivers out there where there is one struct device, but multiple net devices, so digging inside struct net_device to get at the parent struct device, and then trying to parse "rgmii-delay-type" from the of-node won't work.
On Thu, Jun 18, 2020 at 10:45:54AM +0200, Russell King - ARM Linux admin wrote: > Why do we need that complexity? If we decide that we can allow > phy-mode = "rgmii" and introduce new properties to control the > delay, then we just need: > > rgmii-tx-delay-ps = <nnn>; > rgmii-rx-delay-ps = <nnn>; > > specified in the MAC node (to be applied only at the MAC end) or > specified in the PHY node (to be applied only at the PHY end.) > In the normal case, this would be the standard delay value, but > in exceptional cases where supported, the delays can be arbitary. > We know there are PHYs out there which allow other delays. > > This means each end is responsible for parsing these properties in > its own node and applying them - or raising an error if they can't > be supported. Thank you. That makes a lot more sense while keeping the (imo) important properties of my proposal: * It is backwards compatible. These properties override delays specified inside phy-mode. Otherwise the vague phy-mode meaning is retained. * The ambiguity is resolved. It is always clear where delays should be configure and the properties properly account for possible PCB traces. It also resolves my original problem. If support for these properties is added to macb_main.c, it would simply check that both delays are 0 as internal delays are not supported by the hardware. When I would have attempted to configure an rx delay, it would have nicely errored out. How can we achieve wider consensus on this and put it into the dt specification? Should there be drivers supporting these first? Helmut
On Thu, Jun 18, 2020 at 11:05:26AM +0200, Helmut Grohne wrote: > On Thu, Jun 18, 2020 at 10:45:54AM +0200, Russell King - ARM Linux admin wrote: > > Why do we need that complexity? If we decide that we can allow > > phy-mode = "rgmii" and introduce new properties to control the > > delay, then we just need: > > > > rgmii-tx-delay-ps = <nnn>; > > rgmii-rx-delay-ps = <nnn>; > > > > specified in the MAC node (to be applied only at the MAC end) or > > specified in the PHY node (to be applied only at the PHY end.) > > In the normal case, this would be the standard delay value, but > > in exceptional cases where supported, the delays can be arbitary. > > We know there are PHYs out there which allow other delays. > > > > This means each end is responsible for parsing these properties in > > its own node and applying them - or raising an error if they can't > > be supported. > > Thank you. That makes a lot more sense while keeping the (imo) important > properties of my proposal: > * It is backwards compatible. These properties override delays > specified inside phy-mode. Otherwise the vague phy-mode meaning is > retained. > * The ambiguity is resolved. It is always clear where delays should be > configure and the properties properly account for possible PCB > traces. > > It also resolves my original problem. If support for these properties is > added to macb_main.c, it would simply check that both delays are 0 as > internal delays are not supported by the hardware. When I would have > attempted to configure an rx delay, it would have nicely errored out. I think we'd want a helper or two to do the parsing and return the delays, something like: #define PHY_RGMII_DELAY_PS_NONE 0 #define PHY_RGMII_DELAY_PS_STD 1500 /* @np here should be the MAC node */ int of_mac_get_delays(struct device_node *np, phy_interface_mode interface, u32 *tx_delay_ps, u32 *rx_delay_ps) { *tx_delay_ps = PHY_RGMII_DELAY_PS_NONE; *rx_delay_ps = PHY_RGMII_DELAY_PS_NONE; if (!np) return 0; if (interface == PHY_INTERFACE_MODE_RGMII) { of_property_read_u32(np, "rgmii-tx-delay-ps", tx_delay_ps); of_property_read_u32(np, "rgmii-rx-delay-ps", rx_delay_ps); } return 0; } /* @np here should be the PHY node */ int of_phy_get_delays(struct device_node *np, phy_interface_mode interface, u32 *tx_delay_ps, u32 *rx_delay_ps) { switch (interface) { case PHY_INTERFACE_MODE_RGMII_ID: *tx_delay_ps = PHY_RGMII_DELAY_PS_STD; *rx_delay_ps = PHY_RGMII_DELAY_PS_STD; return 0; case PHY_INTERFACE_MODE_RGMII_RXID: *tx_delay_ps = PHY_RGMII_DELAY_PS_NONE; *rx_delay_ps = PHY_RGMII_DELAY_PS_STD; return 0; case PHY_INTERFACE_MODE_RGMII_TXID: *tx_delay_ps = PHY_RGMII_DELAY_PS_STD; *rx_delay_ps = PHY_RGMII_DELAY_PS_NONE; return 0; default: return of_mac_get_delays(np, interface, tx_delay_ps, rx_delay_ps); } } as a first cut - validation left up to the user of these. At least we then have an easy interface for PHY drivers to use, for example: static int m88e1121_config_aneg_rgmii_delays(struct phy_device *phydev) { u32 tx_delay_ps, rx_delay_ps; int err; err = of_phy_get_delays(phydev->mdio.dev.of_node, phydev->interface, &tx_delay_ps, &rx_delay_ps); if (err) return err; mscr = 0; if (tx_delay_ps == PHY_RGMII_DELAY_PS_STD) mscr |= MII_88E1121_PHY_MSCR_TX_DELAY; else if (tx_delay_ps != PHY_RGMII_DELAY_PS_NONE) /* ... log an error to kernel log */ return -EINVAL; if (rx_delay_ps == PHY_RGMII_DELAY_PS_STD) mscr |= MII_88E1121_PHY_MSCR_RX_DELAY; else if (rx_delay_ps != PHY_RGMII_DELAY_PS_NONE) /* ... log an error to kernel log */ return -EINVAL; return phy_modify_paged(phydev, MII_MARVELL_MSCR_PAGE, MII_88E1121_PHY_MSCR_REG, MII_88E1121_PHY_MSCR_DELAY_MASK, mscr); } > How can we achieve wider consensus on this and put it into the dt > specification? Should there be drivers supporting these first? Provide an illustration of the idea in code form for consideration? ;)
On 6/18/2020 1:45 AM, Russell King - ARM Linux admin wrote: > On Thu, Jun 18, 2020 at 10:14:33AM +0200, Helmut Grohne wrote: >> On Wed, Jun 17, 2020 at 02:08:09PM +0200, Russell King - ARM Linux admin wrote: >>> With a fixed link, we could be in either a MAC-to-PHY or MAC-to-MAC >>> setup; we just don't know. However, we don't have is access to the >>> PHY (if it exists) in the fixed link case to configure it for the >>> delay. >> >> Let me twist that a little: We may have access to the PHY, but we don't >> always have access. When we do have access, we have a separate device >> tree node with another fixed-link and another phy-mode. For fixed-links, >> we specify the phy-mode for each end. > > If you have access to the PHY, you shouldn't be using fixed-link. In > any case, there is no support for a fixed-link specification at the > PHY end in the kernel. The PHY binding doc does not allow for this > either. > >>> In the MAC-to-MAC RGMII setup, where neither MAC can insert the >>> necessary delay, the only way to have a RGMII conformant link is to >>> have the PCB traces induce the necessary delay. That errs towards >>> PHY_INTERFACE_MODE_RGMII for this case. >> >> Yes. >> >>> However, considering the MAC-to-PHY RGMII fixed link case, where the >>> PHY may not be accessible, and may be configured with the necessary >>> delay, should that case also use PHY_INTERFACE_MODE_RGMII - clearly >>> that would be as wrong as using PHY_INTERFACE_MODE_RGMII_ID would >>> be for the MAC-to-MAC RGMII with PCB-delays case. >> >> If you take into account that the PHY has a separate node with phy-mode >> being rgmii-id, it makes a lot more sense to use rgmii for the phy-mode >> of the MAC. So I don't think it is that clear that doing so is wrong. > > The PHY binding document does not allow for this, neither does the > kernel. > >> In an earlier discussion Florian Fainelli said: >> https://patchwork.ozlabs.org/project/netdev/patch/20190410005700.31582-19-olteanv@gmail.com/#2149183 >> | fixed-link really should denote a MAC to MAC connection so if you have >> | "rgmii-id" on one side, you would expect "rgmii" on the other side >> | (unless PCB traces account for delays, grrr). >> >> For these reasons, I still think that rgmii would be a useful >> description for the fixed-link to PHY connection where the PHY adds the >> delay. > > I think Florian is wrong; consider what it means when you have a > fixed-link connection between a MAC and an inaccessible PHY and > the link as a whole is operating in what we would call "rgmii-id" > mode if the PHY were accessible. > > Taking Florian's stance, it basically means that DT no longer > describes the hardware, but how we have chosen to interpret the > properties in _one_ specific case in a completely different way > to all the other cases. How do you ensure that a MAC to MAC connection using RGMII actually works if you do not provide a 'phy-mode' for both sides right now? Yes this is more of a DT now describes a configuration choice rather than the hardware but given that Ethernet MACs are usually capable of supporting all 4 possible RGMII modes, how do you propose to solve the negotiation of the appropriate RGMII mode here? > >>> So, I think a MAC driver should not care about the specific RGMII >>> mode being asked for in any case, and just accept them all. >> >> I would like to agree to this. However, the implication is that when you >> get your delays wrong, your driver silently ignores you and you never >> notice your mistake until you see no traffic passing and wonder why. > > So explain to me this aspect of your reasoning: > > - If the link is operating in non-fixed-link mode, the rgmii* PHY modes > describe the delay to be applied at the PHY end. > - If the link is operating in fixed-link mode, the rgmii* PHY modes > describe the delay to be applied at the MAC end. > > That doesn't make sense, and excludes properly describing a MAC-to- > inaccessible-PHY setup. The point with a fixed link is that it does not matter what is the other end, it could be a MAC, it could be a PHY, it could go nowhere, it just does not matter, the only thing that you can know is you configure your side of the fixed link. If you have visibility into the other side as well, you can go one step further an put something in the other fixed-link node that makes sense and will result in a working RGMII link. > > It also means that we're having to conditionalise how we deal with > this PHY mode in every single driver, which means that every single > driver is going to end up interpreting it differently, and it's going > to become a buggy mess. Newsflash: it is already a buggy mess. > >> In this case, I was faced with a PHY that would do rgmii-txid and I >> configured that on the MAC. Unfortunately, macb_main.c didn't tell me >> that it did rgmii-id instead. > > The documentation makes it clear that "rgmii-*" (note the hyphen) are > to be applied by the PHY *only*, and not by the MAC. > >>> I also think that some of this ought to be put in the documentation >>> as guidance for new implementations. >> >> That seems to be the part where everyone agrees. >> >> Given the state of the discussion, I'm wondering whether this could be >> fixed at a more fundamental level in the device tree bindings. >> >> A number of people (including you) pointed out that retroactively fixing >> the meaning of phy modes does not work and causes pain instead. That >> hints that the only way to fix this is adding new properties. How about >> this? >> >> rgmii-delay-type: >> description: >> Responsibility for adding the rgmii delay >> enum: >> # The remote PHY or MAC to this MAC is responsible for adding the >> # delay. >> - remote >> # The delay is added by neither MAC nor MAC, but using PCB traces >> # instead. >> - pcb >> # The MAC must add the delay. >> - local > > Why do we need that complexity? If we decide that we can allow > phy-mode = "rgmii" and introduce new properties to control the > delay, then we just need: > > rgmii-tx-delay-ps = <nnn>; > rgmii-rx-delay-ps = <nnn>; > > specified in the MAC node (to be applied only at the MAC end) or > specified in the PHY node (to be applied only at the PHY end.) > In the normal case, this would be the standard delay value, but > in exceptional cases where supported, the delays can be arbitary. > We know there are PHYs out there which allow other delays. > > This means each end is responsible for parsing these properties in > its own node and applying them - or raising an error if they can't > be supported. And as we all know, RGMII is the most plug and play standard out there so this is "just going to work". > > With your "rgmii-delay-type" idea, if this is only specified at the > MAC end, then the PHY code somehow needs to know what that setting is, > which adds a lot of complexity - the PHY code has to go digging for > the MAC node, we may even have to introduce a back-reference from the > PHY node to the MAC node so the PHY can find it. There are MAC drivers > out there where there is one struct device, but multiple net devices, > so digging inside struct net_device to get at the parent struct device, > and then trying to parse "rgmii-delay-type" from the of-node won't work. >-- Florian
On 6/17/2020 4:57 AM, Russell King - ARM Linux admin wrote: > On Wed, Jun 17, 2020 at 02:38:25PM +0300, Vladimir Oltean wrote: >> On Wed, 17 Jun 2020 at 14:34, Russell King - ARM Linux admin >> <linux@armlinux.org.uk> wrote: >>> >> >>> >>> Why are you so abrasive? >>> >>> Not responding to this until you start behaving more reasonably. >>> >>> -- >>> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ >>> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! >> >> Sorry. >> What I meant to say is: the documentation is unclear. It has been >> interpreted in conflicting ways, where the strict interpretations have >> proven to have practical limitations, and the practical >> interpretations have been accused of being incorrect. In my opinion >> there is no way to fix it without introducing new bindings, which I am >> not sure is really worth doing. > > The documentation was added in 2016, many years after we have had users > of this, in an attempt to clear up some of the confusion. It is likely > that it had to cater for existing users though - I'm sure if Florian > cares, he can comment on that. The need to clarify the 'phy-mode' property came in while reviewing the aurora network driver and trying to make sense of what should be done for a new driver in a way that would result in hopefully less confusion moving forward. > > It would be better if it made a definitive statement about it, but doing > so would likely attract pedants to try to fix everything to conform, > causing breakage in the process. Exactly. > > I've recently had a painful experience of this with the Atheros PHYs, > where there are lots of platforms using "rgmii" when they should have > been using "rgmii-id". A patch changed this in the Atheros code breaking > all these platforms, breakage which persisted over several kernel > versions, needing fixes to DT files that then had to be back-ported. > That's fine if you know what happened to break it, but if you don't, and > you don't know what the fix is, you're mostly stuffed and stuck with non- > working ethernet. That really was not nice. > > This is one of the reasons why I press for any new PHY interface mode > to be documented in the phylib documentation right from the start, so > that hopefully we can avoid this kind of thing in the future. > It seems to be that this is a very RGMII specific problem and no other electrical interface appears to have that problem, or it solves it differently without requiring massive baby sitting from software (or even more, just not for that particular problem maybe).
On 6/18/2020 3:01 AM, Russell King - ARM Linux admin wrote: > On Thu, Jun 18, 2020 at 11:05:26AM +0200, Helmut Grohne wrote: >> On Thu, Jun 18, 2020 at 10:45:54AM +0200, Russell King - ARM Linux admin wrote: >>> Why do we need that complexity? If we decide that we can allow >>> phy-mode = "rgmii" and introduce new properties to control the >>> delay, then we just need: >>> >>> rgmii-tx-delay-ps = <nnn>; >>> rgmii-rx-delay-ps = <nnn>; >>> >>> specified in the MAC node (to be applied only at the MAC end) or >>> specified in the PHY node (to be applied only at the PHY end.) >>> In the normal case, this would be the standard delay value, but >>> in exceptional cases where supported, the delays can be arbitary. >>> We know there are PHYs out there which allow other delays. >>> >>> This means each end is responsible for parsing these properties in >>> its own node and applying them - or raising an error if they can't >>> be supported. >> >> Thank you. That makes a lot more sense while keeping the (imo) important >> properties of my proposal: >> * It is backwards compatible. These properties override delays >> specified inside phy-mode. Otherwise the vague phy-mode meaning is >> retained. >> * The ambiguity is resolved. It is always clear where delays should be >> configure and the properties properly account for possible PCB >> traces. >> >> It also resolves my original problem. If support for these properties is >> added to macb_main.c, it would simply check that both delays are 0 as >> internal delays are not supported by the hardware. When I would have >> attempted to configure an rx delay, it would have nicely errored out. > > I think we'd want a helper or two to do the parsing and return the > delays, something like: Can you review Dan's patch series since he is attempting something that may not end up solving Helmut's problem completely: http://patchwork.ozlabs.org/project/netdev/list/?series=184052
On Thu, Jun 18, 2020 at 11:06:43AM -0700, Florian Fainelli wrote: > > > On 6/18/2020 1:45 AM, Russell King - ARM Linux admin wrote: > > On Thu, Jun 18, 2020 at 10:14:33AM +0200, Helmut Grohne wrote: > >> On Wed, Jun 17, 2020 at 02:08:09PM +0200, Russell King - ARM Linux admin wrote: > >>> With a fixed link, we could be in either a MAC-to-PHY or MAC-to-MAC > >>> setup; we just don't know. However, we don't have is access to the > >>> PHY (if it exists) in the fixed link case to configure it for the > >>> delay. > >> > >> Let me twist that a little: We may have access to the PHY, but we don't > >> always have access. When we do have access, we have a separate device > >> tree node with another fixed-link and another phy-mode. For fixed-links, > >> we specify the phy-mode for each end. > > > > If you have access to the PHY, you shouldn't be using fixed-link. In > > any case, there is no support for a fixed-link specification at the > > PHY end in the kernel. The PHY binding doc does not allow for this > > either. > > > >>> In the MAC-to-MAC RGMII setup, where neither MAC can insert the > >>> necessary delay, the only way to have a RGMII conformant link is to > >>> have the PCB traces induce the necessary delay. That errs towards > >>> PHY_INTERFACE_MODE_RGMII for this case. > >> > >> Yes. > >> > >>> However, considering the MAC-to-PHY RGMII fixed link case, where the > >>> PHY may not be accessible, and may be configured with the necessary > >>> delay, should that case also use PHY_INTERFACE_MODE_RGMII - clearly > >>> that would be as wrong as using PHY_INTERFACE_MODE_RGMII_ID would > >>> be for the MAC-to-MAC RGMII with PCB-delays case. > >> > >> If you take into account that the PHY has a separate node with phy-mode > >> being rgmii-id, it makes a lot more sense to use rgmii for the phy-mode > >> of the MAC. So I don't think it is that clear that doing so is wrong. > > > > The PHY binding document does not allow for this, neither does the > > kernel. > > > >> In an earlier discussion Florian Fainelli said: > >> https://patchwork.ozlabs.org/project/netdev/patch/20190410005700.31582-19-olteanv@gmail.com/#2149183 > >> | fixed-link really should denote a MAC to MAC connection so if you have > >> | "rgmii-id" on one side, you would expect "rgmii" on the other side > >> | (unless PCB traces account for delays, grrr). > >> > >> For these reasons, I still think that rgmii would be a useful > >> description for the fixed-link to PHY connection where the PHY adds the > >> delay. > > > > I think Florian is wrong; consider what it means when you have a > > fixed-link connection between a MAC and an inaccessible PHY and > > the link as a whole is operating in what we would call "rgmii-id" > > mode if the PHY were accessible. > > > > Taking Florian's stance, it basically means that DT no longer > > describes the hardware, but how we have chosen to interpret the > > properties in _one_ specific case in a completely different way > > to all the other cases. > > How do you ensure that a MAC to MAC connection using RGMII actually > works if you do not provide a 'phy-mode' for both sides right now? > > Yes this is more of a DT now describes a configuration choice rather > than the hardware but given that Ethernet MACs are usually capable of > supporting all 4 possible RGMII modes, how do you propose to solve the > negotiation of the appropriate RGMII mode here? This is actually answered by yourself below. > >>> So, I think a MAC driver should not care about the specific RGMII > >>> mode being asked for in any case, and just accept them all. > >> > >> I would like to agree to this. However, the implication is that when you > >> get your delays wrong, your driver silently ignores you and you never > >> notice your mistake until you see no traffic passing and wonder why. > > > > So explain to me this aspect of your reasoning: > > > > - If the link is operating in non-fixed-link mode, the rgmii* PHY modes > > describe the delay to be applied at the PHY end. > > - If the link is operating in fixed-link mode, the rgmii* PHY modes > > describe the delay to be applied at the MAC end. > > > > That doesn't make sense, and excludes properly describing a MAC-to- > > inaccessible-PHY setup. > > The point with a fixed link is that it does not matter what is the other > end, it could be a MAC, it could be a PHY, it could go nowhere, it just > does not matter, the only thing that you can know is you configure your > side of the fixed link. That is *exactly* my point. Today, with a PHY on the other end, the four RGMII modes tell you how the PHY is to be configured, not the MAC. The documentation states this. So, why should a fixed-link be any different?
On 6/18/2020 11:49 AM, Russell King - ARM Linux admin wrote: > On Thu, Jun 18, 2020 at 11:06:43AM -0700, Florian Fainelli wrote: >> >> >> On 6/18/2020 1:45 AM, Russell King - ARM Linux admin wrote: >>> On Thu, Jun 18, 2020 at 10:14:33AM +0200, Helmut Grohne wrote: >>>> On Wed, Jun 17, 2020 at 02:08:09PM +0200, Russell King - ARM Linux admin wrote: >>>>> With a fixed link, we could be in either a MAC-to-PHY or MAC-to-MAC >>>>> setup; we just don't know. However, we don't have is access to the >>>>> PHY (if it exists) in the fixed link case to configure it for the >>>>> delay. >>>> >>>> Let me twist that a little: We may have access to the PHY, but we don't >>>> always have access. When we do have access, we have a separate device >>>> tree node with another fixed-link and another phy-mode. For fixed-links, >>>> we specify the phy-mode for each end. >>> >>> If you have access to the PHY, you shouldn't be using fixed-link. In >>> any case, there is no support for a fixed-link specification at the >>> PHY end in the kernel. The PHY binding doc does not allow for this >>> either. >>> >>>>> In the MAC-to-MAC RGMII setup, where neither MAC can insert the >>>>> necessary delay, the only way to have a RGMII conformant link is to >>>>> have the PCB traces induce the necessary delay. That errs towards >>>>> PHY_INTERFACE_MODE_RGMII for this case. >>>> >>>> Yes. >>>> >>>>> However, considering the MAC-to-PHY RGMII fixed link case, where the >>>>> PHY may not be accessible, and may be configured with the necessary >>>>> delay, should that case also use PHY_INTERFACE_MODE_RGMII - clearly >>>>> that would be as wrong as using PHY_INTERFACE_MODE_RGMII_ID would >>>>> be for the MAC-to-MAC RGMII with PCB-delays case. >>>> >>>> If you take into account that the PHY has a separate node with phy-mode >>>> being rgmii-id, it makes a lot more sense to use rgmii for the phy-mode >>>> of the MAC. So I don't think it is that clear that doing so is wrong. >>> >>> The PHY binding document does not allow for this, neither does the >>> kernel. >>> >>>> In an earlier discussion Florian Fainelli said: >>>> https://patchwork.ozlabs.org/project/netdev/patch/20190410005700.31582-19-olteanv@gmail.com/#2149183 >>>> | fixed-link really should denote a MAC to MAC connection so if you have >>>> | "rgmii-id" on one side, you would expect "rgmii" on the other side >>>> | (unless PCB traces account for delays, grrr). >>>> >>>> For these reasons, I still think that rgmii would be a useful >>>> description for the fixed-link to PHY connection where the PHY adds the >>>> delay. >>> >>> I think Florian is wrong; consider what it means when you have a >>> fixed-link connection between a MAC and an inaccessible PHY and >>> the link as a whole is operating in what we would call "rgmii-id" >>> mode if the PHY were accessible. >>> >>> Taking Florian's stance, it basically means that DT no longer >>> describes the hardware, but how we have chosen to interpret the >>> properties in _one_ specific case in a completely different way >>> to all the other cases. >> >> How do you ensure that a MAC to MAC connection using RGMII actually >> works if you do not provide a 'phy-mode' for both sides right now? >> >> Yes this is more of a DT now describes a configuration choice rather >> than the hardware but given that Ethernet MACs are usually capable of >> supporting all 4 possible RGMII modes, how do you propose to solve the >> negotiation of the appropriate RGMII mode here? > > This is actually answered by yourself below. > >>>>> So, I think a MAC driver should not care about the specific RGMII >>>>> mode being asked for in any case, and just accept them all. >>>> >>>> I would like to agree to this. However, the implication is that when you >>>> get your delays wrong, your driver silently ignores you and you never >>>> notice your mistake until you see no traffic passing and wonder why. >>> >>> So explain to me this aspect of your reasoning: >>> >>> - If the link is operating in non-fixed-link mode, the rgmii* PHY modes >>> describe the delay to be applied at the PHY end. >>> - If the link is operating in fixed-link mode, the rgmii* PHY modes >>> describe the delay to be applied at the MAC end. >>> >>> That doesn't make sense, and excludes properly describing a MAC-to- >>> inaccessible-PHY setup. >> >> The point with a fixed link is that it does not matter what is the other >> end, it could be a MAC, it could be a PHY, it could go nowhere, it just >> does not matter, the only thing that you can know is you configure your >> side of the fixed link. > > That is *exactly* my point. > > Today, with a PHY on the other end, the four RGMII modes tell you how > the PHY is to be configured, not the MAC. The documentation states > this. > > So, why should a fixed-link be any different? It should not, but that means that when you describe the fixed-link, the 'phy-mode' within the local fixed-link node is meant to describe how the other side is configured not how you are configured. For some reason I did not consider that to be intuitive when I sent out my response, but I suppose spelling it out would greatly help.
On Thu, Jun 18, 2020 at 12:02:13PM -0700, Florian Fainelli wrote: > It should not, but that means that when you describe the fixed-link, the > 'phy-mode' within the local fixed-link node is meant to describe how the > other side is configured not how you are configured. For some reason I > did not consider that to be intuitive when I sent out my response, but I > suppose spelling it out would greatly help. Right, so in the MAC-to-MAC setup that is being discussed in this thread, the _local_ side MAC where phy-mode is specified should not care which of the RGMII modes is specified, because the delays are a function of "the other side" of the link. So, the proposal that macb should limit itself to only accepting "rgmii" in it's _local_ phy-mode property because the _local_ MAC does not support delays is wrong.
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 5b9d7c60eebc..bee5bf65e8b3 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -514,7 +514,7 @@ static void macb_validate(struct phylink_config *config, state->interface != PHY_INTERFACE_MODE_RMII && state->interface != PHY_INTERFACE_MODE_GMII && state->interface != PHY_INTERFACE_MODE_SGMII && - !phy_interface_mode_is_rgmii(state->interface)) { + state->interface != PHY_INTERFACE_MODE_RGMII_ID) { bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS); return; } @@ -694,6 +694,13 @@ static int macb_phylink_connect(struct macb *bp) struct phy_device *phydev; int ret; + if (of_phy_is_fixed_link(dn) && + phy_interface_mode_is_rgmii(bp->phy_interface) && + bp->phy_interface != PHY_INTERFACE_MODE_RGMII) { + netdev_err(dev, "RGMII delays are not supported\n"); + return -EINVAL; + } + if (dn) ret = phylink_of_phy_connect(bp->phylink, dn, 0);
The macb driver does not support configuring rgmii delays. At least for the Zynq GEM, delays are not supported by the hardware at all. However, the driver happily accepts and ignores any such delays. When operating in a mac to phy connection, the delay setting applies to the phy. Since the MAC does not support delays, the phy must provide them and the only supported mode is rgmii-id. However, in a fixed mac to mac connection, the delay applies to the mac itself. Therefore the only supported rgmii mode is rgmii. Link: https://lore.kernel.org/netdev/20200610081236.GA31659@laureti-dev/ Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de> --- drivers/net/ethernet/cadence/macb_main.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)