diff mbox series

[net-next,1/4] net: ethernet: validate pause autoneg setting

Message ID 1589243050-18217-2-git-send-email-opendmb@gmail.com
State Changes Requested
Delegated to: David Miller
Headers show
Series Extend phylib implementation of pause support | expand

Commit Message

Doug Berger May 12, 2020, 12:24 a.m. UTC
A comment in uapi/linux/ethtool.h states "Drivers should reject a
non-zero setting of @autoneg when autoneogotiation is disabled (or
not supported) for the link".

That check should be added to phy_validate_pause() to consolidate
the code where possible.

Fixes: 22b7d29926b5 ("net: ethernet: Add helper to determine if pause configuration is supported")
Signed-off-by: Doug Berger <opendmb@gmail.com>
---
 drivers/net/phy/phy_device.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Andrew Lunn May 12, 2020, 12:47 a.m. UTC | #1
On Mon, May 11, 2020 at 05:24:07PM -0700, Doug Berger wrote:
> A comment in uapi/linux/ethtool.h states "Drivers should reject a
> non-zero setting of @autoneg when autoneogotiation is disabled (or
> not supported) for the link".
> 
> That check should be added to phy_validate_pause() to consolidate
> the code where possible.
> 
> Fixes: 22b7d29926b5 ("net: ethernet: Add helper to determine if pause configuration is supported")

Hi Doug

If this is a real fix, please submit this to net, not net-next.

   Andrew
Florian Fainelli May 12, 2020, 3:16 a.m. UTC | #2
On 5/11/2020 5:24 PM, Doug Berger wrote:
> A comment in uapi/linux/ethtool.h states "Drivers should reject a
> non-zero setting of @autoneg when autoneogotiation is disabled (or
> not supported) for the link".
> 
> That check should be added to phy_validate_pause() to consolidate
> the code where possible.
> 
> Fixes: 22b7d29926b5 ("net: ethernet: Add helper to determine if pause configuration is supported")
> Signed-off-by: Doug Berger <opendmb@gmail.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
Doug Berger May 12, 2020, 6:31 p.m. UTC | #3
On 5/11/2020 5:47 PM, Andrew Lunn wrote:
> On Mon, May 11, 2020 at 05:24:07PM -0700, Doug Berger wrote:
>> A comment in uapi/linux/ethtool.h states "Drivers should reject a
>> non-zero setting of @autoneg when autoneogotiation is disabled (or
>> not supported) for the link".
>>
>> That check should be added to phy_validate_pause() to consolidate
>> the code where possible.
>>
>> Fixes: 22b7d29926b5 ("net: ethernet: Add helper to determine if pause configuration is supported")
> 
> Hi Doug
> 
> If this is a real fix, please submit this to net, not net-next.
> 
>    Andrew
> 
This was intended as a fix, but I thought it would be better to keep it
as part of this set for context and since net-next is currently open.

The context is trying to improve the phylib support for offloading
ethtool pause configuration and this is something that could be checked
in a single location rather than by individual drivers.

I included it here to get feedback about its appropriateness as a common
behavior. I should have been more explicit about that.

Personally, I'm actually not that fond of this change since it can
easily be a source of confusion with the ethtool interface because the
link autonegotiation and the pause autonegotiation are controlled by
different commands.

Since the ethtool -A command performs a read/modify/write of pause
parameters, you can get strange results like these:
# ethtool -s eth0 speed 100 duplex full autoneg off
# ethtool -A eth0 tx off
Cannot set device pause parameters: Invalid argument
#
Because, the get read pause autoneg as enabled and only the tx_pause
member of the structure was updated.

The network driver could attempt to change one in response to the other,
but it might be difficult to reach consensus and it adds more
"worthless" code to the network driver in opposition to the intent.

I would like to get more feedback about the approach of the patch set as
a whole before I resubmit, and would be happy to drop this commit at
that time.

But there is still a question of how the comment "Drivers should reject
a non-zero setting of @autoneg when autoneogotiation is disabled (or not
supported) for the link" in ethtool.h should be interpreted.

Should that comment be removed?

Thanks for taking the time to look at this,
    Doug
Andrew Lunn May 12, 2020, 6:37 p.m. UTC | #4
On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> On 5/11/2020 5:47 PM, Andrew Lunn wrote:
> > On Mon, May 11, 2020 at 05:24:07PM -0700, Doug Berger wrote:
> >> A comment in uapi/linux/ethtool.h states "Drivers should reject a
> >> non-zero setting of @autoneg when autoneogotiation is disabled (or
> >> not supported) for the link".
> >>
> >> That check should be added to phy_validate_pause() to consolidate
> >> the code where possible.
> >>
> >> Fixes: 22b7d29926b5 ("net: ethernet: Add helper to determine if pause configuration is supported")
> > 
> > Hi Doug
> > 
> > If this is a real fix, please submit this to net, not net-next.
> > 
> >    Andrew
> > 
> This was intended as a fix, but I thought it would be better to keep it
> as part of this set for context and since net-next is currently open.

My real question is, do you think this should be back ported in
stable?  If so, it should be against net. If this is only intended for
new kernels, don't add a Fixes: tag.

> Personally, I'm actually not that fond of this change since it can
> easily be a source of confusion with the ethtool interface because the
> link autonegotiation and the pause autonegotiation are controlled by
> different commands.
> 
> Since the ethtool -A command performs a read/modify/write of pause
> parameters, you can get strange results like these:
> # ethtool -s eth0 speed 100 duplex full autoneg off
> # ethtool -A eth0 tx off
> Cannot set device pause parameters: Invalid argument
> #
> Because, the get read pause autoneg as enabled and only the tx_pause
> member of the structure was updated.

We can at least improve the error message when using netlink
ethtool. Using extack, we can pass back a string, saying why this
configuration is invalid, that link autoneg is off.

	Andrew
Russell King (Oracle) May 12, 2020, 6:55 p.m. UTC | #5
On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> This was intended as a fix, but I thought it would be better to keep it
> as part of this set for context and since net-next is currently open.
> 
> The context is trying to improve the phylib support for offloading
> ethtool pause configuration and this is something that could be checked
> in a single location rather than by individual drivers.
> 
> I included it here to get feedback about its appropriateness as a common
> behavior. I should have been more explicit about that.
> 
> Personally, I'm actually not that fond of this change since it can
> easily be a source of confusion with the ethtool interface because the
> link autonegotiation and the pause autonegotiation are controlled by
> different commands.
> 
> Since the ethtool -A command performs a read/modify/write of pause
> parameters, you can get strange results like these:
> # ethtool -s eth0 speed 100 duplex full autoneg off
> # ethtool -A eth0 tx off
> Cannot set device pause parameters: Invalid argument
> #
> Because, the get read pause autoneg as enabled and only the tx_pause
> member of the structure was updated.

This looks like the same argument I've been having with Heiner over
the EEE interface, except there's a difference here.

# ethtool -A eth0 autoneg on
# ethtool -s eth0 autoneg off speed 100 duplex full

After those two commands, what is the state of pause mode?  The answer
is, it's disabled.

# ethtool -A eth0 autoneg off rx on tx on

is perfectly acceptable, as we are forcing pause modes at the local
end of the link.

# ethtool -A eth0 autoneg on

Now, the question is whether that should be allowed or not - but this
is merely restoring the "pause" settings that were in effect prior
to the previous command.  It does not enable pause negotiation,
because autoneg as a whole is disabled, but it _allows_ pause
negotiation to occur when autoneg is enabled at some point in the
future.

Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
autoneg off" means you can configure the negotiation parameters
_before_ triggering a negotiation cycle on the link.  In other words,
it would avoid:

# ethtool -s eth0 autoneg on
# # Link renegotiates
# ethtool -A eth0 autoneg on
# # Link renegotiates a second time

and it also means that if stuff has already been scripted to avoid
this, nothing breaks.

If we start rejecting ethtool -A because autoneg is disabled, then
things get difficult to configure - we would need ethtool documentation
to state that autoneg must be enabled before configuration of pause
and EEE can be done.  IMHO, that hurts usability, and adds confusion.
Michal Kubecek May 12, 2020, 7:08 p.m. UTC | #6
On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> On 5/11/2020 5:47 PM, Andrew Lunn wrote:
> > On Mon, May 11, 2020 at 05:24:07PM -0700, Doug Berger wrote:
> >> A comment in uapi/linux/ethtool.h states "Drivers should reject a
> >> non-zero setting of @autoneg when autoneogotiation is disabled (or
> >> not supported) for the link".
> >>
> >> That check should be added to phy_validate_pause() to consolidate
> >> the code where possible.
> >>
> >> Fixes: 22b7d29926b5 ("net: ethernet: Add helper to determine if pause configuration is supported")
> > 
> > Hi Doug
> > 
> > If this is a real fix, please submit this to net, not net-next.
> > 
> >    Andrew
> > 
> This was intended as a fix, but I thought it would be better to keep it
> as part of this set for context and since net-next is currently open.
> 
> The context is trying to improve the phylib support for offloading
> ethtool pause configuration and this is something that could be checked
> in a single location rather than by individual drivers.
> 
> I included it here to get feedback about its appropriateness as a common
> behavior. I should have been more explicit about that.
> 
> Personally, I'm actually not that fond of this change since it can
> easily be a source of confusion with the ethtool interface because the
> link autonegotiation and the pause autonegotiation are controlled by
> different commands.
> 
> Since the ethtool -A command performs a read/modify/write of pause
> parameters, you can get strange results like these:
> # ethtool -s eth0 speed 100 duplex full autoneg off
> # ethtool -A eth0 tx off
> Cannot set device pause parameters: Invalid argument
> #
> Because, the get read pause autoneg as enabled and only the tx_pause
> member of the structure was updated.

This would be indeed unfortunate. We could use extack to make the error
message easier to understand but the real problem IMHO is that
ethtool_ops::get_pauseparam() returns value which is rejected by
ethtool_ops::set_pauseparam(). This is something we should avoid.

If we really wanted to reject ethtool_pauseparam::autoneg on when
general autonegotiation is off, we would have to disable pause
autonegotiation whenever general autonegotiation is disabled. I don't
like that idea, however, as that would mean that

  ethtool -s dev autoneg off ...
  ethtool -s dev autoneg on ...

would reset the setting of pause autonegotiation.

Therefore I believe the comment should be rather replaced by a warning
that even if ethtool_pauseparam::autoneg is enabled, pause
autonegotiation is only active if general autonegotiation is also
enabled.

Michal
Doug Berger May 13, 2020, 2:37 a.m. UTC | #7
On 5/12/2020 12:08 PM, Michal Kubecek wrote:
> On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
>> On 5/11/2020 5:47 PM, Andrew Lunn wrote:
>>> On Mon, May 11, 2020 at 05:24:07PM -0700, Doug Berger wrote:
>>>> A comment in uapi/linux/ethtool.h states "Drivers should reject a
>>>> non-zero setting of @autoneg when autoneogotiation is disabled (or
>>>> not supported) for the link".
>>>>
>>>> That check should be added to phy_validate_pause() to consolidate
>>>> the code where possible.
>>>>
>>>> Fixes: 22b7d29926b5 ("net: ethernet: Add helper to determine if pause configuration is supported")
>>>
>>> Hi Doug
>>>
>>> If this is a real fix, please submit this to net, not net-next.
>>>
>>>    Andrew
>>>
>> This was intended as a fix, but I thought it would be better to keep it
>> as part of this set for context and since net-next is currently open.
>>
>> The context is trying to improve the phylib support for offloading
>> ethtool pause configuration and this is something that could be checked
>> in a single location rather than by individual drivers.
>>
>> I included it here to get feedback about its appropriateness as a common
>> behavior. I should have been more explicit about that.
>>
>> Personally, I'm actually not that fond of this change since it can
>> easily be a source of confusion with the ethtool interface because the
>> link autonegotiation and the pause autonegotiation are controlled by
>> different commands.
>>
>> Since the ethtool -A command performs a read/modify/write of pause
>> parameters, you can get strange results like these:
>> # ethtool -s eth0 speed 100 duplex full autoneg off
>> # ethtool -A eth0 tx off
>> Cannot set device pause parameters: Invalid argument
>> #
>> Because, the get read pause autoneg as enabled and only the tx_pause
>> member of the structure was updated.
> 
> This would be indeed unfortunate. We could use extack to make the error
> message easier to understand but the real problem IMHO is that
> ethtool_ops::get_pauseparam() returns value which is rejected by
> ethtool_ops::set_pauseparam(). This is something we should avoid.
> 
> If we really wanted to reject ethtool_pauseparam::autoneg on when
> general autonegotiation is off, we would have to disable pause
> autonegotiation whenever general autonegotiation is disabled. I don't
> like that idea, however, as that would mean that
> 
>   ethtool -s dev autoneg off ...
>   ethtool -s dev autoneg on ...
> 
> would reset the setting of pause autonegotiation.
> 
> Therefore I believe the comment should be rather replaced by a warning
> that even if ethtool_pauseparam::autoneg is enabled, pause
> autonegotiation is only active if general autonegotiation is also
> enabled.
> 
> Michal
> 
Thanks for your reply.

I agree with your concerns and will remove this commit from the set when
I resubmit. I also favor replacing the comment in ethtool.h.

-Doug
Doug Berger May 13, 2020, 3:48 a.m. UTC | #8
On 5/12/2020 11:55 AM, Russell King - ARM Linux admin wrote:
> On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
>> This was intended as a fix, but I thought it would be better to keep it
>> as part of this set for context and since net-next is currently open.
>>
>> The context is trying to improve the phylib support for offloading
>> ethtool pause configuration and this is something that could be checked
>> in a single location rather than by individual drivers.
>>
>> I included it here to get feedback about its appropriateness as a common
>> behavior. I should have been more explicit about that.
>>
>> Personally, I'm actually not that fond of this change since it can
>> easily be a source of confusion with the ethtool interface because the
>> link autonegotiation and the pause autonegotiation are controlled by
>> different commands.
>>
>> Since the ethtool -A command performs a read/modify/write of pause
>> parameters, you can get strange results like these:
>> # ethtool -s eth0 speed 100 duplex full autoneg off
>> # ethtool -A eth0 tx off
>> Cannot set device pause parameters: Invalid argument
>> #
>> Because, the get read pause autoneg as enabled and only the tx_pause
>> member of the structure was updated.
> 
> This looks like the same argument I've been having with Heiner over
> the EEE interface, except there's a difference here.
> 
> # ethtool -A eth0 autoneg on
> # ethtool -s eth0 autoneg off speed 100 duplex full
> 
> After those two commands, what is the state of pause mode?  The answer
> is, it's disabled.
> 
> # ethtool -A eth0 autoneg off rx on tx on
> 
> is perfectly acceptable, as we are forcing pause modes at the local
> end of the link.
> 
> # ethtool -A eth0 autoneg on
> 
> Now, the question is whether that should be allowed or not - but this
> is merely restoring the "pause" settings that were in effect prior
> to the previous command.  It does not enable pause negotiation,
> because autoneg as a whole is disabled, but it _allows_ pause
> negotiation to occur when autoneg is enabled at some point in the
> future.
> 
> Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
> autoneg off" means you can configure the negotiation parameters
> _before_ triggering a negotiation cycle on the link.  In other words,
> it would avoid:
> 
> # ethtool -s eth0 autoneg on
> # # Link renegotiates
> # ethtool -A eth0 autoneg on
> # # Link renegotiates a second time
> 
> and it also means that if stuff has already been scripted to avoid
> this, nothing breaks.
> 
> If we start rejecting ethtool -A because autoneg is disabled, then
> things get difficult to configure - we would need ethtool documentation
> to state that autoneg must be enabled before configuration of pause
> and EEE can be done.  IMHO, that hurts usability, and adds confusion.
> 
Thanks for your input and I agree with what you have said here. I will
remove this commit from the set when I resubmit and I assume that, like
Michal, you would like to see the comment in ethtool.h revised.

I think the crux of the matter is that the meaning of the autoneg pause
parameter is not well specified, and that is fundamentally what I am
trying to clarify in a common implementation that might help unify a
consistent behavior across network drivers.

My interpretation is that the link autonegotiation and the pause
autonegotiation can be meaningfully set independently from each other
and that the interplay between the two has easily overlooked subtleties.

My opinion (which is at least in part drawn from my interpretation of
your opinion) is as follows with regard to pause behaviors:

The link autonegotiation parameter concerns itself with whether the
Pause capabilities are advertised as part of autonegotiation of link
parameters.

The pause autonegotiation parameter concerns itself with whether the
local node is willing to accept the advertised capabilities of its peer
as input into its pause configuration.

The Tx_Pause and Rx_Pause parameters indicate in which directions pause
frames should be supported.

If the pause autonegotiation is off, the MAC is allowed to act
exclusively according to the Tx_Pause and Rx_Pause parameters. If
Tx_Pause is on the MAC should send pause control frames whenever it
needs to assert back pressure to ease the load on its receiver. If
Tx_Pause is off the MAC should not transmit any pause control frames. If
Rx_Pause is on the MAC should delay its transmissions in response to any
pause control frames it receives. If Rx_Pause is off received pause
control frames should be ignored. If link autonegotiation is on the
Tx_Pause and Rx_Pause values should be advertised in the PHY Pause and
AsymPause bits for informational purposes according to the following
mapping:
    tx rx  Pause AsymPause
    0  0   0     0
    0  1   1     1
    1  0   0     1
    1  1   1     0

If the pause autonegotiation is on, and the link autonegotiation is also
on then the Tx_Pause and Rx_Pause values should be advertised in the PHY
Pause and AsymPause bits according to the IEEE 802.3 spec according to
the following mapping:
    tx rx  Pause AsymPause
    0  0   0     0
    0  1   1     1
    1  0   0     1
    1  1   1     1
If link autonegotiation succeeds the peer's advertised Pause and
AsymPause bits should be used in combination with the local Pause and
Pause Asym bits to determine in which directions pause frames are
supported. However, regardless of the negotiated result, if the Tx_Pause
is off no pause frames should be sent and if the Rx_Pause is off
received pause frames should be ignored. If Tx_Pause is on and the
negotiated result allows pause frames to be sent then pause frames may
be sent by the local node to apply back pressure to reduce the load on
its receive path. If Rx_Pause is on and the negotiated result allows
pause frames to be received then the local node should delay its
transmission in response to received pause frames. In this way the local
settings can only override the negotiated settings to disable the use of
pause frames.

If the pause autonegotiation is on, and the link autonegotiation is off
then the values of the peer's Pause and AsymPause bits are forced to 0
(because they can't be exchanged without link autonegotiation) which
always produces the negotiated result of pause frame use being disabled
in both directions. Since the local Tx_Pause and Rx_Pause parameters can
only override the negotiation when they are off, pause frames should not
be sent or received.

This is the behavior I have attempted to implement by this patch set for
the bcmgenet driver, but I see now that I made an error in this last
case since I made the negotiation also dependent on the link
autonegotiation being enabled. I will correct that in a re-submission.

I would appreciate if you can confirm that you agree that this is a good
general behavior for all network devices before I resubmit, or please
help me understand what could be done better.

Many thanks,
    Doug
Russell King (Oracle) May 13, 2020, 5:34 a.m. UTC | #9
On Tue, May 12, 2020 at 08:48:22PM -0700, Doug Berger wrote:
> On 5/12/2020 11:55 AM, Russell King - ARM Linux admin wrote:
> > On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> >> This was intended as a fix, but I thought it would be better to keep it
> >> as part of this set for context and since net-next is currently open.
> >>
> >> The context is trying to improve the phylib support for offloading
> >> ethtool pause configuration and this is something that could be checked
> >> in a single location rather than by individual drivers.
> >>
> >> I included it here to get feedback about its appropriateness as a common
> >> behavior. I should have been more explicit about that.
> >>
> >> Personally, I'm actually not that fond of this change since it can
> >> easily be a source of confusion with the ethtool interface because the
> >> link autonegotiation and the pause autonegotiation are controlled by
> >> different commands.
> >>
> >> Since the ethtool -A command performs a read/modify/write of pause
> >> parameters, you can get strange results like these:
> >> # ethtool -s eth0 speed 100 duplex full autoneg off
> >> # ethtool -A eth0 tx off
> >> Cannot set device pause parameters: Invalid argument
> >> #
> >> Because, the get read pause autoneg as enabled and only the tx_pause
> >> member of the structure was updated.
> > 
> > This looks like the same argument I've been having with Heiner over
> > the EEE interface, except there's a difference here.
> > 
> > # ethtool -A eth0 autoneg on
> > # ethtool -s eth0 autoneg off speed 100 duplex full
> > 
> > After those two commands, what is the state of pause mode?  The answer
> > is, it's disabled.
> > 
> > # ethtool -A eth0 autoneg off rx on tx on
> > 
> > is perfectly acceptable, as we are forcing pause modes at the local
> > end of the link.
> > 
> > # ethtool -A eth0 autoneg on
> > 
> > Now, the question is whether that should be allowed or not - but this
> > is merely restoring the "pause" settings that were in effect prior
> > to the previous command.  It does not enable pause negotiation,
> > because autoneg as a whole is disabled, but it _allows_ pause
> > negotiation to occur when autoneg is enabled at some point in the
> > future.
> > 
> > Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
> > autoneg off" means you can configure the negotiation parameters
> > _before_ triggering a negotiation cycle on the link.  In other words,
> > it would avoid:
> > 
> > # ethtool -s eth0 autoneg on
> > # # Link renegotiates
> > # ethtool -A eth0 autoneg on
> > # # Link renegotiates a second time
> > 
> > and it also means that if stuff has already been scripted to avoid
> > this, nothing breaks.
> > 
> > If we start rejecting ethtool -A because autoneg is disabled, then
> > things get difficult to configure - we would need ethtool documentation
> > to state that autoneg must be enabled before configuration of pause
> > and EEE can be done.  IMHO, that hurts usability, and adds confusion.
> > 
> Thanks for your input and I agree with what you have said here. I will
> remove this commit from the set when I resubmit and I assume that, like
> Michal, you would like to see the comment in ethtool.h revised.
> 
> I think the crux of the matter is that the meaning of the autoneg pause
> parameter is not well specified, and that is fundamentally what I am
> trying to clarify in a common implementation that might help unify a
> consistent behavior across network drivers.
> 
> My interpretation is that the link autonegotiation and the pause
> autonegotiation can be meaningfully set independently from each other
> and that the interplay between the two has easily overlooked subtleties.
> 
> My opinion (which is at least in part drawn from my interpretation of
> your opinion) is as follows with regard to pause behaviors:
> 
> The link autonegotiation parameter concerns itself with whether the
> Pause capabilities are advertised as part of autonegotiation of link
> parameters.
> 
> The pause autonegotiation parameter concerns itself with whether the
> local node is willing to accept the advertised capabilities of its peer
> as input into its pause configuration.
> 
> The Tx_Pause and Rx_Pause parameters indicate in which directions pause
> frames should be supported.

This is where the ethtool interface breaks down - they are unable
to sanely define which should be supported, as what you end up with
could be wildly different from what you thought.  See the
documentation against linkmode_set_pause() where I detail the issues
in this API.

For example, if you specify Tx_Pause = 0, Rx_Pause = 1, you can end
up with the pause negotiating transmit and receive pause.

If you specify Tx_Pause = 1, Rx_Pause = 1, and the far end supports
only AsymPause, then you end up with pause disabled, despite the
link actually being able to support receive pause at the local end.
Whereas if you specified Tx_Pause = 0, Rx_Pause=1 in this scenario,
you would get receive pause.  That's very counter intuitive.

> If the pause autonegotiation is off, the MAC is allowed to act
> exclusively according to the Tx_Pause and Rx_Pause parameters. If
> Tx_Pause is on the MAC should send pause control frames whenever it
> needs to assert back pressure to ease the load on its receiver. If
> Tx_Pause is off the MAC should not transmit any pause control frames. If
> Rx_Pause is on the MAC should delay its transmissions in response to any
> pause control frames it receives. If Rx_Pause is off received pause
> control frames should be ignored. If link autonegotiation is on the
> Tx_Pause and Rx_Pause values should be advertised in the PHY Pause and
> AsymPause bits for informational purposes according to the following
> mapping:
>     tx rx  Pause AsymPause
>     0  0   0     0
>     0  1   1     1
>     1  0   0     1
>     1  1   1     0

That is what is presently implemented by the helpers, and leads to
the above counter intuitive behaviour.

> If the pause autonegotiation is on, and the link autonegotiation is also
> on then the Tx_Pause and Rx_Pause values should be advertised in the PHY
> Pause and AsymPause bits according to the IEEE 802.3 spec according to
> the following mapping:
>     tx rx  Pause AsymPause
>     0  0   0     0
>     0  1   1     1
>     1  0   0     1
>     1  1   1     1

That would be an API change - and note that in the case of 'tx=0
rx=1' and the result of negotiation being used, you can still end
up with transmit and receive pause being enabled.

Basically, trying to define the pause advertisment in terms of
desired TX and RX pause enablement is *very* problematical - they
really do not mean anything as we can see if we work through the
various settings and results.

You're much better using the raw advertisment mask to set the
pause and asym pause bits manually.

> If link autonegotiation succeeds the peer's advertised Pause and
> AsymPause bits should be used in combination with the local Pause and
> Pause Asym bits to determine in which directions pause frames are
> supported. However, regardless of the negotiated result, if the Tx_Pause
> is off no pause frames should be sent and if the Rx_Pause is off
> received pause frames should be ignored. If Tx_Pause is on and the
> negotiated result allows pause frames to be sent then pause frames may
> be sent by the local node to apply back pressure to reduce the load on
> its receive path. If Rx_Pause is on and the negotiated result allows
> pause frames to be received then the local node should delay its
> transmission in response to received pause frames. In this way the local
> settings can only override the negotiated settings to disable the use of
> pause frames.
> 
> If the pause autonegotiation is on, and the link autonegotiation is off
> then the values of the peer's Pause and AsymPause bits are forced to 0
> (because they can't be exchanged without link autonegotiation) which
> always produces the negotiated result of pause frame use being disabled
> in both directions. Since the local Tx_Pause and Rx_Pause parameters can
> only override the negotiation when they are off, pause frames should not
> be sent or received.
> 
> This is the behavior I have attempted to implement by this patch set for
> the bcmgenet driver, but I see now that I made an error in this last
> case since I made the negotiation also dependent on the link
> autonegotiation being enabled. I will correct that in a re-submission.
> 
> I would appreciate if you can confirm that you agree that this is a good
> general behavior for all network devices before I resubmit, or please
> help me understand what could be done better.

It's gratifying that someone else has run into the same issue I did a
while back, has put thought into it, and come up with a similar idea
that I did.  You'll find your idea already spelt out in the comments
in phylink_ethtool_set_pauseparam().

However, as I say, it's an API change.

I've long considered the ethtool APIs to be very deficient in its
pause handling in many ways.  Another example is:

        Supported pause frame use: Symmetric Receive-only

which leads to the obvious observation: the link can negotiate that
this end should transmit only, but the terminology used here does
not seem to permit it (there's no "Transmit-only" indicated.) In
reality, one shold read "Asymmetric" for "Receive-only" in this
output, because that is exactly what the bit that controls that
indication is.
Russell King (Oracle) May 13, 2020, 9:20 a.m. UTC | #10
On Wed, May 13, 2020 at 06:34:05AM +0100, Russell King - ARM Linux admin wrote:
> On Tue, May 12, 2020 at 08:48:22PM -0700, Doug Berger wrote:
> > On 5/12/2020 11:55 AM, Russell King - ARM Linux admin wrote:
> > > On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> > >> This was intended as a fix, but I thought it would be better to keep it
> > >> as part of this set for context and since net-next is currently open.
> > >>
> > >> The context is trying to improve the phylib support for offloading
> > >> ethtool pause configuration and this is something that could be checked
> > >> in a single location rather than by individual drivers.
> > >>
> > >> I included it here to get feedback about its appropriateness as a common
> > >> behavior. I should have been more explicit about that.
> > >>
> > >> Personally, I'm actually not that fond of this change since it can
> > >> easily be a source of confusion with the ethtool interface because the
> > >> link autonegotiation and the pause autonegotiation are controlled by
> > >> different commands.
> > >>
> > >> Since the ethtool -A command performs a read/modify/write of pause
> > >> parameters, you can get strange results like these:
> > >> # ethtool -s eth0 speed 100 duplex full autoneg off
> > >> # ethtool -A eth0 tx off
> > >> Cannot set device pause parameters: Invalid argument
> > >> #
> > >> Because, the get read pause autoneg as enabled and only the tx_pause
> > >> member of the structure was updated.
> > > 
> > > This looks like the same argument I've been having with Heiner over
> > > the EEE interface, except there's a difference here.
> > > 
> > > # ethtool -A eth0 autoneg on
> > > # ethtool -s eth0 autoneg off speed 100 duplex full
> > > 
> > > After those two commands, what is the state of pause mode?  The answer
> > > is, it's disabled.
> > > 
> > > # ethtool -A eth0 autoneg off rx on tx on
> > > 
> > > is perfectly acceptable, as we are forcing pause modes at the local
> > > end of the link.
> > > 
> > > # ethtool -A eth0 autoneg on
> > > 
> > > Now, the question is whether that should be allowed or not - but this
> > > is merely restoring the "pause" settings that were in effect prior
> > > to the previous command.  It does not enable pause negotiation,
> > > because autoneg as a whole is disabled, but it _allows_ pause
> > > negotiation to occur when autoneg is enabled at some point in the
> > > future.
> > > 
> > > Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
> > > autoneg off" means you can configure the negotiation parameters
> > > _before_ triggering a negotiation cycle on the link.  In other words,
> > > it would avoid:
> > > 
> > > # ethtool -s eth0 autoneg on
> > > # # Link renegotiates
> > > # ethtool -A eth0 autoneg on
> > > # # Link renegotiates a second time
> > > 
> > > and it also means that if stuff has already been scripted to avoid
> > > this, nothing breaks.
> > > 
> > > If we start rejecting ethtool -A because autoneg is disabled, then
> > > things get difficult to configure - we would need ethtool documentation
> > > to state that autoneg must be enabled before configuration of pause
> > > and EEE can be done.  IMHO, that hurts usability, and adds confusion.
> > > 
> > Thanks for your input and I agree with what you have said here. I will
> > remove this commit from the set when I resubmit and I assume that, like
> > Michal, you would like to see the comment in ethtool.h revised.
> > 
> > I think the crux of the matter is that the meaning of the autoneg pause
> > parameter is not well specified, and that is fundamentally what I am
> > trying to clarify in a common implementation that might help unify a
> > consistent behavior across network drivers.
> > 
> > My interpretation is that the link autonegotiation and the pause
> > autonegotiation can be meaningfully set independently from each other
> > and that the interplay between the two has easily overlooked subtleties.
> > 
> > My opinion (which is at least in part drawn from my interpretation of
> > your opinion) is as follows with regard to pause behaviors:
> > 
> > The link autonegotiation parameter concerns itself with whether the
> > Pause capabilities are advertised as part of autonegotiation of link
> > parameters.
> > 
> > The pause autonegotiation parameter concerns itself with whether the
> > local node is willing to accept the advertised capabilities of its peer
> > as input into its pause configuration.
> > 
> > The Tx_Pause and Rx_Pause parameters indicate in which directions pause
> > frames should be supported.
> 
> This is where the ethtool interface breaks down - they are unable
> to sanely define which should be supported, as what you end up with
> could be wildly different from what you thought.  See the
> documentation against linkmode_set_pause() where I detail the issues
> in this API.
> 
> For example, if you specify Tx_Pause = 0, Rx_Pause = 1, you can end
> up with the pause negotiating transmit and receive pause.
> 
> If you specify Tx_Pause = 1, Rx_Pause = 1, and the far end supports
> only AsymPause, then you end up with pause disabled, despite the
> link actually being able to support receive pause at the local end.
> Whereas if you specified Tx_Pause = 0, Rx_Pause=1 in this scenario,
> you would get receive pause.  That's very counter intuitive.
> 
> > If the pause autonegotiation is off, the MAC is allowed to act
> > exclusively according to the Tx_Pause and Rx_Pause parameters. If
> > Tx_Pause is on the MAC should send pause control frames whenever it
> > needs to assert back pressure to ease the load on its receiver. If
> > Tx_Pause is off the MAC should not transmit any pause control frames. If
> > Rx_Pause is on the MAC should delay its transmissions in response to any
> > pause control frames it receives. If Rx_Pause is off received pause
> > control frames should be ignored. If link autonegotiation is on the
> > Tx_Pause and Rx_Pause values should be advertised in the PHY Pause and
> > AsymPause bits for informational purposes according to the following
> > mapping:
> >     tx rx  Pause AsymPause
> >     0  0   0     0
> >     0  1   1     1
> >     1  0   0     1
> >     1  1   1     0
> 
> That is what is presently implemented by the helpers, and leads to
> the above counter intuitive behaviour.
> 
> > If the pause autonegotiation is on, and the link autonegotiation is also
> > on then the Tx_Pause and Rx_Pause values should be advertised in the PHY
> > Pause and AsymPause bits according to the IEEE 802.3 spec according to
> > the following mapping:
> >     tx rx  Pause AsymPause
> >     0  0   0     0
> >     0  1   1     1
> >     1  0   0     1
> >     1  1   1     1
> 
> That would be an API change - and note that in the case of 'tx=0
> rx=1' and the result of negotiation being used, you can still end
> up with transmit and receive pause being enabled.
> 
> Basically, trying to define the pause advertisment in terms of
> desired TX and RX pause enablement is *very* problematical - they
> really do not mean anything as we can see if we work through the
> various settings and results.
> 
> You're much better using the raw advertisment mask to set the
> pause and asym pause bits manually.
> 
> > If link autonegotiation succeeds the peer's advertised Pause and
> > AsymPause bits should be used in combination with the local Pause and
> > Pause Asym bits to determine in which directions pause frames are
> > supported. However, regardless of the negotiated result, if the Tx_Pause
> > is off no pause frames should be sent and if the Rx_Pause is off
> > received pause frames should be ignored. If Tx_Pause is on and the
> > negotiated result allows pause frames to be sent then pause frames may
> > be sent by the local node to apply back pressure to reduce the load on
> > its receive path. If Rx_Pause is on and the negotiated result allows
> > pause frames to be received then the local node should delay its
> > transmission in response to received pause frames. In this way the local
> > settings can only override the negotiated settings to disable the use of
> > pause frames.
> > 
> > If the pause autonegotiation is on, and the link autonegotiation is off
> > then the values of the peer's Pause and AsymPause bits are forced to 0
> > (because they can't be exchanged without link autonegotiation) which
> > always produces the negotiated result of pause frame use being disabled
> > in both directions. Since the local Tx_Pause and Rx_Pause parameters can
> > only override the negotiation when they are off, pause frames should not
> > be sent or received.
> > 
> > This is the behavior I have attempted to implement by this patch set for
> > the bcmgenet driver, but I see now that I made an error in this last
> > case since I made the negotiation also dependent on the link
> > autonegotiation being enabled. I will correct that in a re-submission.
> > 
> > I would appreciate if you can confirm that you agree that this is a good
> > general behavior for all network devices before I resubmit, or please
> > help me understand what could be done better.
> 
> It's gratifying that someone else has run into the same issue I did a
> while back, has put thought into it, and come up with a similar idea
> that I did.  You'll find your idea already spelt out in the comments
> in phylink_ethtool_set_pauseparam().
> 
> However, as I say, it's an API change.
> 
> I've long considered the ethtool APIs to be very deficient in its
> pause handling in many ways.  Another example is:
> 
>         Supported pause frame use: Symmetric Receive-only
> 
> which leads to the obvious observation: the link can negotiate that
> this end should transmit only, but the terminology used here does
> not seem to permit it (there's no "Transmit-only" indicated.) In
> reality, one shold read "Asymmetric" for "Receive-only" in this
> output, because that is exactly what the bit that controls that
> indication is.

One thing I didn't mention is - although I identified the same problem
w.r.t interpretation of pause tx/rx to advertisement, I regard that
it's more important for a user interface to behave consistently.
phylib implements generic code that converts the set_pauseparam method
parameters to the PHY advertisement using the "very-odd" behaviour.

Since phylink uses phylib, phylink has to follow phylib, otherwise we
end up with the API having different behaviours depending on which SFP
modules are plugged in, or whether the network driver is connected to
a PHY or SFP.

So, I think consistency of implementation is more important than fixing
this; the current behaviour has been established for many years now.
Andrew Lunn May 13, 2020, 1:49 p.m. UTC | #11
> So, I think consistency of implementation is more important than fixing
> this; the current behaviour has been established for many years now.

Hi Russell, Doug

With netlink ethtool we have the possibility of adding a new API to
control this. And we can leave the IOCTL API alone, and the current
ethtool commands. We can add a new command to ethtool which uses the new API.

Question is, do we want to do this? Would we be introducing yet more
confusion, rather than making the situation better?

	Andrew
Michal Kubecek May 13, 2020, 2:59 p.m. UTC | #12
On Wed, May 13, 2020 at 03:49:25PM +0200, Andrew Lunn wrote:
> > So, I think consistency of implementation is more important than fixing
> > this; the current behaviour has been established for many years now.
> 
> With netlink ethtool we have the possibility of adding a new API to
> control this. And we can leave the IOCTL API alone, and the current
> ethtool commands. We can add a new command to ethtool which uses the new API.
> 
> Question is, do we want to do this? Would we be introducing yet more
> confusion, rather than making the situation better?

For the record, netlink interface for pause parameters which is based on
existing ioctl and ethtool_ops is in mainline but not in v5.6. If there
is a consensus that it should be rethought, it might still be possible
to drop these two request types and come with a better API later (i.e.
in 5.8 or 5.9 cycle).

Michal
Russell King (Oracle) May 13, 2020, 5:14 p.m. UTC | #13
On Wed, May 13, 2020 at 03:49:25PM +0200, Andrew Lunn wrote:
> Hi Russell, Doug
> 
> With netlink ethtool we have the possibility of adding a new API to
> control this. And we can leave the IOCTL API alone, and the current
> ethtool commands. We can add a new command to ethtool which uses the new API.
> 
> Question is, do we want to do this? Would we be introducing yet more
> confusion, rather than making the situation better?

The conclusion I came to was that I would document the deficiencies
and do no more; I think people are used to its current quirky
behaviour.
Doug Berger May 13, 2020, 9:27 p.m. UTC | #14
On 5/12/2020 10:34 PM, Russell King - ARM Linux admin wrote:
> On Tue, May 12, 2020 at 08:48:22PM -0700, Doug Berger wrote:
>> On 5/12/2020 11:55 AM, Russell King - ARM Linux admin wrote:
>>> On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
>>>> This was intended as a fix, but I thought it would be better to keep it
>>>> as part of this set for context and since net-next is currently open.
>>>>
>>>> The context is trying to improve the phylib support for offloading
>>>> ethtool pause configuration and this is something that could be checked
>>>> in a single location rather than by individual drivers.
>>>>
>>>> I included it here to get feedback about its appropriateness as a common
>>>> behavior. I should have been more explicit about that.
>>>>
>>>> Personally, I'm actually not that fond of this change since it can
>>>> easily be a source of confusion with the ethtool interface because the
>>>> link autonegotiation and the pause autonegotiation are controlled by
>>>> different commands.
>>>>
>>>> Since the ethtool -A command performs a read/modify/write of pause
>>>> parameters, you can get strange results like these:
>>>> # ethtool -s eth0 speed 100 duplex full autoneg off
>>>> # ethtool -A eth0 tx off
>>>> Cannot set device pause parameters: Invalid argument
>>>> #
>>>> Because, the get read pause autoneg as enabled and only the tx_pause
>>>> member of the structure was updated.
>>>
>>> This looks like the same argument I've been having with Heiner over
>>> the EEE interface, except there's a difference here.
>>>
>>> # ethtool -A eth0 autoneg on
>>> # ethtool -s eth0 autoneg off speed 100 duplex full
>>>
>>> After those two commands, what is the state of pause mode?  The answer
>>> is, it's disabled.
>>>
>>> # ethtool -A eth0 autoneg off rx on tx on
>>>
>>> is perfectly acceptable, as we are forcing pause modes at the local
>>> end of the link.
>>>
>>> # ethtool -A eth0 autoneg on
>>>
>>> Now, the question is whether that should be allowed or not - but this
>>> is merely restoring the "pause" settings that were in effect prior
>>> to the previous command.  It does not enable pause negotiation,
>>> because autoneg as a whole is disabled, but it _allows_ pause
>>> negotiation to occur when autoneg is enabled at some point in the
>>> future.
>>>
>>> Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
>>> autoneg off" means you can configure the negotiation parameters
>>> _before_ triggering a negotiation cycle on the link.  In other words,
>>> it would avoid:
>>>
>>> # ethtool -s eth0 autoneg on
>>> # # Link renegotiates
>>> # ethtool -A eth0 autoneg on
>>> # # Link renegotiates a second time
>>>
>>> and it also means that if stuff has already been scripted to avoid
>>> this, nothing breaks.
>>>
>>> If we start rejecting ethtool -A because autoneg is disabled, then
>>> things get difficult to configure - we would need ethtool documentation
>>> to state that autoneg must be enabled before configuration of pause
>>> and EEE can be done.  IMHO, that hurts usability, and adds confusion.
>>>
>> Thanks for your input and I agree with what you have said here. I will
>> remove this commit from the set when I resubmit and I assume that, like
>> Michal, you would like to see the comment in ethtool.h revised.
>>
>> I think the crux of the matter is that the meaning of the autoneg pause
>> parameter is not well specified, and that is fundamentally what I am
>> trying to clarify in a common implementation that might help unify a
>> consistent behavior across network drivers.
>>
>> My interpretation is that the link autonegotiation and the pause
>> autonegotiation can be meaningfully set independently from each other
>> and that the interplay between the two has easily overlooked subtleties.
>>
>> My opinion (which is at least in part drawn from my interpretation of
>> your opinion) is as follows with regard to pause behaviors:
>>
>> The link autonegotiation parameter concerns itself with whether the
>> Pause capabilities are advertised as part of autonegotiation of link
>> parameters.
>>
>> The pause autonegotiation parameter concerns itself with whether the
>> local node is willing to accept the advertised capabilities of its peer
>> as input into its pause configuration.
>>
>> The Tx_Pause and Rx_Pause parameters indicate in which directions pause
>> frames should be supported.
> 
> This is where the ethtool interface breaks down - they are unable
> to sanely define which should be supported, as what you end up with
> could be wildly different from what you thought.  See the
> documentation against linkmode_set_pause() where I detail the issues
> in this API.
> 
> For example, if you specify Tx_Pause = 0, Rx_Pause = 1, you can end
> up with the pause negotiating transmit and receive pause.
> 
> If you specify Tx_Pause = 1, Rx_Pause = 1, and the far end supports
> only AsymPause, then you end up with pause disabled, despite the
> link actually being able to support receive pause at the local end.
> Whereas if you specified Tx_Pause = 0, Rx_Pause=1 in this scenario,
> you would get receive pause.  That's very counter intuitive.
Yes, your documentation of these deficiencies in the current
implementation are very helpful.

>> If the pause autonegotiation is off, the MAC is allowed to act
>> exclusively according to the Tx_Pause and Rx_Pause parameters. If
>> Tx_Pause is on the MAC should send pause control frames whenever it
>> needs to assert back pressure to ease the load on its receiver. If
>> Tx_Pause is off the MAC should not transmit any pause control frames. If
>> Rx_Pause is on the MAC should delay its transmissions in response to any
>> pause control frames it receives. If Rx_Pause is off received pause
>> control frames should be ignored. If link autonegotiation is on the
>> Tx_Pause and Rx_Pause values should be advertised in the PHY Pause and
>> AsymPause bits for informational purposes according to the following
>> mapping:
>>     tx rx  Pause AsymPause
>>     0  0   0     0
>>     0  1   1     1
>>     1  0   0     1
>>     1  1   1     0
> 
> That is what is presently implemented by the helpers, and leads to
> the above counter intuitive behaviour.
Exactly, and that is by intent. I have retained this to allow backward
compatibility for existing drivers that wish to retain the current
behavior. I believe this mapping is not compliant with the IEEE 802.3
standard because of the deficiencies you have documented. However, I
have tried to make an argument in patch 2 of this set for why it might
make sense to advertise this way for a local node that is unwilling to
negotiate its pause capabilities and doesn't care about standard compliance.

>> If the pause autonegotiation is on, and the link autonegotiation is also
>> on then the Tx_Pause and Rx_Pause values should be advertised in the PHY
>> Pause and AsymPause bits according to the IEEE 802.3 spec according to
>> the following mapping:
>>     tx rx  Pause AsymPause
>>     0  0   0     0
>>     0  1   1     1
>>     1  0   0     1
>>     1  1   1     1
> 
> That would be an API change - and note that in the case of 'tx=0
> rx=1' and the result of negotiation being used, you can still end
> up with transmit and receive pause being enabled.
Yes, that is the API change in patch 2 of this set.

> Basically, trying to define the pause advertisment in terms of
> desired TX and RX pause enablement is *very* problematical - they
> really do not mean anything as we can see if we work through the
> various settings and results.
Yes, there is conflation of meaning between these parameters.

> You're much better using the raw advertisment mask to set the
> pause and asym pause bits manually.
Perhaps, but because advertisement is handled by a different ethtool
command and it is very easy to get wrong it would be nice to have a
better default behavior.

>> If link autonegotiation succeeds the peer's advertised Pause and
>> AsymPause bits should be used in combination with the local Pause and
>> Pause Asym bits to determine in which directions pause frames are
>> supported. However, regardless of the negotiated result, if the Tx_Pause
>> is off no pause frames should be sent and if the Rx_Pause is off
>> received pause frames should be ignored. If Tx_Pause is on and the
>> negotiated result allows pause frames to be sent then pause frames may
>> be sent by the local node to apply back pressure to reduce the load on
>> its receive path. If Rx_Pause is on and the negotiated result allows
>> pause frames to be received then the local node should delay its
>> transmission in response to received pause frames. In this way the local
>> settings can only override the negotiated settings to disable the use of
>> pause frames.
>>
>> If the pause autonegotiation is on, and the link autonegotiation is off
>> then the values of the peer's Pause and AsymPause bits are forced to 0
>> (because they can't be exchanged without link autonegotiation) which
>> always produces the negotiated result of pause frame use being disabled
>> in both directions. Since the local Tx_Pause and Rx_Pause parameters can
>> only override the negotiation when they are off, pause frames should not
>> be sent or received.
>>
>> This is the behavior I have attempted to implement by this patch set for
>> the bcmgenet driver, but I see now that I made an error in this last
>> case since I made the negotiation also dependent on the link
>> autonegotiation being enabled. I will correct that in a re-submission.
>>
>> I would appreciate if you can confirm that you agree that this is a good
>> general behavior for all network devices before I resubmit, or please
>> help me understand what could be done better.
> 
> It's gratifying that someone else has run into the same issue I did a
> while back, has put thought into it, and come up with a similar idea
> that I did.  You'll find your idea already spelt out in the comments
> in phylink_ethtool_set_pauseparam().
Yes, I did find it there and actually included it verbatim in the cover
letter to this set.

> However, as I say, it's an API change.
> 
> I've long considered the ethtool APIs to be very deficient in its
> pause handling in many ways.  Another example is:
> 
>         Supported pause frame use: Symmetric Receive-only
> 
> which leads to the obvious observation: the link can negotiate that
> this end should transmit only, but the terminology used here does
> not seem to permit it (there's no "Transmit-only" indicated.) In
> reality, one shold read "Asymmetric" for "Receive-only" in this
> output, because that is exactly what the bit that controls that
> indication is.
Absolutely.

The Pause and AsymPause bits as defined by the IEEE 802.3 standard are
for the purpose of advertising a capability. While the Tx_Pause and
Rx_Pause parameters of ethtool allow a user to indicate whether the
feature should be used on a link that is capable of the feature.

When pause autonegotiation is enabled the local and peer Pause and
AsymPause bits should be used to negotiate the CAPABILITY of using the
pause feature for each direction. This is not the same as enabling pause
in those directions.

So for the problematic cases:

If you specify Tx_Pause = 0, Rx_Pause = 1 you advertise that the link is
capable of both Symmetric PAUSE and Asymmetric PAUSE toward local device
according to Table 37-2 in IEEE Std 802.3-2018. If the result of link
autonegotiation indicates that both directions are capable of supporting
pause control frames you choose not to send pause control frames because
the user asked you not to by setting Tx_Pause = 0.

If you specify Tx_Pause = 1, Rx_Pause = 1 you advertise that the link is
capable of both Symmetric PAUSE and Asymmetric PAUSE toward local device
according to Table 37-2 in IEEE Std 802.3-2018. If the far end supports
only AsymPause, then the link autonegotiation will indicate that only
the receive direction is capable of supporting the pause feature and you
should not send pause control frames to the peer even though the user
has set Tx_Pause = 1.

If link autonegotiation is disabled, then the capability of the link to
support pause frames cannot be negotiated and therefore pause control
frames should not be used.

When pause autonegotiation is disabled the local peer does not care what
its peer is capable of and it can choose to send and process pause
control frames based entirely, on the users requested Tx_Pause and
Rx_Pause parameters. However, if link autonegotiation is enabled it
might as well not be rude and should advertise its intended usage.
Admittedly, Tx_Pause = 1 and Rx_Pause = 1 setting Pause and clearing
AsymPause is still "weird" behavior, but I'm really only retaining it
for backward compatibility. I don't really know why someone thought that
would be a good mapping and why its use has propagated, but I am not
willing to implement that behavior in the bcmgenet driver.

I was considering including a patch in this set to modify the phylink
behavior to use this more rational interpretation as suggested by your
comment there, but it wouldn't allow users to opt-in to the new behavior
so I stopped short of that.

My secret hope was that you would be inspired by this implementation to
champion such an effort in the phylink. :)

I still would like to see support for this standard compliant mapping in
the phylib, but I can localize the changes to a custom implementation in
the bcmgenet driver if necessary.
Doug Berger May 13, 2020, 9:31 p.m. UTC | #15
On 5/13/2020 2:20 AM, Russell King - ARM Linux admin wrote:
> On Wed, May 13, 2020 at 06:34:05AM +0100, Russell King - ARM Linux admin wrote:
>> On Tue, May 12, 2020 at 08:48:22PM -0700, Doug Berger wrote:
>>> On 5/12/2020 11:55 AM, Russell King - ARM Linux admin wrote:
>>>> On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
>>>>> This was intended as a fix, but I thought it would be better to keep it
>>>>> as part of this set for context and since net-next is currently open.
>>>>>
>>>>> The context is trying to improve the phylib support for offloading
>>>>> ethtool pause configuration and this is something that could be checked
>>>>> in a single location rather than by individual drivers.
>>>>>
>>>>> I included it here to get feedback about its appropriateness as a common
>>>>> behavior. I should have been more explicit about that.
>>>>>
>>>>> Personally, I'm actually not that fond of this change since it can
>>>>> easily be a source of confusion with the ethtool interface because the
>>>>> link autonegotiation and the pause autonegotiation are controlled by
>>>>> different commands.
>>>>>
>>>>> Since the ethtool -A command performs a read/modify/write of pause
>>>>> parameters, you can get strange results like these:
>>>>> # ethtool -s eth0 speed 100 duplex full autoneg off
>>>>> # ethtool -A eth0 tx off
>>>>> Cannot set device pause parameters: Invalid argument
>>>>> #
>>>>> Because, the get read pause autoneg as enabled and only the tx_pause
>>>>> member of the structure was updated.
>>>>
>>>> This looks like the same argument I've been having with Heiner over
>>>> the EEE interface, except there's a difference here.
>>>>
>>>> # ethtool -A eth0 autoneg on
>>>> # ethtool -s eth0 autoneg off speed 100 duplex full
>>>>
>>>> After those two commands, what is the state of pause mode?  The answer
>>>> is, it's disabled.
>>>>
>>>> # ethtool -A eth0 autoneg off rx on tx on
>>>>
>>>> is perfectly acceptable, as we are forcing pause modes at the local
>>>> end of the link.
>>>>
>>>> # ethtool -A eth0 autoneg on
>>>>
>>>> Now, the question is whether that should be allowed or not - but this
>>>> is merely restoring the "pause" settings that were in effect prior
>>>> to the previous command.  It does not enable pause negotiation,
>>>> because autoneg as a whole is disabled, but it _allows_ pause
>>>> negotiation to occur when autoneg is enabled at some point in the
>>>> future.
>>>>
>>>> Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
>>>> autoneg off" means you can configure the negotiation parameters
>>>> _before_ triggering a negotiation cycle on the link.  In other words,
>>>> it would avoid:
>>>>
>>>> # ethtool -s eth0 autoneg on
>>>> # # Link renegotiates
>>>> # ethtool -A eth0 autoneg on
>>>> # # Link renegotiates a second time
>>>>
>>>> and it also means that if stuff has already been scripted to avoid
>>>> this, nothing breaks.
>>>>
>>>> If we start rejecting ethtool -A because autoneg is disabled, then
>>>> things get difficult to configure - we would need ethtool documentation
>>>> to state that autoneg must be enabled before configuration of pause
>>>> and EEE can be done.  IMHO, that hurts usability, and adds confusion.
>>>>
>>> Thanks for your input and I agree with what you have said here. I will
>>> remove this commit from the set when I resubmit and I assume that, like
>>> Michal, you would like to see the comment in ethtool.h revised.
>>>
>>> I think the crux of the matter is that the meaning of the autoneg pause
>>> parameter is not well specified, and that is fundamentally what I am
>>> trying to clarify in a common implementation that might help unify a
>>> consistent behavior across network drivers.
>>>
>>> My interpretation is that the link autonegotiation and the pause
>>> autonegotiation can be meaningfully set independently from each other
>>> and that the interplay between the two has easily overlooked subtleties.
>>>
>>> My opinion (which is at least in part drawn from my interpretation of
>>> your opinion) is as follows with regard to pause behaviors:
>>>
>>> The link autonegotiation parameter concerns itself with whether the
>>> Pause capabilities are advertised as part of autonegotiation of link
>>> parameters.
>>>
>>> The pause autonegotiation parameter concerns itself with whether the
>>> local node is willing to accept the advertised capabilities of its peer
>>> as input into its pause configuration.
>>>
>>> The Tx_Pause and Rx_Pause parameters indicate in which directions pause
>>> frames should be supported.
>>
>> This is where the ethtool interface breaks down - they are unable
>> to sanely define which should be supported, as what you end up with
>> could be wildly different from what you thought.  See the
>> documentation against linkmode_set_pause() where I detail the issues
>> in this API.
>>
>> For example, if you specify Tx_Pause = 0, Rx_Pause = 1, you can end
>> up with the pause negotiating transmit and receive pause.
>>
>> If you specify Tx_Pause = 1, Rx_Pause = 1, and the far end supports
>> only AsymPause, then you end up with pause disabled, despite the
>> link actually being able to support receive pause at the local end.
>> Whereas if you specified Tx_Pause = 0, Rx_Pause=1 in this scenario,
>> you would get receive pause.  That's very counter intuitive.
>>
>>> If the pause autonegotiation is off, the MAC is allowed to act
>>> exclusively according to the Tx_Pause and Rx_Pause parameters. If
>>> Tx_Pause is on the MAC should send pause control frames whenever it
>>> needs to assert back pressure to ease the load on its receiver. If
>>> Tx_Pause is off the MAC should not transmit any pause control frames. If
>>> Rx_Pause is on the MAC should delay its transmissions in response to any
>>> pause control frames it receives. If Rx_Pause is off received pause
>>> control frames should be ignored. If link autonegotiation is on the
>>> Tx_Pause and Rx_Pause values should be advertised in the PHY Pause and
>>> AsymPause bits for informational purposes according to the following
>>> mapping:
>>>     tx rx  Pause AsymPause
>>>     0  0   0     0
>>>     0  1   1     1
>>>     1  0   0     1
>>>     1  1   1     0
>>
>> That is what is presently implemented by the helpers, and leads to
>> the above counter intuitive behaviour.
>>
>>> If the pause autonegotiation is on, and the link autonegotiation is also
>>> on then the Tx_Pause and Rx_Pause values should be advertised in the PHY
>>> Pause and AsymPause bits according to the IEEE 802.3 spec according to
>>> the following mapping:
>>>     tx rx  Pause AsymPause
>>>     0  0   0     0
>>>     0  1   1     1
>>>     1  0   0     1
>>>     1  1   1     1
>>
>> That would be an API change - and note that in the case of 'tx=0
>> rx=1' and the result of negotiation being used, you can still end
>> up with transmit and receive pause being enabled.
>>
>> Basically, trying to define the pause advertisment in terms of
>> desired TX and RX pause enablement is *very* problematical - they
>> really do not mean anything as we can see if we work through the
>> various settings and results.
>>
>> You're much better using the raw advertisment mask to set the
>> pause and asym pause bits manually.
>>
>>> If link autonegotiation succeeds the peer's advertised Pause and
>>> AsymPause bits should be used in combination with the local Pause and
>>> Pause Asym bits to determine in which directions pause frames are
>>> supported. However, regardless of the negotiated result, if the Tx_Pause
>>> is off no pause frames should be sent and if the Rx_Pause is off
>>> received pause frames should be ignored. If Tx_Pause is on and the
>>> negotiated result allows pause frames to be sent then pause frames may
>>> be sent by the local node to apply back pressure to reduce the load on
>>> its receive path. If Rx_Pause is on and the negotiated result allows
>>> pause frames to be received then the local node should delay its
>>> transmission in response to received pause frames. In this way the local
>>> settings can only override the negotiated settings to disable the use of
>>> pause frames.
>>>
>>> If the pause autonegotiation is on, and the link autonegotiation is off
>>> then the values of the peer's Pause and AsymPause bits are forced to 0
>>> (because they can't be exchanged without link autonegotiation) which
>>> always produces the negotiated result of pause frame use being disabled
>>> in both directions. Since the local Tx_Pause and Rx_Pause parameters can
>>> only override the negotiation when they are off, pause frames should not
>>> be sent or received.
>>>
>>> This is the behavior I have attempted to implement by this patch set for
>>> the bcmgenet driver, but I see now that I made an error in this last
>>> case since I made the negotiation also dependent on the link
>>> autonegotiation being enabled. I will correct that in a re-submission.
>>>
>>> I would appreciate if you can confirm that you agree that this is a good
>>> general behavior for all network devices before I resubmit, or please
>>> help me understand what could be done better.
>>
>> It's gratifying that someone else has run into the same issue I did a
>> while back, has put thought into it, and come up with a similar idea
>> that I did.  You'll find your idea already spelt out in the comments
>> in phylink_ethtool_set_pauseparam().
>>
>> However, as I say, it's an API change.
>>
>> I've long considered the ethtool APIs to be very deficient in its
>> pause handling in many ways.  Another example is:
>>
>>         Supported pause frame use: Symmetric Receive-only
>>
>> which leads to the obvious observation: the link can negotiate that
>> this end should transmit only, but the terminology used here does
>> not seem to permit it (there's no "Transmit-only" indicated.) In
>> reality, one shold read "Asymmetric" for "Receive-only" in this
>> output, because that is exactly what the bit that controls that
>> indication is.
> 
> One thing I didn't mention is - although I identified the same problem
> w.r.t interpretation of pause tx/rx to advertisement, I regard that
> it's more important for a user interface to behave consistently.
> phylib implements generic code that converts the set_pauseparam method
> parameters to the PHY advertisement using the "very-odd" behaviour.
> 
> Since phylink uses phylib, phylink has to follow phylib, otherwise we
> end up with the API having different behaviours depending on which SFP
> modules are plugged in, or whether the network driver is connected to
> a PHY or SFP.
> 
> So, I think consistency of implementation is more important than fixing
> this; the current behaviour has been established for many years now.
> 
I agree that consistency is important. I had just hoped that since there
really is surprisingly little support for this feature among network
drivers (I assume largely because of this lack of basic support in the
phylib) that there might still be a chance to get it changed.
Doug Berger May 13, 2020, 10:09 p.m. UTC | #16
On 5/13/2020 6:49 AM, Andrew Lunn wrote:
>> So, I think consistency of implementation is more important than fixing
>> this; the current behaviour has been established for many years now.
> 
> Hi Russell, Doug
> 
> With netlink ethtool we have the possibility of adding a new API to
> control this. And we can leave the IOCTL API alone, and the current
> ethtool commands. We can add a new command to ethtool which uses the new API.
> 
> Question is, do we want to do this? Would we be introducing yet more
> confusion, rather than making the situation better?
> 
> 	Andrew
> 
I think it is likely to introduce more confusion.
-Doug
diff mbox series

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index c3a107cf578e..5a9618ba232e 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -2624,6 +2624,9 @@  EXPORT_SYMBOL(phy_set_asym_pause);
 bool phy_validate_pause(struct phy_device *phydev,
 			struct ethtool_pauseparam *pp)
 {
+	if (pp->autoneg && !phydev->autoneg)
+		return false;
+
 	if (!linkmode_test_bit(ETHTOOL_LINK_MODE_Pause_BIT,
 			       phydev->supported) && pp->rx_pause)
 		return false;