mbox series

[iwl,v1,0/5] igc: TX timestamping fixes

Message ID 20230504235233.1850428-1-vinicius.gomes@intel.com
Headers show
Series igc: TX timestamping fixes | expand

Message

Vinicius Costa Gomes May 4, 2023, 11:52 p.m. UTC
Hi,

Changes from the "for-next-queue" version:
 - As this is intended for the iwl/net-queue tree, removed adding
   support for adding the "extra" tstamp registers;
 - Added "Fixes:" tags to the appropriate patches (Vladimir Oltean);
 - Improved the check to catch the case that the skb has the
   SKBTX_HW_TSTAMP flag, but TX timestamping is not enabled (Vladimir
   Oltean);
 - Ony check for timestamping timeouts if TX timestamping is enabled
   (Vladimir Oltean);

for-next-queue version link:
https://lore.kernel.org/intel-wired-lan/20230228054534.1093483-1-vinicius.gomes@intel.com/

This is the fixes part of the series intended to add support for using
the 4 timestamp registers present in i225/i226.

Moving the timestamp handling to be inline with the interrupt handling
has the advantage of improving the TX timestamping retrieval latency,
here are some numbers using ntpperf:

Before:

$ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o -37
               |          responses            |     TX timestamp offset (ns)
rate   clients |  lost invalid   basic  xleave |    min    mean     max stddev
1000       100   0.00%   0.00%   0.00% 100.00%      -56      +9     +52     19
1500       150   0.00%   0.00%   0.00% 100.00%      -40     +30     +75     22
2250       225   0.00%   0.00%   0.00% 100.00%      -11     +29     +72     15
3375       337   0.00%   0.00%   0.00% 100.00%      -18     +40     +88     22
5062       506   0.00%   0.00%   0.00% 100.00%      -19     +23     +77     15
7593       759   0.00%   0.00%   0.00% 100.00%       +7     +47   +5168     43
11389     1138   0.00%   0.00%   0.00% 100.00%      -11     +41   +5240     39
17083     1708   0.00%   0.00%   0.00% 100.00%      +19     +60   +5288     50
25624     2562   0.00%   0.00%   0.00% 100.00%       +1     +56   +5368     58
38436     3843   0.00%   0.00%   0.00% 100.00%      -84     +12   +8847     66
57654     5765   0.00%   0.00% 100.00%   0.00%
86481     8648   0.00%   0.00% 100.00%   0.00%
129721   12972   0.00%   0.00% 100.00%   0.00%
194581   16384   0.00%   0.00% 100.00%   0.00%
291871   16384  27.35%   0.00%  72.65%   0.00%
437806   16384  50.05%   0.00%  49.95%   0.00%

After:

$ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o -37
               |          responses            |     TX timestamp offset (ns)
rate   clients |  lost invalid   basic  xleave |    min    mean     max stddev
1000       100   0.00%   0.00%   0.00% 100.00%      -44      +0     +61     19
1500       150   0.00%   0.00%   0.00% 100.00%       -6     +39     +81     16
2250       225   0.00%   0.00%   0.00% 100.00%      -22     +25     +69     15
3375       337   0.00%   0.00%   0.00% 100.00%      -28     +15     +56     14
5062       506   0.00%   0.00%   0.00% 100.00%       +7     +78    +143     27
7593       759   0.00%   0.00%   0.00% 100.00%      -54     +24    +144     47
11389     1138   0.00%   0.00%   0.00% 100.00%      -90     -33     +28     21
17083     1708   0.00%   0.00%   0.00% 100.00%      -50      -2     +35     14
25624     2562   0.00%   0.00%   0.00% 100.00%      -62      +7     +66     23
38436     3843   0.00%   0.00%   0.00% 100.00%      -33     +30   +5395     36
57654     5765   0.00%   0.00% 100.00%   0.00%
86481     8648   0.00%   0.00% 100.00%   0.00%
129721   12972   0.00%   0.00% 100.00%   0.00%
194581   16384  19.50%   0.00%  80.50%   0.00%
291871   16384  35.81%   0.00%  64.19%   0.00%
437806   16384  55.40%   0.00%  44.60%   0.00%

During this series, and to show that as is always the case, things are
never easy as they should be, a hardware issue was found, and it took
some time to find the workaround(s). The bug and workaround are better
explained in patch 5/5.

Note: the workaround has a simpler alternative, but it would involve
adding support for the other timestamp registers, and only using the
TXSTMP{H/L}_0 as a way to clear the interrupt. But I feel bad about
throwing this kind of resources away. Didn't test this extensively but
it should work.

Also, as Marc Kleine-Budde suggested, after some consensus is reached
on this series, most parts of it will be proposed for igb.

BTW: I hope this is the correct usage of the "iwl" subject prefix.

Cheers,

Vinicius Costa Gomes (5):
  igc: Fix marking some timestamps as skipped wrongly
  igc: Fix race condition in PTP tx code
  igc: Fix checking for tstamp timeouts TX tstamp is off
  igc: Retrieve TX timestamp during interrupt handling
  igc: Add workaround for missing timestamps

 drivers/net/ethernet/intel/igc/igc.h      |   7 +-
 drivers/net/ethernet/intel/igc/igc_main.c |  14 ++-
 drivers/net/ethernet/intel/igc/igc_ptp.c  | 119 +++++++++++++++-------
 3 files changed, 95 insertions(+), 45 deletions(-)

Comments

Tony Nguyen May 8, 2023, 8:55 p.m. UTC | #1
On 5/4/2023 4:52 PM, Vinicius Costa Gomes wrote:
> Hi,
> 
> Changes from the "for-next-queue" version:
>   - As this is intended for the iwl/net-queue tree, removed adding
>     support for adding the "extra" tstamp registers;
>   - Added "Fixes:" tags to the appropriate patches (Vladimir Oltean);

In most cases, net patches should have Fixes: tags to them. Patches 3 
and 5 don't have them and it seems like it would be applicable to them.

Patch 4 seems more like an improvement than a bug fix? If so, -next 
would seem a better path for that patch. Based on the 'for-next-queue 
version' link, there are still some patches remaining that will go 
through -next? Perhaps this can go with them.

>   - Improved the check to catch the case that the skb has the
>     SKBTX_HW_TSTAMP flag, but TX timestamping is not enabled (Vladimir
>     Oltean);
>   - Ony check for timestamping timeouts if TX timestamping is enabled
>     (Vladimir Oltean);
> 
> for-next-queue version link:
> https://lore.kernel.org/intel-wired-lan/20230228054534.1093483-1-vinicius.gomes@intel.com/

...

> BTW: I hope this is the correct usage of the "iwl" subject prefix.

If you could also add -net|-next for the (eventual) target tree
i.e.
     net : iwl-net
     net-next : iwl-next

in this case 'iwl-net'

Thanks,
Tony
Vinicius Costa Gomes May 8, 2023, 10:18 p.m. UTC | #2
Tony Nguyen <anthony.l.nguyen@intel.com> writes:

> On 5/4/2023 4:52 PM, Vinicius Costa Gomes wrote:
>> Hi,
>> 
>> Changes from the "for-next-queue" version:
>>   - As this is intended for the iwl/net-queue tree, removed adding
>>     support for adding the "extra" tstamp registers;
>>   - Added "Fixes:" tags to the appropriate patches (Vladimir Oltean);
>
> In most cases, net patches should have Fixes: tags to them. Patches 3 
> and 5 don't have them and it seems like it would be applicable to them.
>

Patch 3 is directly related to patch 1, but I think it deserved a
separate commit, as it has a bit of refactor. I can squash it into patch
1, if you think it's better I can do that, no worries, I was only afraid
to make the patch harder to follow.

Patch 5, as a hardware issue workaround, I didn't know if adding a
'Fixes:' tag made sense, but as a way to direct patches to the right
stable trees, that would be a good point in favor, even if it's not
fixing a bug in the code. Is this what you had in mind? If so, I can do
that.

> Patch 4 seems more like an improvement than a bug fix? If so, -next 
> would seem a better path for that patch. Based on the 'for-next-queue 
> version' link, there are still some patches remaining that will go 
> through -next? Perhaps this can go with them.
>

On a very loaded system, for example, time synchronization can fail if
something blocks the system workqueue from running, so in a sense, that
patches fixes/helps some user visible issues. But I can see it both
ways, that this is an improvement. What's your preference?

>>   - Improved the check to catch the case that the skb has the
>>     SKBTX_HW_TSTAMP flag, but TX timestamping is not enabled (Vladimir
>>     Oltean);
>>   - Ony check for timestamping timeouts if TX timestamping is enabled
>>     (Vladimir Oltean);
>> 
>> for-next-queue version link:
>> https://lore.kernel.org/intel-wired-lan/20230228054534.1093483-1-vinicius.gomes@intel.com/
>
> ...
>
>> BTW: I hope this is the correct usage of the "iwl" subject prefix.
>
> If you could also add -net|-next for the (eventual) target tree
> i.e.
>      net : iwl-net
>      net-next : iwl-next
>
> in this case 'iwl-net'

Yeah, I sent this patch a couple minutes before seeing the email about
the subject prefix conventions. Will use the correct one next time.

>
> Thanks,
> Tony
Tony Nguyen May 9, 2023, 5:23 p.m. UTC | #3
On 5/8/2023 3:18 PM, Vinicius Costa Gomes wrote:
> Tony Nguyen <anthony.l.nguyen@intel.com> writes:
> 
>> On 5/4/2023 4:52 PM, Vinicius Costa Gomes wrote:
>>> Hi,
>>>
>>> Changes from the "for-next-queue" version:
>>>    - As this is intended for the iwl/net-queue tree, removed adding
>>>      support for adding the "extra" tstamp registers;
>>>    - Added "Fixes:" tags to the appropriate patches (Vladimir Oltean);
>>
>> In most cases, net patches should have Fixes: tags to them. Patches 3
>> and 5 don't have them and it seems like it would be applicable to them.
>>
> 
> Patch 3 is directly related to patch 1, but I think it deserved a
> separate commit, as it has a bit of refactor. I can squash it into patch
> 1, if you think it's better I can do that, no worries, I was only afraid
> to make the patch harder to follow.

I understand the reasoning and makes sense, however, I want to say I 
recently read on netdev a comment for keeping it in one patch for ease 
of backport.

> Patch 5, as a hardware issue workaround, I didn't know if adding a
> 'Fixes:' tag made sense, but as a way to direct patches to the right
> stable trees, that would be a good point in favor, even if it's not
> fixing a bug in the code. Is this what you had in mind? If so, I can do
> that.

Yea, I think a hint on how far back to backport would be valuable. I 
believe even though it's a workaround, from user perspective, it would 
appear as a bug(?)

>> Patch 4 seems more like an improvement than a bug fix? If so, -next
>> would seem a better path for that patch. Based on the 'for-next-queue
>> version' link, there are still some patches remaining that will go
>> through -next? Perhaps this can go with them.
>>
> 
> On a very loaded system, for example, time synchronization can fail if
> something blocks the system workqueue from running, so in a sense, that
> patches fixes/helps some user visible issues. But I can see it both
> ways, that this is an improvement. What's your preference?

I think I'd rather err on the side of fixing and it's already here :)

Thanks,
Tony
Vinicius Costa Gomes May 9, 2023, 8:51 p.m. UTC | #4
Hi Tony,

Tony Nguyen <anthony.l.nguyen@intel.com> writes:

> On 5/8/2023 3:18 PM, Vinicius Costa Gomes wrote:
>> Tony Nguyen <anthony.l.nguyen@intel.com> writes:
>> 
>>> On 5/4/2023 4:52 PM, Vinicius Costa Gomes wrote:
>>>> Hi,
>>>>
>>>> Changes from the "for-next-queue" version:
>>>>    - As this is intended for the iwl/net-queue tree, removed adding
>>>>      support for adding the "extra" tstamp registers;
>>>>    - Added "Fixes:" tags to the appropriate patches (Vladimir Oltean);
>>>
>>> In most cases, net patches should have Fixes: tags to them. Patches 3
>>> and 5 don't have them and it seems like it would be applicable to them.
>>>
>> 
>> Patch 3 is directly related to patch 1, but I think it deserved a
>> separate commit, as it has a bit of refactor. I can squash it into patch
>> 1, if you think it's better I can do that, no worries, I was only afraid
>> to make the patch harder to follow.
>
> I understand the reasoning and makes sense, however, I want to say I 
> recently read on netdev a comment for keeping it in one patch for ease 
> of backport.
>

Makes sense. Will squash it.

>> Patch 5, as a hardware issue workaround, I didn't know if adding a
>> 'Fixes:' tag made sense, but as a way to direct patches to the right
>> stable trees, that would be a good point in favor, even if it's not
>> fixing a bug in the code. Is this what you had in mind? If so, I can do
>> that.
>
> Yea, I think a hint on how far back to backport would be valuable. I 
> believe even though it's a workaround, from user perspective, it would 
> appear as a bug(?)
>

Will add the 'Fixes:' tag.

>>> Patch 4 seems more like an improvement than a bug fix? If so, -next
>>> would seem a better path for that patch. Based on the 'for-next-queue
>>> version' link, there are still some patches remaining that will go
>>> through -next? Perhaps this can go with them.
>>>
>> 
>> On a very loaded system, for example, time synchronization can fail if
>> something blocks the system workqueue from running, so in a sense, that
>> patches fixes/helps some user visible issues. But I can see it both
>> ways, that this is an improvement. What's your preference?
>
> I think I'd rather err on the side of fixing and it's already here :)
>

Understood. Will keep proposing it here for 'iwl-net'.

Will send the v2 soon.


Thank you,