diff mbox series

[RFC,net-next] e1000e: Fix real-time violations on link up

Message ID 20241011195412.51804-1-gerhard@engleder-embedded.com
State RFC
Headers show
Series [RFC,net-next] e1000e: Fix real-time violations on link up | expand

Commit Message

Gerhard Engleder Oct. 11, 2024, 7:54 p.m. UTC
From: Gerhard Engleder <eg@keba.com>

Link down and up triggers update of MTA table. This update executes many
PCIe writes and a final flush. Thus, PCIe will be blocked until all writes
are flushed. As a result, DMA transfers of other targets suffer from delay
in the range of 50us. The result are timing violations on real-time
systems during link down and up of e1000e.

Execute a flush after every single write. This prevents overloading the
interconnect with posted writes. As this also increases the time spent for
MTA table update considerable this change is limited to PREEMPT_RT.

Signed-off-by: Gerhard Engleder <eg@keba.com>
---
 drivers/net/ethernet/intel/e1000e/mac.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Andrew Lunn Oct. 12, 2024, 6:42 p.m. UTC | #1
On Fri, Oct 11, 2024 at 09:54:12PM +0200, Gerhard Engleder wrote:
> From: Gerhard Engleder <eg@keba.com>
> 
> Link down and up triggers update of MTA table. This update executes many
> PCIe writes and a final flush. Thus, PCIe will be blocked until all writes
> are flushed. As a result, DMA transfers of other targets suffer from delay
> in the range of 50us. The result are timing violations on real-time
> systems during link down and up of e1000e.
> 
> Execute a flush after every single write. This prevents overloading the
> interconnect with posted writes. As this also increases the time spent for
> MTA table update considerable this change is limited to PREEMPT_RT.
> 
> Signed-off-by: Gerhard Engleder <eg@keba.com>
> ---
>  drivers/net/ethernet/intel/e1000e/mac.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
> index d7df2a0ed629..f4693d355886 100644
> --- a/drivers/net/ethernet/intel/e1000e/mac.c
> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
> @@ -331,9 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
>  	}
>  
>  	/* replace the entire MTA table */
> -	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
> +	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>  		E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
> +#ifdef CONFIG_PREEMPT_RT
> +		e1e_flush();
> +#endif
> +	}
> +#ifndef CONFIG_PREEMPT_RT
>  	e1e_flush();
> +#endif

#ifdef FOO is generally not liked because it reduces the effectiveness
of build testing.

Two suggestions:

	if (IS_ENABLED(CONFIG_PREEMPT_RT))
		e1e_flush();

This will then end up as and if (0) or if (1), with the statement
following it always being compiled, and then optimised out if not
needed.

Alternatively, consider something like:

	if (i % 8)
		e1e_flush()

if there is a reasonable compromise between RT and none RT
performance. Given that RT is now fully merged, we might see some
distros enable it, so a compromise would probably be better.

	Andrew
Gerhard Engleder Oct. 14, 2024, 5:59 p.m. UTC | #2
On 12.10.24 20:42, Andrew Lunn wrote:
> On Fri, Oct 11, 2024 at 09:54:12PM +0200, Gerhard Engleder wrote:
>> From: Gerhard Engleder <eg@keba.com>
>>
>> Link down and up triggers update of MTA table. This update executes many
>> PCIe writes and a final flush. Thus, PCIe will be blocked until all writes
>> are flushed. As a result, DMA transfers of other targets suffer from delay
>> in the range of 50us. The result are timing violations on real-time
>> systems during link down and up of e1000e.
>>
>> Execute a flush after every single write. This prevents overloading the
>> interconnect with posted writes. As this also increases the time spent for
>> MTA table update considerable this change is limited to PREEMPT_RT.
>>
>> Signed-off-by: Gerhard Engleder <eg@keba.com>
>> ---
>>   drivers/net/ethernet/intel/e1000e/mac.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
>> index d7df2a0ed629..f4693d355886 100644
>> --- a/drivers/net/ethernet/intel/e1000e/mac.c
>> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
>> @@ -331,9 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
>>   	}
>>   
>>   	/* replace the entire MTA table */
>> -	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
>> +	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>>   		E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
>> +#ifdef CONFIG_PREEMPT_RT
>> +		e1e_flush();
>> +#endif
>> +	}
>> +#ifndef CONFIG_PREEMPT_RT
>>   	e1e_flush();
>> +#endif
> 
> #ifdef FOO is generally not liked because it reduces the effectiveness
> of build testing.
> 
> Two suggestions:
> 
> 	if (IS_ENABLED(CONFIG_PREEMPT_RT))
> 		e1e_flush();

I will do that.

> This will then end up as and if (0) or if (1), with the statement
> following it always being compiled, and then optimised out if not
> needed.
> 
> Alternatively, consider something like:
> 
> 	if (i % 8)
> 		e1e_flush()
> 
> if there is a reasonable compromise between RT and none RT
> performance. Given that RT is now fully merged, we might see some
> distros enable it, so a compromise would probably be better.

Yes, read/flush after every posted write is likely too much. I will
do some testing how often flush is required.

Thank you for your feedback Andrew!

Any comments from Intel driver maintainers?

Gerhard
Lifshits, Vitaly Oct. 15, 2024, 1:41 p.m. UTC | #3
On 10/14/2024 8:59 PM, Gerhard Engleder wrote:
> On 12.10.24 20:42, Andrew Lunn wrote:
>> On Fri, Oct 11, 2024 at 09:54:12PM +0200, Gerhard Engleder wrote:
>>> From: Gerhard Engleder <eg@keba.com>
>>>
>>> Link down and up triggers update of MTA table. This update executes 
>>> many
>>> PCIe writes and a final flush. Thus, PCIe will be blocked until all 
>>> writes
>>> are flushed. As a result, DMA transfers of other targets suffer from 
>>> delay
>>> in the range of 50us. The result are timing violations on real-time
>>> systems during link down and up of e1000e.
>>>
>>> Execute a flush after every single write. This prevents overloading the
>>> interconnect with posted writes. As this also increases the time 
>>> spent for
>>> MTA table update considerable this change is limited to PREEMPT_RT.
>>>
>>> Signed-off-by: Gerhard Engleder <eg@keba.com>
>>> ---
>>>   drivers/net/ethernet/intel/e1000e/mac.c | 8 +++++++-
>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c 
>>> b/drivers/net/ethernet/intel/e1000e/mac.c
>>> index d7df2a0ed629..f4693d355886 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/mac.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
>>> @@ -331,9 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct 
>>> e1000_hw *hw,
>>>       }
>>>         /* replace the entire MTA table */
>>> -    for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
>>> +    for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>>>           E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, 
>>> hw->mac.mta_shadow[i]);
>>> +#ifdef CONFIG_PREEMPT_RT
>>> +        e1e_flush();
>>> +#endif
>>> +    }
>>> +#ifndef CONFIG_PREEMPT_RT
>>>       e1e_flush();
>>> +#endif
>>
>> #ifdef FOO is generally not liked because it reduces the effectiveness
>> of build testing.
>>
>> Two suggestions:
>>
>>     if (IS_ENABLED(CONFIG_PREEMPT_RT))
>>         e1e_flush();
>
> I will do that.
>
>> This will then end up as and if (0) or if (1), with the statement
>> following it always being compiled, and then optimised out if not
>> needed.

I agree with Andrew, this approach is more elegant and won't cause 
degradation in the performance.


>>
>> Alternatively, consider something like:
>>
>>     if (i % 8)
>>         e1e_flush()
>>
>> if there is a reasonable compromise between RT and none RT
>> performance. Given that RT is now fully merged, we might see some
>> distros enable it, so a compromise would probably be better.
>
> Yes, read/flush after every posted write is likely too much. I will
> do some testing how often flush is required.


I like this approach less, since it might be system-dependent, so that 
on some systems it will work well and on others it will fail

>
> Thank you for your feedback Andrew!
>
> Any comments from Intel driver maintainers?
>
> Gerhard
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
index d7df2a0ed629..f4693d355886 100644
--- a/drivers/net/ethernet/intel/e1000e/mac.c
+++ b/drivers/net/ethernet/intel/e1000e/mac.c
@@ -331,9 +331,15 @@  void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
 	}
 
 	/* replace the entire MTA table */
-	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
+	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
 		E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
+#ifdef CONFIG_PREEMPT_RT
+		e1e_flush();
+#endif
+	}
+#ifndef CONFIG_PREEMPT_RT
 	e1e_flush();
+#endif
 }
 
 /**