Message ID | 20241219192743.4499-1-gerhard@engleder-embedded.com |
---|---|
State | Under Review |
Delegated to: | Anthony Nguyen |
Headers | show |
Series | [iwl-next,v4] e1000e: Fix real-time violations on link up | expand |
On 12/19/2024 9:27 PM, Gerhard Engleder wrote: > From: Gerhard Engleder <eg@keba.com> > > Link down and up triggers update of MTA table. This update executes many > PCIe writes and a final flush. Thus, PCIe will be blocked until all > writes are flushed. As a result, DMA transfers of other targets suffer > from delay in the range of 50us. This results in timing violations on > real-time systems during link down and up of e1000e in combination with > an Intel i3-2310E Sandy Bridge CPU. > > The i3-2310E is quite old. Launched 2011 by Intel but still in use as > robot controller. The exact root cause of the problem is unclear and > this situation won't change as Intel support for this CPU has ended > years ago. Our experience is that the number of posted PCIe writes needs > to be limited at least for real-time systems. With posted PCIe writes a > much higher throughput can be generated than with PCIe reads which > cannot be posted. Thus, the load on the interconnect is much higher. > Additionally, a PCIe read waits until all posted PCIe writes are done. > Therefore, the PCIe read can block the CPU for much more than 10us if a > lot of PCIe writes were posted before. Both issues are the reason why we > are limiting the number of posted PCIe writes in row in general for our > real-time systems, not only for this driver. > > A flush after a low enough number of posted PCIe writes eliminates the > delay but also increases the time needed for MTA table update. The > following measurements were done on i3-2310E with e1000e for 128 MTA > table entries: > > Single flush after all writes: 106us > Flush after every write: 429us > Flush after every 2nd write: 266us > Flush after every 4th write: 180us > Flush after every 8th write: 141us > Flush after every 16th write: 121us > > A flush after every 8th write delays the link up by 35us and the > negative impact to DMA transfers of other targets is still tolerable. > > Execute a flush after every 8th write. This prevents overloading the > interconnect with posted writes. > > Signed-off-by: Gerhard Engleder <eg@keba.com> > Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/ > CC: Vitaly Lifshits <vitaly.lifshits@intel.com> > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> > --- > v4: > - add PREEMPT_RT dependency again (Vitaly Lifshits) > - fix comment styple (Alexander Lobakin) > - add to comment each 8th and explain why (Alexander Lobakin) > - simplify check for every 8th write (Alexander Lobakin) > > v3: > - mention problematic platform explicitly (Bjorn Helgaas) > - improve comment (Paul Menzel) > > v2: > - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel) > --- > drivers/net/ethernet/intel/e1000e/mac.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c > index d7df2a0ed629..44249dd91bd6 100644 > --- a/drivers/net/ethernet/intel/e1000e/mac.c > +++ b/drivers/net/ethernet/intel/e1000e/mac.c > @@ -331,8 +331,21 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw, > } > > /* replace the entire MTA table */ > - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) > + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) { > E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]); > + > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > + /* > + * Do not queue up too many posted writes to prevent > + * increased latency for other devices on the > + * interconnect. Flush after each 8th posted write, > + * to keep additional execution time low while still > + * preventing increased latency. > + */ > + if (!(i % 8) && i) > + e1e_flush(); > + } > + } > e1e_flush(); > } >
On Thu, Dec 19, 2024 at 08:27:43PM +0100, Gerhard Engleder wrote: > From: Gerhard Engleder <eg@keba.com> > > Link down and up triggers update of MTA table. This update executes many > PCIe writes and a final flush. Thus, PCIe will be blocked until all > writes are flushed. As a result, DMA transfers of other targets suffer > from delay in the range of 50us. This results in timing violations on > real-time systems during link down and up of e1000e in combination with > an Intel i3-2310E Sandy Bridge CPU. > > The i3-2310E is quite old. Launched 2011 by Intel but still in use as > robot controller. The exact root cause of the problem is unclear and > this situation won't change as Intel support for this CPU has ended > years ago. Our experience is that the number of posted PCIe writes needs > to be limited at least for real-time systems. With posted PCIe writes a > much higher throughput can be generated than with PCIe reads which > cannot be posted. Thus, the load on the interconnect is much higher. > Additionally, a PCIe read waits until all posted PCIe writes are done. > Therefore, the PCIe read can block the CPU for much more than 10us if a > lot of PCIe writes were posted before. Both issues are the reason why we > are limiting the number of posted PCIe writes in row in general for our > real-time systems, not only for this driver. > > A flush after a low enough number of posted PCIe writes eliminates the > delay but also increases the time needed for MTA table update. The > following measurements were done on i3-2310E with e1000e for 128 MTA > table entries: > > Single flush after all writes: 106us > Flush after every write: 429us > Flush after every 2nd write: 266us > Flush after every 4th write: 180us > Flush after every 8th write: 141us > Flush after every 16th write: 121us > > A flush after every 8th write delays the link up by 35us and the > negative impact to DMA transfers of other targets is still tolerable. > > Execute a flush after every 8th write. This prevents overloading the > interconnect with posted writes. > > Signed-off-by: Gerhard Engleder <eg@keba.com> > Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/ > CC: Vitaly Lifshits <vitaly.lifshits@intel.com> > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Tested-by: Avigail Dahan <avigailx.dahan@intel.com> > --- > v4: > - add PREEMPT_RT dependency again (Vitaly Lifshits) > - fix comment styple (Alexander Lobakin) > - add to comment each 8th and explain why (Alexander Lobakin) > - simplify check for every 8th write (Alexander Lobakin) > > v3: > - mention problematic platform explicitly (Bjorn Helgaas) > - improve comment (Paul Menzel) > > v2: > - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel) Reviewed-by: Simon Horman <horms@kernel.org>
On 06.01.25 12:17, Simon Horman wrote: > On Thu, Dec 19, 2024 at 08:27:43PM +0100, Gerhard Engleder wrote: >> From: Gerhard Engleder <eg@keba.com> >> >> Link down and up triggers update of MTA table. This update executes many >> PCIe writes and a final flush. Thus, PCIe will be blocked until all >> writes are flushed. As a result, DMA transfers of other targets suffer >> from delay in the range of 50us. This results in timing violations on >> real-time systems during link down and up of e1000e in combination with >> an Intel i3-2310E Sandy Bridge CPU. >> >> The i3-2310E is quite old. Launched 2011 by Intel but still in use as >> robot controller. The exact root cause of the problem is unclear and >> this situation won't change as Intel support for this CPU has ended >> years ago. Our experience is that the number of posted PCIe writes needs >> to be limited at least for real-time systems. With posted PCIe writes a >> much higher throughput can be generated than with PCIe reads which >> cannot be posted. Thus, the load on the interconnect is much higher. >> Additionally, a PCIe read waits until all posted PCIe writes are done. >> Therefore, the PCIe read can block the CPU for much more than 10us if a >> lot of PCIe writes were posted before. Both issues are the reason why we >> are limiting the number of posted PCIe writes in row in general for our >> real-time systems, not only for this driver. >> >> A flush after a low enough number of posted PCIe writes eliminates the >> delay but also increases the time needed for MTA table update. The >> following measurements were done on i3-2310E with e1000e for 128 MTA >> table entries: >> >> Single flush after all writes: 106us >> Flush after every write: 429us >> Flush after every 2nd write: 266us >> Flush after every 4th write: 180us >> Flush after every 8th write: 141us >> Flush after every 16th write: 121us >> >> A flush after every 8th write delays the link up by 35us and the >> negative impact to DMA transfers of other targets is still tolerable. >> >> Execute a flush after every 8th write. This prevents overloading the >> interconnect with posted writes. >> >> Signed-off-by: Gerhard Engleder <eg@keba.com> >> Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/ >> CC: Vitaly Lifshits <vitaly.lifshits@intel.com> >> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> >> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> >> --- >> v4: >> - add PREEMPT_RT dependency again (Vitaly Lifshits) >> - fix comment styple (Alexander Lobakin) >> - add to comment each 8th and explain why (Alexander Lobakin) >> - simplify check for every 8th write (Alexander Lobakin) >> >> v3: >> - mention problematic platform explicitly (Bjorn Helgaas) >> - improve comment (Paul Menzel) >> >> v2: >> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel) > > Reviewed-by: Simon Horman <horms@kernel.org> Is there anything left from my side to get this change over iwl-next into net-next? Gerhard
On 2/4/2025 12:18 PM, Gerhard Engleder wrote: > Is there anything left from my side to get this change over iwl-next > into net-next? Hi Gerhard, Nothing at the moment. I have it on my list to send but I'm working through the backlogged patches. Thanks, Tony > Gerhard
diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c index d7df2a0ed629..44249dd91bd6 100644 --- a/drivers/net/ethernet/intel/e1000e/mac.c +++ b/drivers/net/ethernet/intel/e1000e/mac.c @@ -331,8 +331,21 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw, } /* replace the entire MTA table */ - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) { E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]); + + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { + /* + * Do not queue up too many posted writes to prevent + * increased latency for other devices on the + * interconnect. Flush after each 8th posted write, + * to keep additional execution time low while still + * preventing increased latency. + */ + if (!(i % 8) && i) + e1e_flush(); + } + } e1e_flush(); }