Message ID | 20200629191601.5169-1-tobias@waldekranz.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [net] net: ethernet: fec: prevent tx starvation under high rx load | expand |
From: Tobias Waldekranz <tobias@waldekranz.com> Date: Mon, 29 Jun 2020 21:16:01 +0200 > In the ISR, we poll the event register for the queues in need of > service and then enter polled mode. After this point, the event > register will never be read again until we exit polled mode. > > In a scenario where a UDP flow is routed back out through the same > interface, i.e. "router-on-a-stick" we'll typically only see an rx > queue event initially. Once we start to process the incoming flow > we'll be locked polled mode, but we'll never clean the tx rings since > that event is never caught. > > Eventually the netdev watchdog will trip, causing all buffers to be > dropped and then the process starts over again. > > By adding a poll of the active events at each NAPI call, we avoid the > starvation. > > Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue") > Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> I don't see how this can happen since you process the TX queue unconditionally every NAPI pass, regardless of what bits you see set in the IEVENT register. Or don't you? Oh, I see, you don't: for_each_set_bit(queue_id, &fep->work_tx, FEC_ENET_MAX_TX_QS) { That's the problem. Just unconditionally process the TX work regardless of what is in IEVENT. That whole ->tx_work member and the code that uses it can just be deleted. fec_enet_collect_events() can just return a boolean saying whether there is any RX or TX work at all. Than you're performance and latency will be even better in this situation.
From: Tobias Waldekranz <tobias@waldekranz.com> Sent: Tuesday, June 30, 2020 3:16 AM > In the ISR, we poll the event register for the queues in need of service and > then enter polled mode. After this point, the event register will never be read > again until we exit polled mode. > > In a scenario where a UDP flow is routed back out through the same interface, > i.e. "router-on-a-stick" we'll typically only see an rx queue event initially. > Once we start to process the incoming flow we'll be locked polled mode, but > we'll never clean the tx rings since that event is never caught. > > Eventually the netdev watchdog will trip, causing all buffers to be dropped and > then the process starts over again. > > By adding a poll of the active events at each NAPI call, we avoid the > starvation. > > Fixes: 4d494cdc92b3 ("net: fec: change data structure to support > multiqueue") > Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> > --- > drivers/net/ethernet/freescale/fec_main.c | 22 +++++++++++++--------- > 1 file changed, 13 insertions(+), 9 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/fec_main.c > b/drivers/net/ethernet/freescale/fec_main.c > index 2d0d313ee7c5..f6e25c2d2c8c 100644 > --- a/drivers/net/ethernet/freescale/fec_main.c > +++ b/drivers/net/ethernet/freescale/fec_main.c > @@ -1617,8 +1617,17 @@ fec_enet_rx(struct net_device *ndev, int > budget) } > > static bool > -fec_enet_collect_events(struct fec_enet_private *fep, uint int_events) > +fec_enet_collect_events(struct fec_enet_private *fep) > { > + uint int_events; > + > + int_events = readl(fep->hwp + FEC_IEVENT); > + > + /* Don't clear MDIO events, we poll for those */ > + int_events &= ~FEC_ENET_MII; > + > + writel(int_events, fep->hwp + FEC_IEVENT); > + > if (int_events == 0) > return false; > > @@ -1644,16 +1653,9 @@ fec_enet_interrupt(int irq, void *dev_id) { > struct net_device *ndev = dev_id; > struct fec_enet_private *fep = netdev_priv(ndev); > - uint int_events; > irqreturn_t ret = IRQ_NONE; > > - int_events = readl(fep->hwp + FEC_IEVENT); > - > - /* Don't clear MDIO events, we poll for those */ > - int_events &= ~FEC_ENET_MII; > - > - writel(int_events, fep->hwp + FEC_IEVENT); > - fec_enet_collect_events(fep, int_events); > + fec_enet_collect_events(fep); > > if ((fep->work_tx || fep->work_rx) && fep->link) { > ret = IRQ_HANDLED; > @@ -1674,6 +1676,8 @@ static int fec_enet_rx_napi(struct napi_struct > *napi, int budget) > struct fec_enet_private *fep = netdev_priv(ndev); > int pkts; > > + fec_enet_collect_events(fep); > + > pkts = fec_enet_rx(ndev, budget); > > fec_enet_tx(ndev); > -- > 2.17.1
On Mon Jun 29, 2020 at 3:07 PM CEST, David Miller wrote: > From: Tobias Waldekranz <tobias@waldekranz.com> > Date: Mon, 29 Jun 2020 21:16:01 +0200 > > > In the ISR, we poll the event register for the queues in need of > > service and then enter polled mode. After this point, the event > > register will never be read again until we exit polled mode. > > > > In a scenario where a UDP flow is routed back out through the same > > interface, i.e. "router-on-a-stick" we'll typically only see an rx > > queue event initially. Once we start to process the incoming flow > > we'll be locked polled mode, but we'll never clean the tx rings since > > that event is never caught. > > > > Eventually the netdev watchdog will trip, causing all buffers to be > > dropped and then the process starts over again. > > > > By adding a poll of the active events at each NAPI call, we avoid the > > starvation. > > > > Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue") > > Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> > > I don't see how this can happen since you process the TX queue > unconditionally every NAPI pass, regardless of what bits you see > set in the IEVENT register. > > Or don't you? Oh, I see, you don't: > > for_each_set_bit(queue_id, &fep->work_tx, FEC_ENET_MAX_TX_QS) { > > That's the problem. Just unconditionally process the TX work regardless > of what is in IEVENT. That whole ->tx_work member and the code that > uses it can just be deleted. fec_enet_collect_events() can just return > a boolean saying whether there is any RX or TX work at all. Maybe Andy could chime in here, but I think the ->tx_work construction is load bearing. It seems to me like that is the only thing stopping us from trying to process non-existing queues on older versions of the silicon which only has a single queue.
From: "Tobias Waldekranz" <tobias@waldekranz.com> Date: Tue, 30 Jun 2020 08:39:58 +0200 > On Mon Jun 29, 2020 at 3:07 PM CEST, David Miller wrote: >> I don't see how this can happen since you process the TX queue >> unconditionally every NAPI pass, regardless of what bits you see >> set in the IEVENT register. >> >> Or don't you? Oh, I see, you don't: >> >> for_each_set_bit(queue_id, &fep->work_tx, FEC_ENET_MAX_TX_QS) { >> >> That's the problem. Just unconditionally process the TX work regardless >> of what is in IEVENT. That whole ->tx_work member and the code that >> uses it can just be deleted. fec_enet_collect_events() can just return >> a boolean saying whether there is any RX or TX work at all. > > Maybe Andy could chime in here, but I think the ->tx_work construction > is load bearing. It seems to me like that is the only thing stopping > us from trying to process non-existing queues on older versions of the > silicon which only has a single queue. Then iterate over "actually existing" queues. My primary point still stands.
From: David Miller <davem@davemloft.net> Sent: Wednesday, July 1, 2020 3:58 AM > From: "Tobias Waldekranz" <tobias@waldekranz.com> > Date: Tue, 30 Jun 2020 08:39:58 +0200 > > > On Mon Jun 29, 2020 at 3:07 PM CEST, David Miller wrote: > >> I don't see how this can happen since you process the TX queue > >> unconditionally every NAPI pass, regardless of what bits you see set > >> in the IEVENT register. > >> > >> Or don't you? Oh, I see, you don't: > >> > >> for_each_set_bit(queue_id, &fep->work_tx, FEC_ENET_MAX_TX_QS) { > >> > >> That's the problem. Just unconditionally process the TX work > >> regardless of what is in IEVENT. That whole ->tx_work member and the > >> code that uses it can just be deleted. fec_enet_collect_events() can > >> just return a boolean saying whether there is any RX or TX work at all. > > > > Maybe Andy could chime in here, but I think the ->tx_work construction > > is load bearing. It seems to me like that is the only thing stopping > > us from trying to process non-existing queues on older versions of the > > silicon which only has a single queue. > > Then iterate over "actually existing" queues. Yes, the iterate over real queues, but only bit2 has the chance to be set, so it Is compatible with single queue.
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 2d0d313ee7c5..f6e25c2d2c8c 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1617,8 +1617,17 @@ fec_enet_rx(struct net_device *ndev, int budget) } static bool -fec_enet_collect_events(struct fec_enet_private *fep, uint int_events) +fec_enet_collect_events(struct fec_enet_private *fep) { + uint int_events; + + int_events = readl(fep->hwp + FEC_IEVENT); + + /* Don't clear MDIO events, we poll for those */ + int_events &= ~FEC_ENET_MII; + + writel(int_events, fep->hwp + FEC_IEVENT); + if (int_events == 0) return false; @@ -1644,16 +1653,9 @@ fec_enet_interrupt(int irq, void *dev_id) { struct net_device *ndev = dev_id; struct fec_enet_private *fep = netdev_priv(ndev); - uint int_events; irqreturn_t ret = IRQ_NONE; - int_events = readl(fep->hwp + FEC_IEVENT); - - /* Don't clear MDIO events, we poll for those */ - int_events &= ~FEC_ENET_MII; - - writel(int_events, fep->hwp + FEC_IEVENT); - fec_enet_collect_events(fep, int_events); + fec_enet_collect_events(fep); if ((fep->work_tx || fep->work_rx) && fep->link) { ret = IRQ_HANDLED; @@ -1674,6 +1676,8 @@ static int fec_enet_rx_napi(struct napi_struct *napi, int budget) struct fec_enet_private *fep = netdev_priv(ndev); int pkts; + fec_enet_collect_events(fep); + pkts = fec_enet_rx(ndev, budget); fec_enet_tx(ndev);
In the ISR, we poll the event register for the queues in need of service and then enter polled mode. After this point, the event register will never be read again until we exit polled mode. In a scenario where a UDP flow is routed back out through the same interface, i.e. "router-on-a-stick" we'll typically only see an rx queue event initially. Once we start to process the incoming flow we'll be locked polled mode, but we'll never clean the tx rings since that event is never caught. Eventually the netdev watchdog will trip, causing all buffers to be dropped and then the process starts over again. By adding a poll of the active events at each NAPI call, we avoid the starvation. Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> --- drivers/net/ethernet/freescale/fec_main.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)