Message ID | 1340270358-19504-5-git-send-email-yevgenyp@mellanox.co.il |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Date: Thu, 21 Jun 2012 12:19:17 +0300 > The Transmit and transmit completion flows execute from different contexts, > which are not synchronized. Hence naive reading the of consumer index might > give wrong value by the time it is being used, That could lead to a state of transmit timeout. > Fix that by using atomic variable to maintain that index. > > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> I'm not convinced. There is only one place that actually changes the counter. So it seems more like you have a missing memory barrier somewhere. Other drivers do not need to use something as expansive as an atomic variable for this and neither should you. I'm not applying this patch series, you'll need to resubmit it in it's entirety once you fix this patch. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > The Transmit and transmit completion flows execute from different contexts, > > which are not synchronized. Hence naive reading the of consumer index might > > give wrong value by the time it is being used, That could lead to a state of transmit timeout. > > Fix that by using atomic variable to maintain that index. > > > > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> > > I'm not convinced. There is only one place that actually changes > the counter. > > So it seems more like you have a missing memory barrier somewhere. Or just keep the two ring indexes - instead of keeping the number of 'active' entries as well. Then you don't have a variable which the tx setup and tx completion routines both update. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-06-25 at 10:00 +0100, David Laight wrote: > > > The Transmit and transmit completion flows execute from different > contexts, > > > which are not synchronized. Hence naive reading the of consumer > index might > > > give wrong value by the time it is being used, That could lead to a > state of transmit timeout. > > > Fix that by using atomic variable to maintain that index. > > > > > > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> > > > > I'm not convinced. There is only one place that actually changes > > the counter. > > > > So it seems more like you have a missing memory barrier somewhere. > > Or just keep the two ring indexes - instead of keeping the > number of 'active' entries as well. > Then you don't have a variable which the tx setup and > tx completion routines both update. This is what was implied by David. Using a producer/consumer index and appropriate memory barriers. start_xmit() and tx completion can be truly lockless and atomicless in their fast path. There are many drivers doing that correctly. tg3 driver is a good example. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > The Transmit and transmit completion flows execute from different > > contexts, which are not synchronized. Hence naive reading the of > > consumer index might give wrong value by the time it is being used, That could lead to a state of transmit timeout. > > Fix that by using atomic variable to maintain that index. > > > > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> > > I'm not convinced. There is only one place that actually changes the counter. > > So it seems more like you have a missing memory barrier somewhere. > > Other drivers do not need to use something as expansive as an atomic > variable for this and neither should you. > > I'm not applying this patch series, you'll need to resubmit it in it's > entirety once you fix this patch. Thanks, I'll resubmit the other 3 and continue to work on this one. Thanks, Yevgeny -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c index 019d856..f4b4703 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c @@ -165,6 +165,7 @@ int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv, ring->last_nr_txbb = 1; ring->poll_cnt = 0; ring->blocked = 0; + atomic_set(&ring->inflight, 0); memset(ring->tx_info, 0, ring->size * sizeof(struct mlx4_en_tx_info)); memset(ring->buf, 0, ring->buf_size); @@ -364,15 +365,13 @@ static void mlx4_en_process_tx_cq(struct net_device *dev, struct mlx4_en_cq *cq) wmb(); ring->cons += txbbs_skipped; netdev_tx_completed_queue(ring->tx_queue, packets, bytes); + atomic_sub(txbbs_skipped, &ring->inflight); /* Wakeup Tx queue if this ring stopped it */ - if (unlikely(ring->blocked)) { - if ((u32) (ring->prod - ring->cons) <= - ring->size - HEADROOM - MAX_DESC_TXBBS) { - ring->blocked = 0; - netif_tx_wake_queue(ring->tx_queue); - priv->port_stats.wake_queue++; - } + if (unlikely(ring->blocked && txbbs_skipped > 0)) { + ring->blocked = 0; + netif_tx_wake_queue(ring->tx_queue); + priv->port_stats.wake_queue++; } } @@ -588,7 +587,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) vlan_tag = vlan_tx_tag_get(skb); /* Check available TXBBs And 2K spare for prefetch */ - if (unlikely(((int)(ring->prod - ring->cons)) > + if (unlikely(atomic_read(&ring->inflight) > ring->size - HEADROOM - MAX_DESC_TXBBS)) { /* every full Tx ring stops queue */ netif_tx_stop_queue(ring->tx_queue); @@ -710,6 +709,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) } ring->prod += nr_txbb; + atomic_add(nr_txbb, &ring->inflight); /* If we used a bounce buffer then copy descriptor back into place */ if (bounce) diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 225c20d..6a8a69d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -257,6 +257,7 @@ struct mlx4_en_tx_ring { struct mlx4_bf bf; bool bf_enabled; struct netdev_queue *tx_queue; + atomic_t inflight; }; struct mlx4_en_rx_desc {
The Transmit and transmit completion flows execute from different contexts, which are not synchronized. Hence naive reading the of consumer index might give wrong value by the time it is being used, That could lead to a state of transmit timeout. Fix that by using atomic variable to maintain that index. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> --- drivers/net/ethernet/mellanox/mlx4/en_tx.c | 16 ++++++++-------- drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 + 2 files changed, 9 insertions(+), 8 deletions(-)