Message ID | 4A11391D.8060503@cosmosbay.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Eric Dumazet <dada1@cosmosbay.com> Date: Mon, 18 May 2009 12:31:57 +0200 > [PATCH] sch_teql: should not dereference skb after ndo_start_xmit() > > It is illegal to dereference a skb after a successful ndo_start_xmit() > call. We must store skb length in a local variable instead. > > Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb > (net_sched: Add accessor function for packet length for qdiscs) > > Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Applied and queued up for -stable, thanks! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller a écrit : > From: Eric Dumazet <dada1@cosmosbay.com> > Date: Mon, 18 May 2009 12:31:57 +0200 > >> [PATCH] sch_teql: should not dereference skb after ndo_start_xmit() >> >> It is illegal to dereference a skb after a successful ndo_start_xmit() >> call. We must store skb length in a local variable instead. >> >> Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb >> (net_sched: Add accessor function for packet length for qdiscs) >> >> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> > > Applied and queued up for -stable, thanks! Looking again at teql_master_xmit(), I wonder if there is another problem in it. int subq = skb_get_queue_mapping(skb); But as a matter of fact, following code assumes subq is 0 struct netdev_queue *slave_txq = netdev_get_tx_queue(slave, 0); ... if (__netif_subqueue_stopped(slave, subq) || ... Either we should set subq to 0, or call dev_pick_tx() to better take into account multi queue slaves ? (As teqlN has one queue, I assume original dev_queue_xmit() will set skb queue_mapping to 0 before entering teql_master_xmit(), but maybe other paths could call teql_master_xmit() not through dev_queue_xmit() ? Oh well, time for sleeping a bit... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <dada1@cosmosbay.com> Date: Tue, 19 May 2009 03:34:03 +0200 > Looking again at teql_master_xmit(), I wonder if there is another > problem in it. > > int subq = skb_get_queue_mapping(skb); > > But as a matter of fact, following code assumes subq is 0 > > struct netdev_queue *slave_txq = netdev_get_tx_queue(slave, 0); > ... > if (__netif_subqueue_stopped(slave, subq) || > ... > > Either we should set subq to 0, or call dev_pick_tx() to better > take into account multi queue slaves ? > > (As teqlN has one queue, I assume original dev_queue_xmit() will > set skb queue_mapping to 0 before entering teql_master_xmit(), but > maybe other paths could call teql_master_xmit() not through dev_queue_xmit() ? > > Oh well, time for sleeping a bit... I was admittedly very hasty with the multiqueue conversion here. I'll take a closer look at this. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <dada1@cosmosbay.com> Date: Tue, 19 May 2009 03:34:03 +0200 > Looking again at teql_master_xmit(), I wonder if there is another > problem in it. > > int subq = skb_get_queue_mapping(skb); > > But as a matter of fact, following code assumes subq is 0 I looked into this again, and damn this is tough to deal with. The code works as-is, since teql devices have only 1 queue we can assume queue 0 and we only end up using one of the slave devices queues too. I tried to export dev_pick_tx() (renaming it to netdev_pick_tx() to avoid global namespace pollution) but then I realized that we can't just whack the subq here. If this gets punted back to the caller and we don't actually send out the packet, we can't leave the new skb->queue_index in there as teql's multiqueue parameters are what will be checked and used against this SKB again. I suppose we could restore the old queue index value when we exhaust the slaves and can't transmit, but is getting messy for sure. Since the code works properly, and this is merely a performance issue, I'm deferring this again. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sched/sch_teql.c b/net/sched/sch_teql.c index ec697ce..3b64182 100644 --- a/net/sched/sch_teql.c +++ b/net/sched/sch_teql.c @@ -303,6 +303,8 @@ restart: switch (teql_resolve(skb, skb_res, slave)) { case 0: if (__netif_tx_trylock(slave_txq)) { + unsigned int length = qdisc_pkt_len(skb); + if (!netif_tx_queue_stopped(slave_txq) && !netif_tx_queue_frozen(slave_txq) && slave_ops->ndo_start_xmit(skb, slave) == 0) { @@ -310,8 +312,7 @@ restart: master->slaves = NEXT_SLAVE(q); netif_wake_queue(dev); master->stats.tx_packets++; - master->stats.tx_bytes += - qdisc_pkt_len(skb); + master->stats.tx_bytes += length; return 0; } __netif_tx_unlock(slave_txq);
David I found following by code inspection, not a crash analysis, but I believe it is a 2.6.30 candidate, and stable (2.6.27 and up) as well. Thank you [PATCH] sch_teql: should not dereference skb after ndo_start_xmit() It is illegal to dereference a skb after a successful ndo_start_xmit() call. We must store skb length in a local variable instead. Bug was introduced in 2.6.27 by commit 0abf77e55a2459aa9905be4b226e4729d5b4f0cb (net_sched: Add accessor function for packet length for qdiscs) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> --- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html