Message ID | 20130712180116.21176.qmail@science.horizon.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, Jul 12, 2013 at 11:01 AM, George Spelvin <linux@horizon.com> wrote: >> Hi George, >> While you are right that functionally it doesn't matter, my preference >> would be to have nothing between the wmb() and iowrite() that kicks >> off the TX. This marginally helps kick off the TX process consistently >> slightly sooner. On modern HW, probably irrelevant, but not on the HW >> these chips are used on. > > I'll revise it. It just made sense to me to put it next to the other > bookkeeping line of tp->cur_tx++. Should I move them both below the > iowrite()? I would prefer that. I agree it's better to keep those two lines of code together. Just keep in mind this is a nit and it's not critical to accepting your change. So whatever you submit, I'll probably ACK. > As in: > > --- a/drivers/net/ethernet/dec/tulip/tulip_core.c > +++ b/drivers/net/ethernet/dec/tulip/tulip_core.c > @@ -702,11 +702,11 @@ tulip_start_xmit(struct sk_buff *skb, struct net_device *dev) > tp->tx_ring[entry].status = cpu_to_le32(DescOwned); > wmb(); > > - tp->cur_tx++; > - > /* Trigger an immediate transmit demand. */ > iowrite32(0, tp->base_addr + CSR1); > > + tp->cur_tx++; > + netdev_sent_queue(dev, skb->len); > spin_unlock_irqrestore(&tp->lock, flags); > > return NETDEV_TX_OK; Yup - looks good. >> Lastly, given I haven't powered up a system in two years which has >> tulip, any one want to take over maintainer for tulip driver? >> It's basically obsolete with a few rare patches like this one coming in. > > I'm not up to it myself, sorry. No worries. Just wanted to advertise to anyone who bothers to read about tulip patches. :) cheers, grant -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-07-12 at 14:01 -0400, George Spelvin wrote: > > Hi George, > > While you are right that functionally it doesn't matter, my preference > > would be to have nothing between the wmb() and iowrite() that kicks > > off the TX. This marginally helps kick off the TX process consistently > > slightly sooner. On modern HW, probably irrelevant, but not on the HW > > these chips are used on. > > I'll revise it. It just made sense to me to put it next to the other > bookkeeping line of tp->cur_tx++. Should I move them both below the > iowrite()? As in: > > --- a/drivers/net/ethernet/dec/tulip/tulip_core.c > +++ b/drivers/net/ethernet/dec/tulip/tulip_core.c > @@ -702,11 +702,11 @@ tulip_start_xmit(struct sk_buff *skb, struct net_device *dev) > tp->tx_ring[entry].status = cpu_to_le32(DescOwned); > wmb(); > > - tp->cur_tx++; > - > /* Trigger an immediate transmit demand. */ > iowrite32(0, tp->base_addr + CSR1); > > + tp->cur_tx++; > + netdev_sent_queue(dev, skb->len); This is not good practice, because once you start DMA you have effectively passed ownership of the skb to the TX completion handler. > spin_unlock_irqrestore(&tp->lock, flags); Presumably the TX completion handler will hold this spinlock and therefore cannot free the skb before you use skb->len above. So this will be safe now. But one day someone may want to get rid of this lock, so this is a trap waiting to spring. Ben. > return NETDEV_TX_OK; > > > Lastly, given I haven't powered up a system in two years which has > > tulip, any one want to take over maintainer for tulip driver? > > It's basically obsolete with a few rare patches like this one coming in. > > I'm not up to it myself, sorry.
On Mon, Jul 15, 2013 at 4:58 PM, Ben Hutchings <bhutchings@solarflare.com> wrote: > On Fri, 2013-07-12 at 14:01 -0400, George Spelvin wrote: >> > Hi George, >> > While you are right that functionally it doesn't matter, my preference >> > would be to have nothing between the wmb() and iowrite() that kicks >> > off the TX. This marginally helps kick off the TX process consistently >> > slightly sooner. On modern HW, probably irrelevant, but not on the HW >> > these chips are used on. >> >> I'll revise it. It just made sense to me to put it next to the other >> bookkeeping line of tp->cur_tx++. Should I move them both below the >> iowrite()? As in: >> >> --- a/drivers/net/ethernet/dec/tulip/tulip_core.c >> +++ b/drivers/net/ethernet/dec/tulip/tulip_core.c >> @@ -702,11 +702,11 @@ tulip_start_xmit(struct sk_buff *skb, struct net_device *dev) >> tp->tx_ring[entry].status = cpu_to_le32(DescOwned); >> wmb(); >> >> - tp->cur_tx++; >> - >> /* Trigger an immediate transmit demand. */ >> iowrite32(0, tp->base_addr + CSR1); >> >> + tp->cur_tx++; >> + netdev_sent_queue(dev, skb->len); > > This is not good practice, because once you start DMA you have > effectively passed ownership of the skb to the TX completion handler. Is the problem the reference to skb->len? By passing ownership, are you suggesting the device can change this value? AFAIK, tulip device only knows about the contents of tx_ring[] and not skb's. Would you like to see a comment added to that effect? >> spin_unlock_irqrestore(&tp->lock, flags); > > Presumably the TX completion handler will hold this spinlock and > therefore cannot free the skb before you use skb->len above. So this > will be safe now. Correct. > But one day someone may want to get rid of this lock, > so this is a trap waiting to spring. Even for tulip driver? Sorry, I just can't imagine anyone taking enough interest in tulip driver to implement that. I'm not even sure it would be possible. thanks! grant -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2013-07-15 at 18:17 -0700, Grant Grundler wrote: > Even for tulip driver? Sorry, I just can't imagine anyone taking > enough interest in tulip driver to implement that. I'm not even sure > it would be possible. > I would not care. When I reviewed this patch, I was aware of this problem (as this is the most common error done in BQL patches), and decided patch was fine, because lock was held. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2013-07-15 at 18:17 -0700, Grant Grundler wrote: > On Mon, Jul 15, 2013 at 4:58 PM, Ben Hutchings > <bhutchings@solarflare.com> wrote: > > On Fri, 2013-07-12 at 14:01 -0400, George Spelvin wrote: > >> > Hi George, > >> > While you are right that functionally it doesn't matter, my preference > >> > would be to have nothing between the wmb() and iowrite() that kicks > >> > off the TX. This marginally helps kick off the TX process consistently > >> > slightly sooner. On modern HW, probably irrelevant, but not on the HW > >> > these chips are used on. > >> > >> I'll revise it. It just made sense to me to put it next to the other > >> bookkeeping line of tp->cur_tx++. Should I move them both below the > >> iowrite()? As in: > >> > >> --- a/drivers/net/ethernet/dec/tulip/tulip_core.c > >> +++ b/drivers/net/ethernet/dec/tulip/tulip_core.c > >> @@ -702,11 +702,11 @@ tulip_start_xmit(struct sk_buff *skb, struct net_device *dev) > >> tp->tx_ring[entry].status = cpu_to_le32(DescOwned); > >> wmb(); > >> > >> - tp->cur_tx++; > >> - > >> /* Trigger an immediate transmit demand. */ > >> iowrite32(0, tp->base_addr + CSR1); > >> > >> + tp->cur_tx++; > >> + netdev_sent_queue(dev, skb->len); > > > > This is not good practice, because once you start DMA you have > > effectively passed ownership of the skb to the TX completion handler. > > Is the problem the reference to skb->len? > By passing ownership, are you suggesting the device can change this value? No, the device can complete the descriptor and then the TX completion handler will free the skb. [...] > > But one day someone may want to get rid of this lock, > > so this is a trap waiting to spring. > > Even for tulip driver? Sorry, I just can't imagine anyone taking > enough interest in tulip driver to implement that. I'm not even sure > it would be possible. You're taking interest in it, aren't you? But I accept this is a minor issue. Ben.
On Tue, Jul 16, 2013 at 7:37 AM, Ben Hutchings <bhutchings@solarflare.com> wrote: > On Mon, 2013-07-15 at 18:17 -0700, Grant Grundler wrote: ... >> Is the problem the reference to skb->len? >> By passing ownership, are you suggesting the device can change this value? > > No, the device can complete the descriptor and then the TX completion > handler will free the skb. Ah ok. That's what the spinlock_irqsave() is protecting against. > [...] >> > But one day someone may want to get rid of this lock, >> > so this is a trap waiting to spring. >> >> Even for tulip driver? Sorry, I just can't imagine anyone taking >> enough interest in tulip driver to implement that. I'm not even sure >> it would be possible. > > You're taking interest in it, aren't you? Let me clarify my interest: I will reject patches to change the locking for this driver. Patch reviews for this driver are a "community service project" since I know reasonably well how this chip behaves. But I'm not doing anything else (e.g. bug fixes or testing). If someone else wants to change the tulip driver locking, they need to submit a patch to become the new maintainer first (I'll ACK that :) > But I accept this is a minor issue. Ok. cheers, grant -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> wmb(); >> >> - tp->cur_tx++; >> - >> /* Trigger an immediate transmit demand. */ >> iowrite32(0, tp->base_addr + CSR1); >> >> + tp->cur_tx++; >> + netdev_sent_queue(dev, skb->len); >> spin_unlock_irqrestore(&tp->lock, flags); > This is not good practice, because once you start DMA you have > effectively passed ownership of the skb to the TX completion handler. Thank you for this advice. Just to be clear, is the only issue reading skb->len from a potentially deallocated skb? Or is it also going go give the byte queue system fits if the transmit complete handler calls netdev_completed_queue before the transmitter calls netdev_sent_queue? I'd hope it can underflow safely, and only look at the net value after the transmit handler returns. > Presumably the TX completion handler will hold this spinlock and > therefore cannot free the skb before you use skb->len above. So this > will be safe now. But one day someone may want to get rid of this lock, > so this is a trap waiting to spring. Sounds like a fun project. But I have to dig into the BQL code; if it's getting and dropping locks inside netdev_*_queue, the win is limited. (A lock-free version would have separate "sent" and "completed" counters, and compute the difference when a snapshot is required.) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-07-17 at 00:09 -0400, George Spelvin wrote: > >> wmb(); > >> > >> - tp->cur_tx++; > >> - > >> /* Trigger an immediate transmit demand. */ > >> iowrite32(0, tp->base_addr + CSR1); > >> > >> + tp->cur_tx++; > >> + netdev_sent_queue(dev, skb->len); > >> spin_unlock_irqrestore(&tp->lock, flags); > > > This is not good practice, because once you start DMA you have > > effectively passed ownership of the skb to the TX completion handler. > > Thank you for this advice. Just to be clear, is the only issue reading > skb->len from a potentially deallocated skb? That's what I was thinking of. > Or is it also going go > give the byte queue system fits if the transmit complete handler calls > netdev_completed_queue before the transmitter calls netdev_sent_queue? I don't know. > I'd hope it can underflow safely, and only look at the net value after > the transmit handler returns. > > > Presumably the TX completion handler will hold this spinlock and > > therefore cannot free the skb before you use skb->len above. So this > > will be safe now. But one day someone may want to get rid of this lock, > > so this is a trap waiting to spring. > > Sounds like a fun project. But I have to dig into the BQL code; if it's > getting and dropping locks inside netdev_*_queue, the win is limited. > > (A lock-free version would have separate "sent" and "completed" counters, > and compute the difference when a snapshot is required.) Yes, it is lock-free with separate counters. The implementation is in lib/dynamic_queue_limits.c. I think that the use of POSDIFF() protects against races that could leave completed > sent. Ben.
Grumble. After a week of uptime with my tulip patches (the ones I posted here plus some cleanup patches) my kernel gave me this: ------------[ cut here ]------------ WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xd9/0x137() NETDEV WATCHDOG: cable (tulip): transmit queue 0 timed out CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.2-00033-gca137af #47 Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7376/MS-7376, BIOS V1.7 01/13/2009 0000000000000000 ffffffff81028aba ffff88021fc03e68 ffff880216b38000 ffff88021fc03eb8 ffffffff8133643a ffffffff8133643a ffffffff81028b33 ffffffff8159fe93 ffff880200000030 ffff88021fc03ec8 ffff88021fc03e88 Call Trace: <IRQ> [<ffffffff81028aba>] ? warn_slowpath_common+0x56/0x6b [<ffffffff8133643a>] ? qdisc_rcu_free+0x19/0x19 [<ffffffff8133643a>] ? qdisc_rcu_free+0x19/0x19 [<ffffffff81028b33>] ? warn_slowpath_fmt+0x47/0x49 [<ffffffff8133634e>] ? netif_tx_lock+0x47/0x72 [<ffffffff81336513>] ? dev_watchdog+0xd9/0x137 [<ffffffff81031f40>] ? call_timer_fn.isra.29+0x1c/0x6f [<ffffffff810321a2>] ? run_timer_softirq+0x19a/0x1c2 [<ffffffff8102de09>] ? __do_softirq+0xb9/0x171 [<ffffffff8102df7f>] ? irq_exit+0x3a/0x7a [<ffffffff8101967a>] ? smp_apic_timer_interrupt+0x72/0x7e [<ffffffff8142a60a>] ? apic_timer_interrupt+0x6a/0x70 <EOI> [<ffffffff81007e34>] ? amd_e400_idle+0xbf/0xc1 [<ffffffff81007e2c>] ? amd_e400_idle+0xb7/0xc1 [<ffffffff8104f74d>] ? cpu_startup_entry+0x9c/0xec [<ffffffff8164eb91>] ? start_kernel+0x2bd/0x2c8 [<ffffffff8164e6f7>] ? repair_env_string+0x54/0x54 ---[ end trace fa3269ab5c1a15ad ]--- Since it's on the tulip device, it's probably my fault, and since it took a week to manifest itself, it's going to be a complete PITA to reproduce. Anyway, I haven't forgotten. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/net/ethernet/dec/tulip/tulip_core.c +++ b/drivers/net/ethernet/dec/tulip/tulip_core.c @@ -702,11 +702,11 @@ tulip_start_xmit(struct sk_buff *skb, struct net_device *dev) tp->tx_ring[entry].status = cpu_to_le32(DescOwned); wmb(); - tp->cur_tx++; - /* Trigger an immediate transmit demand. */ iowrite32(0, tp->base_addr + CSR1); + tp->cur_tx++; + netdev_sent_queue(dev, skb->len); spin_unlock_irqrestore(&tp->lock, flags); return NETDEV_TX_OK;