Message ID | 1274456515.9022.14.camel@mulgrave.site |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 21 May 2010, James Bottomley wrote: > > The patch in question is this one (upstream for a while): > > commit d7d05548a62c87ee55b0c81933669177f885aa8d > Author: Mike Christie <michaelc@cs.wisc.edu> > Date: Wed Mar 31 14:41:35 2010 -0500 > > [SCSI] iscsi_tcp: fix relogin/shutdown hang > > It's a simple one line change in iscsi_tcp.c (diff clipped): > > --- a/drivers/scsi/iscsi_tcp.c > +++ b/drivers/scsi/iscsi_tcp.c > @@ -599,7 +599,7 @@ static void iscsi_sw_tcp_conn_stop(struct iscsi_cls_conn *cls_conn, int flag) > - if (sock->sk->sk_sleep && waitqueue_active(sock->sk->sk_sleep)) { > + if (sock->sk->sk_sleep) { > > It was killed by this merge commit in the net-next tree: > > commit 278554bd6579206921f5d8a523649a7a57f8850d > Merge: 5a147e8 cea0d76 > Author: David S. Miller <davem@davemloft.net> > Date: Wed May 12 00:05:35 2010 -0700 > > However, the curious thing is that git seems to have lost trace of the > missing patch entirely No, it's there, and the bug is that David doesn't do merges well. One of the reasons I ask people to let me merge is that it results in cleaner history to not have criss-cross merges. And another is that I'm pretty good at it, and letting me make merges also means that I am more aware of problem spots. > if I try to find it in linus' tree with a git log -- > drivers/scsi/iscsi_tcp.c, it doesn't show up. That is because "git log" will see the merge, see that _all_ history came from the other side, and ignore the side that was ignored. Because clearly, that other side didn't actually contribute anything. Now, the _reason_ that other side didn't contribute anything is that David messed up the merge, and took just his own sides changes. > The merge commit which killed it does list iscsi_tcp.c as a file > conflict Yes. I re-did the merge, and the result looks like this (cut-and-paste whitespace damage, I'm just illustrating he point): diff --cc drivers/scsi/iscsi_tcp.c index 9eae04a,02143af..0000000 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@@ -599,9 -599,9 +599,13 @@@ static void iscsi_sw_tcp_conn_stop(stru set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_rx); write_unlock_bh(&tcp_sw_conn->sock->sk->sk_callback_lock); ++<<<<<<< HEAD + if (sk_sleep(sock->sk) && waitqueue_active(sk_sleep(sock->sk))) { ++======= + if (sock->sk->sk_sleep) { ++>>>>>>> cea0d76 sock->sk->sk_err = EIO; - wake_up_interruptible(sock->sk->sk_sleep); + wake_up_interruptible(sk_sleep(sock->sk)); } iscsi_conn_stop(cls_conn, flag); and David picked his side of things, not your side. The _correct_ merge would have been to take both changes, as is quite obvious if you do gitk 5a147e8...cea0d76 drivers/scsi/iscsi_tcp.c and see that the conflict comes because: - one side (David's) changed sock->sk->sk_sleep into sk_sleep(sock->sk) in commit aa395145165cb06a0d0885221bbe0ce4a564391d - the other side (your) removed the 'waitqueue_active()' part in commit d7d05548a62c87ee55b0c81933669177f885aa8d. So the end result _should_ have been this merge resolution: diff --cc drivers/scsi/iscsi_tcp.c index 9eae04a,02143af..0000000 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@@ -599,9 -599,9 +599,9 @@@ static void iscsi_sw_tcp_conn_stop(stru set_bit(ISCSI_SUSPEND_BIT, &conn->suspend_rx); write_unlock_bh(&tcp_sw_conn->sock->sk->sk_callback_lock); - if (sk_sleep(sock->sk) && waitqueue_active(sk_sleep(sock->sk))) { - if (sock->sk->sk_sleep) { ++ if (sk_sleep(sock->sk)) { sock->sk->sk_err = EIO; - wake_up_interruptible(sock->sk->sk_sleep); + wake_up_interruptible(sk_sleep(sock->sk)); } iscsi_conn_stop(cls_conn, flag); but David just picked his side entirely. And that is also the reason for: > but git show on that commit doesn't list that file in the resolution > diff ... even though this is where it actually got killed. A merge diff ("combined diff") only shows conflicts as defined by "you resolved it to something that didn't match either side". That's a _real_ conflict. If you just end up picking one side, there is no diff. > Is this a git problem ... or is it just a mismerge in the net tree? So it's a mis-merge. You can see the commit with gitk v2.6.34.. --full-history drivers/scsi/iscsi_tcp.c which doesn't do the "ignore the side of a merge that didn't bring any new data in". Or, with any recent git, you can use "--simplify-merges" instead of full-history, which only simplifies trivial merges where neither side really touched things at all. If you do that, you'll also see why git doesn't show the uninteresting side of a merge by default. Just for fun, compare the graphs of gitk v2.6.34.. drivers/scsi/iscsi_tcp.c gitk v2.6.34.. --simplify-merges drivers/scsi/iscsi_tcp.c gitk v2.6.34.. --full-history drivers/scsi/iscsi_tcp.c and ask yourself: do you normally want to see _all_ the history, even stuff that didn't end up affecting the end result? > Either way, of course, we need the patch back ... I'll fix it up. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 21 May 2010, Linus Torvalds wrote: > > > Either way, of course, we need the patch back ... > > I'll fix it up. Hmm. Pushed that out as appended, since that is the correct resolve. HOWEVER - the code still doesn't actually make any sense. It does if (sk_sleep(sock->sk)) { and that sk_sleep() today is an inline function that just does return &sk->sk_wq->wait; and testing the result of an address-of operation for NULL is almost certainly totally non-sensical. Sure, it _might_ work (maybe 'wait' is the first element in the 'sk_wq' data structure, and sk_wq is NULL), but that kind of code is always total and utterl crap regardless. So I pushed it out because I had done the resolve already, and it's no worse than it was before, but it's still a steaming buggy pile of shit. It being iscsi, I can't bring myself to care. But somebody who does, should really look at it. The most likely resolution is to remove the test entirely, since I don't think it's ever valid to have a socket that doesn't have a sk_wq (there's a _lot_ of unconditional use of sk_sleep()). Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2010-05-21 at 10:04 -0700, Linus Torvalds wrote: > > On Fri, 21 May 2010, Linus Torvalds wrote: > > > > > Either way, of course, we need the patch back ... > > > > I'll fix it up. > > Hmm. Pushed that out as appended, since that is the correct resolve. Thanks! > HOWEVER - the code still doesn't actually make any sense. It does > > if (sk_sleep(sock->sk)) { > > and that sk_sleep() today is an inline function that just does > > return &sk->sk_wq->wait; > > and testing the result of an address-of operation for NULL is almost > certainly totally non-sensical. Sure, it _might_ work (maybe 'wait' is the > first element in the 'sk_wq' data structure, and sk_wq is NULL), but that > kind of code is always total and utterl crap regardless. > > So I pushed it out because I had done the resolve already, and it's no > worse than it was before, but it's still a steaming buggy pile of shit. Yes, the problem was caused by this patch commit 43815482370c510c569fd18edb57afcb0fa8cab6 Author: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu Apr 29 11:01:49 2010 +0000 net: sock_def_readable() and friends RCU conversion Which moved sk_sleep() from returning the pointer to the waitqueue, which may or may not be assigned to returning a pointer to an internal waitqueue in the socket, which, obviously, can never be null. I suspect what iscsi should be doing is always sending the wakeup ... in which case with your resolution, the code is operating correctly even if the form is suboptimal. > It being iscsi, I can't bring myself to care. But somebody who does, > should really look at it. The most likely resolution is to remove the test > entirely, since I don't think it's ever valid to have a socket that > doesn't have a sk_wq (there's a _lot_ of unconditional use of sk_sleep()). I'll have Mike look at it, but I think just removing the if() will be the correct resolution. James -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: James Bottomley <James.Bottomley@suse.de> Date: Fri, 21 May 2010 10:41:55 -0500 > Is this a git problem ... or is it just a mismerge in the net tree? Mismerge, because sk->sk_sleep() doesn't exist any more I mistakenly updated the original line to do the sk_sleep() stuff. Sorry about that. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Linus Torvalds <torvalds@linux-foundation.org> Date: Fri, 21 May 2010 09:45:49 -0700 (PDT) > One of the reasons I ask people to let me merge is that it results in > cleaner history to not have criss-cross merges. And another is that I'm > pretty good at it, and letting me make merges also means that I am more > aware of problem spots. That wasn't possible in this case. This happened more than a week ago, as I needed to merge your tree into net-2.6 to resolve a conflict there. That's what took in the iscsi bug fix, and this is way before the merge window. Then I needed to pull net-2.6 into net-next-2.6 to resolve conflicts existing between those two trees. And this is why I ended up having to do the merge :-) >> Either way, of course, we need the patch back ... > > I'll fix it up. Thanks Linus. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -599,7 +599,7 @@ static void iscsi_sw_tcp_conn_stop(struct iscsi_cls_conn *cls_conn, int flag) - if (sock->sk->sk_sleep && waitqueue_active(sock->sk->sk_sleep)) { + if (sock->sk->sk_sleep) { It was killed by this merge commit in the net-next tree: commit 278554bd6579206921f5d8a523649a7a57f8850d Merge: 5a147e8 cea0d76 Author: David S. Miller <davem@davemloft.net>