Message ID | 1242311620.6560.14.camel@heimdal.trondhjem.org |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
Trond Myklebust <trond.myklebust@fys.uio.no> writes: > On Thu, 2009-05-14 at 09:34 -0400, Jeff Moyer wrote: >> Trond Myklebust <trond.myklebust@fys.uio.no> writes: >> >> > On Wed, 2009-05-13 at 15:29 -0400, Jeff Moyer wrote: >> >> Hi, netdev folks. The summary here is: >> >> >> >> A patch added in the 2.6.30 development cycle caused a performance >> >> regression in my NFS iozone testing. The patch in question is the >> >> following: >> >> >> >> commit 47a14ef1af48c696b214ac168f056ddc79793d0e >> >> Author: Olga Kornievskaia <aglo@citi.umich.edu> >> >> Date: Tue Oct 21 14:13:47 2008 -0400 >> >> >> >> svcrpc: take advantage of tcp autotuning >> >> >> >> which is also quoted below. Using 8 nfsd threads, a single client doing >> >> 2GB of streaming read I/O goes from 107590 KB/s under 2.6.29 to 65558 >> >> KB/s under 2.6.30-rc4. I also see more run to run variation under >> >> 2.6.30-rc4 using the deadline I/O scheduler on the server. That >> >> variation disappears (as does the performance regression) when reverting >> >> the above commit. >> > >> > It looks to me as if we've got a bug in the svc_tcp_has_wspace() helper >> > function. I can see no reason why we should stop processing new incoming >> > RPC requests just because the send buffer happens to be 2/3 full. If we >> > see that we have space for another reply, then we should just go for it. >> > OTOH, we do want to ensure that the SOCK_NOSPACE flag remains set, so >> > that the TCP layer knows that we're congested, and that we'd like it to >> > increase the send window size, please. >> > >> > Could you therefore please see if the following (untested) patch helps? >> >> I'm seeing slightly better results with the patch: >> >> 71548 >> 75987 >> 71557 >> 87432 >> 83538 >> >> But that's still not up to the speeds we saw under 2.6.29. The packet >> capture for one run can be found here: >> http://people.redhat.com/jmoyer/trond.pcap.bz2 >> >> Cheers, >> Jeff > > Yes. Something is very wrong there... > > See for instance frame 1195, where the client finishes sending a whole > series of READ requests, and we go into a flurry of ACKs passing > backwards and forwards, but no data. It looks as if the NFS server isn't > processing anything, probably because the window size falls afoul of the > svc_tcp_has_wspace()... > > Does something like this help? Is this in addition to the previous patch or instead of it? Cheers, Jeff -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Trond Myklebust <trond.myklebust@fys.uio.no> writes: > On Thu, 2009-05-14 at 09:34 -0400, Jeff Moyer wrote: >> Trond Myklebust <trond.myklebust@fys.uio.no> writes: >> >> > On Wed, 2009-05-13 at 15:29 -0400, Jeff Moyer wrote: >> >> Hi, netdev folks. The summary here is: >> >> >> >> A patch added in the 2.6.30 development cycle caused a performance >> >> regression in my NFS iozone testing. The patch in question is the >> >> following: >> >> >> >> commit 47a14ef1af48c696b214ac168f056ddc79793d0e >> >> Author: Olga Kornievskaia <aglo@citi.umich.edu> >> >> Date: Tue Oct 21 14:13:47 2008 -0400 >> >> >> >> svcrpc: take advantage of tcp autotuning >> >> >> >> which is also quoted below. Using 8 nfsd threads, a single client doing >> >> 2GB of streaming read I/O goes from 107590 KB/s under 2.6.29 to 65558 >> >> KB/s under 2.6.30-rc4. I also see more run to run variation under >> >> 2.6.30-rc4 using the deadline I/O scheduler on the server. That >> >> variation disappears (as does the performance regression) when reverting >> >> the above commit. >> > >> > It looks to me as if we've got a bug in the svc_tcp_has_wspace() helper >> > function. I can see no reason why we should stop processing new incoming >> > RPC requests just because the send buffer happens to be 2/3 full. If we >> > see that we have space for another reply, then we should just go for it. >> > OTOH, we do want to ensure that the SOCK_NOSPACE flag remains set, so >> > that the TCP layer knows that we're congested, and that we'd like it to >> > increase the send window size, please. >> > >> > Could you therefore please see if the following (untested) patch helps? >> >> I'm seeing slightly better results with the patch: >> >> 71548 >> 75987 >> 71557 >> 87432 >> 83538 >> >> But that's still not up to the speeds we saw under 2.6.29. The packet >> capture for one run can be found here: >> http://people.redhat.com/jmoyer/trond.pcap.bz2 >> >> Cheers, >> Jeff > > Yes. Something is very wrong there... > > See for instance frame 1195, where the client finishes sending a whole > series of READ requests, and we go into a flurry of ACKs passing > backwards and forwards, but no data. It looks as if the NFS server isn't > processing anything, probably because the window size falls afoul of the > svc_tcp_has_wspace()... > > Does something like this help? Sorry for the previous, stupid question. I applied the patch in addition the last one and here are the results: 70327 71561 68760 69199 65324 A packet capture for this run is available here: http://people.redhat.com/jmoyer/trond2.pcap.bz2 Any more ideas? ;) -Jeff -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 8962355..4837442 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -972,9 +972,16 @@ static int svc_tcp_has_wspace(struct svc_xprt *xprt) { struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt); struct svc_serv *serv = svsk->sk_xprt.xpt_server; + int reserved; int required; - required = (atomic_read(&xprt->xpt_reserved) + serv->sv_max_mesg) * 2; + reserved = atomic_read(&xprt->xpt_reserved); + /* Always allow the server to process at least one request, whether + * or not the TCP window is large enough + */ + if (reserved == 0) + return 1; + required = (reserved + serv->sv_max_mesg) << 1; if (sk_stream_wspace(svsk->sk_sk) < required) goto out_nospace; return 1;