From patchwork Fri Apr 26 18:35:10 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Poirier X-Patchwork-Id: 239985 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 86DCC2C0114 for ; Sat, 27 Apr 2013 04:36:40 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757441Ab3DZSfo (ORCPT ); Fri, 26 Apr 2013 14:35:44 -0400 Received: from mail-qe0-f45.google.com ([209.85.128.45]:34931 "EHLO mail-qe0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756090Ab3DZSfn (ORCPT ); Fri, 26 Apr 2013 14:35:43 -0400 Received: by mail-qe0-f45.google.com with SMTP id a11so1427478qen.18 for ; Fri, 26 Apr 2013 11:35:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references; bh=Tsp7WvnY4jkjcdcETHHXEKaTA1Kp7OulN6ZO/FCnr7c=; b=lcj7xZsU5i2pv4ZFVHdE1WsnUX5x6HShXVRoO/34k6RlGJ7V0voF9aMskv+DI157s5 9C31NGKE0OwTN17GH6WbudxJKW9P6ZqA8x5sNt3DoEZ3nkPDU0woTaF1QULiEtMQxh0w AxN0d6BqJue/6TSXUnmLYXKQmFsLFZ2Oh7ly8Mkpafmfc/lOfGNsUghxVrCP4EHQC9BX WShhnUk/yOEcWfBC4jPoBV1+14P500qNa93QmafDlUFibtSmi9tM6ho6CjIuEO1rP+8w Sb3OmKcbbbYs/nOiUUELwaNUUNztenkuyuG4IagSHyrtE+3NI6ApxQ3Phtqy04pxxaC0 Ro3g== X-Received: by 10.224.136.132 with SMTP id r4mr42158103qat.93.1367001342355; Fri, 26 Apr 2013 11:35:42 -0700 (PDT) Received: from d2.synalogic.ca (modemcable062.27-82-70.mc.videotron.ca. [70.82.27.62]) by mx.google.com with ESMTPSA id n5sm12022130qav.2.2013.04.26.11.35.41 for (version=TLSv1.2 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 26 Apr 2013 11:35:41 -0700 (PDT) From: Benjamin Poirier To: "David S. Miller" , Eric Dumazet , Pavel Emelyanov Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH net 1/3] unix/dgram: peek beyond 0-sized skbs Date: Fri, 26 Apr 2013 14:35:10 -0400 Message-Id: <1367001312-6719-1-git-send-email-bpoirier@suse.de> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1366915736.8964.171.camel@edumazet-glaptop> References: <1366915736.8964.171.camel@edumazet-glaptop> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 2013/04/25 11:48, Eric Dumazet wrote: > On Thu, 2013-04-25 at 09:47 -0400, Benjamin Poirier wrote: > > "77c1090 net: fix infinite loop in __skb_recv_datagram()" (v3.8) introduced a > > regression: > > After that commit, recv can no longer peek beyond a 0-sized skb in the queue. > > __skb_recv_datagram() instead stops at the first skb with len == 0 and results > > in the system call failing with -EFAULT via skb_copy_datagram_iovec(). > > > if MSG_PEEK is not used, what happens here ? I'm not sure what you're question is aiming at, but if MSG_PEEK isn't used, there's no difference with regards to this patch. It's all in the "if (flags & MSG_PEEK)" block. More generally, without MSG_PEEK, a sequence of send(..., len=10, ...); send(len=0); send(len=20) results in recv()=10; recv()=0; recv()=20; recv()= /* blocks */ With flags=MSG_PEEK, a sequence of send(len=10); send(len=0); send(len=20) resulted (without any patch) in setsockopt(..., SO_PEEK_OFF -> 0); recv()=10; recv()=0; recv()=0; recv()=0; ... and with v2 of the patch, results in setsockopt(..., SO_PEEK_OFF -> 0); recv()=10; recv()=0; recv()=20; recv()= /* blocks */ We could also have the following sequence setsockopt(..., SO_PEEK_OFF -> 10); recv()=0; recv()=20; recv()= /* blocks */ or setsockopt(..., SO_PEEK_OFF -> 5); recv()=5; recv()=0; recv()=20; recv()= /* blocks */ or the unfortunate setsockopt(..., SO_PEEK_OFF -> 0); recv()=10; recv()=0; recv()=20; setsockopt(..., SO_PEEK_OFF -> 0); recv()=10; ; recv()=20; recv()= /* blocks */ That last one could be changed by resetting the skb->peeked flag for all buffers the queue during sock_setsockopt SO_PEEK_OFF. If you think it's better that way. > > It doesn't look right to me that we return -EFAULT if skb->len is 0, > EFAULT is reserved to faulting (ie reading/writing at least one byte) That's what happens when skb_copy_datagram_iovec() is asked to copy > 0 bytes out of a skb with len == 0. Perhaps skb_copy_datagram_iovec() should be changed to use EINVAL in that case but we can avoid that kind of call altogether by fixing the problem with MSG_PEEK. > > How are we telling the user message had 0 byte, but its not EOF ? > We aren't, but what's EOF on a datagram socket? Thank you for the review. Subject: [PATCH net v2 1/3] unix/dgram: peek beyond 0-sized skbs "77c1090 net: fix infinite loop in __skb_recv_datagram()" (v3.8) introduced a regression: After that commit, recv can no longer peek beyond a 0-sized skb in the queue. __skb_recv_datagram() instead stops at the first skb with len == 0 and results in the system call failing with -EFAULT via skb_copy_datagram_iovec(). When peeking at an offset with 0-sized skb(s), each one of those is received only once, in sequence. The offset starts moving forward again after receiving datagrams with len > 0. Signed-off-by: Benjamin Poirier --- * v2 also fix the situation when sk_peek_off must advance to and beyond a 0-sized skb * v1 fix the case when SO_PEEK_OFF is used to set sk_peek_off beyond a 0-sized skb net/core/datagram.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/core/datagram.c b/net/core/datagram.c index 368f9c3..99c4f52 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -187,7 +187,8 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned int flags, skb_queue_walk(queue, skb) { *peeked = skb->peeked; if (flags & MSG_PEEK) { - if (*off >= skb->len && skb->len) { + if (*off >= skb->len && (skb->len || *off || + skb->peeked)) { *off -= skb->len; continue; }