From patchwork Thu Nov 13 19:15:02 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 410553 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 9B0611400D2 for ; Fri, 14 Nov 2014 06:15:15 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933903AbaKMTPI (ORCPT ); Thu, 13 Nov 2014 14:15:08 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:44313 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933587AbaKMTPH (ORCPT ); Thu, 13 Nov 2014 14:15:07 -0500 Received: from [188.251.61.86] (helo=localhost) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1XozrD-00019F-Id; Thu, 13 Nov 2014 19:15:03 +0000 Date: Thu, 13 Nov 2014 19:15:02 +0000 From: Luis Henriques To: Eric Dumazet Cc: David Miller , netdev , Neal Cardwell , Joseph Salisbury Subject: Re: [PATCH] net: skb_fclone_busy() needs to detect orphaned skb Message-ID: <20141113191502.GC7095@hercules> References: <1414690354.9028.9.camel@edumazet-glaptop2.roam.corp.google.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1414690354.9028.9.camel@edumazet-glaptop2.roam.corp.google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi Eric, On Thu, Oct 30, 2014 at 10:32:34AM -0700, Eric Dumazet wrote: > From: Eric Dumazet > > Some drivers are unable to perform TX completions in a bound time. > They instead call skb_orphan() > > Problem is skb_fclone_busy() has to detect this case, otherwise > we block TCP retransmits and can freeze unlucky tcp sessions on > mostly idle hosts. > > Signed-off-by: Eric Dumazet > Fixes: 1f3279ae0c13 ("tcp: avoid retransmits of TCP packets hanging in host queues") > --- > This is a stable candidate. > This problem is known to hurt users of linux-3.16 kernels used by guests kernels. > David, I can provide backports if you want. > Thanks ! > We got a bug report[0] where a backport for 3.16 was provided. Since I couldn't find the original backport post, I'm not sure who's the actual author. Could you please confirm if this backport is correct? (I'm copying the patch below). [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1390604 Cheers, Signed-off-by: Eric Dumazet --- Luís > include/linux/skbuff.h | 8 ++++++-- > net/ipv4/tcp_output.c | 2 +- > net/xfrm/xfrm_policy.c | 2 +- > 3 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 5884f95ff0e9..6c8b6f604e76 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -799,15 +799,19 @@ struct sk_buff_fclones { > * @skb: buffer > * > * Returns true is skb is a fast clone, and its clone is not freed. > + * Some drivers call skb_orphan() in their ndo_start_xmit(), > + * so we also check that this didnt happen. > */ > -static inline bool skb_fclone_busy(const struct sk_buff *skb) > +static inline bool skb_fclone_busy(const struct sock *sk, > + const struct sk_buff *skb) > { > const struct sk_buff_fclones *fclones; > > fclones = container_of(skb, struct sk_buff_fclones, skb1); > > return skb->fclone == SKB_FCLONE_ORIG && > - fclones->skb2.fclone == SKB_FCLONE_CLONE; > + fclones->skb2.fclone == SKB_FCLONE_CLONE && > + fclones->skb2.sk == sk; > } > > static inline struct sk_buff *alloc_skb_fclone(unsigned int size, > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 3af21296d967..a3d453b94747 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -2126,7 +2126,7 @@ bool tcp_schedule_loss_probe(struct sock *sk) > static bool skb_still_in_host_queue(const struct sock *sk, > const struct sk_buff *skb) > { > - if (unlikely(skb_fclone_busy(skb))) { > + if (unlikely(skb_fclone_busy(sk, skb))) { > NET_INC_STATS_BH(sock_net(sk), > LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES); > return true; > diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c > index 4c4e457e7888..88bf289abdc9 100644 > --- a/net/xfrm/xfrm_policy.c > +++ b/net/xfrm/xfrm_policy.c > @@ -1962,7 +1962,7 @@ static int xdst_queue_output(struct sock *sk, struct sk_buff *skb) > struct xfrm_policy *pol = xdst->pols[0]; > struct xfrm_policy_queue *pq = &pol->polq; > > - if (unlikely(skb_fclone_busy(skb))) { > + if (unlikely(skb_fclone_busy(sk, skb))) { > kfree_skb(skb); > return 0; > } > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 4e4932b5079b..a8794367cd20 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2082,7 +2082,8 @@ static bool skb_still_in_host_queue(const struct sock *sk, const struct sk_buff *fclone = skb + 1; if (unlikely(skb->fclone == SKB_FCLONE_ORIG && - fclone->fclone == SKB_FCLONE_CLONE)) { + fclone->fclone == SKB_FCLONE_CLONE && + fclone->sk == sk)) { NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES); return true;