From patchwork Thu Apr 21 21:07:25 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Stringer X-Patchwork-Id: 613298 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3qrWYZ32Xsz9t4Z for ; Fri, 22 Apr 2016 07:08:02 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id EB8B31076B; Thu, 21 Apr 2016 14:07:52 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e4.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id DB30010761 for ; Thu, 21 Apr 2016 14:07:50 -0700 (PDT) Received: from bar5.cudamail.com (unknown [192.168.21.12]) by mx1e4.cudamail.com (Postfix) with ESMTPS id 55C871E0346 for ; Thu, 21 Apr 2016 15:07:50 -0600 (MDT) X-ASG-Debug-ID: 1461272869-09eadd7fbb261350001-byXFYA Received: from mx1-pf2.cudamail.com ([192.168.24.2]) by bar5.cudamail.com with ESMTP id X1uNxJ3DvSxvCdCA (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 21 Apr 2016 15:07:49 -0600 (MDT) X-Barracuda-Envelope-From: joe@ovn.org X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.2 Received: from unknown (HELO relay3-d.mail.gandi.net) (217.70.183.195) by mx1-pf2.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 21 Apr 2016 21:07:49 -0000 Received-SPF: pass (mx1-pf2.cudamail.com: SPF record at ovn.org designates 217.70.183.195 as permitted sender) X-Barracuda-Apparent-Source-IP: 217.70.183.195 X-Barracuda-RBL-IP: 217.70.183.195 Received: from mfilter36-d.gandi.net (mfilter36-d.gandi.net [217.70.178.167]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id 112B6A80CF; Thu, 21 Apr 2016 23:07:48 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter36-d.gandi.net Received: from relay3-d.mail.gandi.net ([IPv6:::ffff:217.70.183.195]) by mfilter36-d.gandi.net (mfilter36-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id L7zKuf8ybigd; Thu, 21 Apr 2016 23:07:46 +0200 (CEST) X-Originating-IP: 208.91.1.34 Received: from localhost.localdomain (unknown [208.91.1.34]) (Authenticated sender: joe@ovn.org) by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 3D64BA80C0; Thu, 21 Apr 2016 23:07:44 +0200 (CEST) X-CudaMail-Envelope-Sender: joe@ovn.org From: Joe Stringer To: dev@openvswitch.org X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-E2-420078548 X-CudaMail-DTE: 042116 X-CudaMail-Originating-IP: 217.70.183.195 Date: Thu, 21 Apr 2016 14:07:25 -0700 X-ASG-Orig-Subj: [##CM-E2-420078548##][PATCH 2/5] compat: nf_defrag_ipv6: avoid/free clone operations. Message-Id: <1461272848-62263-3-git-send-email-joe@ovn.org> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1461272848-62263-1-git-send-email-joe@ovn.org> References: <1461272848-62263-1-git-send-email-joe@ovn.org> X-Barracuda-Connect: UNKNOWN[192.168.24.2] X-Barracuda-Start-Time: 1461272869 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 Subject: [ovs-dev] [PATCH 2/5] compat: nf_defrag_ipv6: avoid/free clone operations. X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" Upstream commit: netfilter: ipv6: nf_defrag: avoid/free clone operations commit 6aafeef03b9d9ecf ("netfilter: push reasm skb through instead of original frag skbs") changed ipv6 defrag to not use the original skbs anymore. So rather than keeping the original skbs around just to discard them afterwards just use the original skbs directly for the fraglist of the newly assembled skb and remove the extra clone/free operations. The skb that completes the fragment queue is morphed into a the reassembled one instead, just like ipv4 defrag. openvswitch doesn't need any additional skb_morph magic anymore to deal with this situation so just remove that. A followup patch can then also remove the NF_HOOK (re)invocation in the ipv6 netfilter defrag hook. Cc: Joe Stringer Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Upstream: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone operations") Signed-off-by: Joe Stringer --- datapath/conntrack.c | 14 --- .../include/net/netfilter/ipv6/nf_defrag_ipv6.h | 12 +-- datapath/linux/compat/nf_conntrack_reasm.c | 104 ++++++++------------- 3 files changed, 46 insertions(+), 84 deletions(-) diff --git a/datapath/conntrack.c b/datapath/conntrack.c index 82db51567313..f721a8920678 100644 --- a/datapath/conntrack.c +++ b/datapath/conntrack.c @@ -337,21 +337,7 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key, if (!reasm) return -EINPROGRESS; - if (skb == reasm) { - kfree_skb(skb); - return -EINVAL; - } - - /* Don't free 'skb' even though it is one of the original - * fragments, as we're going to morph it into the head. - */ - skb_get(skb); - nf_ct_frag6_consume_orig(reasm); - key->ip.proto = ipv6_hdr(reasm)->nexthdr; - skb_morph(skb, reasm); - skb->next = reasm->next; - consume_skb(reasm); ovs_cb.dp_cb.mru = IP6CB(skb)->frag_max_size; #endif /* IP frag support */ } else { diff --git a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h index fe99ced37227..a3b86dab2c9c 100644 --- a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h +++ b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h @@ -16,17 +16,17 @@ #define OVS_NF_DEFRAG6_BACKPORT 1 struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user); +#define nf_ct_frag6_gather rpl_nf_ct_frag6_gather +#endif /* HAVE_NF_CT_FRAG6_CONSUME_ORIG */ + +#ifdef OVS_NF_DEFRAG6_BACKPORT int __init rpl_nf_ct_frag6_init(void); void rpl_nf_ct_frag6_cleanup(void); -void rpl_nf_ct_frag6_consume_orig(struct sk_buff *skb); -#define nf_ct_frag6_gather rpl_nf_ct_frag6_gather -#else /* HAVE_NF_CT_FRAG6_CONSUME_ORIG */ +#else /* !OVS_NF_DEFRAG6_BACKPORT */ static inline int __init rpl_nf_ct_frag6_init(void) { return 0; } static inline void rpl_nf_ct_frag6_cleanup(void) { } -static inline void rpl_nf_ct_frag6_consume_orig(struct sk_buff *skb) { } -#endif /* HAVE_NF_CT_FRAG6_CONSUME_ORIG */ +#endif /* OVS_NF_DEFRAG6_BACKPORT */ #define nf_ct_frag6_init rpl_nf_ct_frag6_init #define nf_ct_frag6_cleanup rpl_nf_ct_frag6_cleanup -#define nf_ct_frag6_consume_orig rpl_nf_ct_frag6_consume_orig #endif /* __NF_DEFRAG_IPV6_WRAPPER_H */ diff --git a/datapath/linux/compat/nf_conntrack_reasm.c b/datapath/linux/compat/nf_conntrack_reasm.c index 701bd15d8efd..c6dc7ebec5b5 100644 --- a/datapath/linux/compat/nf_conntrack_reasm.c +++ b/datapath/linux/compat/nf_conntrack_reasm.c @@ -62,7 +62,6 @@ struct nf_ct_frag6_skb_cb { struct inet6_skb_parm h; int offset; - struct sk_buff *orig; }; #define NFCT_FRAG6_CB(skb) ((struct nf_ct_frag6_skb_cb*)((skb)->cb)) @@ -94,12 +93,6 @@ static unsigned int nf_hashfn(struct inet_frag_queue *q) return nf_hash_frag(nq->id, &nq->saddr, &nq->daddr); } -static void nf_skb_free(struct sk_buff *skb) -{ - if (NFCT_FRAG6_CB(skb)->orig) - kfree_skb(NFCT_FRAG6_CB(skb)->orig); -} - static void nf_ct_frag6_expire(unsigned long data) { struct frag_queue *fq; @@ -300,9 +293,9 @@ err: * the last and the first frames arrived and all the bits are here. */ static struct sk_buff * -nf_ct_frag6_reasm(struct frag_queue *fq, struct net_device *dev) +nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev, struct net_device *dev) { - struct sk_buff *fp, *op, *head = fq->q.fragments; + struct sk_buff *fp, *head = fq->q.fragments; int payload_len; u8 ecn; @@ -353,10 +346,38 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct net_device *dev) clone->csum = 0; clone->ip_summed = head->ip_summed; - NFCT_FRAG6_CB(clone)->orig = NULL; add_frag_mem_limit(fq->q.net, clone->truesize); } + /* morph head into last received skb: prev. + * + * This allows callers of ipv6 conntrack defrag to continue + * to use the last skb(frag) passed into the reasm engine. + * The last skb frag 'silently' turns into the full reassembled skb. + * + * Since prev is also part of q->fragments we have to clone it first. + */ + if (head != prev) { + struct sk_buff *iter; + + fp = skb_clone(prev, GFP_ATOMIC); + if (!fp) + goto out_oom; + + fp->next = prev->next; + skb_queue_walk(head, iter) { + if (iter->next != prev) + continue; + iter->next = fp; + break; + } + + skb_morph(prev, head); + prev->next = head->next; + consume_skb(head); + head = prev; + } + /* We have to remove fragment header from datagram and to relocate * header in order to calculate ICV correctly. */ skb_network_header(head)[fq->nhoffset] = skb_transport_header(head)[0]; @@ -397,21 +418,6 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct net_device *dev) fq->q.fragments = NULL; fq->q.fragments_tail = NULL; - /* all original skbs are linked into the NFCT_FRAG6_CB(head).orig */ - fp = skb_shinfo(head)->frag_list; - if (fp && NFCT_FRAG6_CB(fp)->orig == NULL) - /* at above code, head skb is divided into two skbs. */ - fp = fp->next; - - op = NFCT_FRAG6_CB(head)->orig; - for (; fp; fp = fp->next) { - struct sk_buff *orig = NFCT_FRAG6_CB(fp)->orig; - - op->next = orig; - op = orig; - NFCT_FRAG6_CB(fp)->orig = NULL; - } - return head; out_oversize: @@ -490,7 +496,6 @@ find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int *prevhoff, int *fhoff) struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user) { - struct sk_buff *clone; struct net_device *dev = skb->dev; struct frag_hdr *fhdr; struct frag_queue *fq; @@ -508,42 +513,30 @@ struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, if (find_prev_fhdr(skb, &prevhdr, &nhoff, &fhoff) < 0) return skb; - clone = skb_clone(skb, GFP_ATOMIC); - if (clone == NULL) { - pr_debug("Can't clone skb\n"); + if (!pskb_may_pull(skb, fhoff + sizeof(*fhdr))) return skb; - } - NFCT_FRAG6_CB(clone)->orig = skb; - - if (!pskb_may_pull(clone, fhoff + sizeof(*fhdr))) { - pr_debug("message is too short.\n"); - goto ret_orig; - } - - skb_set_transport_header(clone, fhoff); - hdr = ipv6_hdr(clone); - fhdr = (struct frag_hdr *)skb_transport_header(clone); + skb_set_transport_header(skb, fhoff); + hdr = ipv6_hdr(skb); + fhdr = (struct frag_hdr *)skb_transport_header(skb); fq = fq_find(net, fhdr->identification, user, &hdr->saddr, &hdr->daddr, ip6_frag_ecn(hdr)); - if (fq == NULL) { - pr_debug("Can't find and can't create new queue\n"); - goto ret_orig; - } + if (fq == NULL) + return skb; spin_lock_bh(&fq->q.lock); - if (nf_ct_frag6_queue(fq, clone, fhdr, nhoff) < 0) { + if (nf_ct_frag6_queue(fq, skb, fhdr, nhoff) < 0) { spin_unlock_bh(&fq->q.lock); pr_debug("Can't insert skb to queue\n"); inet_frag_put(&fq->q, &nf_frags); - goto ret_orig; + return skb; } if (qp_flags(fq) == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) && fq->q.meat == fq->q.len) { - ret_skb = nf_ct_frag6_reasm(fq, dev); + ret_skb = nf_ct_frag6_reasm(fq, skb, dev); if (ret_skb == NULL) pr_debug("Can't reassemble fragmented packets\n"); } @@ -551,10 +544,6 @@ struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, inet_frag_put(&fq->q, &nf_frags); return ret_skb; - -ret_orig: - kfree_skb(clone); - return skb; } EXPORT_SYMBOL_GPL(rpl_nf_ct_frag6_gather); @@ -590,18 +579,6 @@ static bool rpl_ip6_frag_match(struct inet_frag_queue *q, void *a) ipv6_addr_equal(&fq->daddr, arg->dst); } -void nf_ct_frag6_consume_orig(struct sk_buff *skb) -{ - struct sk_buff *s, *s2; - - for (s = NFCT_FRAG6_CB(skb)->orig; s;) { - s2 = s->next; - s->next = NULL; - consume_skb(s); - s = s2; - } -} - static int nf_ct_net_init(struct net *net) { nf_defrag_ipv6_enable(); @@ -626,7 +603,6 @@ int rpl_nf_ct_frag6_init(void) nf_frags.hashfn = nf_hashfn; nf_frags.constructor = rpl_ip6_frag_init; nf_frags.destructor = NULL; - nf_frags.skb_free = nf_skb_free; nf_frags.qsize = sizeof(struct frag_queue); nf_frags.match = rpl_ip6_frag_match; nf_frags.frag_expire = nf_ct_frag6_expire;