From patchwork Sat Nov  7 19:59:49 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Joe Stringer <joestringer@nicira.com>
X-Patchwork-Id: 541369
Return-Path: <dev-bounces@openvswitch.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from archives.nicira.com (unknown
	[IPv6:2600:3c00::f03c:91ff:fe6e:bdf7])
	by ozlabs.org (Postfix) with ESMTP id 907C51402C6
	for <incoming@patchwork.ozlabs.org>;
	Sun,  8 Nov 2015 07:01:21 +1100 (AEDT)
Authentication-Results: ozlabs.org;
	dkim=fail reason="signature verification failed" (2048-bit key;
	unprotected) header.d=nicira_com.20150623.gappssmtp.com
	header.i=@nicira_com.20150623.gappssmtp.com header.b=xA0XaCWI;
	dkim-atps=neutral
Received: from archives.nicira.com (localhost [127.0.0.1])
	by archives.nicira.com (Postfix) with ESMTP id 301CD10A65;
	Sat,  7 Nov 2015 12:00:30 -0800 (PST)
X-Original-To: dev@openvswitch.org
Delivered-To: dev@openvswitch.org
Received: from mx1e4.cudamail.com (mx1.cudamail.com [69.90.118.67])
	by archives.nicira.com (Postfix) with ESMTPS id 5A82E10A5C
	for <dev@openvswitch.org>; Sat,  7 Nov 2015 12:00:29 -0800 (PST)
Received: from bar5.cudamail.com (unknown [192.168.21.12])
	by mx1e4.cudamail.com (Postfix) with ESMTPS id CDBF01E00CF
	for <dev@openvswitch.org>; Sat,  7 Nov 2015 13:00:28 -0700 (MST)
X-ASG-Debug-ID: 1446926428-09eadd03648fb60001-byXFYA
Received: from mx1-pf2.cudamail.com ([192.168.24.2]) by bar5.cudamail.com
	with
	ESMTP id hp9V0aQfHg0hkUl8 (version=TLSv1 cipher=DHE-RSA-AES256-SHA
	bits=256 verify=NO) for <dev@openvswitch.org>;
	Sat, 07 Nov 2015 13:00:28 -0700 (MST)
X-Barracuda-Envelope-From: joestringer@nicira.com
X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.2
Received: from unknown (HELO mail-pa0-f46.google.com) (209.85.220.46)
	by mx1-pf2.cudamail.com with ESMTPS (RC4-SHA encrypted);
	7 Nov 2015 20:00:28 -0000
Received-SPF: unknown (mx1-pf2.cudamail.com: Multiple SPF records returned)
X-Barracuda-Apparent-Source-IP: 209.85.220.46
X-Barracuda-RBL-IP: 209.85.220.46
Received: by pasz6 with SMTP id z6so160772004pas.2
	for <dev@openvswitch.org>; Sat, 07 Nov 2015 12:00:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=nicira_com.20150623.gappssmtp.com; s=20150623;
	h=from:to:subject:date:message-id:in-reply-to:references;
	bh=qGrVurMdhoP7xi6SwutXDOIGQ11QjePkpmzYWlWQlEk=;
	b=xA0XaCWIJ/UOz49lyzyBxTJDtBkkgGg5X3GpXqMmpXLqshrGno5Bey+PNB0WbTW2pi
	4MvbbfZzjbGTfQEK3IpGce2eSCVPu1ftkfPk+5+jHRLNoJuQ1/WzM3NU2AhHSj43UvAh
	iKNXzjxocUGi86Zl1WuJb7xzGy2aSbARlUBY9FxF3WjxT1KLqn7odUsYNbITi1vEp2Yg
	EaYXJe4HkAKCKeSPZfCp27nQ1Rujr1qx4yV6XAkt6IcTBT/yehf0EmfuFQTSkyrJwCnP
	L1ePuGmRGaUxALPqBKmIHt9rJB9C/5+kNFJDlJY1gg+VubclAcIUvtrXK2HUy+lKFB58
	Crgw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to
	:references;
	bh=qGrVurMdhoP7xi6SwutXDOIGQ11QjePkpmzYWlWQlEk=;
	b=RElnzVngBc8/MzJnVoAZd0LI9haOTZW1d8A5SfEaWkv1NHxZMHPQhp55lktT1HvwmM
	piUfNA/Ndb0x+VtqpVvlPYaQ3wzBfk+dODVM0E7c88bZ3IqTYgtEad5vXrqi+4aGZFbP
	lfPD3+ZUQv+9hjR9tOW7QrnALumMtnYv7RtsPq+XmwCnNOYmsiwh4GY1Qke3tFPSFjDR
	XBxw7jF+UwucYuNbTeYxjObScN5fOY+FiVMKyaXQsVoINWHQ80TyL5raDNWkSD2m3Wrp
	s52qqYtdW7E+XtqYAAr0WoFcmbnOchOlrC1KZ675u8tCEPfHetqHPPmJfLwSN9hjD7t2
	Im+g==
X-Gm-Message-State: 
 ALoCoQmql7/oK/q3kRtr/eV2nWtCKjLAdnoxHF47azls2lOJXyF2tHrkjsF9BZX1U6RPiD1gN4F7
X-Received: by 10.66.221.105 with SMTP id qd9mr28055969pac.46.1446926427625;
	Sat, 07 Nov 2015 12:00:27 -0800 (PST)
Received: from localhost.localdomain ([208.91.2.4])
	by smtp.gmail.com with ESMTPSA id
	nu5sm7312219pbb.65.2015.11.07.12.00.26 for <dev@openvswitch.org>
	(version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Sat, 07 Nov 2015 12:00:26 -0800 (PST)
X-CudaMail-Envelope-Sender: joestringer@nicira.com
From: Joe Stringer <joestringer@nicira.com>
To: dev@openvswitch.org
X-CudaMail-Whitelist-To: dev@openvswitch.org
X-CudaMail-MID: CM-E2-1106023132
X-CudaMail-DTE: 110715
X-CudaMail-Originating-IP: 209.85.220.46
Date: Sat,  7 Nov 2015 11:59:49 -0800
X-ASG-Orig-Subj: [##CM-E2-1106023132##][PATCH 11/23] compat: Backport IPv6
	reassembly
Message-Id: <1446926401-55723-12-git-send-email-joestringer@nicira.com>
X-Mailer: git-send-email 2.1.4
In-Reply-To: <1446926401-55723-1-git-send-email-joestringer@nicira.com>
References: <1446926401-55723-1-git-send-email-joestringer@nicira.com>
X-Barracuda-Connect: UNKNOWN[192.168.24.2]
X-Barracuda-Start-Time: 1446926428
X-Barracuda-Encrypted: DHE-RSA-AES256-SHA
X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi
X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?=
X-Virus-Scanned: by bsmtpd at cudamail.com
X-Barracuda-BRTS-Status: 1
Subject: [ovs-dev] [PATCH 11/23] compat: Backport IPv6 reassembly
X-BeenThere: dev@openvswitch.org
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: <dev.openvswitch.org>
List-Unsubscribe: <http://openvswitch.org/mailman/options/dev>,
	<mailto:dev-request@openvswitch.org?subject=unsubscribe>
List-Archive: <http://openvswitch.org/pipermail/dev>
List-Post: <mailto:dev@openvswitch.org>
List-Help: <mailto:dev-request@openvswitch.org?subject=help>
List-Subscribe: <http://openvswitch.org/mailman/listinfo/dev>,
	<mailto:dev-request@openvswitch.org?subject=subscribe>
MIME-Version: 1.0
Errors-To: dev-bounces@openvswitch.org
Sender: "dev" <dev-bounces@openvswitch.org>

Backport IPv6 fragment reassembly from upstream commits in the Linux 4.3
development tree.

Signed-off-by: Joe Stringer <joestringer@nicira.com>
---
 datapath/compat.h                                  |  28 +-
 datapath/linux/Modules.mk                          |   2 +
 .../include/net/netfilter/ipv6/nf_defrag_ipv6.h    |  32 +
 datapath/linux/compat/nf_conntrack_reasm.c         | 643 +++++++++++++++++++++
 4 files changed, 704 insertions(+), 1 deletion(-)
 create mode 100644 datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
 create mode 100644 datapath/linux/compat/nf_conntrack_reasm.c

diff --git a/datapath/compat.h b/datapath/compat.h
index a6404c601f7a..3cbd121f29cd 100644
--- a/datapath/compat.h
+++ b/datapath/compat.h
@@ -25,6 +25,7 @@
 #include <net/ip.h>
 #include <net/route.h>
 #include <net/xfrm.h>
+#include <net/netfilter/ipv6/nf_defrag_ipv6.h>
 
 #ifdef HAVE_GENL_MULTICAST_GROUP_WITH_ID
 #define GROUP_ID(grp)	((grp)->id)
@@ -54,12 +55,37 @@ static inline bool skb_encapsulation(struct sk_buff *skb)
 #endif
 
 #ifdef OVS_FRAGMENT_BACKPORT
+int __init ip6_output_init(void);
+void ip6_output_exit(void);
+
 static inline int __init compat_init(void)
 {
-	return ipfrag_init();
+	int err;
+
+	err = ipfrag_init();
+	if (err)
+		return err;
+
+	err = nf_ct_frag6_init();
+	if (err)
+		goto error_ipfrag_exit;
+
+	err = ip6_output_init();
+	if (err)
+		goto error_frag6_exit;
+
+	return 0;
+
+error_frag6_exit:
+	nf_ct_frag6_cleanup();
+error_ipfrag_exit:
+	rpl_ipfrag_fini();
+	return err;
 }
 static inline void compat_exit(void)
 {
+	ip6_output_exit();
+	nf_ct_frag6_cleanup();
 	rpl_ipfrag_fini();
 }
 #else
diff --git a/datapath/linux/Modules.mk b/datapath/linux/Modules.mk
index 2b12ec5b89e6..bff549c3a60a 100644
--- a/datapath/linux/Modules.mk
+++ b/datapath/linux/Modules.mk
@@ -17,6 +17,7 @@ openvswitch_sources += \
 	linux/compat/netdevice.c \
 	linux/compat/net_namespace.c \
 	linux/compat/nf_conntrack_core.c \
+	linux/compat/nf_conntrack_reasm.c \
 	linux/compat/reciprocal_div.c \
 	linux/compat/skbuff-openvswitch.c \
 	linux/compat/socket.c \
@@ -102,5 +103,6 @@ openvswitch_headers += \
 	linux/compat/include/net/netfilter/nf_conntrack_expect.h \
 	linux/compat/include/net/netfilter/nf_conntrack_labels.h \
 	linux/compat/include/net/netfilter/nf_conntrack_zones.h \
+	linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h \
 	linux/compat/include/net/sctp/checksum.h
 EXTRA_DIST += linux/compat/build-aux/export-check-whitelist
diff --git a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
new file mode 100644
index 000000000000..7d51491a9c1b
--- /dev/null
+++ b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
@@ -0,0 +1,32 @@
+#ifndef _NF_DEFRAG_IPV6_WRAPPER_H
+#define _NF_DEFRAG_IPV6_WRAPPER_H
+
+#include <linux/kconfig.h>
+
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,37)
+#include_next <net/netfilter/ipv6/nf_defrag_ipv6.h>
+#endif
+
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,3,0)
+#if defined(OVS_FRAGMENT_BACKPORT)
+struct sk_buff *rpl_nf_ct_frag6_gather(struct sk_buff *skb, u32 user);
+int __init rpl_nf_ct_frag6_init(void);
+void rpl_nf_ct_frag6_cleanup(void);
+void rpl_nf_ct_frag6_consume_orig(struct sk_buff *skb);
+#else /* !OVS_FRAGMENT_BACKPORT */
+static inline struct sk_buff *rpl_nf_ct_frag6_gather(struct sk_buff *skb,
+						     u32 user)
+{
+	return skb;
+}
+static inline int __init rpl_nf_ct_frag6_init(void) { return 0; }
+static inline void rpl_nf_ct_frag6_cleanup(void) { }
+static inline void rpl_nf_ct_frag6_consume_orig(struct sk_buff *skb) { }
+#endif /* OVS_FRAGMENT_BACKPORT */
+#define nf_ct_frag6_gather rpl_nf_ct_frag6_gather
+#define nf_ct_frag6_init rpl_nf_ct_frag6_init
+#define nf_ct_frag6_cleanup rpl_nf_ct_frag6_cleanup
+#define nf_ct_frag6_consume_orig rpl_nf_ct_frag6_consume_orig
+#endif /* < 4.3 */
+
+#endif /* __NF_DEFRAG_IPV6_WRAPPER_H */
diff --git a/datapath/linux/compat/nf_conntrack_reasm.c b/datapath/linux/compat/nf_conntrack_reasm.c
new file mode 100644
index 000000000000..1f7deba01d8f
--- /dev/null
+++ b/datapath/linux/compat/nf_conntrack_reasm.c
@@ -0,0 +1,643 @@
+/*
+ * Backported from upstream commit 5b490047240f
+ * ("ipv6: Export nf_ct_frag6_gather()")
+ *
+ * IPv6 fragment reassembly for connection tracking
+ *
+ * Copyright (C)2004 USAGI/WIDE Project
+ *
+ * Author:
+ *	Yasuyuki Kozakai @USAGI <yasuyuki.kozakai@toshiba.co.jp>
+ *
+ * Based on: net/ipv6/reassembly.c
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) "IPv6-nf: " fmt
+
+#include <linux/version.h>
+
+#ifdef OVS_FRAGMENT_BACKPORT
+
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/socket.h>
+#include <linux/sockios.h>
+#include <linux/jiffies.h>
+#include <linux/net.h>
+#include <linux/list.h>
+#include <linux/netdevice.h>
+#include <linux/in6.h>
+#include <linux/ipv6.h>
+#include <linux/icmpv6.h>
+#include <linux/random.h>
+#include <linux/slab.h>
+
+#include <net/sock.h>
+#include <net/snmp.h>
+#include <net/inet_frag.h>
+
+#include <net/ipv6.h>
+#include <net/protocol.h>
+#include <net/transp_v6.h>
+#include <net/rawv6.h>
+#include <net/ndisc.h>
+#include <net/addrconf.h>
+#include <net/inet_ecn.h>
+#include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter_ipv6.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <net/netfilter/ipv6/nf_defrag_ipv6.h>
+
+static const char nf_frags_cache_name[] = "nf-frags";
+
+struct nf_ct_frag6_skb_cb
+{
+	struct inet6_skb_parm	h;
+	int			offset;
+	struct sk_buff		*orig;
+};
+
+#define NFCT_FRAG6_CB(skb)	((struct nf_ct_frag6_skb_cb*)((skb)->cb))
+
+static struct inet_frags nf_frags;
+
+static inline u8 ip6_frag_ecn(const struct ipv6hdr *ipv6h)
+{
+	return 1 << (ipv6_get_dsfield(ipv6h) & INET_ECN_MASK);
+}
+
+static unsigned int nf_hash_frag(__be32 id, const struct in6_addr *saddr,
+				 const struct in6_addr *daddr)
+{
+	net_get_random_once(&nf_frags.rnd, sizeof(nf_frags.rnd));
+	return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
+			    (__force u32)id, nf_frags.rnd);
+}
+
+
+static unsigned int nf_hashfn(const struct inet_frag_queue *q)
+{
+	const struct frag_queue *nq;
+
+	nq = container_of(q, struct frag_queue, q);
+	return nf_hash_frag(nq->id, &nq->saddr, &nq->daddr);
+}
+
+static void nf_skb_free(struct sk_buff *skb)
+{
+	if (NFCT_FRAG6_CB(skb)->orig)
+		kfree_skb(NFCT_FRAG6_CB(skb)->orig);
+}
+
+static void nf_ct_frag6_expire(unsigned long data)
+{
+	struct frag_queue *fq;
+	struct net *net;
+
+	fq = container_of((struct inet_frag_queue *)data, struct frag_queue, q);
+	net = container_of(fq->q.net, struct net, nf_frag.frags);
+
+	ip6_expire_frag_queue(net, fq, &nf_frags);
+}
+
+/* Creation primitives. */
+static inline struct frag_queue *fq_find(struct net *net, __be32 id,
+					 u32 user, struct in6_addr *src,
+					 struct in6_addr *dst, u8 ecn)
+{
+	struct inet_frag_queue *q;
+	struct ip6_create_arg arg;
+	unsigned int hash;
+
+	arg.id = id;
+	arg.user = user;
+	arg.src = src;
+	arg.dst = dst;
+	arg.ecn = ecn;
+
+	local_bh_disable();
+	hash = nf_hash_frag(id, src, dst);
+
+	q = inet_frag_find(&net->nf_frag.frags, &nf_frags, &arg, hash);
+	local_bh_enable();
+	if (IS_ERR_OR_NULL(q)) {
+		inet_frag_maybe_warn_overflow(q, pr_fmt());
+		return NULL;
+	}
+	return container_of(q, struct frag_queue, q);
+}
+
+
+static int nf_ct_frag6_queue(struct frag_queue *fq, struct sk_buff *skb,
+			     const struct frag_hdr *fhdr, int nhoff)
+{
+	struct sk_buff *prev, *next;
+	unsigned int payload_len;
+	int offset, end;
+	u8 ecn;
+
+	if (qp_flags(fq) & INET_FRAG_COMPLETE) {
+		pr_debug("Already completed\n");
+		goto err;
+	}
+
+	payload_len = ntohs(ipv6_hdr(skb)->payload_len);
+
+	offset = ntohs(fhdr->frag_off) & ~0x7;
+	end = offset + (payload_len -
+			((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));
+
+	if ((unsigned int)end > IPV6_MAXPLEN) {
+		pr_debug("offset is too large.\n");
+		return -1;
+	}
+
+	ecn = ip6_frag_ecn(ipv6_hdr(skb));
+
+	if (skb->ip_summed == CHECKSUM_COMPLETE) {
+		const unsigned char *nh = skb_network_header(skb);
+		skb->csum = csum_sub(skb->csum,
+				     csum_partial(nh, (u8 *)(fhdr + 1) - nh,
+						  0));
+	}
+
+	/* Is this the final fragment? */
+	if (!(fhdr->frag_off & htons(IP6_MF))) {
+		/* If we already have some bits beyond end
+		 * or have different end, the segment is corrupted.
+		 */
+		if (end < fq->q.len ||
+		    ((qp_flags(fq) & INET_FRAG_LAST_IN) && end != fq->q.len)) {
+			pr_debug("already received last fragment\n");
+			goto err;
+		}
+		qp_flags(fq) |= INET_FRAG_LAST_IN;
+		fq->q.len = end;
+	} else {
+		/* Check if the fragment is rounded to 8 bytes.
+		 * Required by the RFC.
+		 */
+		if (end & 0x7) {
+			/* RFC2460 says always send parameter problem in
+			 * this case. -DaveM
+			 */
+			pr_debug("end of fragment not rounded to 8 bytes.\n");
+			return -1;
+		}
+		if (end > fq->q.len) {
+			/* Some bits beyond end -> corruption. */
+			if (qp_flags(fq) & INET_FRAG_LAST_IN) {
+				pr_debug("last packet already reached.\n");
+				goto err;
+			}
+			fq->q.len = end;
+		}
+	}
+
+	if (end == offset)
+		goto err;
+
+	/* Point into the IP datagram 'data' part. */
+	if (!pskb_pull(skb, (u8 *) (fhdr + 1) - skb->data)) {
+		pr_debug("queue: message is too short.\n");
+		goto err;
+	}
+	if (pskb_trim_rcsum(skb, end - offset)) {
+		pr_debug("Can't trim\n");
+		goto err;
+	}
+
+	/* Find out which fragments are in front and at the back of us
+	 * in the chain of fragments so far.  We must know where to put
+	 * this fragment, right?
+	 */
+	prev = fq->q.fragments_tail;
+	if (!prev || NFCT_FRAG6_CB(prev)->offset < offset) {
+		next = NULL;
+		goto found;
+	}
+	prev = NULL;
+	for (next = fq->q.fragments; next != NULL; next = next->next) {
+		if (NFCT_FRAG6_CB(next)->offset >= offset)
+			break;	/* bingo! */
+		prev = next;
+	}
+
+found:
+	/* RFC5722, Section 4:
+	 *                                  When reassembling an IPv6 datagram, if
+	 *   one or more its constituent fragments is determined to be an
+	 *   overlapping fragment, the entire datagram (and any constituent
+	 *   fragments, including those not yet received) MUST be silently
+	 *   discarded.
+	 */
+
+	/* Check for overlap with preceding fragment. */
+	if (prev &&
+	    (NFCT_FRAG6_CB(prev)->offset + prev->len) > offset)
+		goto discard_fq;
+
+	/* Look for overlap with succeeding segment. */
+	if (next && NFCT_FRAG6_CB(next)->offset < end)
+		goto discard_fq;
+
+	NFCT_FRAG6_CB(skb)->offset = offset;
+
+	/* Insert this fragment in the chain of fragments. */
+	skb->next = next;
+	if (!next)
+		fq->q.fragments_tail = skb;
+	if (prev)
+		prev->next = skb;
+	else
+		fq->q.fragments = skb;
+
+	if (skb->dev) {
+		fq->iif = skb->dev->ifindex;
+		skb->dev = NULL;
+	}
+	fq->q.stamp = skb->tstamp;
+	fq->q.meat += skb->len;
+	fq->ecn |= ecn;
+	if (payload_len > fq->q.max_size)
+		fq->q.max_size = payload_len;
+	add_frag_mem_limit(fq->q.net, skb->truesize);
+
+	/* The first fragment.
+	 * nhoffset is obtained from the first fragment, of course.
+	 */
+	if (offset == 0) {
+		fq->nhoffset = nhoff;
+		qp_flags(fq) |= INET_FRAG_FIRST_IN;
+	}
+
+	return 0;
+
+discard_fq:
+	inet_frag_kill(&fq->q, &nf_frags);
+err:
+	return -1;
+}
+
+/*
+ *	Check if this packet is complete.
+ *	Returns NULL on failure by any reason, and pointer
+ *	to current nexthdr field in reassembled frame.
+ *
+ *	It is called with locked fq, and caller must check that
+ *	queue is eligible for reassembly i.e. it is not COMPLETE,
+ *	the last and the first frames arrived and all the bits are here.
+ */
+static struct sk_buff *
+nf_ct_frag6_reasm(struct frag_queue *fq, struct net_device *dev)
+{
+	struct sk_buff *fp, *op, *head = fq->q.fragments;
+	int    payload_len;
+	u8 ecn;
+
+	inet_frag_kill(&fq->q, &nf_frags);
+
+	WARN_ON(head == NULL);
+	WARN_ON(NFCT_FRAG6_CB(head)->offset != 0);
+
+	ecn = ip_frag_ecn_table[fq->ecn];
+	if (unlikely(ecn == 0xff))
+		goto out_fail;
+
+	/* Unfragmented part is taken from the first segment. */
+	payload_len = ((head->data - skb_network_header(head)) -
+		       sizeof(struct ipv6hdr) + fq->q.len -
+		       sizeof(struct frag_hdr));
+	if (payload_len > IPV6_MAXPLEN) {
+		pr_debug("payload len is too large.\n");
+		goto out_oversize;
+	}
+
+	/* Head of list must not be cloned. */
+	if (skb_unclone(head, GFP_ATOMIC)) {
+		pr_debug("skb is cloned but can't expand head");
+		goto out_oom;
+	}
+
+	/* If the first fragment is fragmented itself, we split
+	 * it to two chunks: the first with data and paged part
+	 * and the second, holding only fragments. */
+	if (skb_has_frag_list(head)) {
+		struct sk_buff *clone;
+		int i, plen = 0;
+
+		clone = alloc_skb(0, GFP_ATOMIC);
+		if (clone == NULL)
+			goto out_oom;
+
+		clone->next = head->next;
+		head->next = clone;
+		skb_shinfo(clone)->frag_list = skb_shinfo(head)->frag_list;
+		skb_frag_list_init(head);
+		for (i = 0; i < skb_shinfo(head)->nr_frags; i++)
+			plen += skb_frag_size(&skb_shinfo(head)->frags[i]);
+		clone->len = clone->data_len = head->data_len - plen;
+		head->data_len -= clone->len;
+		head->len -= clone->len;
+		clone->csum = 0;
+		clone->ip_summed = head->ip_summed;
+
+		NFCT_FRAG6_CB(clone)->orig = NULL;
+		add_frag_mem_limit(fq->q.net, clone->truesize);
+	}
+
+	/* We have to remove fragment header from datagram and to relocate
+	 * header in order to calculate ICV correctly. */
+	skb_network_header(head)[fq->nhoffset] = skb_transport_header(head)[0];
+	memmove(head->head + sizeof(struct frag_hdr), head->head,
+		(head->data - head->head) - sizeof(struct frag_hdr));
+	head->mac_header += sizeof(struct frag_hdr);
+	head->network_header += sizeof(struct frag_hdr);
+
+	skb_shinfo(head)->frag_list = head->next;
+	skb_reset_transport_header(head);
+	skb_push(head, head->data - skb_network_header(head));
+
+	for (fp=head->next; fp; fp = fp->next) {
+		head->data_len += fp->len;
+		head->len += fp->len;
+		if (head->ip_summed != fp->ip_summed)
+			head->ip_summed = CHECKSUM_NONE;
+		else if (head->ip_summed == CHECKSUM_COMPLETE)
+			head->csum = csum_add(head->csum, fp->csum);
+		head->truesize += fp->truesize;
+	}
+	sub_frag_mem_limit(fq->q.net, head->truesize);
+
+	head->ignore_df = 1;
+	head->next = NULL;
+	head->dev = dev;
+	head->tstamp = fq->q.stamp;
+	ipv6_hdr(head)->payload_len = htons(payload_len);
+	ipv6_change_dsfield(ipv6_hdr(head), 0xff, ecn);
+	IP6CB(head)->frag_max_size = sizeof(struct ipv6hdr) + fq->q.max_size;
+
+	/* Yes, and fold redundant checksum back. 8) */
+	if (head->ip_summed == CHECKSUM_COMPLETE)
+		head->csum = csum_partial(skb_network_header(head),
+					  skb_network_header_len(head),
+					  head->csum);
+
+	fq->q.fragments = NULL;
+	fq->q.fragments_tail = NULL;
+
+	/* all original skbs are linked into the NFCT_FRAG6_CB(head).orig */
+	fp = skb_shinfo(head)->frag_list;
+	if (fp && NFCT_FRAG6_CB(fp)->orig == NULL)
+		/* at above code, head skb is divided into two skbs. */
+		fp = fp->next;
+
+	op = NFCT_FRAG6_CB(head)->orig;
+	for (; fp; fp = fp->next) {
+		struct sk_buff *orig = NFCT_FRAG6_CB(fp)->orig;
+
+		op->next = orig;
+		op = orig;
+		NFCT_FRAG6_CB(fp)->orig = NULL;
+	}
+
+	return head;
+
+out_oversize:
+	net_dbg_ratelimited("nf_ct_frag6_reasm: payload len = %d\n",
+			    payload_len);
+	goto out_fail;
+out_oom:
+	net_dbg_ratelimited("nf_ct_frag6_reasm: no memory for reassembly\n");
+out_fail:
+	return NULL;
+}
+
+/*
+ * find the header just before Fragment Header.
+ *
+ * if success return 0 and set ...
+ * (*prevhdrp): the value of "Next Header Field" in the header
+ *		just before Fragment Header.
+ * (*prevhoff): the offset of "Next Header Field" in the header
+ *		just before Fragment Header.
+ * (*fhoff)   : the offset of Fragment Header.
+ *
+ * Based on ipv6_skip_hdr() in net/ipv6/exthdr.c
+ *
+ */
+static int
+find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int *prevhoff, int *fhoff)
+{
+	u8 nexthdr = ipv6_hdr(skb)->nexthdr;
+	const int netoff = skb_network_offset(skb);
+	u8 prev_nhoff = netoff + offsetof(struct ipv6hdr, nexthdr);
+	int start = netoff + sizeof(struct ipv6hdr);
+	int len = skb->len - start;
+	u8 prevhdr = NEXTHDR_IPV6;
+
+	while (nexthdr != NEXTHDR_FRAGMENT) {
+		struct ipv6_opt_hdr hdr;
+		int hdrlen;
+
+		if (!ipv6_ext_hdr(nexthdr)) {
+			return -1;
+		}
+		if (nexthdr == NEXTHDR_NONE) {
+			pr_debug("next header is none\n");
+			return -1;
+		}
+		if (len < (int)sizeof(struct ipv6_opt_hdr)) {
+			pr_debug("too short\n");
+			return -1;
+		}
+		if (skb_copy_bits(skb, start, &hdr, sizeof(hdr)))
+			BUG();
+		if (nexthdr == NEXTHDR_AUTH)
+			hdrlen = (hdr.hdrlen+2)<<2;
+		else
+			hdrlen = ipv6_optlen(&hdr);
+
+		prevhdr = nexthdr;
+		prev_nhoff = start;
+
+		nexthdr = hdr.nexthdr;
+		len -= hdrlen;
+		start += hdrlen;
+	}
+
+	if (len < 0)
+		return -1;
+
+	*prevhdrp = prevhdr;
+	*prevhoff = prev_nhoff;
+	*fhoff = start;
+
+	return 0;
+}
+
+struct sk_buff *rpl_nf_ct_frag6_gather(struct sk_buff *skb, u32 user)
+{
+	struct sk_buff *clone;
+	struct net_device *dev = skb->dev;
+	struct net *net = skb_dst(skb) ? dev_net(skb_dst(skb)->dev)
+				       : dev_net(skb->dev);
+	struct frag_hdr *fhdr;
+	struct frag_queue *fq;
+	struct ipv6hdr *hdr;
+	int fhoff, nhoff;
+	u8 prevhdr;
+	struct sk_buff *ret_skb = NULL;
+
+	/* Jumbo payload inhibits frag. header */
+	if (ipv6_hdr(skb)->payload_len == 0) {
+		pr_debug("payload len = 0\n");
+		return skb;
+	}
+
+	if (find_prev_fhdr(skb, &prevhdr, &nhoff, &fhoff) < 0)
+		return skb;
+
+	clone = skb_clone(skb, GFP_ATOMIC);
+	if (clone == NULL) {
+		pr_debug("Can't clone skb\n");
+		return skb;
+	}
+
+	NFCT_FRAG6_CB(clone)->orig = skb;
+
+	if (!pskb_may_pull(clone, fhoff + sizeof(*fhdr))) {
+		pr_debug("message is too short.\n");
+		goto ret_orig;
+	}
+
+	skb_set_transport_header(clone, fhoff);
+	hdr = ipv6_hdr(clone);
+	fhdr = (struct frag_hdr *)skb_transport_header(clone);
+
+	fq = fq_find(net, fhdr->identification, user, &hdr->saddr, &hdr->daddr,
+		     ip6_frag_ecn(hdr));
+	if (fq == NULL) {
+		pr_debug("Can't find and can't create new queue\n");
+		goto ret_orig;
+	}
+
+	spin_lock_bh(&fq->q.lock);
+
+	if (nf_ct_frag6_queue(fq, clone, fhdr, nhoff) < 0) {
+		spin_unlock_bh(&fq->q.lock);
+		pr_debug("Can't insert skb to queue\n");
+		inet_frag_put(&fq->q, &nf_frags);
+		goto ret_orig;
+	}
+
+	if (qp_flags(fq) == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
+	    fq->q.meat == fq->q.len) {
+		ret_skb = nf_ct_frag6_reasm(fq, dev);
+		if (ret_skb == NULL)
+			pr_debug("Can't reassemble fragmented packets\n");
+	}
+	spin_unlock_bh(&fq->q.lock);
+
+	inet_frag_put(&fq->q, &nf_frags);
+	return ret_skb;
+
+ret_orig:
+	kfree_skb(clone);
+	return skb;
+}
+EXPORT_SYMBOL_GPL(rpl_nf_ct_frag6_gather);
+
+static void rpl_ip6_frag_init(struct inet_frag_queue *q, const void *a)
+{
+	struct frag_queue *fq = container_of(q, struct frag_queue, q);
+	const struct ip6_create_arg *arg = a;
+
+	fq->id = arg->id;
+	fq->user = arg->user;
+	fq->saddr = *arg->src;
+	fq->daddr = *arg->dst;
+	fq->ecn = arg->ecn;
+}
+
+static bool rpl_ip6_frag_match(const struct inet_frag_queue *q, const void *a)
+{
+	const struct frag_queue *fq;
+	const struct ip6_create_arg *arg = a;
+
+	fq = container_of(q, struct frag_queue, q);
+	return	fq->id == arg->id &&
+		fq->user == arg->user &&
+		ipv6_addr_equal(&fq->saddr, arg->src) &&
+		ipv6_addr_equal(&fq->daddr, arg->dst);
+}
+
+void nf_ct_frag6_consume_orig(struct sk_buff *skb)
+{
+	struct sk_buff *s, *s2;
+
+	for (s = NFCT_FRAG6_CB(skb)->orig; s;) {
+		s2 = s->next;
+		s->next = NULL;
+		consume_skb(s);
+		s = s2;
+	}
+}
+
+static int nf_ct_net_init(struct net *net)
+{
+	nf_defrag_ipv6_enable();
+
+	return 0;
+}
+
+static void nf_ct_net_exit(struct net *net)
+{
+	inet_frags_exit_net(&net->ipv6.frags, &nf_frags);
+}
+
+static struct pernet_operations nf_ct_net_ops = {
+	.init = nf_ct_net_init,
+	.exit = nf_ct_net_exit,
+};
+
+int rpl_nf_ct_frag6_init(void)
+{
+	int ret = 0;
+
+	nf_frags.hashfn = nf_hashfn;
+	nf_frags.constructor = rpl_ip6_frag_init;
+	nf_frags.destructor = NULL;
+	nf_frags.skb_free = nf_skb_free;
+	nf_frags.qsize = sizeof(struct frag_queue);
+	nf_frags.match = rpl_ip6_frag_match;
+	nf_frags.frag_expire = nf_ct_frag6_expire;
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(3,17,0)
+	nf_frags.frags_cache_name = nf_frags_cache_name;
+#endif
+	ret = inet_frags_init(&nf_frags);
+	if (ret)
+		goto out;
+	ret = register_pernet_subsys(&nf_ct_net_ops);
+	if (ret)
+		inet_frags_fini(&nf_frags);
+
+out:
+	return ret;
+}
+
+void rpl_nf_ct_frag6_cleanup(void)
+{
+	unregister_pernet_subsys(&nf_ct_net_ops);
+	inet_frags_fini(&nf_frags);
+}
+
+#endif /* OVS_FRAGMENT_BACKPORT */