From patchwork Thu Dec 3 07:53:54 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Stringer X-Patchwork-Id: 552074 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (unknown [IPv6:2600:3c00::f03c:91ff:fe6e:bdf7]) by ozlabs.org (Postfix) with ESMTP id 45BC31402EC for ; Thu, 3 Dec 2015 18:56:16 +1100 (AEDT) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id E10B910BF2; Wed, 2 Dec 2015 23:54:55 -0800 (PST) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e4.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id D705310BD4 for ; Wed, 2 Dec 2015 23:54:54 -0800 (PST) Received: from bar5.cudamail.com (unknown [192.168.21.12]) by mx1e4.cudamail.com (Postfix) with ESMTPS id 6328F1E00E4 for ; Thu, 3 Dec 2015 00:54:54 -0700 (MST) X-ASG-Debug-ID: 1449129293-09eadd59fdac860001-byXFYA Received: from mx1-pf1.cudamail.com ([192.168.24.1]) by bar5.cudamail.com with ESMTP id XCwFcpDIUVHYJzvs (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 03 Dec 2015 00:54:54 -0700 (MST) X-Barracuda-Envelope-From: joe@ovn.org X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.1 Received: from unknown (HELO relay4-d.mail.gandi.net) (217.70.183.196) by mx1-pf1.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 3 Dec 2015 07:54:53 -0000 Received-SPF: pass (mx1-pf1.cudamail.com: SPF record at ovn.org designates 217.70.183.196 as permitted sender) X-Barracuda-Apparent-Source-IP: 217.70.183.196 X-Barracuda-RBL-IP: 217.70.183.196 Received: from mfilter44-d.gandi.net (mfilter44-d.gandi.net [217.70.178.175]) by relay4-d.mail.gandi.net (Postfix) with ESMTP id F1EBF1720AE; Thu, 3 Dec 2015 08:54:50 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mfilter44-d.gandi.net Received: from relay4-d.mail.gandi.net ([IPv6:::ffff:217.70.183.196]) by mfilter44-d.gandi.net (mfilter44-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id vstOgjuW_8md; Thu, 3 Dec 2015 08:54:48 +0100 (CET) X-Originating-IP: 208.91.1.34 Received: from localhost.localdomain (unknown [208.91.1.34]) (Authenticated sender: joe@ovn.org) by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id 7171E17209D; Thu, 3 Dec 2015 08:54:47 +0100 (CET) X-CudaMail-Envelope-Sender: joe@ovn.org From: Joe Stringer To: dev@openvswitch.org X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-E1-1202001409 X-CudaMail-DTE: 120315 X-CudaMail-Originating-IP: 217.70.183.196 Date: Wed, 2 Dec 2015 23:53:54 -0800 X-ASG-Orig-Subj: [##CM-E1-1202001409##][PATCHv2 18/20] datapath: Backport conntrack fixes. Message-Id: <1449129236-5038-19-git-send-email-joe@ovn.org> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1449129236-5038-1-git-send-email-joe@ovn.org> References: <1449129236-5038-1-git-send-email-joe@ovn.org> X-Barracuda-Connect: UNKNOWN[192.168.24.1] X-Barracuda-Start-Time: 1449129294 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 Subject: [ovs-dev] [PATCHv2 18/20] datapath: Backport conntrack fixes. X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dev-bounces@openvswitch.org Sender: "dev" From: Joe Stringer Backport the following fixes for conntrack from upstream. 9723e6abc70a openswitch: fix typo CONFIG_NF_CONNTRACK_LABEL 0d5cdef8d5dd openvswitch: Fix conntrack compilation without mark. 982b52700482 openvswitch: Fix mask generation for nested attributes. cc5706056baa openvswitch: Fix IPv6 exthdr handling with ct helpers. 33db4125ec74 openvswitch: Rename LABEL->LABELS b8f2257069f1 openvswitch: Fix skb leak in ovs_fragment() ec0d043d05e6 openvswitch: Ensure flow is valid before executing ct 6f225952461b openvswitch: Reject ct_state unsupported bits fbccce5965a5 openvswitch: Extend ct_state match field to 32 bits ab38a7b5a449 openvswitch: Change CT_ATTR_FLAGS to CT_ATTR_COMMIT 9e384715e9e7 openvswitch: Reject ct_state masks for unknown bits 4f0909ee3d8e openvswitch: Mark connections new when not confirmed. e754ec69ab69 openvswitch: Serialize nested ct actions if provided 74c16618137f openvswitch: Fix double-free on ip_defrag() errors 6f5cadee44d8 openvswitch: Fix skb leak using IPv6 defrag Signed-off-by: Joe Stringer --- datapath/actions.c | 23 ++- datapath/conntrack.c | 164 +++++++++++++--------- datapath/conntrack.h | 9 +- datapath/flow.h | 2 +- datapath/flow_netlink.c | 33 +++-- datapath/linux/compat/include/linux/openvswitch.h | 3 +- 6 files changed, 147 insertions(+), 87 deletions(-) diff --git a/datapath/actions.c b/datapath/actions.c index c6b3ca9cbf0d..0d6375f0efd3 100644 --- a/datapath/actions.c +++ b/datapath/actions.c @@ -688,7 +688,7 @@ static void ovs_fragment(struct vport *vport, struct sk_buff *skb, u16 mru, { if (skb_network_offset(skb) > MAX_L2_LEN) { OVS_NLERR(1, "L2 header too long to fragment"); - return; + goto err; } if (ethertype == htons(ETH_P_IP)) { @@ -712,8 +712,7 @@ static void ovs_fragment(struct vport *vport, struct sk_buff *skb, u16 mru, struct rt6_info ovs_rt; if (!v6ops) { - kfree_skb(skb); - return; + goto err; } prepare_frag(vport, skb); @@ -732,8 +731,12 @@ static void ovs_fragment(struct vport *vport, struct sk_buff *skb, u16 mru, WARN_ONCE(1, "Failed fragment ->%s: eth=%04x, MRU=%d, MTU=%d.", ovs_vport_name(vport), ntohs(ethertype), mru, vport->dev->mtu); - kfree_skb(skb); + goto err; } + + return; +err: + kfree_skb(skb); } #else /* <= 3.9 */ static void ovs_fragment(struct vport *vport, struct sk_buff *skb, u16 mru, @@ -982,7 +985,7 @@ static int execute_masked_set_action(struct sk_buff *skb, case OVS_KEY_ATTR_CT_STATE: case OVS_KEY_ATTR_CT_ZONE: case OVS_KEY_ATTR_CT_MARK: - case OVS_KEY_ATTR_CT_LABEL: + case OVS_KEY_ATTR_CT_LABELS: err = -EINVAL; break; } @@ -1113,12 +1116,18 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, break; case OVS_ACTION_ATTR_CT: + if (!is_flow_key_valid(key)) { + err = ovs_flow_key_update(skb, key); + if (err) + return err; + } + err = ovs_ct_execute(ovs_dp_get_net(dp), skb, key, nla_data(a)); /* Hide stolen IP fragments from user space. */ - if (err == -EINPROGRESS) - return 0; + if (err) + return err == -EINPROGRESS ? 0 : err; break; } diff --git a/datapath/conntrack.c b/datapath/conntrack.c index 9d3ee4766f6e..d1bd45ffe007 100644 --- a/datapath/conntrack.c +++ b/datapath/conntrack.c @@ -43,9 +43,9 @@ struct md_mark { }; /* Metadata label for masked write to conntrack label. */ -struct md_label { - struct ovs_key_ct_label value; - struct ovs_key_ct_label mask; +struct md_labels { + struct ovs_key_ct_labels value; + struct ovs_key_ct_labels mask; }; /* Conntrack action context for execution. */ @@ -53,10 +53,10 @@ struct ovs_conntrack_info { struct nf_conntrack_helper *helper; struct nf_conntrack_zone zone; struct nf_conn *ct; - u32 flags; + u8 commit : 1; u16 family; struct md_mark mark; - struct md_label label; + struct md_labels labels; }; static void __ovs_ct_free_action(struct ovs_conntrack_info *ct_info); @@ -108,21 +108,30 @@ static u8 ovs_ct_get_state(enum ip_conntrack_info ctinfo) return ct_state; } -static void ovs_ct_get_label(const struct nf_conn *ct, - struct ovs_key_ct_label *label) +static u32 ovs_ct_get_mark(const struct nf_conn *ct) +{ +#if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK) + return ct ? ct->mark : 0; +#else + return 0; +#endif +} + +static void ovs_ct_get_labels(const struct nf_conn *ct, + struct ovs_key_ct_labels *labels) { struct nf_conn_labels *cl = ct ? nf_ct_labels_find(ct) : NULL; if (cl) { size_t len = cl->words * sizeof(long); - if (len > OVS_CT_LABEL_LEN) - len = OVS_CT_LABEL_LEN; - else if (len < OVS_CT_LABEL_LEN) - memset(label, 0, OVS_CT_LABEL_LEN); - memcpy(label, cl->bits, len); + if (len > OVS_CT_LABELS_LEN) + len = OVS_CT_LABELS_LEN; + else if (len < OVS_CT_LABELS_LEN) + memset(labels, 0, OVS_CT_LABELS_LEN); + memcpy(labels, cl->bits, len); } else { - memset(label, 0, OVS_CT_LABEL_LEN); + memset(labels, 0, OVS_CT_LABELS_LEN); } } @@ -132,8 +141,8 @@ static void __ovs_ct_update_key(struct sw_flow_key *key, u8 state, { key->ct.state = state; key->ct.zone = zone->id; - key->ct.mark = ct ? ct->mark : 0; - ovs_ct_get_label(ct, &key->ct.label); + key->ct.mark = ovs_ct_get_mark(ct); + ovs_ct_get_labels(ct, &key->ct.labels); } /* Update 'key' based on skb->nfct. If 'post_ct' is true, then OVS has @@ -150,6 +159,8 @@ static void ovs_ct_update_key(const struct sk_buff *skb, ct = nf_ct_get(skb, &ctinfo); if (ct) { state = ovs_ct_get_state(ctinfo); + if (!nf_ct_is_confirmed(ct)) + state |= OVS_CS_F_NEW; if (ct->master) state |= OVS_CS_F_RELATED; zone = nf_ct_zone(ct); @@ -166,7 +177,7 @@ void ovs_ct_fill_key(const struct sk_buff *skb, struct sw_flow_key *key) int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb) { - if (nla_put_u8(skb, OVS_KEY_ATTR_CT_STATE, key->ct.state)) + if (nla_put_u32(skb, OVS_KEY_ATTR_CT_STATE, key->ct.state)) return -EMSGSIZE; if (IS_ENABLED(CONFIG_NF_CONNTRACK_ZONES) && @@ -177,9 +188,9 @@ int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb) nla_put_u32(skb, OVS_KEY_ATTR_CT_MARK, key->ct.mark)) return -EMSGSIZE; - if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABEL) && - nla_put(skb, OVS_KEY_ATTR_CT_LABEL, sizeof(key->ct.label), - &key->ct.label)) + if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) && + nla_put(skb, OVS_KEY_ATTR_CT_LABELS, sizeof(key->ct.labels), + &key->ct.labels)) return -EMSGSIZE; return 0; @@ -188,12 +199,11 @@ int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb) static int ovs_ct_set_mark(struct sk_buff *skb, struct sw_flow_key *key, u32 ct_mark, u32 mask) { +#if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK) enum ip_conntrack_info ctinfo; struct nf_conn *ct; u32 new_mark; - if (!IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)) - return -ENOTSUPP; /* The connection could be invalid, in which case set_mark is no-op. */ ct = nf_ct_get(skb, &ctinfo); @@ -208,20 +218,20 @@ static int ovs_ct_set_mark(struct sk_buff *skb, struct sw_flow_key *key, } return 0; +#else + return -ENOTSUPP; +#endif } -static int ovs_ct_set_label(struct sk_buff *skb, struct sw_flow_key *key, - const struct ovs_key_ct_label *label, - const struct ovs_key_ct_label *mask) +static int ovs_ct_set_labels(struct sk_buff *skb, struct sw_flow_key *key, + const struct ovs_key_ct_labels *labels, + const struct ovs_key_ct_labels *mask) { enum ip_conntrack_info ctinfo; struct nf_conn_labels *cl; struct nf_conn *ct; int err; - if (!IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS)) - return -ENOTSUPP; - /* The connection could be invalid, in which case set_label is no-op.*/ ct = nf_ct_get(skb, &ctinfo); if (!ct) @@ -232,15 +242,15 @@ static int ovs_ct_set_label(struct sk_buff *skb, struct sw_flow_key *key, nf_ct_labels_ext_add(ct); cl = nf_ct_labels_find(ct); } - if (!cl || cl->words * sizeof(long) < OVS_CT_LABEL_LEN) + if (!cl || cl->words * sizeof(long) < OVS_CT_LABELS_LEN) return -ENOSPC; - err = nf_connlabels_replace(ct, (u32 *)label, (u32 *)mask, - OVS_CT_LABEL_LEN / sizeof(u32)); + err = nf_connlabels_replace(ct, (u32 *)labels, (u32 *)mask, + OVS_CT_LABELS_LEN / sizeof(u32)); if (err) return err; - ovs_ct_get_label(ct, &key->ct.label); + ovs_ct_get_labels(ct, &key->ct.labels); return 0; } @@ -272,13 +282,15 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto) case NFPROTO_IPV6: { u8 nexthdr = ipv6_hdr(skb)->nexthdr; __be16 frag_off; + int ofs; - protoff = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), - &nexthdr, &frag_off); - if (protoff < 0 || (frag_off & htons(~0x7)) != 0) { + ofs = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), &nexthdr, + &frag_off); + if (ofs < 0 || (frag_off & htons(~0x7)) != 0) { pr_debug("proto header not found\n"); return NF_ACCEPT; } + protoff = ofs; break; } default: @@ -289,6 +301,9 @@ static int ovs_ct_helper(struct sk_buff *skb, u16 proto) return helper->help(skb, protoff, ct, ctinfo); } +/* Returns 0 on success, -EINPROGRESS if 'skb' is stolen, or other nonzero + * value if 'skb' is freed. + */ static int handle_fragments(struct net *net, struct sw_flow_key *key, u16 zone, struct sk_buff *skb) { @@ -309,8 +324,8 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key, return err; ovs_cb.mru = IPCB(skb)->frag_max_size; - } else if (key->eth.type == htons(ETH_P_IPV6)) { #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6) + } else if (key->eth.type == htons(ETH_P_IPV6)) { enum ip6_defrag_users user = IP6_DEFRAG_CONNTRACK_IN + zone; struct sk_buff *reasm; @@ -319,17 +334,25 @@ static int handle_fragments(struct net *net, struct sw_flow_key *key, if (!reasm) return -EINPROGRESS; - if (skb == reasm) + if (skb == reasm) { + kfree_skb(skb); return -EINVAL; + } + + /* Don't free 'skb' even though it is one of the original + * fragments, as we're going to morph it into the head. + */ + skb_get(skb); + nf_ct_frag6_consume_orig(reasm); key->ip.proto = ipv6_hdr(reasm)->nexthdr; skb_morph(skb, reasm); + skb->next = reasm->next; consume_skb(reasm); ovs_cb.mru = IP6CB(skb)->frag_max_size; -#else - return -EPFNOSUPPORT; #endif /* IP frag support */ } else { + kfree_skb(skb); return -EPFNOSUPPORT; } @@ -377,7 +400,7 @@ static bool skb_nfct_cached(const struct net *net, const struct sk_buff *skb, return true; } -static int __ovs_ct_lookup(struct net *net, const struct sw_flow_key *key, +static int __ovs_ct_lookup(struct net *net, struct sw_flow_key *key, const struct ovs_conntrack_info *info, struct sk_buff *skb) { @@ -408,6 +431,8 @@ static int __ovs_ct_lookup(struct net *net, const struct sw_flow_key *key, } } + ovs_ct_update_key(skb, key, true); + return 0; } @@ -430,8 +455,6 @@ static int ovs_ct_lookup(struct net *net, struct sw_flow_key *key, err = __ovs_ct_lookup(net, key, info, skb); if (err) return err; - - ovs_ct_update_key(skb, key, true); } return 0; @@ -460,22 +483,23 @@ static int ovs_ct_commit(struct net *net, struct sw_flow_key *key, if (nf_conntrack_confirm(skb) != NF_ACCEPT) return -EINVAL; - ovs_ct_update_key(skb, key, true); - return 0; } -static bool label_nonzero(const struct ovs_key_ct_label *label) +static bool labels_nonzero(const struct ovs_key_ct_labels *labels) { size_t i; - for (i = 0; i < sizeof(*label); i++) - if (label->ct_label[i]) + for (i = 0; i < sizeof(*labels); i++) + if (labels->ct_labels[i]) return true; return false; } +/* Returns 0 on success, -EINPROGRESS if 'skb' is stolen, or other nonzero + * value if 'skb' is freed. + */ int ovs_ct_execute(struct net *net, struct sk_buff *skb, struct sw_flow_key *key, const struct ovs_conntrack_info *info) @@ -493,7 +517,7 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb, return err; } - if (info->flags & OVS_CT_F_COMMIT) + if (info->commit) err = ovs_ct_commit(net, key, info, skb); else err = ovs_ct_lookup(net, key, info, skb); @@ -506,11 +530,13 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb, if (err) goto err; } - if (label_nonzero(&info->label.mask)) - err = ovs_ct_set_label(skb, key, &info->label.value, - &info->label.mask); + if (labels_nonzero(&info->labels.mask)) + err = ovs_ct_set_labels(skb, key, &info->labels.value, + &info->labels.mask); err: skb_push(skb, nh_ofs); + if (err) + kfree_skb(skb); return err; } @@ -539,14 +565,13 @@ static int ovs_ct_add_helper(struct ovs_conntrack_info *info, const char *name, } static const struct ovs_ct_len_tbl ovs_ct_attr_lens[OVS_CT_ATTR_MAX + 1] = { - [OVS_CT_ATTR_FLAGS] = { .minlen = sizeof(u32), - .maxlen = sizeof(u32) }, + [OVS_CT_ATTR_COMMIT] = { .minlen = 0, .maxlen = 0 }, [OVS_CT_ATTR_ZONE] = { .minlen = sizeof(u16), .maxlen = sizeof(u16) }, [OVS_CT_ATTR_MARK] = { .minlen = sizeof(struct md_mark), .maxlen = sizeof(struct md_mark) }, - [OVS_CT_ATTR_LABEL] = { .minlen = sizeof(struct md_label), - .maxlen = sizeof(struct md_label) }, + [OVS_CT_ATTR_LABELS] = { .minlen = sizeof(struct md_labels), + .maxlen = sizeof(struct md_labels) }, [OVS_CT_ATTR_HELPER] = { .minlen = 1, .maxlen = NF_CT_HELPER_NAME_LEN } }; @@ -576,8 +601,8 @@ static int parse_ct(const struct nlattr *attr, struct ovs_conntrack_info *info, } switch (type) { - case OVS_CT_ATTR_FLAGS: - info->flags = nla_get_u32(a); + case OVS_CT_ATTR_COMMIT: + info->commit = true; break; #ifdef CONFIG_NF_CONNTRACK_ZONES case OVS_CT_ATTR_ZONE: @@ -588,15 +613,23 @@ static int parse_ct(const struct nlattr *attr, struct ovs_conntrack_info *info, case OVS_CT_ATTR_MARK: { struct md_mark *mark = nla_data(a); + if (!mark->mask) { + OVS_NLERR(log, "ct_mark mask cannot be 0"); + return -EINVAL; + } info->mark = *mark; break; } #endif #ifdef CONFIG_NF_CONNTRACK_LABELS - case OVS_CT_ATTR_LABEL: { - struct md_label *label = nla_data(a); + case OVS_CT_ATTR_LABELS: { + struct md_labels *labels = nla_data(a); - info->label = *label; + if (!labels_nonzero(&labels->mask)) { + OVS_NLERR(log, "ct_labels mask cannot be 0"); + return -EINVAL; + } + info->labels = *labels; break; } #endif @@ -633,7 +666,7 @@ bool ovs_ct_verify(struct net *net, enum ovs_key_attr attr) attr == OVS_KEY_ATTR_CT_MARK) return true; if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) && - attr == OVS_KEY_ATTR_CT_LABEL) { + attr == OVS_KEY_ATTR_CT_LABELS) { struct ovs_net *ovs_net = net_generic(net, ovs_net_id); return ovs_net->xt_label; @@ -701,18 +734,19 @@ int ovs_ct_action_to_attr(const struct ovs_conntrack_info *ct_info, if (!start) return -EMSGSIZE; - if (nla_put_u32(skb, OVS_CT_ATTR_FLAGS, ct_info->flags)) + if (ct_info->commit && nla_put_flag(skb, OVS_CT_ATTR_COMMIT)) return -EMSGSIZE; if (IS_ENABLED(CONFIG_NF_CONNTRACK_ZONES) && nla_put_u16(skb, OVS_CT_ATTR_ZONE, ct_info->zone.id)) return -EMSGSIZE; - if (IS_ENABLED(CONFIG_NF_CONNTRACK_MARK) && + if (IS_ENABLED(CONFIG_NF_CONNTRACK_MARK) && ct_info->mark.mask && nla_put(skb, OVS_CT_ATTR_MARK, sizeof(ct_info->mark), &ct_info->mark)) return -EMSGSIZE; if (IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS) && - nla_put(skb, OVS_CT_ATTR_LABEL, sizeof(ct_info->label), - &ct_info->label)) + labels_nonzero(&ct_info->labels.mask) && + nla_put(skb, OVS_CT_ATTR_LABELS, sizeof(ct_info->labels), + &ct_info->labels)) return -EMSGSIZE; if (ct_info->helper) { if (nla_put_string(skb, OVS_CT_ATTR_HELPER, @@ -742,7 +776,7 @@ static void __ovs_ct_free_action(struct ovs_conntrack_info *ct_info) void ovs_ct_init(struct net *net) { - unsigned int n_bits = sizeof(struct ovs_key_ct_label) * BITS_PER_BYTE; + unsigned int n_bits = sizeof(struct ovs_key_ct_labels) * BITS_PER_BYTE; struct ovs_net *ovs_net = net_generic(net, ovs_net_id); if (nf_connlabels_get(net, n_bits)) { diff --git a/datapath/conntrack.h b/datapath/conntrack.h index 082c92ca4651..340eec621f70 100644 --- a/datapath/conntrack.h +++ b/datapath/conntrack.h @@ -35,6 +35,10 @@ int ovs_ct_execute(struct net *, struct sk_buff *, struct sw_flow_key *, void ovs_ct_fill_key(const struct sk_buff *skb, struct sw_flow_key *key); int ovs_ct_put_key(const struct sw_flow_key *key, struct sk_buff *skb); void ovs_ct_free_action(const struct nlattr *a); + +#define CT_SUPPORTED_MASK (OVS_CS_F_NEW | OVS_CS_F_ESTABLISHED | \ + OVS_CS_F_RELATED | OVS_CS_F_REPLY_DIR | \ + OVS_CS_F_INVALID | OVS_CS_F_TRACKED) #else #include @@ -64,6 +68,7 @@ static inline int ovs_ct_execute(struct net *net, struct sk_buff *skb, struct sw_flow_key *key, const struct ovs_conntrack_info *info) { + kfree_skb(skb); return -ENOTSUPP; } @@ -73,7 +78,7 @@ static inline void ovs_ct_fill_key(const struct sk_buff *skb, key->ct.state = 0; key->ct.zone = 0; key->ct.mark = 0; - memset(&key->ct.label, 0, sizeof(key->ct.label)); + memset(&key->ct.labels, 0, sizeof(key->ct.labels)); } static inline int ovs_ct_put_key(const struct sw_flow_key *key, @@ -83,5 +88,7 @@ static inline int ovs_ct_put_key(const struct sw_flow_key *key, } static inline void ovs_ct_free_action(const struct nlattr *a) { } + +#define CT_SUPPORTED_MASK 0 #endif #endif /* ovs_conntrack.h */ diff --git a/datapath/flow.h b/datapath/flow.h index f724ce6c317d..c0b628a2948f 100644 --- a/datapath/flow.h +++ b/datapath/flow.h @@ -116,7 +116,7 @@ struct sw_flow_key { u16 zone; u32 mark; u8 state; - struct ovs_key_ct_label label; + struct ovs_key_ct_labels labels; } ct; } __aligned(BITS_PER_LONG/8); /* Ensure that we can do comparisons as longs. */ diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c index d87b5b009d73..0a9eb4514f21 100644 --- a/datapath/flow_netlink.c +++ b/datapath/flow_netlink.c @@ -292,10 +292,10 @@ size_t ovs_key_attr_size(void) + nla_total_size(4) /* OVS_KEY_ATTR_SKB_MARK */ + nla_total_size(4) /* OVS_KEY_ATTR_DP_HASH */ + nla_total_size(4) /* OVS_KEY_ATTR_RECIRC_ID */ - + nla_total_size(1) /* OVS_KEY_ATTR_CT_STATE */ + + nla_total_size(4) /* OVS_KEY_ATTR_CT_STATE */ + nla_total_size(2) /* OVS_KEY_ATTR_CT_ZONE */ + nla_total_size(4) /* OVS_KEY_ATTR_CT_MARK */ - + nla_total_size(16) /* OVS_KEY_ATTR_CT_LABEL */ + + nla_total_size(16) /* OVS_KEY_ATTR_CT_LABELS */ + nla_total_size(12) /* OVS_KEY_ATTR_ETHERNET */ + nla_total_size(2) /* OVS_KEY_ATTR_ETHERTYPE */ + nla_total_size(4) /* OVS_KEY_ATTR_VLAN */ @@ -350,10 +350,10 @@ static const struct ovs_len_tbl ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = { [OVS_KEY_ATTR_TUNNEL] = { .len = OVS_ATTR_NESTED, .next = ovs_tunnel_key_lens, }, [OVS_KEY_ATTR_MPLS] = { .len = sizeof(struct ovs_key_mpls) }, - [OVS_KEY_ATTR_CT_STATE] = { .len = sizeof(u8) }, + [OVS_KEY_ATTR_CT_STATE] = { .len = sizeof(u32) }, [OVS_KEY_ATTR_CT_ZONE] = { .len = sizeof(u16) }, [OVS_KEY_ATTR_CT_MARK] = { .len = sizeof(u32) }, - [OVS_KEY_ATTR_CT_LABEL] = { .len = sizeof(struct ovs_key_ct_label) }, + [OVS_KEY_ATTR_CT_LABELS] = { .len = sizeof(struct ovs_key_ct_labels) }, }; static bool check_attr_len(unsigned int attr_len, unsigned int expected_len) @@ -815,7 +815,13 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match, if (*attrs & (1 << OVS_KEY_ATTR_CT_STATE) && ovs_ct_verify(net, OVS_KEY_ATTR_CT_STATE)) { - u8 ct_state = nla_get_u8(a[OVS_KEY_ATTR_CT_STATE]); + u32 ct_state = nla_get_u32(a[OVS_KEY_ATTR_CT_STATE]); + + if (ct_state & ~CT_SUPPORTED_MASK) { + OVS_NLERR(log, "ct_state flags %08x unsupported", + ct_state); + return -EINVAL; + } SW_FLOW_KEY_PUT(match, ct.state, ct_state, is_mask); *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_STATE); @@ -834,14 +840,14 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match, SW_FLOW_KEY_PUT(match, ct.mark, mark, is_mask); *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_MARK); } - if (*attrs & (1 << OVS_KEY_ATTR_CT_LABEL) && - ovs_ct_verify(net, OVS_KEY_ATTR_CT_LABEL)) { - const struct ovs_key_ct_label *cl; + if (*attrs & (1 << OVS_KEY_ATTR_CT_LABELS) && + ovs_ct_verify(net, OVS_KEY_ATTR_CT_LABELS)) { + const struct ovs_key_ct_labels *cl; - cl = nla_data(a[OVS_KEY_ATTR_CT_LABEL]); - SW_FLOW_KEY_MEMCPY(match, ct.label, cl->ct_label, + cl = nla_data(a[OVS_KEY_ATTR_CT_LABELS]); + SW_FLOW_KEY_MEMCPY(match, ct.labels, cl->ct_labels, sizeof(*cl), is_mask); - *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_LABEL); + *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_LABELS); } return 0; } @@ -1095,6 +1101,9 @@ static void nlattr_set(struct nlattr *attr, u8 val, } else { memset(nla_data(nla), val, nla_len(nla)); } + + if (nla_type(nla) == OVS_KEY_ATTR_CT_STATE) + *(u32 *)nla_data(nla) &= CT_SUPPORTED_MASK; } } @@ -1979,7 +1988,7 @@ static int validate_set(const struct nlattr *a, case OVS_KEY_ATTR_PRIORITY: case OVS_KEY_ATTR_SKB_MARK: case OVS_KEY_ATTR_CT_MARK: - case OVS_KEY_ATTR_CT_LABEL: + case OVS_KEY_ATTR_CT_LABELS: case OVS_KEY_ATTR_ETHERNET: break; diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index 3e2c9e53bb18..3b39ebbc3d6a 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -676,7 +676,8 @@ struct ovs_action_push_tnl { * enum ovs_ct_attr - Attributes for %OVS_ACTION_ATTR_CT action. * @OVS_CT_ATTR_COMMIT: If present, commits the connection to the conntrack * table. This allows future packets for the same connection to be identified - * as 'established' or 'related'. + * as 'established' or 'related'. The flow key for the current packet will + * retain the pre-commit connection state. * @OVS_CT_ATTR_ZONE: u16 connection tracking zone. * @OVS_CT_ATTR_MARK: u32 value followed by u32 mask. For each bit set in the * mask, the corresponding bit in the value is copied to the connection