From patchwork Mon Aug 22 14:38:34 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Vadai X-Patchwork-Id: 661495 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3sHx6l1XYqz9s36 for ; Tue, 23 Aug 2016 00:39:43 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756046AbcHVOjO (ORCPT ); Mon, 22 Aug 2016 10:39:14 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34416 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755474AbcHVOjF (ORCPT ); Mon, 22 Aug 2016 10:39:05 -0400 Received: by mail-wm0-f66.google.com with SMTP id q128so13811308wma.1 for ; Mon, 22 Aug 2016 07:39:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cEdRhv2p9/YMHAdH6OpQBsb0xZ37YX00blzNeN/3Rm8=; b=ip1xaT945qxWrVQlb6IH7pdr9tHW7BKbNjzN3sHgyPbAvf2ldGjCO6NoIq2E8r6B2S iLasxYSpGB51wC2SxNSeaZu8PSCuejzoklPX9wLYXuvZDWxZA0+4GEgOp+ic+2jBmTER VSPH71FeMJQnCK8sQiz3+yZJe6nF2agMuwyA+IYfH1fXnkCDedRx+vsy9n376v+St7jG 30Srh1Yv7k9IhBslW4K2HlqX+WmyfxoHcLcmxUJNs+fDR7jFHNnTYzp4PUn8PT2/jGbX QIsFPfq7V7Lsr2k15TYo7yQTFf3KHUdgGGzrBACPVY+FNZkeLogcvZ0Jc14Vf4Qjd6zY P4Cw== X-Gm-Message-State: AEkoouvnEFaZwOOj+WhOuWTLeeIuTJthYxHM8WyaZmw3CTFAu8Q0iLhNtne+FCpDwpHxpA== X-Received: by 10.194.205.166 with SMTP id lh6mr17501680wjc.114.1471876741724; Mon, 22 Aug 2016 07:39:01 -0700 (PDT) Received: from office.vadai.me (212.116.172.4.static.012.net.il. [212.116.172.4]) by smtp.gmail.com with ESMTPSA id q139sm21809682wmb.18.2016.08.22.07.38.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Aug 2016 07:39:00 -0700 (PDT) From: Amir Vadai To: "David S. Miller" Cc: netdev@vger.kernel.org, John Fastabend , Jiri Pirko , Cong Wang , Jamal Hadi Salim , Or Gerlitz , Hadar Har-Zion , Amir Vadai Subject: [PATCH net-next 3/3] net/sched: Introduce act_iptunnel Date: Mon, 22 Aug 2016 17:38:34 +0300 Message-Id: <20160822143834.32422-4-amir@vadai.me> X-Mailer: git-send-email 2.9.0 In-Reply-To: <20160822143834.32422-1-amir@vadai.me> References: <20160822143834.32422-1-amir@vadai.me> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This action could be used before redirecting packets to a shared tunnel device, or when redirecting packets arriving from a such a device The action will release the metadata created by the tunnel device (decap), or set the metadata with the specified values for encap operation. For example, the following flower filter will forward all ICMP packets destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before redirecting, a metadata for the vxlan tunnel is created using the iptunnel action and it's arguments: $ filter add dev net0 protocol ip parent ffff: \ flower \ ip_proto 1 \ dst_ip 11.11.11.2 \ action iptunnel encap \ src_ip 11.11.0.1 \ dst_ip 11.11.0.2 \ id 11 \ action mirred egress redirect dev vxlan0 Signed-off-by: Amir Vadai --- include/net/tc_act/tc_iptunnel.h | 24 +++ include/uapi/linux/tc_act/tc_iptunnel.h | 40 +++++ net/sched/Kconfig | 11 ++ net/sched/Makefile | 1 + net/sched/act_iptunnel.c | 292 ++++++++++++++++++++++++++++++++ 5 files changed, 368 insertions(+) create mode 100644 include/net/tc_act/tc_iptunnel.h create mode 100644 include/uapi/linux/tc_act/tc_iptunnel.h create mode 100644 net/sched/act_iptunnel.c diff --git a/include/net/tc_act/tc_iptunnel.h b/include/net/tc_act/tc_iptunnel.h new file mode 100644 index 000000000000..a325081478e7 --- /dev/null +++ b/include/net/tc_act/tc_iptunnel.h @@ -0,0 +1,24 @@ +/* + * Copyright (c) 2016, Amir Vadai + * Copyright (c) 2016, Mellanox Technologies. All rights reserved. + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#ifndef __NET_TC_IPTUNNEL_H +#define __NET_TC_IPTUNNEL_H + +#include + +struct tcf_iptunnel { + struct tc_action common; + int tcft_action; + struct metadata_dst *tcft_enc_metadata; +}; + +#define to_iptunnel(a) ((struct tcf_iptunnel *)a) + +#endif /* __NET_TC_IPTUNNEL_H */ + diff --git a/include/uapi/linux/tc_act/tc_iptunnel.h b/include/uapi/linux/tc_act/tc_iptunnel.h new file mode 100644 index 000000000000..a9b688c1f28b --- /dev/null +++ b/include/uapi/linux/tc_act/tc_iptunnel.h @@ -0,0 +1,40 @@ +/* + * Copyright (c) 2016, Amir Vadai + * Copyright (c) 2016, Mellanox Technologies. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#ifndef __LINUX_TC_IPTUNNEL_H +#define __LINUX_TC_IPTUNNEL_H + +#include + +#define TCA_ACT_IPTUNNEL 17 + +#define TCA_IPTUNNEL_ACT_ENCAP 1 +#define TCA_IPTUNNEL_ACT_DECAP 2 + +struct tc_iptunnel { + tc_gen; + int t_action; +}; + +enum { + TCA_IPTUNNEL_UNSPEC, + TCA_IPTUNNEL_TM, + TCA_IPTUNNEL_PARMS, + TCA_IPTUNNEL_ENC_IPV4_SRC, /* be32 */ + TCA_IPTUNNEL_ENC_IPV4_DST, /* be32 */ + TCA_IPTUNNEL_ENC_KEY_ID, /* be64 */ + TCA_IPTUNNEL_PAD, + __TCA_IPTUNNEL_MAX, +}; + +#define TCA_IPTUNNEL_MAX (__TCA_IPTUNNEL_MAX - 1) + +#endif + diff --git a/net/sched/Kconfig b/net/sched/Kconfig index ccf931b3b94c..a8a5ac4edb2e 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -761,6 +761,17 @@ config NET_ACT_IFE To compile this code as a module, choose M here: the module will be called act_ife. +config NET_ACT_IPTUNNEL + tristate "IP tunnel manipulation" + depends on NET_CLS_ACT + ---help--- + Say Y here to set/release ip tunnel metadata. + + If unsure, say N. + + To compile this code as a module, choose M here: the + module will be called act_tunnel. + config NET_IFE_SKBMARK tristate "Support to encoding decoding skb mark on IFE action" depends on NET_ACT_IFE diff --git a/net/sched/Makefile b/net/sched/Makefile index ae088a5a9d95..c1287b95b574 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -22,6 +22,7 @@ obj-$(CONFIG_NET_ACT_CONNMARK) += act_connmark.o obj-$(CONFIG_NET_ACT_IFE) += act_ife.o obj-$(CONFIG_NET_IFE_SKBMARK) += act_meta_mark.o obj-$(CONFIG_NET_IFE_SKBPRIO) += act_meta_skbprio.o +obj-$(CONFIG_NET_ACT_IPTUNNEL) += act_iptunnel.o obj-$(CONFIG_NET_SCH_FIFO) += sch_fifo.o obj-$(CONFIG_NET_SCH_CBQ) += sch_cbq.o obj-$(CONFIG_NET_SCH_HTB) += sch_htb.o diff --git a/net/sched/act_iptunnel.c b/net/sched/act_iptunnel.c new file mode 100644 index 000000000000..37640bd11b62 --- /dev/null +++ b/net/sched/act_iptunnel.c @@ -0,0 +1,292 @@ +/* + * Copyright (c) 2016, Amir Vadai + * Copyright (c) 2016, Mellanox Technologies. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define IPTUNNEL_TAB_MASK 15 + +static int iptunnel_net_id; +static struct tc_action_ops act_iptunnel_ops; + +static int tcf_iptunnel(struct sk_buff *skb, const struct tc_action *a, + struct tcf_result *res) +{ + struct tcf_iptunnel *t = to_iptunnel(a); + int action; + + spin_lock(&t->tcf_lock); + tcf_lastuse_update(&t->tcf_tm); + bstats_update(&t->tcf_bstats, skb); + action = t->tcf_action; + + switch (t->tcft_action) { + case TCA_IPTUNNEL_ACT_DECAP: + skb_dst_set_noref(skb, NULL); + break; + case TCA_IPTUNNEL_ACT_ENCAP: + skb_dst_set_noref(skb, &t->tcft_enc_metadata->dst); + + break; + default: + BUG(); + } + + spin_unlock(&t->tcf_lock); + return action; +} + +static const struct nla_policy iptunnel_policy[TCA_IPTUNNEL_MAX + 1] = { + [TCA_IPTUNNEL_PARMS] = { .len = sizeof(struct tc_iptunnel) }, + [TCA_IPTUNNEL_ENC_IPV4_SRC] = { .type = NLA_U32 }, + [TCA_IPTUNNEL_ENC_IPV4_DST] = { .type = NLA_U32 }, + [TCA_IPTUNNEL_ENC_KEY_ID] = { .type = NLA_U32 }, +}; + +static struct metadata_dst *iptunnel_alloc(struct tcf_iptunnel *t, + __be32 saddr, __be32 daddr, + __be64 key_id) +{ + struct ip_tunnel_info *tun_info; + struct metadata_dst *metadata; + + metadata = metadata_dst_alloc(0, GFP_KERNEL); + if (!metadata) + return ERR_PTR(-ENOMEM); + + tun_info = &metadata->u.tun_info; + tun_info->mode = IP_TUNNEL_INFO_TX; + + ip_tunnel_key_init(&tun_info->key, saddr, daddr, 0, 0, 0, 0, 0, + key_id, 0); + + return metadata; +} + +static int tcf_iptunnel_init(struct net *net, struct nlattr *nla, + struct nlattr *est, struct tc_action **a, + int ovr, int bind) +{ + struct tc_action_net *tn = net_generic(net, iptunnel_net_id); + struct nlattr *tb[TCA_IPTUNNEL_MAX + 1]; + struct metadata_dst *metadata; + struct tc_iptunnel *parm; + struct tcf_iptunnel *t; + __be32 saddr = 0; + __be32 daddr = 0; + __be64 key_id = 0; + int encapdecap; + bool exists = false; + int ret = -EINVAL; + int err; + + if (!nla) + return -EINVAL; + + err = nla_parse_nested(tb, TCA_IPTUNNEL_MAX, nla, iptunnel_policy); + if (err < 0) + return err; + + if (!tb[TCA_IPTUNNEL_PARMS]) + return -EINVAL; + parm = nla_data(tb[TCA_IPTUNNEL_PARMS]); + exists = tcf_hash_check(tn, parm->index, a, bind); + if (exists && bind) + return 0; + + encapdecap = parm->t_action; + + switch (encapdecap) { + case TCA_IPTUNNEL_ACT_DECAP: + break; + case TCA_IPTUNNEL_ACT_ENCAP: + if (tb[TCA_IPTUNNEL_ENC_IPV4_SRC]) + saddr = nla_get_be32(tb[TCA_IPTUNNEL_ENC_IPV4_SRC]); + if (tb[TCA_IPTUNNEL_ENC_IPV4_DST]) + daddr = nla_get_be32(tb[TCA_IPTUNNEL_ENC_IPV4_DST]); + if (tb[TCA_IPTUNNEL_ENC_KEY_ID]) + key_id = key32_to_tunnel_id(nla_get_be32(tb[TCA_IPTUNNEL_ENC_KEY_ID])); + + if (!saddr || !daddr || !key_id) { + ret = -EINVAL; + goto err_out; + } + + metadata = iptunnel_alloc(t, saddr, daddr, key_id); + if (IS_ERR(metadata)) { + ret = PTR_ERR(metadata); + goto err_out; + } + + break; + default: + goto err_out; + } + + if (!exists) { + ret = tcf_hash_create(tn, parm->index, est, a, + &act_iptunnel_ops, bind, false); + if (ret) + return ret; + + ret = ACT_P_CREATED; + } else { + tcf_hash_release(*a, bind); + if (!ovr) + return -EEXIST; + } + + t = to_iptunnel(*a); + + spin_lock_bh(&t->tcf_lock); + + t->tcf_action = parm->action; + + t->tcft_action = encapdecap; + t->tcft_enc_metadata = metadata; + + spin_unlock_bh(&t->tcf_lock); + + if (ret == ACT_P_CREATED) + tcf_hash_insert(tn, *a); + + return ret; + +err_out: + if (exists) + tcf_hash_release(*a, bind); + return ret; +} + +static void tcf_iptunnel_release(struct tc_action *a, int bind) +{ + struct tcf_iptunnel *t = to_iptunnel(a); + + if (t->tcft_action == TCA_IPTUNNEL_ACT_ENCAP) + dst_release(&t->tcft_enc_metadata->dst); +} + +static int tcf_iptunnel_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) +{ + unsigned char *b = skb_tail_pointer(skb); + struct tcf_iptunnel *t = to_iptunnel(a); + struct tc_iptunnel opt = { + .index = t->tcf_index, + .refcnt = t->tcf_refcnt - ref, + .bindcnt = t->tcf_bindcnt - bind, + .action = t->tcf_action, + .t_action = t->tcft_action, + }; + struct tcf_t tm; + + if (nla_put(skb, TCA_IPTUNNEL_PARMS, sizeof(opt), &opt)) + goto nla_put_failure; + + if (t->tcft_action == TCA_IPTUNNEL_ACT_ENCAP) { + struct ip_tunnel_key *key = + &t->tcft_enc_metadata->u.tun_info.key; + __be32 saddr = key->u.ipv4.src; + __be32 daddr = key->u.ipv4.dst; + __be32 key_id = tunnel_id_to_key32(key->tun_id); + + if (nla_put_be32(skb, TCA_IPTUNNEL_ENC_IPV4_SRC, saddr) || + nla_put_be32(skb, TCA_IPTUNNEL_ENC_IPV4_DST, daddr) || + nla_put_be32(skb, TCA_IPTUNNEL_ENC_KEY_ID, key_id)) + goto nla_put_failure; + } + + tcf_tm_dump(&tm, &t->tcf_tm); + if (nla_put_64bit(skb, TCA_IPTUNNEL_TM, sizeof(tm), &tm, TCA_IPTUNNEL_PAD)) + goto nla_put_failure; + + return skb->len; + +nla_put_failure: + nlmsg_trim(skb, b); + return -1; +} + +static int tcf_iptunnel_walker(struct net *net, struct sk_buff *skb, + struct netlink_callback *cb, int type, + const struct tc_action_ops *ops) +{ + struct tc_action_net *tn = net_generic(net, iptunnel_net_id); + + return tcf_generic_walker(tn, skb, cb, type, ops); +} + +static int tcf_iptunnel_search(struct net *net, struct tc_action **a, u32 index) +{ + struct tc_action_net *tn = net_generic(net, iptunnel_net_id); + + return tcf_hash_search(tn, a, index); +} + +static struct tc_action_ops act_iptunnel_ops = { + .kind = "iptunnel", + .type = TCA_ACT_IPTUNNEL, + .owner = THIS_MODULE, + .act = tcf_iptunnel, + .dump = tcf_iptunnel_dump, + .init = tcf_iptunnel_init, + .cleanup = tcf_iptunnel_release, + .walk = tcf_iptunnel_walker, + .lookup = tcf_iptunnel_search, + .size = sizeof(struct tcf_iptunnel), +}; + +static __net_init int iptunnel_init_net(struct net *net) +{ + struct tc_action_net *tn = net_generic(net, iptunnel_net_id); + + return tc_action_net_init(tn, &act_iptunnel_ops, IPTUNNEL_TAB_MASK); +} + +static void __net_exit iptunnel_exit_net(struct net *net) +{ + struct tc_action_net *tn = net_generic(net, iptunnel_net_id); + + tc_action_net_exit(tn); +} + +static struct pernet_operations iptunnel_net_ops = { + .init = iptunnel_init_net, + .exit = iptunnel_exit_net, + .id = &iptunnel_net_id, + .size = sizeof(struct tc_action_net), +}; + +static int __init iptunnel_init_module(void) +{ + return tcf_register_action(&act_iptunnel_ops, &iptunnel_net_ops); +} + +static void __exit iptunnel_cleanup_module(void) +{ + tcf_unregister_action(&act_iptunnel_ops, &iptunnel_net_ops); +} + +module_init(iptunnel_init_module); +module_exit(iptunnel_cleanup_module); + +MODULE_AUTHOR("Amir Vadai "); +MODULE_DESCRIPTION("ip tunnel manipulation actions"); +MODULE_LICENSE("GPL v2");