diff mbox

[1/2] mpls: add handlers

Message ID 1502396917-14848-2-git-send-email-amine.kherbouche@6wind.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Amine Kherbouche Aug. 10, 2017, 8:28 p.m. UTC
Mpls handler allows creation/deletion of mpls routes without using
rtnetlink. When an incoming mpls packet matches this route, the saved
function handler is called.

Signed-off-by: Amine Kherbouche <amine.kherbouche@6wind.com>
Signed-off-by: David Lamparter <equinox@diac24.net>
---
 include/net/mpls.h  | 10 +++++++
 net/mpls/af_mpls.c  | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/mpls/internal.h |  3 +++
 3 files changed, 88 insertions(+)

Comments

David Lamparter Aug. 11, 2017, 12:34 p.m. UTC | #1
On Thu, Aug 10, 2017 at 10:28:36PM +0200, Amine Kherbouche wrote:
> Mpls handler allows creation/deletion of mpls routes without using
> rtnetlink. When an incoming mpls packet matches this route, the saved
> function handler is called.

Since I originally authored this patch, I have come to believe that it
might be unneccessarily complicated.  It is unlikely that a lot of
different "handlers" will exist here;  the only things I can think of
are VPLS support and BIER-MPLS multicast replication.  I'm not saying
it's a bad idea, but, well, this was in the README that I gave to 6WIND
with this code:

...
MPLS layer:
- the "MPT_HANDLER" thing is probably overkill, it likely makes more sense to
  tie in the VPLS code more directly.
...


-David
Roopa Prabhu Aug. 11, 2017, 2:37 p.m. UTC | #2
On Fri, Aug 11, 2017 at 5:34 AM, David Lamparter <equinox@diac24.net> wrote:
> On Thu, Aug 10, 2017 at 10:28:36PM +0200, Amine Kherbouche wrote:
>> Mpls handler allows creation/deletion of mpls routes without using
>> rtnetlink. When an incoming mpls packet matches this route, the saved
>> function handler is called.
>
> Since I originally authored this patch, I have come to believe that it
> might be unneccessarily complicated.  It is unlikely that a lot of
> different "handlers" will exist here;  the only things I can think of
> are VPLS support and BIER-MPLS multicast replication.  I'm not saying
> it's a bad idea, but, well, this was in the README that I gave to 6WIND
> with this code:
>
> ...

yes, I would also prefer just exporting the functions  and calling
them directly instead of adding a
handler layer. We can move to that later if it becomes necessary.
Amine Kherbouche Aug. 12, 2017, 1:35 p.m. UTC | #3
On 11/08/2017 16:37, Roopa Prabhu wrote:
> On Fri, Aug 11, 2017 at 5:34 AM, David Lamparter <equinox@diac24.net> wrote:
>> On Thu, Aug 10, 2017 at 10:28:36PM +0200, Amine Kherbouche wrote:
>>> Mpls handler allows creation/deletion of mpls routes without using
>>> rtnetlink. When an incoming mpls packet matches this route, the saved
>>> function handler is called.
>> Since I originally authored this patch, I have come to believe that it
>> might be unneccessarily complicated.  It is unlikely that a lot of
>> different "handlers" will exist here;  the only things I can think of
>> are VPLS support and BIER-MPLS multicast replication.  I'm not saying
>> it's a bad idea, but, well, this was in the README that I gave to 6WIND
>> with this code:
>>
>> ...
> yes, I would also prefer just exporting the functions  and calling
> them directly instead of adding a
> handler layer. We can move to that later if it becomes necessary.
I understand that the handler layer is an overhead (as said by David's
note), and I agree with the solution for exporting the mpls functions that
allows route creation/deletion, but how about forwarding the right mpls
packet to the right vpls device with the device ptr? I don't see
another way.
Roopa Prabhu Aug. 13, 2017, 3:29 a.m. UTC | #4
On Sat, Aug 12, 2017 at 6:35 AM, Amine Kherbouche
<amine.kherbouche@6wind.com> wrote:
>
>
> On 11/08/2017 16:37, Roopa Prabhu wrote:
>>
>> On Fri, Aug 11, 2017 at 5:34 AM, David Lamparter <equinox@diac24.net>
>> wrote:
>>>
>>> On Thu, Aug 10, 2017 at 10:28:36PM +0200, Amine Kherbouche wrote:
>>>>
>>>> Mpls handler allows creation/deletion of mpls routes without using
>>>> rtnetlink. When an incoming mpls packet matches this route, the saved
>>>> function handler is called.
>>>
>>> Since I originally authored this patch, I have come to believe that it
>>> might be unneccessarily complicated.  It is unlikely that a lot of
>>> different "handlers" will exist here;  the only things I can think of
>>> are VPLS support and BIER-MPLS multicast replication.  I'm not saying
>>> it's a bad idea, but, well, this was in the README that I gave to 6WIND
>>> with this code:
>>>
>>> ...
>>
>> yes, I would also prefer just exporting the functions  and calling
>> them directly instead of adding a
>> handler layer. We can move to that later if it becomes necessary.
>
> I understand that the handler layer is an overhead (as said by David's
> note), and I agree with the solution for exporting the mpls functions that
> allows route creation/deletion, but how about forwarding the right mpls
> packet to the right vpls device with the device ptr? I don't see
> another way.


hmm...ok, so you are adding a mpls route to get into vpls_rcv and you
want this route to carry the vpls_rcv information. Ideally if you knew
the route is pointing to a vpls device kind, you can directly call
vpls_rcv.
But, am not sure if a route is necessary here either.

It just seems like the vpls device information is duplicated in the
mpls route per vpls dev. Wondering if we can skip the route part and
always do a lookup on vpls-id/label in mpls_rcv to send it to a
vpls_rcv if there is a match.  This will be the l2 handler for mpls.
David Lamparter Aug. 15, 2017, 9:37 a.m. UTC | #5
On Sat, Aug 12, 2017 at 08:29:18PM -0700, Roopa Prabhu wrote:
> On Sat, Aug 12, 2017 at 6:35 AM, Amine Kherbouche
> <amine.kherbouche@6wind.com> wrote:
> >
> >
> > On 11/08/2017 16:37, Roopa Prabhu wrote:
> >>
> >> On Fri, Aug 11, 2017 at 5:34 AM, David Lamparter <equinox@diac24.net>
> >> wrote:
> >>>
> >>> On Thu, Aug 10, 2017 at 10:28:36PM +0200, Amine Kherbouche wrote:
> >>>>
> >>>> Mpls handler allows creation/deletion of mpls routes without using
> >>>> rtnetlink. When an incoming mpls packet matches this route, the saved
> >>>> function handler is called.
> >>>
> >>> Since I originally authored this patch, I have come to believe that it
> >>> might be unneccessarily complicated.  It is unlikely that a lot of
> >>> different "handlers" will exist here;  the only things I can think of
> >>> are VPLS support and BIER-MPLS multicast replication.  I'm not saying
> >>> it's a bad idea, but, well, this was in the README that I gave to 6WIND
> >>> with this code:
> >>>
> >>> ...
> >>
> >> yes, I would also prefer just exporting the functions  and calling
> >> them directly instead of adding a
> >> handler layer. We can move to that later if it becomes necessary.
> >
> > I understand that the handler layer is an overhead (as said by David's
> > note), and I agree with the solution for exporting the mpls functions that
> > allows route creation/deletion, but how about forwarding the right mpls
> > packet to the right vpls device with the device ptr? I don't see
> > another way.
> 
> 
> hmm...ok, so you are adding a mpls route to get into vpls_rcv and you
> want this route to carry the vpls_rcv information. Ideally if you knew
> the route is pointing to a vpls device kind, you can directly call
> vpls_rcv.
> But, am not sure if a route is necessary here either.
> 
> It just seems like the vpls device information is duplicated in the
> mpls route per vpls dev. Wondering if we can skip the route part and
> always do a lookup on vpls-id/label in mpls_rcv to send it to a
> vpls_rcv if there is a match.  This will be the l2 handler for mpls.

I think the reverse is the better option, removing the vpls device
information and just going with the route table.  My approach to this
would be to add a new netlink route attribute "RTA_VPLS" which
identifies the vpls device, is stored in the route table, and provides
the device ptr needed here.
(The control word config should also be on the route.)

My reason for thinking this is that the VPLS code needs exactly the same
information as does a normal MPLS route:  it attaches to an incoming
label (decapsulating packets instead of forwarding them), and for TX it
does the same operation of looking up a nexthop (possibly with ECMP
support) and adding a label stack.  The code should, in fact, probably
reuse the TX path.

This also fits both an 1:1 and 1:n model pretty well.  Creating a VPLS
head-end netdevice doesn't even need any config.  It'd just work like:
- ip link add name vpls123 type vpls
- ip -f mpls route add \
        1234 \                              # incoming label for decap
        vpls vpls123 \                      # new attr: VPLS device
        as 2345 via inet 10.0.0.1 dev eth0  # outgoing label for encap

For a 1:n model, one would simply add multiple routes on the same vpls
device.


-David
Roopa Prabhu Aug. 16, 2017, 5:30 a.m. UTC | #6
On Tue, Aug 15, 2017 at 2:37 AM, David Lamparter <equinox@diac24.net> wrote:

[snip]

> I think the reverse is the better option, removing the vpls device
> information and just going with the route table.  My approach to this
> would be to add a new netlink route attribute "RTA_VPLS" which
> identifies the vpls device, is stored in the route table, and provides
> the device ptr needed here.
> (The control word config should also be on the route.)
>
> My reason for thinking this is that the VPLS code needs exactly the same
> information as does a normal MPLS route:  it attaches to an incoming
> label (decapsulating packets instead of forwarding them), and for TX it
> does the same operation of looking up a nexthop (possibly with ECMP
> support) and adding a label stack.  The code should, in fact, probably
> reuse the TX path.
>
> This also fits both an 1:1 and 1:n model pretty well.  Creating a VPLS
> head-end netdevice doesn't even need any config.  It'd just work like:
> - ip link add name vpls123 type vpls
> - ip -f mpls route add \
>         1234 \                              # incoming label for decap
>         vpls vpls123 \                      # new attr: VPLS device
>         as 2345 via inet 10.0.0.1 dev eth0  # outgoing label for encap
>
> For a 1:n model, one would simply add multiple routes on the same vpls
> device.
>

this is a nice model too. But, i don't see how vlans and mac based
learning will fit in here.

modeling it same as how vxlan l2 overlay tunnels are done seems like a
great fit.
The vpls driver can learn mac's per pw tunnel labels. And this l2 fdb
table per vpls device can also carry dst information similar to how
vxlan driver does today.
Amine Kherbouche Aug. 16, 2017, 10:24 a.m. UTC | #7
On 08/16/2017 07:30 AM, Roopa Prabhu wrote:
> On Tue, Aug 15, 2017 at 2:37 AM, David Lamparter <equinox@diac24.net> wrote:
>
> [snip]
>
>> I think the reverse is the better option, removing the vpls device
>> information and just going with the route table.  My approach to this
>> would be to add a new netlink route attribute "RTA_VPLS" which
>> identifies the vpls device, is stored in the route table, and provides
>> the device ptr needed here.
>> (The control word config should also be on the route.)
>>
>> My reason for thinking this is that the VPLS code needs exactly the same
>> information as does a normal MPLS route:  it attaches to an incoming
>> label (decapsulating packets instead of forwarding them), and for TX it
>> does the same operation of looking up a nexthop (possibly with ECMP
>> support) and adding a label stack.  The code should, in fact, probably
>> reuse the TX path.
>>
>> This also fits both an 1:1 and 1:n model pretty well.  Creating a VPLS
>> head-end netdevice doesn't even need any config.  It'd just work like:
>> - ip link add name vpls123 type vpls
>> - ip -f mpls route add \
>>         1234 \                              # incoming label for decap
>>         vpls vpls123 \                      # new attr: VPLS device
>>         as 2345 via inet 10.0.0.1 dev eth0  # outgoing label for encap
>>
>> For a 1:n model, one would simply add multiple routes on the same vpls
>> device.
>>
>
> this is a nice model too. But, i don't see how vlans and mac based
> learning will fit in here.
>
> modeling it same as how vxlan l2 overlay tunnels are done seems like a
> great fit.
> The vpls driver can learn mac's per pw tunnel labels. And this l2 fdb
> table per vpls device can also carry dst information similar to how
> vxlan driver does today.
>

I think this is a good idea too, I'll implement this concept in mpls and 
have a look at the way vxlan is done to be able to support the l2 part 
in vpls driver.

Thanks
diff mbox

Patch

diff --git a/include/net/mpls.h b/include/net/mpls.h
index 1dbc669..0ff51b6 100644
--- a/include/net/mpls.h
+++ b/include/net/mpls.h
@@ -33,4 +33,14 @@  static inline struct mpls_shim_hdr *mpls_hdr(const struct sk_buff *skb)
 {
 	return (struct mpls_shim_hdr *)skb_network_header(skb);
 }
+
+typedef int (*mpls_handler)(void *arg, struct sk_buff *skb,
+			    struct net_device *dev, u32 index, u8 bos);
+
+extern int mpls_handler_add(struct net *net, u32 index, u8 via_table, u8 via[],
+			    mpls_handler handler, void *handler_arg,
+			    struct netlink_ext_ack *extack);
+extern int mpls_handler_del(struct net *net, u32 index,
+			    struct netlink_ext_ack *extack);
+
 #endif
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index c5b9ce4..82d2126 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -10,6 +10,7 @@ 
 #include <linux/netconf.h>
 #include <linux/vmalloc.h>
 #include <linux/percpu.h>
+#include <net/mpls.h>
 #include <net/ip.h>
 #include <net/dst.h>
 #include <net/sock.h>
@@ -299,6 +300,7 @@  static bool mpls_egress(struct net *net, struct mpls_route *rt,
 		success = true;
 		break;
 	}
+	case MPT_HANDLER:
 	case MPT_UNSPEC:
 		/* Should have decided which protocol it is by now */
 		break;
@@ -356,6 +358,10 @@  static int mpls_forward(struct sk_buff *skb, struct net_device *dev,
 		goto drop;
 	}
 
+	if (rt->rt_payload_type == MPT_HANDLER)
+		return rt->rt_handler(rt->rt_harg, skb, dev,
+				      dec.label, dec.bos);
+
 	nh = mpls_select_multipath(rt, skb);
 	if (!nh)
 		goto err;
@@ -457,6 +463,8 @@  static const struct nla_policy rtm_mpls_policy[RTA_MAX+1] = {
 struct mpls_route_config {
 	u32			rc_protocol;
 	u32			rc_ifindex;
+	mpls_handler		rc_handler;
+	void			*rc_harg;
 	u8			rc_via_table;
 	u8			rc_via_alen;
 	u8			rc_via[MAX_VIA_ALEN];
@@ -995,6 +1003,11 @@  static int mpls_route_add(struct mpls_route_config *cfg,
 	rt->rt_payload_type = cfg->rc_payload_type;
 	rt->rt_ttl_propagate = cfg->rc_ttl_propagate;
 
+	if (cfg->rc_handler) {
+		rt->rt_handler = cfg->rc_handler;
+		rt->rt_harg = cfg->rc_harg;
+	}
+
 	if (cfg->rc_mp)
 		err = mpls_nh_build_multi(cfg, rt, max_labels, extack);
 	else
@@ -1271,6 +1284,68 @@  static int mpls_netconf_dump_devconf(struct sk_buff *skb,
 	return skb->len;
 }
 
+int mpls_handler_add(struct net *net, u32 label, u8 via_table, u8 via[],
+		     mpls_handler handler, void *handler_arg,
+		     struct netlink_ext_ack *extack)
+{
+	struct net_device *dev = handler_arg;
+	struct mpls_route_config *cfg;
+	u8 alen = 0;
+	int err;
+
+	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
+	if (!cfg)
+		return -ENOMEM;
+
+	memset(cfg, 0, sizeof(*cfg));
+	if (via_table == NEIGH_ARP_TABLE)
+		alen = sizeof(struct in_addr);
+	else if (via_table == NEIGH_ND_TABLE)
+		alen = sizeof(struct in6_addr);
+
+	cfg->rc_ttl_propagate	= MPLS_TTL_PROP_DEFAULT;
+	cfg->rc_protocol	= RTPROT_KERNEL;
+	cfg->rc_nlflags		|= NLM_F_CREATE;
+	cfg->rc_payload_type	= MPT_HANDLER;
+	cfg->rc_via_table	= via_table;
+	cfg->rc_label		= label;
+	cfg->rc_via_alen	= alen;
+	memcpy(&cfg->rc_via, via, alen);
+	cfg->rc_ifindex		= dev->ifindex;
+	cfg->rc_nlinfo.nl_net	= net;
+	cfg->rc_harg		= handler_arg;
+	cfg->rc_handler		= handler;
+	cfg->rc_output_labels	= 0;
+
+	err = mpls_route_add(cfg, extack);
+	kfree(cfg);
+
+	return err;
+}
+EXPORT_SYMBOL(mpls_handler_add);
+
+int mpls_handler_del(struct net *net, u32 index,
+		     struct netlink_ext_ack *extack)
+{
+	struct mpls_route_config *cfg;
+	int err = 0;
+
+	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
+	if (!cfg)
+		return -ENOMEM;
+
+	memset(cfg, 0, sizeof(*cfg));
+	cfg->rc_protocol	= RTPROT_KERNEL;
+	cfg->rc_label		= index;
+	cfg->rc_nlinfo.nl_net	= net;
+
+	err = mpls_route_del(cfg, extack);
+	kfree(cfg);
+
+	return err;
+}
+EXPORT_SYMBOL(mpls_handler_del);
+
 #define MPLS_PERDEV_SYSCTL_OFFSET(field)	\
 	(&((struct mpls_dev *)0)->field)
 
diff --git a/net/mpls/internal.h b/net/mpls/internal.h
index cf65aec..2cd73eb 100644
--- a/net/mpls/internal.h
+++ b/net/mpls/internal.h
@@ -78,6 +78,7 @@  enum mpls_payload_type {
 	MPT_UNSPEC, /* IPv4 or IPv6 */
 	MPT_IPV4 = 4,
 	MPT_IPV6 = 6,
+	MPT_HANDLER = 255,
 
 	/* Other types not implemented:
 	 *  - Pseudo-wire with or without control word (RFC4385)
@@ -141,6 +142,8 @@  enum mpls_ttl_propagation {
  */
 struct mpls_route { /* next hop label forwarding entry */
 	struct rcu_head		rt_rcu;
+	mpls_handler		rt_handler;
+	void			*rt_harg;
 	u8			rt_protocol;
 	u8			rt_payload_type;
 	u8			rt_max_alen;