Message ID | 20200825155715.24006-1-nicolas.dichtel@6wind.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [net-next,v2] gtp: add notification mechanism | expand |
Hi Nicolas, thanks a lot for your patch. On Tue, Aug 25, 2020 at 05:57:15PM +0200, Nicolas Dichtel wrote: > Like all other network functions, let's notify gtp context on creation and > deletion. While this may be in-line with typical kernel tunnel device practises, I am not convinced it is the right way to go for GTP. Contrary to other tunneling mechansims, GTP doesn't have a 1:1 rlationship between tunnels and netdev's. You can easily have tens of thousands - or even many more - PDP contexts (at least one per subscriber) within one "gtp0" netdev. Also, the state is highly volatile. Every time a subscriber registers/deregisters, goes in or out of coverage, in or out of airplane mode, etc. those PDP contexts go up and down. Sending (unsolicited) notifications about all of those seems quite heavyweight to me.
Le 25/08/2020 à 19:01, Harald Welte a écrit : > Hi Nicolas, > > thanks a lot for your patch. > > On Tue, Aug 25, 2020 at 05:57:15PM +0200, Nicolas Dichtel wrote: >> Like all other network functions, let's notify gtp context on creation and >> deletion. > > While this may be in-line with typical kernel tunnel device practises, I am not > convinced it is the right way to go for GTP. > > Contrary to other tunneling mechansims, GTP doesn't have a 1:1 rlationship between > tunnels and netdev's. You can easily have tens of thousands - or even many more - > PDP contexts (at least one per subscriber) within one "gtp0" netdev. Also, the state > is highly volatile. Every time a subscriber registers/deregisters, goes in or out of > coverage, in or out of airplane mode, etc. those PDP contexts go up and down. > > Sending (unsolicited) notifications about all of those seems quite heavyweight to me. There is no 'unsolicited' notifications with this patch. Notifications are sent only if a userspace application has subscribed to the gtp mcast group. ip routes or conntrack entries are notified in the same way and there could a lot of them also (more than 100k conntrack entries for example).
Hi Nicolas, On Wed, Aug 26, 2020 at 09:47:54AM +0200, Nicolas Dichtel wrote: > > Sending (unsolicited) notifications about all of those seems quite heavyweight to me. > > There is no 'unsolicited' notifications with this patch. Notifications are sent > only if a userspace application has subscribed to the gtp mcast group. > ip routes or conntrack entries are notified in the same way and there could a > lot of them also (more than 100k conntrack entries for example). Ok, thanks for reminding me of that. However, even if those events are not sent/multicasted, it still looks like the proposed patch is unconditionally allocating a netlink message and filling it with information about the PDP. That alone looks like adding significant overhead to every user - even the majority of current use cases where nobody is listening/subscribing to that multicast group. Wouldn't it make sense to only allocate + fill those messages if we actually knew a subscriber existed?
Le 26/08/2020 à 20:52, Harald Welte a écrit : > Hi Nicolas, > > On Wed, Aug 26, 2020 at 09:47:54AM +0200, Nicolas Dichtel wrote: >>> Sending (unsolicited) notifications about all of those seems quite heavyweight to me. >> >> There is no 'unsolicited' notifications with this patch. Notifications are sent >> only if a userspace application has subscribed to the gtp mcast group. >> ip routes or conntrack entries are notified in the same way and there could a >> lot of them also (more than 100k conntrack entries for example). > > Ok, thanks for reminding me of that. However, even if those events are > not sent/multicasted, it still looks like the proposed patch is > unconditionally allocating a netlink message and filling it with > information about the PDP. That alone looks like adding significant > overhead to every user - even the majority of current use cases where > nobody is listening/subscribing to that multicast group. I don't think that this is a significant overhead. This is added in the control path. When a PDP context is added, the rtnl lock is took, this is another magnitude of overhead than a kmalloc(). > > Wouldn't it make sense to only allocate + fill those messages if we > actually knew a subscriber existed? In fact, this is actually how the netlink framework works.
Hi Nicolas, On Thu, Aug 27, 2020 at 12:36:24AM +0200, Nicolas Dichtel wrote: > Le 26/08/2020 à 20:52, Harald Welte a écrit : > > Wouldn't it make sense to only allocate + fill those messages if we > > actually knew a subscriber existed? > > In fact, this is actually how the netlink framework works. Well, as you can tell from my responses, I've not been doing kernel work for a decade now, so I'm looking at things from a more distant and ignorant perspective. To me it seems odd to allocate memory and copy data to it (cache misses, ...) if nobody every requested that data, and nobody will ever use it. But if this is how it is supposed to work, then I will of course defer to that. All netlink would have to expose is a function that returns whether or not there are any subscribers to the given multicast group. Then all of the allocation + initialization would disappear in a branch that is not executed most of the time, at least for current, existing gtpnl systems. Yes, that means one more branch, of course. But that branch will happen later on anyway, event today: Only after the allocation + initialization. So having said the above, if this is how it is supposed to work with netlink: Acked-by: Harald Welte <laforge@gnumonks.org>
Hi Harald, Le 27/08/2020 à 11:00, Harald Welte a écrit : > Hi Nicolas, > > On Thu, Aug 27, 2020 at 12:36:24AM +0200, Nicolas Dichtel wrote: >> Le 26/08/2020 à 20:52, Harald Welte a écrit : > >>> Wouldn't it make sense to only allocate + fill those messages if we >>> actually knew a subscriber existed? >> >> In fact, this is actually how the netlink framework works. > > Well, as you can tell from my responses, I've not been doing kernel work > for a decade now, so I'm looking at things from a more distant and > ignorant perspective. To me it seems odd to allocate memory and copy > data to it (cache misses, ...) if nobody every requested that data, and > nobody will ever use it. But if this is how it is supposed to work, > then I will of course defer to that. All netlink would have to expose > is a function that returns whether or not there are any subscribers > to the given multicast group. Then all of the allocation + > initialization would disappear in a branch that is not executed most of > the time, at least for current, existing gtpnl systems. Yes, that means > one more branch, of course. But that branch will happen later on > anyway, event today: Only after the allocation + initialization. I agree, but I didn't find a good solution for this right now. The lookup is not straight forward. > > So having said the above, if this is how it is supposed to work with > netlink: > > Acked-by: Harald Welte <laforge@gnumonks.org> > Thank you.
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c index 8e47d0112e5d..76fd87a44fdf 100644 --- a/drivers/net/gtp.c +++ b/drivers/net/gtp.c @@ -928,8 +928,8 @@ static void ipv4_pdp_fill(struct pdp_ctx *pctx, struct genl_info *info) } } -static int gtp_pdp_add(struct gtp_dev *gtp, struct sock *sk, - struct genl_info *info) +static struct pdp_ctx *gtp_pdp_add(struct gtp_dev *gtp, struct sock *sk, + struct genl_info *info) { struct pdp_ctx *pctx, *pctx_tid = NULL; struct net_device *dev = gtp->dev; @@ -956,12 +956,12 @@ static int gtp_pdp_add(struct gtp_dev *gtp, struct sock *sk, if (found) { if (info->nlhdr->nlmsg_flags & NLM_F_EXCL) - return -EEXIST; + return ERR_PTR(-EEXIST); if (info->nlhdr->nlmsg_flags & NLM_F_REPLACE) - return -EOPNOTSUPP; + return ERR_PTR(-EOPNOTSUPP); if (pctx && pctx_tid) - return -EEXIST; + return ERR_PTR(-EEXIST); if (!pctx) pctx = pctx_tid; @@ -974,13 +974,13 @@ static int gtp_pdp_add(struct gtp_dev *gtp, struct sock *sk, netdev_dbg(dev, "GTPv1-U: update tunnel id = %x/%x (pdp %p)\n", pctx->u.v1.i_tei, pctx->u.v1.o_tei, pctx); - return 0; + return pctx; } pctx = kmalloc(sizeof(*pctx), GFP_ATOMIC); if (pctx == NULL) - return -ENOMEM; + return ERR_PTR(-ENOMEM); sock_hold(sk); pctx->sk = sk; @@ -1018,7 +1018,7 @@ static int gtp_pdp_add(struct gtp_dev *gtp, struct sock *sk, break; } - return 0; + return pctx; } static void pdp_context_free(struct rcu_head *head) @@ -1036,9 +1036,12 @@ static void pdp_context_delete(struct pdp_ctx *pctx) call_rcu(&pctx->rcu_head, pdp_context_free); } +static int gtp_tunnel_notify(struct pdp_ctx *pctx, u8 cmd); + static int gtp_genl_new_pdp(struct sk_buff *skb, struct genl_info *info) { unsigned int version; + struct pdp_ctx *pctx; struct gtp_dev *gtp; struct sock *sk; int err; @@ -1088,7 +1091,13 @@ static int gtp_genl_new_pdp(struct sk_buff *skb, struct genl_info *info) goto out_unlock; } - err = gtp_pdp_add(gtp, sk, info); + pctx = gtp_pdp_add(gtp, sk, info); + if (IS_ERR(pctx)) { + err = PTR_ERR(pctx); + } else { + gtp_tunnel_notify(pctx, GTP_CMD_NEWPDP); + err = 0; + } out_unlock: rcu_read_unlock(); @@ -1159,6 +1168,7 @@ static int gtp_genl_del_pdp(struct sk_buff *skb, struct genl_info *info) netdev_dbg(pctx->dev, "GTPv1-U: deleting tunnel id = %x/%x (pdp %p)\n", pctx->u.v1.i_tei, pctx->u.v1.o_tei, pctx); + gtp_tunnel_notify(pctx, GTP_CMD_DELPDP); pdp_context_delete(pctx); out_unlock: @@ -1168,6 +1178,14 @@ static int gtp_genl_del_pdp(struct sk_buff *skb, struct genl_info *info) static struct genl_family gtp_genl_family; +enum gtp_multicast_groups { + GTP_GENL_MCGRP, +}; + +static const struct genl_multicast_group gtp_genl_mcgrps[] = { + [GTP_GENL_MCGRP] = { .name = GTP_GENL_MCGRP_NAME }, +}; + static int gtp_genl_fill_info(struct sk_buff *skb, u32 snd_portid, u32 snd_seq, int flags, u32 type, struct pdp_ctx *pctx) { @@ -1205,6 +1223,26 @@ static int gtp_genl_fill_info(struct sk_buff *skb, u32 snd_portid, u32 snd_seq, return -EMSGSIZE; } +static int gtp_tunnel_notify(struct pdp_ctx *pctx, u8 cmd) +{ + struct sk_buff *msg; + int ret; + + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC); + if (!msg) + return -ENOMEM; + + ret = gtp_genl_fill_info(msg, 0, 0, 0, cmd, pctx); + if (ret < 0) { + nlmsg_free(msg); + return ret; + } + + ret = genlmsg_multicast_netns(>p_genl_family, dev_net(pctx->dev), msg, + 0, GTP_GENL_MCGRP, GFP_ATOMIC); + return ret; +} + static int gtp_genl_get_pdp(struct sk_buff *skb, struct genl_info *info) { struct pdp_ctx *pctx = NULL; @@ -1335,6 +1373,8 @@ static struct genl_family gtp_genl_family __ro_after_init = { .module = THIS_MODULE, .ops = gtp_genl_ops, .n_ops = ARRAY_SIZE(gtp_genl_ops), + .mcgrps = gtp_genl_mcgrps, + .n_mcgrps = ARRAY_SIZE(gtp_genl_mcgrps), }; static int __net_init gtp_net_init(struct net *net) diff --git a/include/uapi/linux/gtp.h b/include/uapi/linux/gtp.h index c7d66755d212..79f9191bbb24 100644 --- a/include/uapi/linux/gtp.h +++ b/include/uapi/linux/gtp.h @@ -2,6 +2,8 @@ #ifndef _UAPI_LINUX_GTP_H_ #define _UAPI_LINUX_GTP_H_ +#define GTP_GENL_MCGRP_NAME "gtp" + enum gtp_genl_cmds { GTP_CMD_NEWPDP, GTP_CMD_DELPDP,