Message ID | 20171221221832.89616-1-kraigatgoog@gmail.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] rtnetlink: fix struct net reference leak | expand |
Le 21/12/2017 à 23:18, Craig Gallek a écrit : > From: Craig Gallek <kraig@google.com> > > The below referenced commit extended the RTM_GETLINK interface to > allow querying by netns id. The netnsid property was previously > defined as a signed integer, but this patch assumes that the user > always passes a positive integer. syzkaller discovered this problem > by setting a negative netnsid and then calling the get-link path > in a tight loop. This surprisingly quickly overflows the reference > count on the associated struct net, potentially destroying it. When the > default namespace is used, the machine crashes in strange and interesting > ways. > > Unfortunately, this is not easy to reproduce with just the ip tool > as it enforces unsigned integer parsing despite the interface interpeting > the NETNSID attribute as signed. > > I'm not sure why this attribute is signed in the first place, but > the first commit that introduced it (6621dd29eb9b) is in v4.15-rc4, > so I assume it's too late to change. A valid (assigned) nsid is always >= 0. > > This patch removes the positive netns id assumption, but adds another > assumption that the netns id 0 is always the 'self' identifying id (for > which an additional struct net reference is not necessary). We cannot make this assumption, this is wrong. nsids may be automatically allocated by the kernel, and it starts by 0. The current netns can be identify by NETNSA_NSID_NOT_ASSIGNED, ie -1. Regards, Nicolas
On Fri, Dec 22, 2017 at 3:11 AM, Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote: > Le 21/12/2017 à 23:18, Craig Gallek a écrit : >> From: Craig Gallek <kraig@google.com> >> >> The below referenced commit extended the RTM_GETLINK interface to >> allow querying by netns id. The netnsid property was previously >> defined as a signed integer, but this patch assumes that the user >> always passes a positive integer. syzkaller discovered this problem >> by setting a negative netnsid and then calling the get-link path >> in a tight loop. This surprisingly quickly overflows the reference >> count on the associated struct net, potentially destroying it. When the >> default namespace is used, the machine crashes in strange and interesting >> ways. >> >> Unfortunately, this is not easy to reproduce with just the ip tool >> as it enforces unsigned integer parsing despite the interface interpeting >> the NETNSID attribute as signed. >> >> I'm not sure why this attribute is signed in the first place, but >> the first commit that introduced it (6621dd29eb9b) is in v4.15-rc4, >> so I assume it's too late to change. > A valid (assigned) nsid is always >= 0. > >> >> This patch removes the positive netns id assumption, but adds another >> assumption that the netns id 0 is always the 'self' identifying id (for >> which an additional struct net reference is not necessary). > We cannot make this assumption, this is wrong. nsids may be automatically > allocated by the kernel, and it starts by 0. > The current netns can be identify by NETNSA_NSID_NOT_ASSIGNED, ie -1. Thank you, I'll respin this with NETNSA_NSID_NOT_ASSIGNED as the sentinel value. Craig
On Fri, Dec 22, 2017 at 8:59 AM, Craig Gallek <kraigatgoog@gmail.com> wrote: > On Fri, Dec 22, 2017 at 3:11 AM, Nicolas Dichtel > <nicolas.dichtel@6wind.com> wrote: >> Le 21/12/2017 à 23:18, Craig Gallek a écrit : >>> From: Craig Gallek <kraig@google.com> >>> >>> The below referenced commit extended the RTM_GETLINK interface to >>> allow querying by netns id. The netnsid property was previously >>> defined as a signed integer, but this patch assumes that the user >>> always passes a positive integer. syzkaller discovered this problem >>> by setting a negative netnsid and then calling the get-link path >>> in a tight loop. This surprisingly quickly overflows the reference >>> count on the associated struct net, potentially destroying it. When the >>> default namespace is used, the machine crashes in strange and interesting >>> ways. >>> >>> Unfortunately, this is not easy to reproduce with just the ip tool >>> as it enforces unsigned integer parsing despite the interface interpeting >>> the NETNSID attribute as signed. >>> >>> I'm not sure why this attribute is signed in the first place, but >>> the first commit that introduced it (6621dd29eb9b) is in v4.15-rc4, >>> so I assume it's too late to change. >> A valid (assigned) nsid is always >= 0. >> >>> >>> This patch removes the positive netns id assumption, but adds another >>> assumption that the netns id 0 is always the 'self' identifying id (for >>> which an additional struct net reference is not necessary). >> We cannot make this assumption, this is wrong. nsids may be automatically >> allocated by the kernel, and it starts by 0. >> The current netns can be identify by NETNSA_NSID_NOT_ASSIGNED, ie -1. > Thank you, I'll respin this with NETNSA_NSID_NOT_ASSIGNED as the sentinel value. Looking at the netns id code more closely, there are several places that assume ids will never be zero (short of the sentinel). I think the only simple fix here is to update the netlink interfaces to not accept negative values as input. I'm going to send that patch instead...
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index dabba2a91fc8..3de033b7e4b9 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1451,7 +1451,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, ifm->ifi_flags = dev_get_flags(dev); ifm->ifi_change = change; - if (tgt_netnsid >= 0 && nla_put_s32(skb, IFLA_IF_NETNSID, tgt_netnsid)) + if (tgt_netnsid && nla_put_s32(skb, IFLA_IF_NETNSID, tgt_netnsid)) goto nla_put_failure; if (nla_put_string(skb, IFLA_IFNAME, dev->name) || @@ -1712,7 +1712,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) const struct rtnl_link_ops *kind_ops = NULL; unsigned int flags = NLM_F_MULTI; int master_idx = 0; - int netnsid = -1; + int netnsid = 0; int err; int hdrlen; @@ -1733,10 +1733,12 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) ifla_policy, NULL) >= 0) { if (tb[IFLA_IF_NETNSID]) { netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]); - tgt_net = get_target_net(skb, netnsid); - if (IS_ERR(tgt_net)) { - tgt_net = net; - netnsid = -1; + if (netnsid) { + tgt_net = get_target_net(skb, netnsid); + if (IS_ERR(tgt_net)) { + tgt_net = net; + netnsid = 0; + } } } @@ -1786,7 +1788,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) cb->args[0] = h; cb->seq = net->dev_base_seq; nl_dump_check_consistent(cb, nlmsg_hdr(skb)); - if (netnsid >= 0) + if (netnsid) put_net(tgt_net); return err; @@ -2873,7 +2875,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh, struct nlattr *tb[IFLA_MAX+1]; struct net_device *dev = NULL; struct sk_buff *nskb; - int netnsid = -1; + int netnsid = 0; int err; u32 ext_filter_mask = 0; @@ -2883,9 +2885,11 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh, if (tb[IFLA_IF_NETNSID]) { netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]); - tgt_net = get_target_net(skb, netnsid); - if (IS_ERR(tgt_net)) - return PTR_ERR(tgt_net); + if (netnsid) { + tgt_net = get_target_net(skb, netnsid); + if (IS_ERR(tgt_net)) + return PTR_ERR(tgt_net); + } } if (tb[IFLA_IFNAME]) @@ -2923,7 +2927,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh, } else err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid); out: - if (netnsid >= 0) + if (netnsid) put_net(tgt_net); return err;