Message ID | 20190606102407.881-2-stefan.bader@canonical.com |
---|---|
State | New |
Headers | show |
Series | [SRU,V2,X/C] UBUNTU: SAUCE: ipv6: frags: fix skb extraction in ip6_expire_frag_queue() | expand |
On 06/06/2019 11:24, Stefan Bader wrote: > BugLink: https://bugs.launchpad.net/bugs/1824687 > > The backport of > > 05c0b86b96 ("ipv6: frags: rewrite ip6_expire_frag_queue()") > > to linux-4.4.y stable changed ip6_expire_frag_queue() to be similar > to ip_expire(). However, using skb_get() leads to a crash while > sending the ICMP message due to a check for shared SKBs. > > kernel BUG at linux-4.4.0/net/core/skbuff.c:1207! > RIP: 0010:[<ffffffff81740953>] > [<ffffffff81740953>] pskb_expand_head+0x243/0x250 > [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 > [<ffffffff8183939a>] _decode_session6+0x26a/0x400 > [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 > [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 > [<ffffffff81824421>] icmp6_send+0x5e1/0x940 > [<ffffffff8183d431>] icmpv6_send+0x21/0x30 > [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 > > For IPv4 the ip_expire() function however did change considerably > since then. In > > fa0f527358 ("ip: use rb trees for IP frag queue.") > > the SKB might be taken from a rbtree (use of rbtrees for IPv4 was > backported to 4.4.y upstream). > Along with those obvious changes, the code also is modified to > actually de-queue the SKB from whichever source it was taken. > This also got rid of the skb_get() which causes problems in > icmpv6_send(). And latest upstream code uses inet_frag_pull_head() > which does the same. > > To fix the crash in IPv6, we use the same modifications added > to ip_expire() by fa0f527358. This might be too much change for > now because IPv6 only starts using rbtrees for frags with > > 997dd96471 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c") > > which has not been backported to 4.4.y. Testing by a reporter was > showing good results. Likely the else part never gets used until > 997dd96471 is backported, too. And that needs more changes. > Some upstream (stable) discussion was started but has not yet > resulted in any usable results. So adding this as SAUCE for now > to get the kernel stable (based on testing). > > Fixes: bf8187348f ("ipv6: frags: rewrite ip6_expire_frag_queue()") > in the linux-4.4.y stable tree. > (based-on: f78a3f45e7 ("ip: use rb trees for IP frag queue." 4.4.y)) > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> > --- > net/ipv6/reassembly.c | 26 +++++++++++++++++++++----- > 1 file changed, 21 insertions(+), 5 deletions(-) > > diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c > index ec917f58d105..5b09ce54c476 100644 > --- a/net/ipv6/reassembly.c > +++ b/net/ipv6/reassembly.c > @@ -92,7 +92,7 @@ EXPORT_SYMBOL(ip6_frag_init); > void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > { > struct net_device *dev = NULL; > - struct sk_buff *head; > + struct sk_buff *head = NULL; > > rcu_read_lock(); > spin_lock(&fq->q.lock); > @@ -110,26 +110,42 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT); > > /* Don't send error if the first segment did not arrive. */ > - head = fq->q.fragments; > - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head) > + if (!(fq->q.flags & INET_FRAG_FIRST_IN)) > goto out; > > + if (fq->q.fragments) { > + head = fq->q.fragments; > + fq->q.fragments = head->next; > + } else { > + head = skb_rb_first(&fq->q.rb_fragments); > + if (!head) > + goto out; > + rb_erase(&head->rbnode, &fq->q.rb_fragments); > + memset(&head->rbnode, 0, sizeof(head->rbnode)); > + barrier(); > + } > + > + if (head == fq->q.fragments_tail) > + fq->q.fragments_tail = NULL; > + > + sub_frag_mem_limit(fq->q.net, head->truesize); > + > /* But use as source device on which LAST ARRIVED > * segment was received. And do not use fq->dev > * pointer directly, device might already disappeared. > */ > head->dev = dev; > - skb_get(head); > spin_unlock(&fq->q.lock); > > icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0); > - kfree_skb(head); > goto out_rcu_unlock; > > out: > spin_unlock(&fq->q.lock); > out_rcu_unlock: > rcu_read_unlock(); > + if (head) > + kfree_skb(head); > inet_frag_put(&fq->q); > } > EXPORT_SYMBOL(ip6_expire_frag_queue); > this now closer matches the intent from the commit fa0f527358bd900ef9 that it is based on. The original had positive test results, so I'm hoping this will be OK too (the delta is only minor) and addresses the concerns in Andrea's comments. So, based on this, Acked-by: Colin Ian King <colin.king@canonical.com>
On Thu, Jun 06, 2019 at 12:24:07PM +0200, Stefan Bader wrote: > BugLink: https://bugs.launchpad.net/bugs/1824687 > > The backport of > > 05c0b86b96 ("ipv6: frags: rewrite ip6_expire_frag_queue()") > > to linux-4.4.y stable changed ip6_expire_frag_queue() to be similar > to ip_expire(). However, using skb_get() leads to a crash while > sending the ICMP message due to a check for shared SKBs. > > kernel BUG at linux-4.4.0/net/core/skbuff.c:1207! > RIP: 0010:[<ffffffff81740953>] > [<ffffffff81740953>] pskb_expand_head+0x243/0x250 > [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 > [<ffffffff8183939a>] _decode_session6+0x26a/0x400 > [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 > [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 > [<ffffffff81824421>] icmp6_send+0x5e1/0x940 > [<ffffffff8183d431>] icmpv6_send+0x21/0x30 > [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 > > For IPv4 the ip_expire() function however did change considerably > since then. In > > fa0f527358 ("ip: use rb trees for IP frag queue.") > > the SKB might be taken from a rbtree (use of rbtrees for IPv4 was > backported to 4.4.y upstream). > Along with those obvious changes, the code also is modified to > actually de-queue the SKB from whichever source it was taken. > This also got rid of the skb_get() which causes problems in > icmpv6_send(). And latest upstream code uses inet_frag_pull_head() > which does the same. > > To fix the crash in IPv6, we use the same modifications added > to ip_expire() by fa0f527358. This might be too much change for > now because IPv6 only starts using rbtrees for frags with > > 997dd96471 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c") > > which has not been backported to 4.4.y. Testing by a reporter was > showing good results. Likely the else part never gets used until > 997dd96471 is backported, too. And that needs more changes. > Some upstream (stable) discussion was started but has not yet > resulted in any usable results. So adding this as SAUCE for now > to get the kernel stable (based on testing). > > Fixes: bf8187348f ("ipv6: frags: rewrite ip6_expire_frag_queue()") > in the linux-4.4.y stable tree. > (based-on: f78a3f45e7 ("ip: use rb trees for IP frag queue." 4.4.y)) > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> > --- > net/ipv6/reassembly.c | 26 +++++++++++++++++++++----- > 1 file changed, 21 insertions(+), 5 deletions(-) > > diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c > index ec917f58d105..5b09ce54c476 100644 > --- a/net/ipv6/reassembly.c > +++ b/net/ipv6/reassembly.c > @@ -92,7 +92,7 @@ EXPORT_SYMBOL(ip6_frag_init); > void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > { > struct net_device *dev = NULL; > - struct sk_buff *head; > + struct sk_buff *head = NULL; > > rcu_read_lock(); > spin_lock(&fq->q.lock); > @@ -110,26 +110,42 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT); > > /* Don't send error if the first segment did not arrive. */ > - head = fq->q.fragments; > - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head) > + if (!(fq->q.flags & INET_FRAG_FIRST_IN)) > goto out; > > + if (fq->q.fragments) { > + head = fq->q.fragments; > + fq->q.fragments = head->next; > + } else { > + head = skb_rb_first(&fq->q.rb_fragments); > + if (!head) > + goto out; > + rb_erase(&head->rbnode, &fq->q.rb_fragments); > + memset(&head->rbnode, 0, sizeof(head->rbnode)); > + barrier(); > + } > + > + if (head == fq->q.fragments_tail) > + fq->q.fragments_tail = NULL; > + > + sub_frag_mem_limit(fq->q.net, head->truesize); > + > /* But use as source device on which LAST ARRIVED > * segment was received. And do not use fq->dev > * pointer directly, device might already disappeared. > */ > head->dev = dev; > - skb_get(head); > spin_unlock(&fq->q.lock); > > icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0); > - kfree_skb(head); > goto out_rcu_unlock; > > out: > spin_unlock(&fq->q.lock); > out_rcu_unlock: > rcu_read_unlock(); > + if (head) > + kfree_skb(head); > inet_frag_put(&fq->q); > } > EXPORT_SYMBOL(ip6_expire_frag_queue); Looks good to me! Acked-by: Andrea Righi <andrea.righi@canonical.com>
On 06.06.19 12:24, Stefan Bader wrote: > BugLink: https://bugs.launchpad.net/bugs/1824687 > > The backport of > > 05c0b86b96 ("ipv6: frags: rewrite ip6_expire_frag_queue()") > > to linux-4.4.y stable changed ip6_expire_frag_queue() to be similar > to ip_expire(). However, using skb_get() leads to a crash while > sending the ICMP message due to a check for shared SKBs. > > kernel BUG at linux-4.4.0/net/core/skbuff.c:1207! > RIP: 0010:[<ffffffff81740953>] > [<ffffffff81740953>] pskb_expand_head+0x243/0x250 > [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 > [<ffffffff8183939a>] _decode_session6+0x26a/0x400 > [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 > [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 > [<ffffffff81824421>] icmp6_send+0x5e1/0x940 > [<ffffffff8183d431>] icmpv6_send+0x21/0x30 > [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 > > For IPv4 the ip_expire() function however did change considerably > since then. In > > fa0f527358 ("ip: use rb trees for IP frag queue.") > > the SKB might be taken from a rbtree (use of rbtrees for IPv4 was > backported to 4.4.y upstream). > Along with those obvious changes, the code also is modified to > actually de-queue the SKB from whichever source it was taken. > This also got rid of the skb_get() which causes problems in > icmpv6_send(). And latest upstream code uses inet_frag_pull_head() > which does the same. > > To fix the crash in IPv6, we use the same modifications added > to ip_expire() by fa0f527358. This might be too much change for > now because IPv6 only starts using rbtrees for frags with > > 997dd96471 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c") > > which has not been backported to 4.4.y. Testing by a reporter was > showing good results. Likely the else part never gets used until > 997dd96471 is backported, too. And that needs more changes. > Some upstream (stable) discussion was started but has not yet > resulted in any usable results. So adding this as SAUCE for now > to get the kernel stable (based on testing). > > Fixes: bf8187348f ("ipv6: frags: rewrite ip6_expire_frag_queue()") > in the linux-4.4.y stable tree. > (based-on: f78a3f45e7 ("ip: use rb trees for IP frag queue." 4.4.y)) > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> > --- Applied to xenial/master-next and cosmic/master-next. Thanks. -Stefan > net/ipv6/reassembly.c | 26 +++++++++++++++++++++----- > 1 file changed, 21 insertions(+), 5 deletions(-) > > diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c > index ec917f58d105..5b09ce54c476 100644 > --- a/net/ipv6/reassembly.c > +++ b/net/ipv6/reassembly.c > @@ -92,7 +92,7 @@ EXPORT_SYMBOL(ip6_frag_init); > void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > { > struct net_device *dev = NULL; > - struct sk_buff *head; > + struct sk_buff *head = NULL; > > rcu_read_lock(); > spin_lock(&fq->q.lock); > @@ -110,26 +110,42 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT); > > /* Don't send error if the first segment did not arrive. */ > - head = fq->q.fragments; > - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head) > + if (!(fq->q.flags & INET_FRAG_FIRST_IN)) > goto out; > > + if (fq->q.fragments) { > + head = fq->q.fragments; > + fq->q.fragments = head->next; > + } else { > + head = skb_rb_first(&fq->q.rb_fragments); > + if (!head) > + goto out; > + rb_erase(&head->rbnode, &fq->q.rb_fragments); > + memset(&head->rbnode, 0, sizeof(head->rbnode)); > + barrier(); > + } > + > + if (head == fq->q.fragments_tail) > + fq->q.fragments_tail = NULL; > + > + sub_frag_mem_limit(fq->q.net, head->truesize); > + > /* But use as source device on which LAST ARRIVED > * segment was received. And do not use fq->dev > * pointer directly, device might already disappeared. > */ > head->dev = dev; > - skb_get(head); > spin_unlock(&fq->q.lock); > > icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0); > - kfree_skb(head); > goto out_rcu_unlock; > > out: > spin_unlock(&fq->q.lock); > out_rcu_unlock: > rcu_read_unlock(); > + if (head) > + kfree_skb(head); > inet_frag_put(&fq->q); > } > EXPORT_SYMBOL(ip6_expire_frag_queue); >
On Thu, Jun 6, 2019 at 5:22 AM Stefan Bader <stefan.bader@canonical.com> wrote: > > On 06.06.19 12:24, Stefan Bader wrote: > > BugLink: https://bugs.launchpad.net/bugs/1824687 > > > > The backport of > > > > 05c0b86b96 ("ipv6: frags: rewrite ip6_expire_frag_queue()") > > > > to linux-4.4.y stable changed ip6_expire_frag_queue() to be similar > > to ip_expire(). However, using skb_get() leads to a crash while > > sending the ICMP message due to a check for shared SKBs. > > > > kernel BUG at linux-4.4.0/net/core/skbuff.c:1207! > > RIP: 0010:[<ffffffff81740953>] > > [<ffffffff81740953>] pskb_expand_head+0x243/0x250 > > [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 > > [<ffffffff8183939a>] _decode_session6+0x26a/0x400 > > [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 > > [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 > > [<ffffffff81824421>] icmp6_send+0x5e1/0x940 > > [<ffffffff8183d431>] icmpv6_send+0x21/0x30 > > [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 > > > > For IPv4 the ip_expire() function however did change considerably > > since then. In > > > > fa0f527358 ("ip: use rb trees for IP frag queue.") > > > > the SKB might be taken from a rbtree (use of rbtrees for IPv4 was > > backported to 4.4.y upstream). > > Along with those obvious changes, the code also is modified to > > actually de-queue the SKB from whichever source it was taken. > > This also got rid of the skb_get() which causes problems in > > icmpv6_send(). And latest upstream code uses inet_frag_pull_head() > > which does the same. > > > > To fix the crash in IPv6, we use the same modifications added > > to ip_expire() by fa0f527358. This might be too much change for > > now because IPv6 only starts using rbtrees for frags with > > > > 997dd96471 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c") > > > > which has not been backported to 4.4.y. Testing by a reporter was > > showing good results. Likely the else part never gets used until > > 997dd96471 is backported, too. And that needs more changes. > > Some upstream (stable) discussion was started but has not yet > > resulted in any usable results. So adding this as SAUCE for now > > to get the kernel stable (based on testing). > > > > Fixes: bf8187348f ("ipv6: frags: rewrite ip6_expire_frag_queue()") > > in the linux-4.4.y stable tree. > > (based-on: f78a3f45e7 ("ip: use rb trees for IP frag queue." 4.4.y)) > > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> > > --- > > Applied to xenial/master-next and cosmic/master-next. Thanks. fyi, cosmic/master-next is failing to build for me after this patch: https://paste.ubuntu.com/p/kjKJBk6BRN/ -dann > > net/ipv6/reassembly.c | 26 +++++++++++++++++++++----- > > 1 file changed, 21 insertions(+), 5 deletions(-) > > > > diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c > > index ec917f58d105..5b09ce54c476 100644 > > --- a/net/ipv6/reassembly.c > > +++ b/net/ipv6/reassembly.c > > @@ -92,7 +92,7 @@ EXPORT_SYMBOL(ip6_frag_init); > > void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > > { > > struct net_device *dev = NULL; > > - struct sk_buff *head; > > + struct sk_buff *head = NULL; > > > > rcu_read_lock(); > > spin_lock(&fq->q.lock); > > @@ -110,26 +110,42 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) > > IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT); > > > > /* Don't send error if the first segment did not arrive. */ > > - head = fq->q.fragments; > > - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head) > > + if (!(fq->q.flags & INET_FRAG_FIRST_IN)) > > goto out; > > > > + if (fq->q.fragments) { > > + head = fq->q.fragments; > > + fq->q.fragments = head->next; > > + } else { > > + head = skb_rb_first(&fq->q.rb_fragments); > > + if (!head) > > + goto out; > > + rb_erase(&head->rbnode, &fq->q.rb_fragments); > > + memset(&head->rbnode, 0, sizeof(head->rbnode)); > > + barrier(); > > + } > > + > > + if (head == fq->q.fragments_tail) > > + fq->q.fragments_tail = NULL; > > + > > + sub_frag_mem_limit(fq->q.net, head->truesize); > > + > > /* But use as source device on which LAST ARRIVED > > * segment was received. And do not use fq->dev > > * pointer directly, device might already disappeared. > > */ > > head->dev = dev; > > - skb_get(head); > > spin_unlock(&fq->q.lock); > > > > icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0); > > - kfree_skb(head); > > goto out_rcu_unlock; > > > > out: > > spin_unlock(&fq->q.lock); > > out_rcu_unlock: > > rcu_read_unlock(); > > + if (head) > > + kfree_skb(head); > > inet_frag_put(&fq->q); > > } > > EXPORT_SYMBOL(ip6_expire_frag_queue); > > > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On 10.06.19 17:49, dann frazier wrote: > On Thu, Jun 6, 2019 at 5:22 AM Stefan Bader <stefan.bader@canonical.com> wrote: >> >> On 06.06.19 12:24, Stefan Bader wrote: >>> BugLink: https://bugs.launchpad.net/bugs/1824687 >>> >>> The backport of >>> >>> 05c0b86b96 ("ipv6: frags: rewrite ip6_expire_frag_queue()") >>> >>> to linux-4.4.y stable changed ip6_expire_frag_queue() to be similar >>> to ip_expire(). However, using skb_get() leads to a crash while >>> sending the ICMP message due to a check for shared SKBs. >>> >>> kernel BUG at linux-4.4.0/net/core/skbuff.c:1207! >>> RIP: 0010:[<ffffffff81740953>] >>> [<ffffffff81740953>] pskb_expand_head+0x243/0x250 >>> [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 >>> [<ffffffff8183939a>] _decode_session6+0x26a/0x400 >>> [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 >>> [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 >>> [<ffffffff81824421>] icmp6_send+0x5e1/0x940 >>> [<ffffffff8183d431>] icmpv6_send+0x21/0x30 >>> [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 >>> >>> For IPv4 the ip_expire() function however did change considerably >>> since then. In >>> >>> fa0f527358 ("ip: use rb trees for IP frag queue.") >>> >>> the SKB might be taken from a rbtree (use of rbtrees for IPv4 was >>> backported to 4.4.y upstream). >>> Along with those obvious changes, the code also is modified to >>> actually de-queue the SKB from whichever source it was taken. >>> This also got rid of the skb_get() which causes problems in >>> icmpv6_send(). And latest upstream code uses inet_frag_pull_head() >>> which does the same. >>> >>> To fix the crash in IPv6, we use the same modifications added >>> to ip_expire() by fa0f527358. This might be too much change for >>> now because IPv6 only starts using rbtrees for frags with >>> >>> 997dd96471 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c") >>> >>> which has not been backported to 4.4.y. Testing by a reporter was >>> showing good results. Likely the else part never gets used until >>> 997dd96471 is backported, too. And that needs more changes. >>> Some upstream (stable) discussion was started but has not yet >>> resulted in any usable results. So adding this as SAUCE for now >>> to get the kernel stable (based on testing). >>> >>> Fixes: bf8187348f ("ipv6: frags: rewrite ip6_expire_frag_queue()") >>> in the linux-4.4.y stable tree. >>> (based-on: f78a3f45e7 ("ip: use rb trees for IP frag queue." 4.4.y)) >>> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> >>> --- >> >> Applied to xenial/master-next and cosmic/master-next. Thanks. > > fyi, cosmic/master-next is failing to build for me after this patch: > https://paste.ubuntu.com/p/kjKJBk6BRN/ Hrm, I should have checked whether rbtree usage was added to ipv4 in 4.18. Was fooled by the ipv6 function looking just like in the 4.4.y backports. And the ip_expire() function back then seems to just do the skb_get(). So either everything is even more horribly broken or the bug statement was not added. Since we have no explicit reports for Cosmic, we should drop it there for now. -Stefan > > -dann > >>> net/ipv6/reassembly.c | 26 +++++++++++++++++++++----- >>> 1 file changed, 21 insertions(+), 5 deletions(-) >>> >>> diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c >>> index ec917f58d105..5b09ce54c476 100644 >>> --- a/net/ipv6/reassembly.c >>> +++ b/net/ipv6/reassembly.c >>> @@ -92,7 +92,7 @@ EXPORT_SYMBOL(ip6_frag_init); >>> void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) >>> { >>> struct net_device *dev = NULL; >>> - struct sk_buff *head; >>> + struct sk_buff *head = NULL; >>> >>> rcu_read_lock(); >>> spin_lock(&fq->q.lock); >>> @@ -110,26 +110,42 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) >>> IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT); >>> >>> /* Don't send error if the first segment did not arrive. */ >>> - head = fq->q.fragments; >>> - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head) >>> + if (!(fq->q.flags & INET_FRAG_FIRST_IN)) >>> goto out; >>> >>> + if (fq->q.fragments) { >>> + head = fq->q.fragments; >>> + fq->q.fragments = head->next; >>> + } else { >>> + head = skb_rb_first(&fq->q.rb_fragments); >>> + if (!head) >>> + goto out; >>> + rb_erase(&head->rbnode, &fq->q.rb_fragments); >>> + memset(&head->rbnode, 0, sizeof(head->rbnode)); >>> + barrier(); >>> + } >>> + >>> + if (head == fq->q.fragments_tail) >>> + fq->q.fragments_tail = NULL; >>> + >>> + sub_frag_mem_limit(fq->q.net, head->truesize); >>> + >>> /* But use as source device on which LAST ARRIVED >>> * segment was received. And do not use fq->dev >>> * pointer directly, device might already disappeared. >>> */ >>> head->dev = dev; >>> - skb_get(head); >>> spin_unlock(&fq->q.lock); >>> >>> icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0); >>> - kfree_skb(head); >>> goto out_rcu_unlock; >>> >>> out: >>> spin_unlock(&fq->q.lock); >>> out_rcu_unlock: >>> rcu_read_unlock(); >>> + if (head) >>> + kfree_skb(head); >>> inet_frag_put(&fq->q); >>> } >>> EXPORT_SYMBOL(ip6_expire_frag_queue); >>> >> >> >> -- >> kernel-team mailing list >> kernel-team@lists.ubuntu.com >> https://lists.ubuntu.com/mailman/listinfo/kernel-team
For now dropped from Cosmic.
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index ec917f58d105..5b09ce54c476 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -92,7 +92,7 @@ EXPORT_SYMBOL(ip6_frag_init); void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) { struct net_device *dev = NULL; - struct sk_buff *head; + struct sk_buff *head = NULL; rcu_read_lock(); spin_lock(&fq->q.lock); @@ -110,26 +110,42 @@ void ip6_expire_frag_queue(struct net *net, struct frag_queue *fq) IP6_INC_STATS_BH(net, __in6_dev_get(dev), IPSTATS_MIB_REASMTIMEOUT); /* Don't send error if the first segment did not arrive. */ - head = fq->q.fragments; - if (!(fq->q.flags & INET_FRAG_FIRST_IN) || !head) + if (!(fq->q.flags & INET_FRAG_FIRST_IN)) goto out; + if (fq->q.fragments) { + head = fq->q.fragments; + fq->q.fragments = head->next; + } else { + head = skb_rb_first(&fq->q.rb_fragments); + if (!head) + goto out; + rb_erase(&head->rbnode, &fq->q.rb_fragments); + memset(&head->rbnode, 0, sizeof(head->rbnode)); + barrier(); + } + + if (head == fq->q.fragments_tail) + fq->q.fragments_tail = NULL; + + sub_frag_mem_limit(fq->q.net, head->truesize); + /* But use as source device on which LAST ARRIVED * segment was received. And do not use fq->dev * pointer directly, device might already disappeared. */ head->dev = dev; - skb_get(head); spin_unlock(&fq->q.lock); icmpv6_send(head, ICMPV6_TIME_EXCEED, ICMPV6_EXC_FRAGTIME, 0); - kfree_skb(head); goto out_rcu_unlock; out: spin_unlock(&fq->q.lock); out_rcu_unlock: rcu_read_unlock(); + if (head) + kfree_skb(head); inet_frag_put(&fq->q); } EXPORT_SYMBOL(ip6_expire_frag_queue);
BugLink: https://bugs.launchpad.net/bugs/1824687 The backport of 05c0b86b96 ("ipv6: frags: rewrite ip6_expire_frag_queue()") to linux-4.4.y stable changed ip6_expire_frag_queue() to be similar to ip_expire(). However, using skb_get() leads to a crash while sending the ICMP message due to a check for shared SKBs. kernel BUG at linux-4.4.0/net/core/skbuff.c:1207! RIP: 0010:[<ffffffff81740953>] [<ffffffff81740953>] pskb_expand_head+0x243/0x250 [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 [<ffffffff8183939a>] _decode_session6+0x26a/0x400 [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 [<ffffffff81824421>] icmp6_send+0x5e1/0x940 [<ffffffff8183d431>] icmpv6_send+0x21/0x30 [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 For IPv4 the ip_expire() function however did change considerably since then. In fa0f527358 ("ip: use rb trees for IP frag queue.") the SKB might be taken from a rbtree (use of rbtrees for IPv4 was backported to 4.4.y upstream). Along with those obvious changes, the code also is modified to actually de-queue the SKB from whichever source it was taken. This also got rid of the skb_get() which causes problems in icmpv6_send(). And latest upstream code uses inet_frag_pull_head() which does the same. To fix the crash in IPv6, we use the same modifications added to ip_expire() by fa0f527358. This might be too much change for now because IPv6 only starts using rbtrees for frags with 997dd96471 ("net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c") which has not been backported to 4.4.y. Testing by a reporter was showing good results. Likely the else part never gets used until 997dd96471 is backported, too. And that needs more changes. Some upstream (stable) discussion was started but has not yet resulted in any usable results. So adding this as SAUCE for now to get the kernel stable (based on testing). Fixes: bf8187348f ("ipv6: frags: rewrite ip6_expire_frag_queue()") in the linux-4.4.y stable tree. (based-on: f78a3f45e7 ("ip: use rb trees for IP frag queue." 4.4.y)) Signed-off-by: Stefan Bader <stefan.bader@canonical.com> --- net/ipv6/reassembly.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-)