Message ID | 20190904174049.27082-1-cascardo@canonical.com |
---|---|
State | New |
Headers | show |
Series | [SRU,Bionic] ip: frags: fix crash in ip_do_fragment() | expand |
On Wed, Sep 4, 2019 at 2:41 PM Thadeu Lima de Souza Cascardo <cascardo@canonical.com> wrote: > > From: Taehee Yoo <ap420073@gmail.com> > > BugLink: https://bugs.launchpad.net/bugs/1842447 > > commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream > > A kernel crash occurrs when defragmented packet is fragmented > in ip_do_fragment(). > In defragment routine, skb_orphan() is called and > skb->ip_defrag_offset is set. but skb->sk and > skb->ip_defrag_offset are same union member. so that > frag->sk is not NULL. > Hence crash occurrs in skb->sk check routine in ip_do_fragment() when > defragmented packet is fragmented. > > test commands: > %iptables -t nat -I POSTROUTING -j MASQUERADE > %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000 > > splat looks like: > [ 261.069429] kernel BUG at net/ipv4/ip_output.c:636! Thanks for sending this one out! This is probably the issue responsible for a late NAK to a bnx2x patch recently, per the line/function, although with a different stack trace in the forwarding path. I'll ask that reporter to check this patch too, but response may not happen in a timely manner. cheers, Mauricio > [ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3 > [ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600 > [ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c > [ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202 > [ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004 > [ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8 > [ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395 > [ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4 > [ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000 > [ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 > [ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0 > [ 261.198158] Call Trace: > [ 261.199018] ? dst_output+0x180/0x180 > [ 261.205011] ? save_trace+0x300/0x300 > [ 261.209018] ? ip_copy_metadata+0xb00/0xb00 > [ 261.213034] ? sched_clock_local+0xd4/0x140 > [ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack] > [ 261.223014] ? rt_cpu_seq_stop+0x10/0x10 > [ 261.227014] ? find_held_lock+0x39/0x1c0 > [ 261.233008] ip_finish_output+0x51d/0xb50 > [ 261.237006] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack] > [ 261.250152] ? rcu_is_watching+0x77/0x120 > [ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4] > [ 261.261033] ? nf_hook_slow+0xb1/0x160 > [ 261.265007] ip_output+0x1c7/0x710 > [ 261.269005] ? ip_mc_output+0x13f0/0x13f0 > [ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0 > [ 261.278152] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.282996] ? nf_hook_slow+0xb1/0x160 > [ 261.287007] raw_sendmsg+0x21f9/0x4420 > [ 261.291008] ? dst_output+0x180/0x180 > [ 261.297003] ? sched_clock_cpu+0x126/0x170 > [ 261.301003] ? find_held_lock+0x39/0x1c0 > [ 261.306155] ? stop_critical_timings+0x420/0x420 > [ 261.311004] ? check_flags.part.36+0x450/0x450 > [ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.326142] ? cyc2ns_read_end+0x10/0x10 > [ 261.330139] ? raw_bind+0x280/0x280 > [ 261.334138] ? sched_clock_cpu+0x126/0x170 > [ 261.338995] ? check_flags.part.36+0x450/0x450 > [ 261.342991] ? __lock_acquire+0x4500/0x4500 > [ 261.348994] ? inet_sendmsg+0x11c/0x500 > [ 261.352989] ? dst_output+0x180/0x180 > [ 261.357012] inet_sendmsg+0x11c/0x500 > [ ... ] > > v2: > - clear skb->sk at reassembly routine.(Eric Dumarzet) > > Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.") > Suggested-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > Reviewed-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > (backported from commit 08fb833b40e361ce927c64d40e348af96996d9eb) > > [cascardo: > The backport here misses the hunk at net/ipv6/netfilter/nf_conntrack_reasm.c. > This one has been changed by our commit (net: IP6 defrag: use rbtrees in > nf_conntrack_reasm.c), and calls inet_frag_reasm_finish, which resets sk to > NULL. > ] > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> > --- > net/ipv4/ip_fragment.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c > index 73ca84d880b9..098c6dcff305 100644 > --- a/net/ipv4/ip_fragment.c > +++ b/net/ipv4/ip_fragment.c > @@ -606,6 +606,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, > nextp = &fp->next; > fp->prev = NULL; > memset(&fp->rbnode, 0, sizeof(fp->rbnode)); > + fp->sk = NULL; > head->data_len += fp->len; > head->len += fp->len; > if (head->ip_summed != fp->ip_summed) > -- > 2.20.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On 2019-09-04 14:40:49 , Thadeu Lima de Souza Cascardo wrote: > From: Taehee Yoo <ap420073@gmail.com> > > BugLink: https://bugs.launchpad.net/bugs/1842447 > > commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream > > A kernel crash occurrs when defragmented packet is fragmented > in ip_do_fragment(). > In defragment routine, skb_orphan() is called and > skb->ip_defrag_offset is set. but skb->sk and > skb->ip_defrag_offset are same union member. so that > frag->sk is not NULL. > Hence crash occurrs in skb->sk check routine in ip_do_fragment() when > defragmented packet is fragmented. > > test commands: > %iptables -t nat -I POSTROUTING -j MASQUERADE > %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000 > > splat looks like: > [ 261.069429] kernel BUG at net/ipv4/ip_output.c:636! > [ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3 > [ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600 > [ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c > [ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202 > [ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004 > [ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8 > [ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395 > [ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4 > [ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000 > [ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 > [ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0 > [ 261.198158] Call Trace: > [ 261.199018] ? dst_output+0x180/0x180 > [ 261.205011] ? save_trace+0x300/0x300 > [ 261.209018] ? ip_copy_metadata+0xb00/0xb00 > [ 261.213034] ? sched_clock_local+0xd4/0x140 > [ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack] > [ 261.223014] ? rt_cpu_seq_stop+0x10/0x10 > [ 261.227014] ? find_held_lock+0x39/0x1c0 > [ 261.233008] ip_finish_output+0x51d/0xb50 > [ 261.237006] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack] > [ 261.250152] ? rcu_is_watching+0x77/0x120 > [ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4] > [ 261.261033] ? nf_hook_slow+0xb1/0x160 > [ 261.265007] ip_output+0x1c7/0x710 > [ 261.269005] ? ip_mc_output+0x13f0/0x13f0 > [ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0 > [ 261.278152] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.282996] ? nf_hook_slow+0xb1/0x160 > [ 261.287007] raw_sendmsg+0x21f9/0x4420 > [ 261.291008] ? dst_output+0x180/0x180 > [ 261.297003] ? sched_clock_cpu+0x126/0x170 > [ 261.301003] ? find_held_lock+0x39/0x1c0 > [ 261.306155] ? stop_critical_timings+0x420/0x420 > [ 261.311004] ? check_flags.part.36+0x450/0x450 > [ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.326142] ? cyc2ns_read_end+0x10/0x10 > [ 261.330139] ? raw_bind+0x280/0x280 > [ 261.334138] ? sched_clock_cpu+0x126/0x170 > [ 261.338995] ? check_flags.part.36+0x450/0x450 > [ 261.342991] ? __lock_acquire+0x4500/0x4500 > [ 261.348994] ? inet_sendmsg+0x11c/0x500 > [ 261.352989] ? dst_output+0x180/0x180 > [ 261.357012] inet_sendmsg+0x11c/0x500 > [ ... ] > > v2: > - clear skb->sk at reassembly routine.(Eric Dumarzet) > > Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.") > Suggested-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > Reviewed-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > (backported from commit 08fb833b40e361ce927c64d40e348af96996d9eb) > > [cascardo: > The backport here misses the hunk at net/ipv6/netfilter/nf_conntrack_reasm.c. > This one has been changed by our commit (net: IP6 defrag: use rbtrees in > nf_conntrack_reasm.c), and calls inet_frag_reasm_finish, which resets sk to > NULL. > ] > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> > --- > net/ipv4/ip_fragment.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c > index 73ca84d880b9..098c6dcff305 100644 > --- a/net/ipv4/ip_fragment.c > +++ b/net/ipv4/ip_fragment.c > @@ -606,6 +606,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, > nextp = &fp->next; > fp->prev = NULL; > memset(&fp->rbnode, 0, sizeof(fp->rbnode)); > + fp->sk = NULL; > head->data_len += fp->len; > head->len += fp->len; > if (head->ip_summed != fp->ip_summed) Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
On 9/4/19 10:40 AM, Thadeu Lima de Souza Cascardo wrote: > From: Taehee Yoo <ap420073@gmail.com> > > BugLink: https://bugs.launchpad.net/bugs/1842447 > > commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream > > A kernel crash occurrs when defragmented packet is fragmented > in ip_do_fragment(). > In defragment routine, skb_orphan() is called and > skb->ip_defrag_offset is set. but skb->sk and > skb->ip_defrag_offset are same union member. so that > frag->sk is not NULL. > Hence crash occurrs in skb->sk check routine in ip_do_fragment() when > defragmented packet is fragmented. > > test commands: > %iptables -t nat -I POSTROUTING -j MASQUERADE > %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000 > > splat looks like: > [ 261.069429] kernel BUG at net/ipv4/ip_output.c:636! > [ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3 > [ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600 > [ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c > [ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202 > [ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004 > [ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8 > [ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395 > [ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4 > [ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000 > [ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 > [ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0 > [ 261.198158] Call Trace: > [ 261.199018] ? dst_output+0x180/0x180 > [ 261.205011] ? save_trace+0x300/0x300 > [ 261.209018] ? ip_copy_metadata+0xb00/0xb00 > [ 261.213034] ? sched_clock_local+0xd4/0x140 > [ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack] > [ 261.223014] ? rt_cpu_seq_stop+0x10/0x10 > [ 261.227014] ? find_held_lock+0x39/0x1c0 > [ 261.233008] ip_finish_output+0x51d/0xb50 > [ 261.237006] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack] > [ 261.250152] ? rcu_is_watching+0x77/0x120 > [ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4] > [ 261.261033] ? nf_hook_slow+0xb1/0x160 > [ 261.265007] ip_output+0x1c7/0x710 > [ 261.269005] ? ip_mc_output+0x13f0/0x13f0 > [ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0 > [ 261.278152] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.282996] ? nf_hook_slow+0xb1/0x160 > [ 261.287007] raw_sendmsg+0x21f9/0x4420 > [ 261.291008] ? dst_output+0x180/0x180 > [ 261.297003] ? sched_clock_cpu+0x126/0x170 > [ 261.301003] ? find_held_lock+0x39/0x1c0 > [ 261.306155] ? stop_critical_timings+0x420/0x420 > [ 261.311004] ? check_flags.part.36+0x450/0x450 > [ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.326142] ? cyc2ns_read_end+0x10/0x10 > [ 261.330139] ? raw_bind+0x280/0x280 > [ 261.334138] ? sched_clock_cpu+0x126/0x170 > [ 261.338995] ? check_flags.part.36+0x450/0x450 > [ 261.342991] ? __lock_acquire+0x4500/0x4500 > [ 261.348994] ? inet_sendmsg+0x11c/0x500 > [ 261.352989] ? dst_output+0x180/0x180 > [ 261.357012] inet_sendmsg+0x11c/0x500 > [ ... ] > > v2: > - clear skb->sk at reassembly routine.(Eric Dumarzet) > > Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.") > Suggested-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > Reviewed-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > (backported from commit 08fb833b40e361ce927c64d40e348af96996d9eb) > > [cascardo: > The backport here misses the hunk at net/ipv6/netfilter/nf_conntrack_reasm.c. > This one has been changed by our commit (net: IP6 defrag: use rbtrees in > nf_conntrack_reasm.c), and calls inet_frag_reasm_finish, which resets sk to > NULL. > ] > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Connor Kuehl <connor.kuehl@canonical.com> > --- > net/ipv4/ip_fragment.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c > index 73ca84d880b9..098c6dcff305 100644 > --- a/net/ipv4/ip_fragment.c > +++ b/net/ipv4/ip_fragment.c > @@ -606,6 +606,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, > nextp = &fp->next; > fp->prev = NULL; > memset(&fp->rbnode, 0, sizeof(fp->rbnode)); > + fp->sk = NULL; > head->data_len += fp->len; > head->len += fp->len; > if (head->ip_summed != fp->ip_summed) >
On 2019-09-04 14:40:49, Thadeu Lima de Souza Cascardo wrote: > From: Taehee Yoo <ap420073@gmail.com> > > BugLink: https://bugs.launchpad.net/bugs/1842447 > > commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream > > A kernel crash occurrs when defragmented packet is fragmented > in ip_do_fragment(). > In defragment routine, skb_orphan() is called and > skb->ip_defrag_offset is set. but skb->sk and > skb->ip_defrag_offset are same union member. so that > frag->sk is not NULL. > Hence crash occurrs in skb->sk check routine in ip_do_fragment() when > defragmented packet is fragmented. > > test commands: > %iptables -t nat -I POSTROUTING -j MASQUERADE > %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000 > > splat looks like: > [ 261.069429] kernel BUG at net/ipv4/ip_output.c:636! > [ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3 > [ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600 > [ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c > [ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202 > [ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004 > [ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8 > [ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395 > [ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4 > [ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000 > [ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 > [ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0 > [ 261.198158] Call Trace: > [ 261.199018] ? dst_output+0x180/0x180 > [ 261.205011] ? save_trace+0x300/0x300 > [ 261.209018] ? ip_copy_metadata+0xb00/0xb00 > [ 261.213034] ? sched_clock_local+0xd4/0x140 > [ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack] > [ 261.223014] ? rt_cpu_seq_stop+0x10/0x10 > [ 261.227014] ? find_held_lock+0x39/0x1c0 > [ 261.233008] ip_finish_output+0x51d/0xb50 > [ 261.237006] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack] > [ 261.250152] ? rcu_is_watching+0x77/0x120 > [ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4] > [ 261.261033] ? nf_hook_slow+0xb1/0x160 > [ 261.265007] ip_output+0x1c7/0x710 > [ 261.269005] ? ip_mc_output+0x13f0/0x13f0 > [ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0 > [ 261.278152] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.282996] ? nf_hook_slow+0xb1/0x160 > [ 261.287007] raw_sendmsg+0x21f9/0x4420 > [ 261.291008] ? dst_output+0x180/0x180 > [ 261.297003] ? sched_clock_cpu+0x126/0x170 > [ 261.301003] ? find_held_lock+0x39/0x1c0 > [ 261.306155] ? stop_critical_timings+0x420/0x420 > [ 261.311004] ? check_flags.part.36+0x450/0x450 > [ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.326142] ? cyc2ns_read_end+0x10/0x10 > [ 261.330139] ? raw_bind+0x280/0x280 > [ 261.334138] ? sched_clock_cpu+0x126/0x170 > [ 261.338995] ? check_flags.part.36+0x450/0x450 > [ 261.342991] ? __lock_acquire+0x4500/0x4500 > [ 261.348994] ? inet_sendmsg+0x11c/0x500 > [ 261.352989] ? dst_output+0x180/0x180 > [ 261.357012] inet_sendmsg+0x11c/0x500 > [ ... ] > > v2: > - clear skb->sk at reassembly routine.(Eric Dumarzet) > > Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.") > Suggested-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > Reviewed-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > (backported from commit 08fb833b40e361ce927c64d40e348af96996d9eb) > > [cascardo: > The backport here misses the hunk at net/ipv6/netfilter/nf_conntrack_reasm.c. > This one has been changed by our commit (net: IP6 defrag: use rbtrees in > nf_conntrack_reasm.c), and calls inet_frag_reasm_finish, which resets sk to > NULL. > ] > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> The Fixes line above confused me because I know that we don't have upstream commit fa0f527358bd ("ip: use rb trees for IP frag queue.") in Bionic. However, Bionic commit 4faadabc060a ("net: IP defrag: encapsulate rbtree defrag code into callable functions") brings back the "harmful" parts of upstream commit fa0f527358bd which is why this fix is needed. I agree that the call to inet_frag_reasm_finish() handles the reassignment for the IPv6 case. Acked-by: Tyler Hicks <tyhicks@canonical.com> Tyler > --- > net/ipv4/ip_fragment.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c > index 73ca84d880b9..098c6dcff305 100644 > --- a/net/ipv4/ip_fragment.c > +++ b/net/ipv4/ip_fragment.c > @@ -606,6 +606,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, > nextp = &fp->next; > fp->prev = NULL; > memset(&fp->rbnode, 0, sizeof(fp->rbnode)); > + fp->sk = NULL; > head->data_len += fp->len; > head->len += fp->len; > if (head->ip_summed != fp->ip_summed) > -- > 2.20.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On Wed, Sep 04, 2019 at 02:40:49PM -0300, Thadeu Lima de Souza Cascardo wrote: > From: Taehee Yoo <ap420073@gmail.com> > > BugLink: https://bugs.launchpad.net/bugs/1842447 > > commit 5d407b071dc369c26a38398326ee2be53651cfe4 upstream > > A kernel crash occurrs when defragmented packet is fragmented > in ip_do_fragment(). > In defragment routine, skb_orphan() is called and > skb->ip_defrag_offset is set. but skb->sk and > skb->ip_defrag_offset are same union member. so that > frag->sk is not NULL. > Hence crash occurrs in skb->sk check routine in ip_do_fragment() when > defragmented packet is fragmented. > > test commands: > %iptables -t nat -I POSTROUTING -j MASQUERADE > %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000 > > splat looks like: > [ 261.069429] kernel BUG at net/ipv4/ip_output.c:636! > [ 261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3 > [ 261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600 > [ 261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c > [ 261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202 > [ 261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004 > [ 261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8 > [ 261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395 > [ 261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4 > [ 261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000 > [ 261.174169] FS: 00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 > [ 261.183012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0 > [ 261.198158] Call Trace: > [ 261.199018] ? dst_output+0x180/0x180 > [ 261.205011] ? save_trace+0x300/0x300 > [ 261.209018] ? ip_copy_metadata+0xb00/0xb00 > [ 261.213034] ? sched_clock_local+0xd4/0x140 > [ 261.218158] ? kill_l4proto+0x120/0x120 [nf_conntrack] > [ 261.223014] ? rt_cpu_seq_stop+0x10/0x10 > [ 261.227014] ? find_held_lock+0x39/0x1c0 > [ 261.233008] ip_finish_output+0x51d/0xb50 > [ 261.237006] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.243011] ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack] > [ 261.250152] ? rcu_is_watching+0x77/0x120 > [ 261.255010] ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4] > [ 261.261033] ? nf_hook_slow+0xb1/0x160 > [ 261.265007] ip_output+0x1c7/0x710 > [ 261.269005] ? ip_mc_output+0x13f0/0x13f0 > [ 261.273002] ? __local_bh_enable_ip+0xe9/0x1b0 > [ 261.278152] ? ip_fragment.constprop.56+0x220/0x220 > [ 261.282996] ? nf_hook_slow+0xb1/0x160 > [ 261.287007] raw_sendmsg+0x21f9/0x4420 > [ 261.291008] ? dst_output+0x180/0x180 > [ 261.297003] ? sched_clock_cpu+0x126/0x170 > [ 261.301003] ? find_held_lock+0x39/0x1c0 > [ 261.306155] ? stop_critical_timings+0x420/0x420 > [ 261.311004] ? check_flags.part.36+0x450/0x450 > [ 261.315005] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.320995] ? _raw_spin_unlock_irq+0x29/0x40 > [ 261.326142] ? cyc2ns_read_end+0x10/0x10 > [ 261.330139] ? raw_bind+0x280/0x280 > [ 261.334138] ? sched_clock_cpu+0x126/0x170 > [ 261.338995] ? check_flags.part.36+0x450/0x450 > [ 261.342991] ? __lock_acquire+0x4500/0x4500 > [ 261.348994] ? inet_sendmsg+0x11c/0x500 > [ 261.352989] ? dst_output+0x180/0x180 > [ 261.357012] inet_sendmsg+0x11c/0x500 > [ ... ] > > v2: > - clear skb->sk at reassembly routine.(Eric Dumarzet) > > Fixes: fa0f527358bd ("ip: use rb trees for IP frag queue.") > Suggested-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Taehee Yoo <ap420073@gmail.com> > Reviewed-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > (backported from commit 08fb833b40e361ce927c64d40e348af96996d9eb) > > [cascardo: > The backport here misses the hunk at net/ipv6/netfilter/nf_conntrack_reasm.c. > This one has been changed by our commit (net: IP6 defrag: use rbtrees in > nf_conntrack_reasm.c), and calls inet_frag_reasm_finish, which resets sk to > NULL. > ] > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Paolo Pisati <paolo.pisati@canonical.com>
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 73ca84d880b9..098c6dcff305 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -606,6 +606,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb, nextp = &fp->next; fp->prev = NULL; memset(&fp->rbnode, 0, sizeof(fp->rbnode)); + fp->sk = NULL; head->data_len += fp->len; head->len += fp->len; if (head->ip_summed != fp->ip_summed)