Message ID | 1443223510-1810-1-git-send-email-pshelar@nicira.com |
---|---|
State | Accepted |
Headers | show |
On Fri, Sep 25, 2015 at 4:25 PM, Pravin B Shelar <pshelar@nicira.com> wrote: > Upstream commit: > > VXLAN device can receive skb with checksum partial. But the checksum > offset could be in outer header which is pulled on receive. This results > in negative checksum offset for the skb. Such skb can cause the assert > failure in skb_checksum_help(). Following patch fixes the bug by setting > checksum-none while pulling outer header. > > Following is the kernel panic msg from old kernel hitting the bug. > > ------------[ cut here ]------------ > kernel BUG at net/core/dev.c:1906! > RIP: 0010:[<ffffffff81518034>] skb_checksum_help+0x144/0x150 > Call Trace: > <IRQ> > [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] > [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] > [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] > [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] > [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] > [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] > [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] > [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 > [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 > [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 > [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 > [<ffffffff81550128>] ip_local_deliver+0x88/0x90 > [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 > [<ffffffff81550365>] ip_rcv+0x235/0x300 > [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 > [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 > [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 > [<ffffffff8151cd04>] net_rx_action+0x134/0x290 > [<ffffffff810683d8>] __do_softirq+0xa8/0x210 > [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 > [<ffffffff810161a5>] do_softirq+0x65/0xa0 > [<ffffffff810687be>] irq_exit+0x8e/0xb0 > [<ffffffff81630733>] do_IRQ+0x63/0xe0 > [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e > > Reported-by: Anupam Chanda <achanda@vmware.com> > Signed-off-by: Pravin B Shelar <pshelar@nicira.com> > Acked-by: Tom Herbert <tom@herbertland.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > > Upstream: 6ae459bdaae ("skbuff: Fix skb checksum flag on skb pull") > Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com> That being said, I believe that we will still run into this bug on kernels where we use upstream tunnel implementations (but before 4.3, of course).
On Fri, Sep 25, 2015 at 6:05 PM, Jesse Gross <jesse@nicira.com> wrote: > On Fri, Sep 25, 2015 at 4:25 PM, Pravin B Shelar <pshelar@nicira.com> wrote: >> Upstream commit: >> >> VXLAN device can receive skb with checksum partial. But the checksum >> offset could be in outer header which is pulled on receive. This results >> in negative checksum offset for the skb. Such skb can cause the assert >> failure in skb_checksum_help(). Following patch fixes the bug by setting >> checksum-none while pulling outer header. >> >> Following is the kernel panic msg from old kernel hitting the bug. >> >> ------------[ cut here ]------------ >> kernel BUG at net/core/dev.c:1906! >> RIP: 0010:[<ffffffff81518034>] skb_checksum_help+0x144/0x150 >> Call Trace: >> <IRQ> >> [<ffffffffa0164c28>] queue_userspace_packet+0x408/0x470 [openvswitch] >> [<ffffffffa016614d>] ovs_dp_upcall+0x5d/0x60 [openvswitch] >> [<ffffffffa0166236>] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] >> [<ffffffffa016629b>] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] >> [<ffffffffa016c51a>] ovs_vport_receive+0x2a/0x30 [openvswitch] >> [<ffffffffa0171383>] vxlan_rcv+0x53/0x60 [openvswitch] >> [<ffffffffa01734cb>] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] >> [<ffffffff8157addc>] udp_queue_rcv_skb+0x2dc/0x3b0 >> [<ffffffff8157b56f>] __udp4_lib_rcv+0x1cf/0x6c0 >> [<ffffffff8157ba7a>] udp_rcv+0x1a/0x20 >> [<ffffffff8154fdbd>] ip_local_deliver_finish+0xdd/0x280 >> [<ffffffff81550128>] ip_local_deliver+0x88/0x90 >> [<ffffffff8154fa7d>] ip_rcv_finish+0x10d/0x370 >> [<ffffffff81550365>] ip_rcv+0x235/0x300 >> [<ffffffff8151ba1d>] __netif_receive_skb+0x55d/0x620 >> [<ffffffff8151c360>] netif_receive_skb+0x80/0x90 >> [<ffffffff81459935>] virtnet_poll+0x555/0x6f0 >> [<ffffffff8151cd04>] net_rx_action+0x134/0x290 >> [<ffffffff810683d8>] __do_softirq+0xa8/0x210 >> [<ffffffff8162fe6c>] call_softirq+0x1c/0x30 >> [<ffffffff810161a5>] do_softirq+0x65/0xa0 >> [<ffffffff810687be>] irq_exit+0x8e/0xb0 >> [<ffffffff81630733>] do_IRQ+0x63/0xe0 >> [<ffffffff81625f2e>] common_interrupt+0x6e/0x6e >> >> Reported-by: Anupam Chanda <achanda@vmware.com> >> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> >> Acked-by: Tom Herbert <tom@herbertland.com> >> Signed-off-by: David S. Miller <davem@davemloft.net> >> >> Upstream: 6ae459bdaae ("skbuff: Fix skb checksum flag on skb pull") >> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> > > Acked-by: Jesse Gross <jesse@nicira.com> > Thanks. I pushed it to master and branch-2.3 and branch-2.4. > That being said, I believe that we will still run into this bug on > kernels where we use upstream tunnel implementations (but before 4.3, > of course). ok, I have asked for upstream patch to be queued for stable.
diff --git a/datapath/linux/compat/include/linux/skbuff.h b/datapath/linux/compat/include/linux/skbuff.h index 1a576a0..23b13b8 100644 --- a/datapath/linux/compat/include/linux/skbuff.h +++ b/datapath/linux/compat/include/linux/skbuff.h @@ -372,4 +372,28 @@ int rpl_skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci); void rpl_kfree_skb_list(struct sk_buff *segs); #define kfree_skb_list rpl_kfree_skb_list #endif + +#if LINUX_VERSION_CODE < KERNEL_VERSION(4,3,0) +#define skb_postpull_rcsum rpl_skb_postpull_rcsum +static inline void skb_postpull_rcsum(struct sk_buff *skb, + const void *start, unsigned int len) +{ + if (skb->ip_summed == CHECKSUM_COMPLETE) + skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); + else if (skb->ip_summed == CHECKSUM_PARTIAL && + skb_checksum_start_offset(skb) <= len) + skb->ip_summed = CHECKSUM_NONE; +} + +#define skb_pull_rcsum rpl_skb_pull_rcsum +static inline unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len) +{ + BUG_ON(len > skb->len); + skb->len -= len; + BUG_ON(skb->len < skb->data_len); + skb_postpull_rcsum(skb, skb->data, len); + return skb->data += len; +} + +#endif #endif