From patchwork Thu Jun 17 09:09:08 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shan Wei X-Patchwork-Id: 55993 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 4A6931007D2 for ; Thu, 17 Jun 2010 19:09:31 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751976Ab0FQJJ1 (ORCPT ); Thu, 17 Jun 2010 05:09:27 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:62345 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1751065Ab0FQJJZ (ORCPT ); Thu, 17 Jun 2010 05:09:25 -0400 Received: from tang.cn.fujitsu.com (tang.cn.fujitsu.com [10.167.250.3]) by song.cn.fujitsu.com (Postfix) with ESMTP id 81125170127; Thu, 17 Jun 2010 17:09:24 +0800 (CST) Received: from fnst.cn.fujitsu.com (tang.cn.fujitsu.com [127.0.0.1]) by tang.cn.fujitsu.com (8.14.3/8.13.1) with ESMTP id o5H970Xl015058; Thu, 17 Jun 2010 17:07:00 +0800 Received: from [10.167.141.214] (unknown [10.167.141.214]) by fnst.cn.fujitsu.com (Postfix) with ESMTPA id 8241210C16F; Thu, 17 Jun 2010 17:09:13 +0800 (CST) Message-ID: <4C19E634.3030703@cn.fujitsu.com> Date: Thu, 17 Jun 2010 17:09:08 +0800 From: Shan Wei User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: David Miller , "Ronciak, John" , "netdev@vger.kernel.org" CC: Eric Dumazet , Shan Wei Subject: [RFC][BUG-FIX] the problem of checksum checking in UDP protocol Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org *Description of Problem* When received an UDP packet, if the length parameter in UDP header is less than the actual length of payload(including 8 bytes UDP header), and checksum parameter is calculated including all payload, some NIC devices that supports hardware checksum success to check checksum, and set ip_summed with CHECKSUM_UNNECESSARY flag. But If we turn off rx-checksumming offload, UDP protocol failed to check the checksum. *Step to Reproduce* We need to download netwib&netwox tools and then install them only on M1 node. On M1 node, execute the below steps. M1 M2 +---------------------------+ +---------------------------+ | eth1 |<---------------> |eth0 | |fe80::225:86ff:fe9d:3efa | |fe80::215:17ff:fe71:51f4 | +---------------------------+ +---------------------------+ 1. netwox 149 -i fe80::215:17ff:fe71:51f4 -d eth1 -E 0:0:0:0:1:0 -e 0:15:17:71:51:f4 -I fe80::200:ff:fe00:100 -c 1 This step is to create neighbor cache for spurious source address of fe80::200:ff:fe00:100. 2. netwox 141 -d eth1 -a 0:0:0:0:1:0 -b 0:15:17:71:51:f4 -f 17 -g 64 -h fe80::200:ff:fe00:100 -i fe80::215:17ff:fe71:51f4 \ -o 3333 -p 7 -q 000000000000000000000000000000000000000000000000 -r 34525 -e 32 -s 16 -t 35126 This step is to construct an UDPv6 packet that length field(16 bytes) less than total payload length(32 bytes). The readable format of this packet that netwox shows. Ethernet________________________________________________________. | 00:00:00:00:01:00->00:15:17:71:51:F4 type:0x86DD | |_______________________________________________________________| IP______________________________________________________________. |version| traffic class | flow label | |___6___|_______0_______|___________________0___________________| | payload length | next header | hop limit | |___________0x0020=32___________|____0x11=17____|______64_______| | source | |_____________________fe80::200:ff:fe00:100_____________________| | destination | |___________________fe80::215:17ff:fe71:51f4____________________| UDP_____________________________________________________________. | source port | destination port | |__________0x0D05=3333__________|___________0x0007=7____________| | length | checksum | |___________0x0010=16___________|_________0x8936=35126__________| 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 # ................ 00 00 00 00 00 00 00 00 # ........ *Actual Results* On M2 note, using ethtool to see the counter about rx_csum_offload. #ethtool -S eth0 | grep csum rx_csum_offload_good: 1 rx_csum_offload_errors: 0 #cat /proc/net/snmp6 | grep Udp6 Udp6InDatagrams 1 Udp6InErrors 0 *Expected Results* #ethtool -S eth0 | grep csum rx_csum_offload_good: 0 rx_csum_offload_errors: 1 #cat /proc/net/snmp6 | grep Udp6 Udp6InDatagrams 0 Udp6InErrors 1 *The Reason* UDPv6 handles a received packet like this: 1. Confirm length of data If length parameter in UDPv6 header is greater than skb->len(actual data length added UDP header), the packet will be dropped. If length parameter in UDPv6 header is lower than skb->len, the data will be trimmed to be equal to length parameter. 2. Then UDPv6 calculates checksum with 40 bytes IPv6 pseudo-header,8 bytes UDPv6 header, 8 bytes Payload Data. Note that checksum(35126) in UDPv6 header includes 16 bytes redundant data. NIC checks checksum with total data includes redundant data, So the checksum that hardware calculated is different from that UDP did. *The Solution* We have reported the problem to Intel E1000e developer, the reply from Ronciak John is that the driver code of e1000e is ok. About the discuss, see http://comments.gmane.org/gmane.linux.drivers.e1000.devel/7077 For this case, UDP protocol should not trust the CHECKSUM_UNNECESSARY flag set by driver. When UDP protocol received this kind of packet, if NIC hardware checked successfully, we reset ip_summed with CHECKSUM_NONE, and UDP protocol checked checksum again. (This patch is not complete, it's just for my idea.) --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 1dd1aff..47f7e86 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -723,6 +723,10 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, if (ulen < skb->len) { if (pskb_trim_rcsum(skb, ulen)) goto short_packet; + + if (skb_csum_unnecessary(skb)) + skb->ip_summed = CHECKSUM_NONE; + saddr = &ipv6_hdr(skb)->saddr; daddr = &ipv6_hdr(skb)->daddr; uh = udp_hdr(skb);