Message ID | KL1P15301MB028018F5C84C618BF7628045BF740@KL1P15301MB0280.APCP153.PROD.OUTLOOK.COM |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
Series | UDP data corruption in v4.4 | expand |
On Sat, Jul 25, 2020 at 02:21:06AM +0000, Dexuan Cui wrote: > Hi, > The v4.4 stable kernel (currently it's v4.4.231) lacks this bugfix: > 327868212381 ("make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error") > , as a result, the v4.4 kernel can deliver corrupt data to the application > when a corrupt UDP packet is closely followed by a valid UDP packet: > when the same invocation of the recvmsg() syscall delivers the corrupt > packet's UDP payload to the application's receive buffer, it provides the > UDP payload length and the "from IP/Port" of the valid packet to the > application -- this mismatch makes the issue worse. > > Details: > > For a UDP packet longer than 76 bytes (see the v5.8-rc6 kernel's > include/linux/skbuff.h:3951), Linux delays the UDP checksum verification > until the application invokes the syscall recvmsg(). > > In the recvmsg() syscall handler, while Linux is copying the UDP payload > to the application's memory, it calculates the UDP checksum. If the > calculated checksum doesn't match the received checksum, Linux drops the > corrupt UDP packet, and then starts to process the next packet (if any), > and if the next packet is valid (i.e. the checksum is correct), Linux will > copy the valid UDP packet's payload to the application's receiver buffer. > > The bug is: before Linux starts to copy the valid UDP packet, the data > structure used to track how many more bytes should be copied to the > application memory is not reset to what it was when the application just > entered the kernel by the syscall! Consequently, only a small portion or > none of the valid packet's payload is copied to the application's receive > buffer, and later when the application exits from the kernel, actually > most of the application's receive buffer contains the payload of the > corrupt packet while recvmsg() returns the length of the UDP payload of > the valid packet. > > For the mainline kernel, the bug was fixed by Al Viro in the commit > 327868212381, but unluckily the bugfix is only backported to the > upstream v4.9+ kernels. I hope the bugfix can be backported to the > v4.4 stable kernel, since it's a "longterm" kernel and is still used by > some Linux distros. > > It turns out backporting 327868212381 to v4.4 means that some > Supporting patches must be backported first, so the overall changes > are pretty big... > > I made the below one-line workaround patch to force the recvmsg() syscall > handler to return to the userspace when Linux detects a corrupt UDP packet, > so the application will invoke the syscall again to receive the following valid > UDP packet (note: the patch may not work well with blocking sockets, for > which typically the application doesn't expect an error of -EAGAIN. I > guess it would be safer to return -EINTR instead?): > > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -1367,6 +1367,7 @@ csum_copy_err: > /* starting over for a new packet, but check if we need to yield */ > cond_resched(); > msg->msg_flags &= ~MSG_TRUNC; > + return -EAGAIN; > goto try_again; > } > > > Eric Dumazet made an alternative that performs the csum validation earlier: > > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct > sk_buff *skb) > } > } > > - if (rcu_access_pointer(sk->sk_filter) && > - udp_lib_checksum_complete(skb)) > + if (udp_lib_checksum_complete(skb)) > goto csum_error; > > if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) { > > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than > trying to backport 327868212381. Does Eric's fix work with your testing? If so, great, can you turn it into something I can apply to the 4.4.y stable tree and send it to stable@vger.kernel.org? thanks, greg k-h
> From: Greg KH <greg@kroah.com> > Sent: Friday, July 24, 2020 10:59 PM > > [...] > > Eric Dumazet made an alternative that performs the csum validation earlier: > > > > --- a/net/ipv4/udp.c > > +++ b/net/ipv4/udp.c > > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct > > sk_buff *skb) > > } > > } > > > > - if (rcu_access_pointer(sk->sk_filter) && > > - udp_lib_checksum_complete(skb)) > > + if (udp_lib_checksum_complete(skb)) > > goto csum_error; > > > > if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) { > > > > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than > > trying to backport 327868212381. > > Does Eric's fix work with your testing? Yes, it worked in my testing overnight. > If so, great, can you turn it > into something I can apply to the 4.4.y stable tree and send it to > stable@vger.kernel.org? > > greg k-h Will do shortly. Thanks, Dexuan
On Mon, Jul 27, 2020 at 11:38 AM Dexuan Cui <decui@microsoft.com> wrote: > > > From: Greg KH <greg@kroah.com> > > Sent: Friday, July 24, 2020 10:59 PM > > > [...] > > > Eric Dumazet made an alternative that performs the csum validation earlier: > > > > > > --- a/net/ipv4/udp.c > > > +++ b/net/ipv4/udp.c > > > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct > > > sk_buff *skb) > > > } > > > } > > > > > > - if (rcu_access_pointer(sk->sk_filter) && > > > - udp_lib_checksum_complete(skb)) > > > + if (udp_lib_checksum_complete(skb)) > > > goto csum_error; > > > > > > if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) { > > > > > > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than > > > trying to backport 327868212381. > > > > Does Eric's fix work with your testing? > > Yes, it worked in my testing overnight. > > > If so, great, can you turn it > > into something I can apply to the 4.4.y stable tree and send it to > > stable@vger.kernel.org? > > > > greg k-h > > Will do shortly. > Just as a reminder, please also add the IPv6 part. diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index a8d74f44056a681ef9057c4c4abb34016120b44f..13713e0e5779b75de975faaeb4511bef40e097dc 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -661,8 +661,7 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) } prefetch(&sk->sk_rmem_alloc); - if (rcu_access_pointer(sk->sk_filter) && - udp_lib_checksum_complete(skb)) + if (udp_lib_checksum_complete(skb)) goto csum_error; if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr)))
> From: Eric Dumazet <edumazet@google.com> > Sent: Monday, July 27, 2020 11:40 AM > To: Dexuan Cui <decui@microsoft.com> > > On Mon, Jul 27, 2020 at 11:38 AM Dexuan Cui <decui@microsoft.com> wrote: > > > > > From: Greg KH <greg@kroah.com> > > > Sent: Friday, July 24, 2020 10:59 PM > > > > [...] > > > > Eric Dumazet made an alternative that performs the csum validation > earlier: > > > > > > > > --- a/net/ipv4/udp.c > > > > +++ b/net/ipv4/udp.c > > > > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct > > > > sk_buff *skb) > > > > } > > > > } > > > > > > > > - if (rcu_access_pointer(sk->sk_filter) && > > > > - udp_lib_checksum_complete(skb)) > > > > + if (udp_lib_checksum_complete(skb)) > > > > goto csum_error; > > > > > > > > if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) { > > > > > > > > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than > > > > trying to backport 327868212381. > > > > > > Does Eric's fix work with your testing? > > > > Yes, it worked in my testing overnight. > > > > > If so, great, can you turn it > > > into something I can apply to the 4.4.y stable tree and send it to > > > stable@vger.kernel.org? > > > > > > greg k-h > > > > Will do shortly. > > > > Just as a reminder, please also add the IPv6 part. > > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c > index > a8d74f44056a681ef9057c4c4abb34016120b44f..13713e0e5779b75de975faae > b4511bef40e097dc > 100644 > --- a/net/ipv6/udp.c > +++ b/net/ipv6/udp.c > @@ -661,8 +661,7 @@ static int udpv6_queue_rcv_one_skb(struct sock > *sk, struct sk_buff *skb) > } > > prefetch(&sk->sk_rmem_alloc); > - if (rcu_access_pointer(sk->sk_filter) && > - udp_lib_checksum_complete(skb)) > + if (udp_lib_checksum_complete(skb)) > goto csum_error; > > if (sk_filter_trim_cap(sk, skb, sizeof(struct udphdr))) Oh, yes! :-) Thank you! Eric, I'll add your Signed-off-by and mine. Please let me know in case this is not ok. I'll do a little more testing with the patch and I plan to post the patch to stable@vger.kernel.org and netdev@vger.kernel.org this afternoon, i.e. in 3~4 hours or so. Thanks, -- Dexuan
On 7/27/20 11:57 AM, Dexuan Cui wrote: >> From: Eric Dumazet <edumazet@google.com> > Oh, yes! :-) Thank you! > > Eric, I'll add your Signed-off-by and mine. Please let me know in case > this is not ok. Sure, do not worry about this. > > I'll do a little more testing with the patch and I plan to post the patch > to stable@vger.kernel.org and netdev@vger.kernel.org this afternoon, > i.e. in 3~4 hours or so. > > Thanks, > -- Dexuan >
--- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1367,6 +1367,7 @@ csum_copy_err: /* starting over for a new packet, but check if we need to yield */ cond_resched(); msg->msg_flags &= ~MSG_TRUNC; + return -EAGAIN; goto try_again; } Eric Dumazet made an alternative that performs the csum validation earlier: --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) } } - if (rcu_access_pointer(sk->sk_filter) && - udp_lib_checksum_complete(skb)) + if (udp_lib_checksum_complete(skb)) goto csum_error;