diff mbox series

[bisected] UDP / xfrm: NAT-T packets with bad UDP checksum get dropped

Message ID 20190912151353.2w7jakdrqljkfbsq@intra2net.com
State RFC
Delegated to: David Miller
Headers show
Series [bisected] UDP / xfrm: NAT-T packets with bad UDP checksum get dropped | expand

Commit Message

Thomas Jarosch Sept. 12, 2019, 3:13 p.m. UTC
Hello together,

after updating many machines already from 3.14 to 4.19.67,
one site showed a non-working IPSec VPN connection with 4.19.x.

This IPSec connection is using UDP NAT traversal on port 4500.
The tunnel gets established fine, but no data flows.
Output of "ip xfrm state" looked sane.

The TRACE iptables firewall target showed the UDP packets
on port 4500 get accepted as expected. Still they didn't appear
in "ip xfrm monitor" and vanish somewhere in the kernel,
no error counter in /proc/net/xfrm_state increased at all.

After a few hours of bisecting with a test VM,
this commit was identified to cause the packet drop:

*******************
commit 0a80966b1043c3e2dc684140f155a3fded308660
Author: Tom Herbert <therbert@google.com>
Date:   Wed May 7 16:52:39 2014 -0700

    net: Verify UDP checksum before handoff to encap

    Moving validation of UDP checksum to be done in UDP not encap layer.

..
*******************

This commit is part of kernel 3.16. Reverting the commit
brings back the VPN connection using kernel 4.19.67.

I didn't spot the checksum error initially since wireshark and tcpdump
need to be explicitly told to verify checksums and there's a warning
about checksum offloading messing with it.

Further tracing showed the UDP packets leave the source
site with a zero UDP checksum and arrive, after passing
an unknown home router and a few other routers,
with a bogus UDP checksum on the destination side.

I've disabled rx and tx checksum offloading on the target test VM
and also on a router in between, but the packet dumps just
contain a fixed value as UDP checksum (f.e. 0x91c4).

RFC 3948 specifies how ESP packets should be encapsulated
using UDP for NAT traveral:
https://tools.ietf.org/html/rfc3948

*******************
2.1. UDP-Encapsulated ESP Header Format

the IPv4 UDP Checksum SHOULD be transmitted as a zero value, and
receivers MUST NOT depend on the UDP checksum being a zero value.


3.5.  Tunnel Mode ESP Decapsulation

1.  The UDP header is removed from the packet.
*******************

I'm wondering how the RFC should be interpreted here
regarding the UDP checksumming?

As far as I can tell it doesn't mention the UDP checksum
should be verified before decapsulation, the ESP packets
will provide proper data authentication anyway.

Cheers,
Thomas

Comments

Thomas Jarosch Sept. 16, 2019, 2:45 p.m. UTC | #1
> After a few hours of bisecting with a test VM,
> this commit was identified to cause the packet drop:
> 
> *******************
> commit 0a80966b1043c3e2dc684140f155a3fded308660
> Author: Tom Herbert <therbert@google.com>
> Date:   Wed May 7 16:52:39 2014 -0700
> 
>     net: Verify UDP checksum before handoff to encap
> 
>     Moving validation of UDP checksum to be done in UDP not encap layer.
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index f2d05d7be743..54ea0a3a48f1 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1495,6 +1495,10 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>                 if (skb->len > sizeof(struct udphdr) && encap_rcv != NULL) {
>                         int ret;
>  
> +                       /* Verify checksum before giving to encap */
> +                       if (udp_lib_checksum_complete(skb))
> +                               goto csum_error;
> +
>                         ret = encap_rcv(sk, skb);
>                         if (ret <= 0) {
>                                 UDP_INC_STATS_BH(sock_net(sk),
> ..
> *******************
> 
> This commit is part of kernel 3.16. Reverting the commit
> brings back the VPN connection using kernel 4.19.67.

..brings back the VPN connection when there's no iptables firewall involved ^^

As soon as netfilter is active, the packets get eaten again.
I've bisected this down to another commit, basically it's "broken" in 4.19.1
and works with with 4.19 vanilla. The commit in question:

*************************************
# git bisect good
4fb0dc97de1cd79399ab0c556f096d8db2bac278 is the first bad commit
commit 4fb0dc97de1cd79399ab0c556f096d8db2bac278
Author: Sean Tranchetti <stranche@codeaurora.org>
Date:   Tue Oct 23 16:04:31 2018 -0600

    net: udp: fix handling of CHECKSUM_COMPLETE packets

    [ Upstream commit db4f1be3ca9b0ef7330763d07bf4ace83ad6f913 ]

    Current handling of CHECKSUM_COMPLETE packets by the UDP stack is
    incorrect for any packet that has an incorrect checksum value.

    udp4/6_csum_init() will both make a call to
    __skb_checksum_validate_complete() to initialize/validate the csum
    field when receiving a CHECKSUM_COMPLETE packet. When this packet
    fails validation, skb->csum will be overwritten with the pseudoheader
    checksum so the packet can be fully validated by software, but the
    skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way
    the stack can later warn the user about their hardware spewing bad
    checksums. Unfortunately, leaving the SKB in this state can cause
    problems later on in the checksum calculation.

    Since the the packet is still marked as CHECKSUM_COMPLETE,
    udp_csum_pull_header() will SUBTRACT the checksum of the UDP header
    from skb->csum instead of adding it, leaving us with a garbage value
    in that field. Once we try to copy the packet to userspace in the
    udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg()
    to checksum the packet data and add it in the garbage skb->csum value
    to perform our final validation check.

    Since the value we're validating is not the proper checksum, it's possible
    that the folded value could come out to 0, causing us not to drop the
    packet. Instead, we believe that the packet was checksummed incorrectly
    by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt
    to warn the user with netdev_rx_csum_fault(skb->dev);

    Unfortunately, since this is the UDP path, skb->dev has been overwritten
    by skb->dev_scratch and is no longer a valid pointer, so we end up
    reading invalid memory.

    This patch addresses this problem in two ways:
            1) Do not use the dev pointer when calling netdev_rx_csum_fault()
               from skb_copy_and_csum_datagram_msg(). Since this gets called
               from the UDP path where skb->dev has been overwritten, we have
               no way of knowing if the pointer is still valid. Also for the
               sake of consistency with the other uses of
               netdev_rx_csum_fault(), don't attempt to call it if the
               packet was checksummed by software.

            2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init().
               If we receive a packet that's CHECKSUM_COMPLETE that fails
               verification (i.e. skb->csum_valid == 0), check who performed
               the calculation. It's possible that the checksum was done in
               software by the network stack earlier (such as Netfilter's
               CONNTRACK module), and if that says the checksum is bad,
               we can drop the packet immediately instead of waiting until
               we try and copy it to userspace. Otherwise, we need to
               mark the SKB as CHECKSUM_NONE, since the skb->csum field
               no longer contains the full packet checksum after the
               call to __skb_checksum_validate_complete().

    Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
    Fixes: c84d949057ca ("udp: copy skb->truesize in the first cache line")
    Cc: Sam Kumar <samanthakumar@google.com>
    Cc: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*************************************

Reverting this commit and the previously mentioned one
0a80966b1043c3e2dc684140f155a3fded308660
("net: Verify UDP checksum before handoff to encap")
brings back the VPN tunnel using 4.19.57 including the full iptables firewall.
(I had the firewall disabled during my earlier tests)

Given the patch description above, this commit is not something I want to revert.
I tried playing around with the CHECKSUM netfilter target, but it just seems
to fill in missing checksums, not overwrite invalid ones.

My next step is to figure out what brand and model exactly the
"unknown home router" is and if there's a firmware update available.

If it can be fixed by replacing / upgrading the home router, I will go this 
route and cross my fingers this will by a one time VPN tunnel incident
after the major kernel 3.14 -> 4.19 upgrade.

Cheers,
Thomas
diff mbox series

Patch

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index f2d05d7be743..54ea0a3a48f1 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1495,6 +1495,10 @@  int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
                if (skb->len > sizeof(struct udphdr) && encap_rcv != NULL) {
                        int ret;
 
+                       /* Verify checksum before giving to encap */
+                       if (udp_lib_checksum_complete(skb))
+                               goto csum_error;
+
                        ret = encap_rcv(sk, skb);
                        if (ret <= 0) {
                                UDP_INC_STATS_BH(sock_net(sk),