From patchwork Wed Aug 23 07:58:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 804872 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3xcfv84mj3z9s9Y for ; Wed, 23 Aug 2017 17:58:44 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753542AbdHWH6l (ORCPT ); Wed, 23 Aug 2017 03:58:41 -0400 Received: from mail-wm0-f53.google.com ([74.125.82.53]:37071 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753525AbdHWH6i (ORCPT ); Wed, 23 Aug 2017 03:58:38 -0400 Received: by mail-wm0-f53.google.com with SMTP id b189so9767770wmd.0 for ; Wed, 23 Aug 2017 00:58:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SXt1ogDb91uFJ082i0W27OmjKk6A5jddfSMwmqKbwsw=; b=SWF78Sho+x23vkCPoRQVUiysiNjuwBUcqJlxIq+jMxiDC2pf+jjs2TWnLq6Oksmryi QVe9IhjvCiCiGWdHcyvZiFiSt4/MtGcXEmhR2LOO7JOB6VIZS554Xs75nPl6lMTq2wg5 cUvyZwCGYGMllOS7Aji7HGxFK/IipXPrXYrOUFcEBPqIuVhu2BwzrIKa0pPmhuziIdf2 dJel4Yza+0lmNx+IgOoIsXBIadsV4owybk2OKfQ1V7rDhQWYOs95bowhcT9wKAoUGdol 7cQylNQpkewIHuOk5W2sS4Btq0+KwGbIVEdSvF+QmaRg1ByfqsQE9anmpHYsqDBJ7/+R dZGA== X-Gm-Message-State: AHYfb5idDjwLBYYPDViSqiBQAVo6by9YbySrqw8N8bGWFbQzmm4vPpTh 9DdLyrjmK/H73j9jS93JXg== X-Received: by 10.28.191.68 with SMTP id p65mr1146215wmf.58.1503475117280; Wed, 23 Aug 2017 00:58:37 -0700 (PDT) Received: from redhat.com (red-hat-inc.vlan404.asr1.mad1.gblx.net. [64.215.113.190]) by smtp.gmail.com with ESMTPSA id 80sm1002561wml.23.2017.08.23.00.58.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 23 Aug 2017 00:58:36 -0700 (PDT) From: Jakub Sitnicki To: netdev@vger.kernel.org Cc: "David S. Miller" , Hannes Frederic Sowa , Nikolay Aleksandrov , Tom Herbert Subject: [PATCH net-next v3 2/4] ipv6: Compute multipath hash for ICMP errors from offending packet Date: Wed, 23 Aug 2017 09:58:29 +0200 Message-Id: <20170823075831.27031-3-jkbs@redhat.com> X-Mailer: git-send-email 2.9.4 In-Reply-To: <20170823075831.27031-1-jkbs@redhat.com> References: <20170823075831.27031-1-jkbs@redhat.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When forwarding or sending out an ICMPv6 error, look at the embedded packet that triggered the error and compute a flow hash over its headers. This let's us route the ICMP error together with the flow it belongs to when multipath (ECMP) routing is in use, which in turn makes Path MTU Discovery work in ECMP load-balanced or anycast setups (RFC 7690). Granted, end-hosts behind the ECMP router (aka servers) need to reflect the IPv6 Flow Label for PMTUD to work. The code is organized to be in parallel with ipv4 stack: ip_multipath_l3_keys -> ip6_multipath_l3_keys fib_multipath_hash -> rt6_multipath_hash Signed-off-by: Jakub Sitnicki --- include/net/ip6_route.h | 1 + net/ipv6/icmp.c | 1 + net/ipv6/route.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 52 insertions(+) diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 907d39a..882bc3c 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -115,6 +115,7 @@ static inline int ip6_route_get_saddr(struct net *net, struct rt6_info *rt, struct rt6_info *rt6_lookup(struct net *net, const struct in6_addr *daddr, const struct in6_addr *saddr, int oif, int flags); +u32 rt6_multipath_hash(const struct flowi6 *fl6, const struct sk_buff *skb); struct dst_entry *icmp6_dst_alloc(struct net_device *dev, struct flowi6 *fl6); diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 4f82830..dd7608c 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -519,6 +519,7 @@ static void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info, fl6.fl6_icmp_type = type; fl6.fl6_icmp_code = code; fl6.flowi6_uid = sock_net_uid(net, NULL); + fl6.mp_hash = rt6_multipath_hash(&fl6, skb); security_skb_classify_flow(skb, flowi6_to_flowi(&fl6)); sk = icmpv6_xmit_lock(net); diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 9b02064..6c4dd57 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1214,6 +1214,54 @@ struct dst_entry *ip6_route_input_lookup(struct net *net, } EXPORT_SYMBOL_GPL(ip6_route_input_lookup); +static void ip6_multipath_l3_keys(const struct sk_buff *skb, + struct flow_keys *keys) +{ + const struct ipv6hdr *outer_iph = ipv6_hdr(skb); + const struct ipv6hdr *key_iph = outer_iph; + const struct ipv6hdr *inner_iph; + const struct icmp6hdr *icmph; + struct ipv6hdr _inner_iph; + + if (likely(outer_iph->nexthdr != IPPROTO_ICMPV6)) + goto out; + + icmph = icmp6_hdr(skb); + if (icmph->icmp6_type != ICMPV6_DEST_UNREACH && + icmph->icmp6_type != ICMPV6_PKT_TOOBIG && + icmph->icmp6_type != ICMPV6_TIME_EXCEED && + icmph->icmp6_type != ICMPV6_PARAMPROB) + goto out; + + inner_iph = skb_header_pointer(skb, + skb_transport_offset(skb) + sizeof(*icmph), + sizeof(_inner_iph), &_inner_iph); + if (!inner_iph) + goto out; + + key_iph = inner_iph; +out: + memset(keys, 0, sizeof(*keys)); + keys->control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; + keys->addrs.v6addrs.src = key_iph->saddr; + keys->addrs.v6addrs.dst = key_iph->daddr; + keys->tags.flow_label = ip6_flowinfo(key_iph); + keys->basic.ip_proto = key_iph->nexthdr; +} + +/* if skb is set it will be used and fl6 can be NULL */ +u32 rt6_multipath_hash(const struct flowi6 *fl6, const struct sk_buff *skb) +{ + struct flow_keys hash_keys; + + if (skb) { + ip6_multipath_l3_keys(skb, &hash_keys); + return flow_hash_from_keys(&hash_keys); + } + + return get_hash_from_flowi6(fl6); +} + void ip6_route_input(struct sk_buff *skb) { const struct ipv6hdr *iph = ipv6_hdr(skb); @@ -1232,6 +1280,8 @@ void ip6_route_input(struct sk_buff *skb) tun_info = skb_tunnel_info(skb); if (tun_info && !(tun_info->mode & IP_TUNNEL_INFO_TX)) fl6.flowi6_tun_key.tun_id = tun_info->key.tun_id; + if (unlikely(fl6.flowi6_proto == IPPROTO_ICMPV6)) + fl6.mp_hash = rt6_multipath_hash(&fl6, skb); skb_dst_drop(skb); skb_dst_set(skb, ip6_route_input_lookup(net, skb->dev, &fl6, flags)); }