From patchwork Wed May 16 20:29:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 914913 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=toke.dk Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; secure) header.d=toke.dk header.i=@toke.dk header.b="iWnbf0we"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40mQyS0HmYz9s1B for ; Thu, 17 May 2018 06:29:20 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752030AbeEPU3R (ORCPT ); Wed, 16 May 2018 16:29:17 -0400 Received: from mail.toke.dk ([52.28.52.200]:37471 "EHLO mail.toke.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751892AbeEPU3P (ORCPT ); Wed, 16 May 2018 16:29:15 -0400 Subject: [PATCH net-next v12 4/7] sch_cake: Add NAT awareness to packet classifier DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1526502546; bh=rVELI/RQ2O7YCWbeWygv140utM6SfmJ5R0SGht49PHY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=iWnbf0weH65/BHzM72JOJFBR4YAJcnPzDmcKeSglQytmroS/rxUVNMkZW0Fy1kfkx LNFgO5G3O7Y0PAshUZRl0XXhlajWEC/jUCOqSVMO2g04UJIdXMMkU5HHV+10DYYjPI 6lk2oZ/i29hCVOg7zWQdtL4Dd0LlLuUS1zjgu12hJ5Hu4oC5ENKIzkfBu54OXFLVf8 bT1aV5l5S7lU4dczPmOef0pwtrAa48EpyKnPU1WSdPxB9puT/XusAcgnT/y+QFuO6X IxJycL96/hUErsz/n6RL/+zywAhzKxtF8xcYdYfldNAYHcRnT+g2hmKxtRUK5YmFqz CPjhkAR0rtctA== From: Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rgensen?= To: netdev@vger.kernel.org Cc: cake@lists.bufferbloat.net Date: Wed, 16 May 2018 22:29:06 +0200 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <152650254618.25701.1794377356779114652.stgit@alrua-kau> In-Reply-To: <152650253056.25701.10138252969621361651.stgit@alrua-kau> References: <152650253056.25701.10138252969621361651.stgit@alrua-kau> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When CAKE is deployed on a gateway that also performs NAT (which is a common deployment mode), the host fairness mechanism cannot distinguish internal hosts from each other, and so fails to work correctly. To fix this, we add an optional NAT awareness mode, which will query the kernel conntrack mechanism to obtain the pre-NAT addresses for each packet and use that in the flow and host hashing. When the shaper is enabled and the host is already performing NAT, the cost of this lookup is negligible. However, in unlimited mode with no NAT being performed, there is a significant CPU cost at higher bandwidths. For this reason, the feature is turned off by default. Signed-off-by: Toke Høiland-Jørgensen --- net/sched/sch_cake.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index 65439b643c92..e1038a7b6686 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -71,6 +71,12 @@ #include #include +#if IS_REACHABLE(CONFIG_NF_CONNTRACK) +#include +#include +#include +#endif + #define CAKE_SET_WAYS (8) #define CAKE_MAX_TINS (8) #define CAKE_QUEUES (1024) @@ -514,6 +520,60 @@ static bool cobalt_should_drop(struct cobalt_vars *vars, return drop; } +#if IS_REACHABLE(CONFIG_NF_CONNTRACK) + +static void cake_update_flowkeys(struct flow_keys *keys, + const struct sk_buff *skb) +{ + const struct nf_conntrack_tuple *tuple; + enum ip_conntrack_info ctinfo; + struct nf_conn *ct; + bool rev = false; + + if (tc_skb_protocol(skb) != htons(ETH_P_IP)) + return; + + ct = nf_ct_get(skb, &ctinfo); + if (ct) { + tuple = nf_ct_tuple(ct, CTINFO2DIR(ctinfo)); + } else { + const struct nf_conntrack_tuple_hash *hash; + struct nf_conntrack_tuple srctuple; + + if (!nf_ct_get_tuplepr(skb, skb_network_offset(skb), + NFPROTO_IPV4, dev_net(skb->dev), + &srctuple)) + return; + + hash = nf_conntrack_find_get(dev_net(skb->dev), + &nf_ct_zone_dflt, + &srctuple); + if (!hash) + return; + + rev = true; + ct = nf_ct_tuplehash_to_ctrack(hash); + tuple = nf_ct_tuple(ct, !hash->tuple.dst.dir); + } + + keys->addrs.v4addrs.src = rev ? tuple->dst.u3.ip : tuple->src.u3.ip; + keys->addrs.v4addrs.dst = rev ? tuple->src.u3.ip : tuple->dst.u3.ip; + + if (keys->ports.ports) { + keys->ports.src = rev ? tuple->dst.u.all : tuple->src.u.all; + keys->ports.dst = rev ? tuple->src.u.all : tuple->dst.u.all; + } + if (rev) + nf_ct_put(ct); +} +#else +static void cake_update_flowkeys(struct flow_keys *keys, + const struct sk_buff *skb) +{ + /* There is nothing we can do here without CONNTRACK */ +} +#endif + /* Cake has several subtle multiple bit settings. In these cases you * would be matching triple isolate mode as well. */ @@ -541,6 +601,9 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL); + if (flow_mode & CAKE_FLOW_NAT_FLAG) + cake_update_flowkeys(&keys, skb); + /* flow_hash_from_keys() sorts the addresses by value, so we have * to preserve their order in a separate data structure to treat * src and dst host addresses as independently selectable. @@ -1727,6 +1790,12 @@ static int cake_change(struct Qdisc *sch, struct nlattr *opt, q->flow_mode = (nla_get_u32(tb[TCA_CAKE_FLOW_MODE]) & CAKE_FLOW_MASK); + if (tb[TCA_CAKE_NAT]) { + q->flow_mode &= ~CAKE_FLOW_NAT_FLAG; + q->flow_mode |= CAKE_FLOW_NAT_FLAG * + !!nla_get_u32(tb[TCA_CAKE_NAT]); + } + if (tb[TCA_CAKE_RTT]) { q->interval = nla_get_u32(tb[TCA_CAKE_RTT]); @@ -1892,6 +1961,10 @@ static int cake_dump(struct Qdisc *sch, struct sk_buff *skb) if (nla_put_u32(skb, TCA_CAKE_ACK_FILTER, q->ack_filter)) goto nla_put_failure; + if (nla_put_u32(skb, TCA_CAKE_NAT, + !!(q->flow_mode & CAKE_FLOW_NAT_FLAG))) + goto nla_put_failure; + return nla_nest_end(skb, opts); nla_put_failure: