From patchwork Mon May 3 19:39:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Jurgens X-Patchwork-Id: 1473333 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FYtcJ0Lhzz9s1l; Tue, 4 May 2021 05:40:31 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ldeQd-0002CW-31; Mon, 03 May 2021 19:40:27 +0000 Received: from mail-il-dmz.mellanox.com ([193.47.165.129] helo=mellanox.co.il) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ldePy-0001er-BH for kernel-team@lists.ubuntu.com; Mon, 03 May 2021 19:39:46 +0000 Received: from Internal Mail-Server by MTLPINE1 (envelope-from danielj@nvidia.com) with SMTP; 3 May 2021 22:39:43 +0300 Received: from sw-mtx-hparm-003.mtx.labs.mlnx. (sw-mtx-hparm-003.mtx.labs.mlnx [10.9.151.78]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 143JdHlu026551; Mon, 3 May 2021 22:39:42 +0300 From: Daniel Jurgens To: kernel-team@lists.ubuntu.com Subject: [SRU][F:linux-bluefield][PATCH 27/32] net/bonding: Implement ndo_sk_get_lower_dev Date: Mon, 3 May 2021 22:39:12 +0300 Message-Id: <1620070757-51528-28-git-send-email-danielj@nvidia.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1620070757-51528-1-git-send-email-danielj@nvidia.com> References: <1620070757-51528-1-git-send-email-danielj@nvidia.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: vlad@nvidia.com MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Tariq Toukan BugLink: https://bugs.launchpad.net/bugs/1926994 Add ndo_sk_get_lower_dev() implementation for bond interfaces. Support only for the cases where the socket's and SKBs' hash yields identical value for the whole connection lifetime. Here we restrict it to L3+4 sockets only, with xmit_hash_policy==LAYER34 and bond modes xor/802.3ad. Signed-off-by: Tariq Toukan Reviewed-by: Boris Pismenny Signed-off-by: Jakub Kicinski (cherry picked from commit 007feb87fb15933b5de7135e6bdf57c219b3fbec) Signed-off-by: Daniel Jurgens --- drivers/net/bonding/bond_main.c | 93 +++++++++++++++++++++++++++++++++++++++++ include/net/bonding.h | 2 + 2 files changed, 95 insertions(+) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 02b4fe7..0442b67 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -254,6 +254,19 @@ netdev_tx_t bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, return dev_queue_xmit(skb); } +bool bond_sk_check(struct bonding *bond) +{ + switch (BOND_MODE(bond)) { + case BOND_MODE_8023AD: + case BOND_MODE_XOR: + if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34) + return true; + fallthrough; + default: + return false; + } +} + /*---------------------------------- VLAN -----------------------------------*/ /* In the following 2 functions, bond_vlan_rx_add_vid and bond_vlan_rx_kill_vid, @@ -4459,6 +4472,85 @@ static struct net_device *bond_xmit_get_slave(struct net_device *master_dev, return NULL; } +static void bond_sk_to_flow(struct sock *sk, struct flow_keys *flow) +{ + switch (sk->sk_family) { +#if IS_ENABLED(CONFIG_IPV6) + case AF_INET6: + if (sk->sk_ipv6only || + ipv6_addr_type(&sk->sk_v6_daddr) != IPV6_ADDR_MAPPED) { + flow->control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; + flow->addrs.v6addrs.src = inet6_sk(sk)->saddr; + flow->addrs.v6addrs.dst = sk->sk_v6_daddr; + break; + } + fallthrough; +#endif + default: /* AF_INET */ + flow->control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; + flow->addrs.v4addrs.src = inet_sk(sk)->inet_rcv_saddr; + flow->addrs.v4addrs.dst = inet_sk(sk)->inet_daddr; + break; + } + + flow->ports.src = inet_sk(sk)->inet_sport; + flow->ports.dst = inet_sk(sk)->inet_dport; +} + +/** + * bond_sk_hash_l34 - generate a hash value based on the socket's L3 and L4 fields + * @sk: socket to use for headers + * + * This function will extract the necessary field from the socket and use + * them to generate a hash based on the LAYER34 xmit_policy. + * Assumes that sk is a TCP or UDP socket. + */ +static u32 bond_sk_hash_l34(struct sock *sk) +{ + struct flow_keys flow; + u32 hash; + + bond_sk_to_flow(sk, &flow); + + /* L4 */ + memcpy(&hash, &flow.ports.ports, sizeof(hash)); + /* L3 */ + return bond_ip_hash(hash, &flow); +} + +static struct net_device *__bond_sk_get_lower_dev(struct bonding *bond, + struct sock *sk) +{ + struct bond_up_slave *slaves; + struct slave *slave; + unsigned int count; + u32 hash; + + slaves = rcu_dereference(bond->usable_slaves); + count = slaves ? READ_ONCE(slaves->count) : 0; + if (unlikely(!count)) + return NULL; + + hash = bond_sk_hash_l34(sk); + slave = slaves->arr[hash % count]; + + return slave->dev; +} + +static struct net_device *bond_sk_get_lower_dev(struct net_device *dev, + struct sock *sk) +{ + struct bonding *bond = netdev_priv(dev); + struct net_device *lower = NULL; + + rcu_read_lock(); + if (bond_sk_check(bond)) + lower = __bond_sk_get_lower_dev(bond, sk); + rcu_read_unlock(); + + return lower; +} + static netdev_tx_t __bond_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct bonding *bond = netdev_priv(dev); @@ -4596,6 +4688,7 @@ static void bond_ethtool_get_drvinfo(struct net_device *bond_dev, .ndo_fix_features = bond_fix_features, .ndo_features_check = passthru_features_check, .ndo_get_xmit_slave = bond_xmit_get_slave, + .ndo_sk_get_lower_dev = bond_sk_get_lower_dev, }; static const struct device_type bond_type = { diff --git a/include/net/bonding.h b/include/net/bonding.h index 8603f62..4975064 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -266,6 +266,8 @@ struct bond_vlan_tag { unsigned short vlan_id; }; +bool bond_sk_check(struct bonding *bond); + /** * Returns NULL if the net_device does not belong to any of the bond's slaves *