diff mbox series

[net-next,v3] ipv4: Support multipath hashing on inner IP pkts for GRE tunnel

Message ID 20190613183858.9892-1-ssuryaextr@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series [net-next,v3] ipv4: Support multipath hashing on inner IP pkts for GRE tunnel | expand

Commit Message

Stephen Suryaputra June 13, 2019, 6:38 p.m. UTC
Multipath hash policy value of 0 isn't distributing since the outer IP
dest and src aren't varied eventhough the inner ones are. Since the flow
is on the inner ones in the case of tunneled traffic, hashing on them is
desired.

This is done mainly for IP over GRE, hence only tested for that. But
anything else supported by flow dissection should work.

v2: Use skb_flow_dissect_flow_keys() directly so that other tunneling
    can be supported through flow dissection (per Nikolay Aleksandrov).
v3: Remove accidental inclusion of ports in the hash keys and clarify
    the documentation (Nikolay Alexandrov).
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
---
 Documentation/networking/ip-sysctl.txt |  1 +
 net/ipv4/route.c                       | 17 +++++++++++++++++
 net/ipv4/sysctl_net_ipv4.c             |  2 +-
 3 files changed, 19 insertions(+), 1 deletion(-)

Comments

Nikolay Aleksandrov June 14, 2019, 10:55 a.m. UTC | #1
On 13/06/2019 21:38, Stephen Suryaputra wrote:
> Multipath hash policy value of 0 isn't distributing since the outer IP
> dest and src aren't varied eventhough the inner ones are. Since the flow
> is on the inner ones in the case of tunneled traffic, hashing on them is
> desired.
> 
> This is done mainly for IP over GRE, hence only tested for that. But
> anything else supported by flow dissection should work.
> 
> v2: Use skb_flow_dissect_flow_keys() directly so that other tunneling
>     can be supported through flow dissection (per Nikolay Aleksandrov).
> v3: Remove accidental inclusion of ports in the hash keys and clarify
>     the documentation (Nikolay Alexandrov).
> Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
> ---
>  Documentation/networking/ip-sysctl.txt |  1 +
>  net/ipv4/route.c                       | 17 +++++++++++++++++
>  net/ipv4/sysctl_net_ipv4.c             |  2 +-
>  3 files changed, 19 insertions(+), 1 deletion(-)
> 

Looks good to me,
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
David Miller June 15, 2019, 2:43 a.m. UTC | #2
From: Stephen Suryaputra <ssuryaextr@gmail.com>
Date: Thu, 13 Jun 2019 14:38:58 -0400

> Multipath hash policy value of 0 isn't distributing since the outer IP
> dest and src aren't varied eventhough the inner ones are. Since the flow
> is on the inner ones in the case of tunneled traffic, hashing on them is
> desired.
> 
> This is done mainly for IP over GRE, hence only tested for that. But
> anything else supported by flow dissection should work.
> 
> v2: Use skb_flow_dissect_flow_keys() directly so that other tunneling
>     can be supported through flow dissection (per Nikolay Aleksandrov).
> v3: Remove accidental inclusion of ports in the hash keys and clarify
>     the documentation (Nikolay Alexandrov).
> Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>

Applied.
Ido Schimmel June 17, 2019, 2:39 p.m. UTC | #3
On Thu, Jun 13, 2019 at 02:38:58PM -0400, Stephen Suryaputra wrote:
> Multipath hash policy value of 0 isn't distributing since the outer IP
> dest and src aren't varied eventhough the inner ones are. Since the flow
> is on the inner ones in the case of tunneled traffic, hashing on them is
> desired.
> 
> This is done mainly for IP over GRE, hence only tested for that. But
> anything else supported by flow dissection should work.
> 
> v2: Use skb_flow_dissect_flow_keys() directly so that other tunneling
>     can be supported through flow dissection (per Nikolay Aleksandrov).
> v3: Remove accidental inclusion of ports in the hash keys and clarify
>     the documentation (Nikolay Alexandrov).
> Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>

Hi,

Do you plan to add IPv6 support? Would be good to have the same features
in both stacks.

Also, we have tests for these sysctls under
tools/testing/selftests/net/forwarding/router_multipath.sh

Can you add a test for this change as well? You'll probably need to
create a new file given the topology created by router_multipath.sh does
not include tunnels.

Thanks
David Ahern June 17, 2019, 3:53 p.m. UTC | #4
On 6/17/19 8:39 AM, Ido Schimmel wrote:
> 
> Do you plan to add IPv6 support? Would be good to have the same features
> in both stacks.

we really should be mandating equal support for all new changes like this.

> 
> Also, we have tests for these sysctls under
> tools/testing/selftests/net/forwarding/router_multipath.sh
> 
> Can you add a test for this change as well? You'll probably need to
> create a new file given the topology created by router_multipath.sh does
> not include tunnels.
> 
> Thanks
>
Stephen Suryaputra June 17, 2019, 5:08 p.m. UTC | #5
On Mon, Jun 17, 2019 at 09:53:06AM -0600, David Ahern wrote:
> On 6/17/19 8:39 AM, Ido Schimmel wrote:
> > 
> > Do you plan to add IPv6 support? Would be good to have the same features
> > in both stacks.
> 
> we really should be mandating equal support for all new changes like this.
> 
I will add that.
> > 
> > Also, we have tests for these sysctls under
> > tools/testing/selftests/net/forwarding/router_multipath.sh
> > 
> > Can you add a test for this change as well? You'll probably need to
> > create a new file given the topology created by router_multipath.sh does
> > not include tunnels.

I never looked at the selftests scripts, but will attempt.

Stephen.
diff mbox series

Patch

diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 5eedc6941ce5..f2ee426f13ad 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -80,6 +80,7 @@  fib_multipath_hash_policy - INTEGER
 	Possible values:
 	0 - Layer 3
 	1 - Layer 4
+	2 - Layer 3 or inner Layer 3 if present
 
 fib_sync_mem - UNSIGNED INTEGER
 	Amount of dirty memory from fib entries that can be backlogged before
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 14c7fdacaa72..7ad96121ed8e 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1872,6 +1872,23 @@  int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4,
 			hash_keys.basic.ip_proto = fl4->flowi4_proto;
 		}
 		break;
+	case 2:
+		memset(&hash_keys, 0, sizeof(hash_keys));
+		hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+		/* skb is currently provided only when forwarding */
+		if (skb) {
+			struct flow_keys keys;
+
+			skb_flow_dissect_flow_keys(skb, &keys, 0);
+
+			hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src;
+			hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst;
+		} else {
+			/* Same as case 0 */
+			hash_keys.addrs.v4addrs.src = fl4->saddr;
+			hash_keys.addrs.v4addrs.dst = fl4->daddr;
+		}
+		break;
 	}
 	mhash = flow_hash_from_keys(&hash_keys);
 
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 2316c08e9591..e1efc2e62d21 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -960,7 +960,7 @@  static struct ctl_table ipv4_net_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_fib_multipath_hash_policy,
 		.extra1		= &zero,
-		.extra2		= &one,
+		.extra2		= &two,
 	},
 #endif
 	{