diff mbox

[V3,2/2] bonding support for IPv6 transmit hashing

Message ID 4FEE99EE.2000001@8192.net
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

John June 30, 2012, 6:17 a.m. UTC
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Hannes Frederic Sowa June 30, 2012, 11:59 a.m. UTC | #1
On Sat, Jun 30, 2012 at 8:17 AM, John <linux@8192.net> wrote:
> diff --git a/Documentation/networking/bonding.txt
> b/Documentation/networking/bonding.txt
> index bfea8a3..5db14fe 100644
> --- a/Documentation/networking/bonding.txt
> +++ b/Documentation/networking/bonding.txt
> @@ -752,12 +752,22 @@ xmit_hash_policy
>                 protocol information to generate the hash.
>
>                 Uses XOR of hardware MAC addresses and IP addresses to
> -               generate the hash.  The formula is
> +               generate the hash.  The IPv4 formula is
>
>                 (((source IP XOR dest IP) AND 0xffff) XOR
>                         ( source MAC XOR destination MAC ))
>                                 modulo slave count
>
> +               The IPv6 forumla is
> +
> +               iphash =
> +                       (source ip quad 2 XOR dest IP quad 2) XOR
> +                       (source ip quad 3 XOR dest IP quad 3) XOR
> +                       (source ip quad 4 XOR dest IP quad 4)
> +
> +               ((iphash >> 16) XOR (iphash >> 8) XOR iphash)
> +                       modulo slave count
> +

Wouldn't it be beneficial to include the ipv6 flow label in the hash
calculation?

Greetings,

  Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John June 30, 2012, 7:38 p.m. UTC | #2
On 6/30/2012 4:59 AM, Hannes Frederic Sowa wrote:
> On Sat, Jun 30, 2012 at 8:17 AM, John <linux@8192.net> wrote:
>> diff --git a/Documentation/networking/bonding.txt
>> b/Documentation/networking/bonding.txt
>> index bfea8a3..5db14fe 100644
>> --- a/Documentation/networking/bonding.txt
>> +++ b/Documentation/networking/bonding.txt
>> @@ -752,12 +752,22 @@ xmit_hash_policy
>>                  protocol information to generate the hash.
>>
>>                  Uses XOR of hardware MAC addresses and IP addresses to
>> -               generate the hash.  The formula is
>> +               generate the hash.  The IPv4 formula is
>>
>>                  (((source IP XOR dest IP) AND 0xffff) XOR
>>                          ( source MAC XOR destination MAC ))
>>                                  modulo slave count
>>
>> +               The IPv6 forumla is
>> +
>> +               iphash =
>> +                       (source ip quad 2 XOR dest IP quad 2) XOR
>> +                       (source ip quad 3 XOR dest IP quad 3) XOR
>> +                       (source ip quad 4 XOR dest IP quad 4)
>> +
>> +               ((iphash >> 16) XOR (iphash >> 8) XOR iphash)
>> +                       modulo slave count
>> +
>
> Wouldn't it be beneficial to include the ipv6 flow label in the hash
> calculation?
>
> Greetings,
>
>    Hannes
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Hannes,

In all of the traffic I inspected I don't believe I saw a single flow 
label set. Even if it were set 100% of the time by Linux, any packets 
routed or bridged from another operating system wouldn't see any 
benefit. The current algorithm distributes the traffic very well, I 
don't believe adding the flow label would be beneficial even if it were 
set more frequently.

If you feel strongly about its inclusion, though, I am willing to 
reconsider.

John

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Frederic Sowa July 1, 2012, 3:57 a.m. UTC | #3
On Sat, Jun 30, 2012 at 9:38 PM, John <linux@8192.net> wrote:
> On 6/30/2012 4:59 AM, Hannes Frederic Sowa wrote:
>> On Sat, Jun 30, 2012 at 8:17 AM, John <linux@8192.net> wrote:
>>>
>>> diff --git a/Documentation/networking/bonding.txt
>>> b/Documentation/networking/bonding.txt
>>> index bfea8a3..5db14fe 100644
>>> --- a/Documentation/networking/bonding.txt
>>> +++ b/Documentation/networking/bonding.txt
>>> @@ -752,12 +752,22 @@ xmit_hash_policy
>>>                  protocol information to generate the hash.
>>>
>>>                  Uses XOR of hardware MAC addresses and IP addresses to
>>> -               generate the hash.  The formula is
>>> +               generate the hash.  The IPv4 formula is
>>>
>>>                  (((source IP XOR dest IP) AND 0xffff) XOR
>>>                          ( source MAC XOR destination MAC ))
>>>                                  modulo slave count
>>>
>>> +               The IPv6 forumla is
>>> +
>>> +               iphash =
>>> +                       (source ip quad 2 XOR dest IP quad 2) XOR
>>> +                       (source ip quad 3 XOR dest IP quad 3) XOR
>>> +                       (source ip quad 4 XOR dest IP quad 4)
>>> +
>>> +               ((iphash >> 16) XOR (iphash >> 8) XOR iphash)
>>> +                       modulo slave count
>>> +
>>
>>
>> Wouldn't it be beneficial to include the ipv6 flow label in the hash
>> calculation?
>
> Hannes,
>
> In all of the traffic I inspected I don't believe I saw a single flow label
> set. Even if it were set 100% of the time by Linux, any packets routed or
> bridged from another operating system wouldn't see any benefit. The current
> algorithm distributes the traffic very well, I don't believe adding the flow
> label would be beneficial even if it were set more frequently.
>
> If you feel strongly about its inclusion, though, I am willing to
> reconsider.

It would definitely help to load balance tunnelled traffic over a
bonded interface. But as I currently don't use such a setup, I don't
have a strong opinion on that.

Greetings,

  Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index bfea8a3..5db14fe 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -752,12 +752,22 @@  xmit_hash_policy
  		protocol information to generate the hash.

  		Uses XOR of hardware MAC addresses and IP addresses to
-		generate the hash.  The formula is
+		generate the hash.  The IPv4 formula is

  		(((source IP XOR dest IP) AND 0xffff) XOR
  			( source MAC XOR destination MAC ))
  				modulo slave count

+		The IPv6 forumla is
+
+		iphash =
+			(source ip quad 2 XOR dest IP quad 2) XOR
+			(source ip quad 3 XOR dest IP quad 3) XOR
+			(source ip quad 4 XOR dest IP quad 4)
+
+		((iphash >> 16) XOR (iphash >> 8) XOR iphash)
+			modulo slave count
+
  		This algorithm will place all traffic to a particular
  		network peer on the same slave.  For non-IP traffic,
  		the formula is the same as for the layer2 transmit
@@ -778,19 +788,30 @@  xmit_hash_policy
  		slaves, although a single connection will not span
  		multiple slaves.

-		The formula for unfragmented TCP and UDP packets is
+		The formula for unfragmented IPv4 TCP and UDP packets is

  		((source port XOR dest port) XOR
  			 ((source IP XOR dest IP) AND 0xffff)
  				modulo slave count

-		For fragmented TCP or UDP packets and all other IP
-		protocol traffic, the source and destination port
+		The formula for unfragmented IPv6 TCP and UDP packets is
+
+		iphash =
+			(source ip quad 2 XOR dest IP quad 2) XOR
+			(source ip quad 3 XOR dest IP quad 3) XOR
+			(source ip quad 4 XOR dest IP quad 4)
+
+		((source port XOR dest port) XOR
+			(iphash >> 16) XOR (iphash >> 8) XOR iphash)
+				modulo slave count
+
+		For fragmented TCP or UDP packets and all other IPv4 and
+		IPv6 protocol traffic, the source and destination port
  		information is omitted.  For non-IP traffic, the
  		formula is the same as for the layer2 transmit hash
  		policy.

-		This policy is intended to mimic the behavior of
+		The IPv4 policy is intended to mimic the behavior of
  		certain switches, notably Cisco switches with PFC2 as
  		well as some Foundry and IBM products.