diff mbox

[ovs-dev,PATCH/RFC] bond: add enable-recirc configuration for bond.

Message ID 1502813613-27353-1-git-send-email-simon.horman@netronome.com
State RFC
Headers show

Commit Message

Simon Horman Aug. 15, 2017, 4:13 p.m. UTC
From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>

Adds a config parameter that determines if a bond will use recirculation
in the kernel datapath when implementing a bond in balance-tcp mode.

The default for enable-recirc is "true", resulting in the traditional
implementation of a bond in balance-tcp mode. Setting enable-recirc to
false results in datapath rules that do not rely on the recirculation
action.

example usage:
ovs-vsctl set port bond0 other_config:enable-recirc=false

Advantages:
- Allows TC offloading of OVS bonds on hardware which
  does not support recirculation
- Appears to result in lower latency (in systems using few flows).

Quick ping test results in:

other_config:enable-recirc=false
rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms

other_config:enable-recirc=true
rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms

More comprehensive testing is in progress.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
---
 ofproto/bond.c       | 11 ++++++++++-
 ofproto/bond.h       |  1 +
 vswitchd/bridge.c    |  3 +++
 vswitchd/vswitch.xml |  8 ++++++++
 4 files changed, 22 insertions(+), 1 deletion(-)

Comments

Andy Zhou Aug. 15, 2017, 8:31 p.m. UTC | #1
On Tue, Aug 15, 2017 at 9:13 AM, Simon Horman
<simon.horman@netronome.com> wrote:
> From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
>
> Adds a config parameter that determines if a bond will use recirculation
> in the kernel datapath when implementing a bond in balance-tcp mode.

Using recirc for bonding is a lower level decision. I am not sure
OVSDB is right level
for exposing this detail.
>
> The default for enable-recirc is "true", resulting in the traditional
> implementation of a bond in balance-tcp mode. Setting enable-recirc to
> false results in datapath rules that do not rely on the recirculation
> action.
>
> example usage:
> ovs-vsctl set port bond0 other_config:enable-recirc=false
>
> Advantages:
> - Allows TC offloading of OVS bonds on hardware which
>   does not support recirculation

Not sure it is an advantage. Bonding with recirc allows the first flow
to be a mega flow.
without megaflow, all bond traffic has to be micro flows, which was
how OVS implements
bonding. The recirc was introduced to solve the flow explosion problem

Without supporting recirc, I'd image TC offloading would suffer the
same issue. May be it
is correct not to offload the bond flows, at least not before recirc
is supported in TC.

Another way to look at this is that when TC ran out of flow space, and
bonding is supported
by the kernel module. This configuration would prevent kernel module
from using recirc.

> - Appears to result in lower latency (in systems using few flows).

Not sure this is true in general. When using recirc with proper
megaflow, the latency should
improve for new microflows since upcalls can be omitted.


>
> Quick ping test results in:
>
> other_config:enable-recirc=false
> rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms
>
> other_config:enable-recirc=true
> rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms
>

When setting up the flow, 'recirc" requires 2 miss  upcalls, thus set
up latency is larger.
However, once flows are set up, the latency difference should be minimal.

I'd expect the max latency to be different with or without recirc.
Notice the minimal latency difference is small.
I'd expect the average latency to converge to minimal for sufficiently
large number of ping packets.

> More comprehensive testing is in progress.
>
> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Jan Scheurich Aug. 16, 2017, 2:32 p.m. UTC | #2
Hi Simon, 

This patch only covers how to prevent recirculation but does not describe how bond selection works in the absence of recirculation. Can you explain?

Thanks, Jan

> -----Original Message-----
> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> bounces@openvswitch.org] On Behalf Of Simon Horman
> Sent: Tuesday, 15 August, 2017 18:14
> To: dev@openvswitch.org
> Cc: oss-drivers@netronome.com; Simon Horman
> <simon.horman@netronome.com>; Pieter Jansen van Vuuren
> <pieter.jansenvanvuuren@netronome.com>
> Subject: [ovs-dev] [PATCH/RFC] bond: add enable-recirc configuration for
> bond.
> 
> From: Pieter Jansen van Vuuren
> <pieter.jansenvanvuuren@netronome.com>
> 
> Adds a config parameter that determines if a bond will use recirculation
> in the kernel datapath when implementing a bond in balance-tcp mode.
> 
> The default for enable-recirc is "true", resulting in the traditional
> implementation of a bond in balance-tcp mode. Setting enable-recirc to
> false results in datapath rules that do not rely on the recirculation
> action.
> 
> example usage:
> ovs-vsctl set port bond0 other_config:enable-recirc=false
> 
> Advantages:
> - Allows TC offloading of OVS bonds on hardware which
>   does not support recirculation
> - Appears to result in lower latency (in systems using few flows).
> 
> Quick ping test results in:
> 
> other_config:enable-recirc=false
> rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms
> 
> other_config:enable-recirc=true
> rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms
> 
> More comprehensive testing is in progress.
> 
> Signed-off-by: Pieter Jansen van Vuuren
> <pieter.jansenvanvuuren@netronome.com>
> Signed-off-by: Simon Horman <simon.horman@netronome.com>
> ---
>  ofproto/bond.c       | 11 ++++++++++-
>  ofproto/bond.h       |  1 +
>  vswitchd/bridge.c    |  3 +++
>  vswitchd/vswitch.xml |  8 ++++++++
>  4 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index 365a3ca7ffad..46f8a9afcb3b 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -147,6 +147,7 @@ struct bond {
>                                 /* The MAC address of the active interface. */
>      /* Legacy compatibility. */
>      bool lacp_fallback_ab; /* Fallback to active-backup on LACP failure. */
> +    bool recirc_enabled;
> 
>      struct ovs_refcount ref_cnt;
>  };
> @@ -437,6 +438,11 @@ bond_reconfigure(struct bond *bond, const struct
> bond_settings *s)
>          revalidate = true;
>      }
> 
> +    if (bond->recirc_enabled != s->recirc_enabled) {
> +        bond->recirc_enabled = s->recirc_enabled;
> +        revalidate = true;
> +    }
> +
>      if (bond->rebalance_interval != s->rebalance_interval) {
>          bond->rebalance_interval = s->rebalance_interval;
>          revalidate = true;
> @@ -458,7 +464,10 @@ bond_reconfigure(struct bond *bond, const struct
> bond_settings *s)
>      }
> 
>      if (bond->balance != BM_AB) {
> -        if (!bond->recirc_id) {
> +	if (!bond->recirc_enabled) {
> +            recirc_free_id(bond->recirc_id);
> +            bond->recirc_id = 0;
> +        } else if (!bond->recirc_id) {
>              bond->recirc_id = recirc_alloc_id(bond->ofproto);
>          }
>      } else if (bond->recirc_id) {
> diff --git a/ofproto/bond.h b/ofproto/bond.h
> index e7c3d9bc35dd..beb937b9910e 100644
> --- a/ofproto/bond.h
> +++ b/ofproto/bond.h
> @@ -53,6 +53,7 @@ struct bond_settings {
>      int down_delay;             /* ms before disabling a down slave. */
> 
>      bool lacp_fallback_ab_cfg;  /* Fallback to active-backup on LACP
> failure. */
> +    bool recirc_enabled;
> 
>      struct eth_addr active_slave_mac;
>                                  /* The MAC address of the interface
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index a8cbae78cb23..b4b5c89ca6a0 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -4259,6 +4259,9 @@ port_configure_bond(struct port *port, struct
> bond_settings *s)
>      s->lacp_fallback_ab_cfg = smap_get_bool(&port->cfg->other_config,
>                                         "lacp-fallback-ab", false);
> 
> +    s->recirc_enabled = smap_get_bool(&port->cfg->other_config,
> +                                      "enable-recirc", true);
> +
>      LIST_FOR_EACH (iface, port_elem, &port->ifaces) {
>          netdev_set_miimon_interval(iface->netdev, miimon_interval);
>      }
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 074535b588ef..7b97d720d276 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -1693,6 +1693,14 @@
>              is configured to LACP mode, the bond will use LACP.
>            </p>
>          </column>
> +
> +        <column name="other_config" key="enable-recirc"
> +                type='{"type": "boolean"}'>
> +          <p>
> +            Determines if a bond will use recirculation in the kernel datapath
> +            when implementing a bond in balance-tcp mode.
> +          </p>
> +        </column>
>        </group>
> 
>        <group title="Rebalancing Configuration">
> --
> 2.1.4
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Simon Horman Aug. 16, 2017, 2:59 p.m. UTC | #3
Hi Andy,

On Tue, Aug 15, 2017 at 01:31:36PM -0700, Andy Zhou wrote:
> On Tue, Aug 15, 2017 at 9:13 AM, Simon Horman
> <simon.horman@netronome.com> wrote:
> > From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
> >
> > Adds a config parameter that determines if a bond will use recirculation
> > in the kernel datapath when implementing a bond in balance-tcp mode.
> 
> Using recirc for bonding is a lower level decision. I am not sure
> OVSDB is right level
> for exposing this detail.

I do see that this configuration is arguably more detailed than existing
bond parameters set via OVSDB, but it is still a bond parameter so perhaps
it isn't entirely out of place here.

> > The default for enable-recirc is "true", resulting in the traditional
> > implementation of a bond in balance-tcp mode. Setting enable-recirc to
> > false results in datapath rules that do not rely on the recirculation
> > action.
> >
> > example usage:
> > ovs-vsctl set port bond0 other_config:enable-recirc=false
> >
> > Advantages:
> > - Allows TC offloading of OVS bonds on hardware which
> >   does not support recirculation
> 
> Not sure it is an advantage. Bonding with recirc allows the first flow to
> be a mega flow.  without megaflow, all bond traffic has to be micro
> flows, which was how OVS implements bonding. The recirc was introduced to
> solve the flow explosion problem
> 
> Without supporting recirc, I'd image TC offloading would suffer the same
> issue. May be it is correct not to offload the bond flows, at least not
> before recirc is supported in TC.
> 
> Another way to look at this is that when TC ran out of flow space, and
> bonding is supported by the kernel module. This configuration would
> prevent kernel module from using recirc.

I believe that your points regarding flow explosion and fallback to
the kernel datapath are well made. However, I also believe there is
hardware where using recirculation is not practical. For that
reason I think it is worth exploring this proposal further.

> > - Appears to result in lower latency (in systems using few flows).
> 
> Not sure this is true in general. When using recirc with proper
> megaflow, the latency should
> improve for new microflows since upcalls can be omitted.

That may be true in terms of flow setup cost but I believe that
the per-packet cost is going to be higher when recirculation is used.

> > Quick ping test results in:
> >
> > other_config:enable-recirc=false
> > rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms
> >
> > other_config:enable-recirc=true
> > rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms
> >
> 
> When setting up the flow, 'recirc" requires 2 miss  upcalls, thus set
> up latency is larger.
> However, once flows are set up, the latency difference should be minimal.
> 
> I'd expect the max latency to be different with or without recirc.
> Notice the minimal latency difference is small.
> I'd expect the average latency to converge to minimal for sufficiently
> large number of ping packets.

Latency may not be the best way to measure this but having
to classify each packet twice, as I believe is the case when
recirculation is used, must cost something.

> > More comprehensive testing is in progress.
> >
> > Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
> > Signed-off-by: Simon Horman <simon.horman@netronome.com>
Andy Zhou Aug. 16, 2017, 6:22 p.m. UTC | #4
On Wed, Aug 16, 2017 at 7:59 AM, Simon Horman
<simon.horman@netronome.com> wrote:
> Hi Andy,
>
> On Tue, Aug 15, 2017 at 01:31:36PM -0700, Andy Zhou wrote:
>> On Tue, Aug 15, 2017 at 9:13 AM, Simon Horman
>> <simon.horman@netronome.com> wrote:
>> > From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
>> >
>> > Adds a config parameter that determines if a bond will use recirculation
>> > in the kernel datapath when implementing a bond in balance-tcp mode.
>>
>> Using recirc for bonding is a lower level decision. I am not sure
>> OVSDB is right level
>> for exposing this detail.
>
> I do see that this configuration is arguably more detailed than existing
> bond parameters set via OVSDB, but it is still a bond parameter so perhaps
> it isn't entirely out of place here.
>
>> > The default for enable-recirc is "true", resulting in the traditional
>> > implementation of a bond in balance-tcp mode. Setting enable-recirc to
>> > false results in datapath rules that do not rely on the recirculation
>> > action.
>> >
>> > example usage:
>> > ovs-vsctl set port bond0 other_config:enable-recirc=false
>> >
>> > Advantages:
>> > - Allows TC offloading of OVS bonds on hardware which
>> >   does not support recirculation
>>
>> Not sure it is an advantage. Bonding with recirc allows the first flow to
>> be a mega flow.  without megaflow, all bond traffic has to be micro
>> flows, which was how OVS implements bonding. The recirc was introduced to
>> solve the flow explosion problem
>>
>> Without supporting recirc, I'd image TC offloading would suffer the same
>> issue. May be it is correct not to offload the bond flows, at least not
>> before recirc is supported in TC.
>>
>> Another way to look at this is that when TC ran out of flow space, and
>> bonding is supported by the kernel module. This configuration would
>> prevent kernel module from using recirc.
>
> I believe that your points regarding flow explosion and fallback to
> the kernel datapath are well made. However, I also believe there is
> hardware where using recirculation is not practical. For that
> reason I think it is worth exploring this proposal further.

I am not sure offloading microflows to hardware is a good ideal because of
flow explosion.
>
>> > - Appears to result in lower latency (in systems using few flows).
>>
>> Not sure this is true in general. When using recirc with proper
>> megaflow, the latency should
>> improve for new microflows since upcalls can be omitted.
>
> That may be true in terms of flow setup cost but I believe that
> the per-packet cost is going to be higher when recirculation is used.
>
>> > Quick ping test results in:
>> >
>> > other_config:enable-recirc=false
>> > rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms
>> >
>> > other_config:enable-recirc=true
>> > rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms
>> >
>>
>> When setting up the flow, 'recirc" requires 2 miss  upcalls, thus set
>> up latency is larger.
>> However, once flows are set up, the latency difference should be minimal.
>>
>> I'd expect the max latency to be different with or without recirc.
>> Notice the minimal latency difference is small.
>> I'd expect the average latency to converge to minimal for sufficiently
>> large number of ping packets.
>
> Latency may not be the best way to measure this but having
> to classify each packet twice, as I believe is the case when
> recirculation is used, must cost something.

I agree that recirculation is not free. The trade off is that it
avoids flow explosion.
>
>> > More comprehensive testing is in progress.
>> >
>> > Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
>> > Signed-off-by: Simon Horman <simon.horman@netronome.com>
diff mbox

Patch

diff --git a/ofproto/bond.c b/ofproto/bond.c
index 365a3ca7ffad..46f8a9afcb3b 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -147,6 +147,7 @@  struct bond {
                                /* The MAC address of the active interface. */
     /* Legacy compatibility. */
     bool lacp_fallback_ab; /* Fallback to active-backup on LACP failure. */
+    bool recirc_enabled;
 
     struct ovs_refcount ref_cnt;
 };
@@ -437,6 +438,11 @@  bond_reconfigure(struct bond *bond, const struct bond_settings *s)
         revalidate = true;
     }
 
+    if (bond->recirc_enabled != s->recirc_enabled) {
+        bond->recirc_enabled = s->recirc_enabled;
+        revalidate = true;
+    }
+
     if (bond->rebalance_interval != s->rebalance_interval) {
         bond->rebalance_interval = s->rebalance_interval;
         revalidate = true;
@@ -458,7 +464,10 @@  bond_reconfigure(struct bond *bond, const struct bond_settings *s)
     }
 
     if (bond->balance != BM_AB) {
-        if (!bond->recirc_id) {
+	if (!bond->recirc_enabled) {
+            recirc_free_id(bond->recirc_id);
+            bond->recirc_id = 0;
+        } else if (!bond->recirc_id) {
             bond->recirc_id = recirc_alloc_id(bond->ofproto);
         }
     } else if (bond->recirc_id) {
diff --git a/ofproto/bond.h b/ofproto/bond.h
index e7c3d9bc35dd..beb937b9910e 100644
--- a/ofproto/bond.h
+++ b/ofproto/bond.h
@@ -53,6 +53,7 @@  struct bond_settings {
     int down_delay;             /* ms before disabling a down slave. */
 
     bool lacp_fallback_ab_cfg;  /* Fallback to active-backup on LACP failure. */
+    bool recirc_enabled;
 
     struct eth_addr active_slave_mac;
                                 /* The MAC address of the interface
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index a8cbae78cb23..b4b5c89ca6a0 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -4259,6 +4259,9 @@  port_configure_bond(struct port *port, struct bond_settings *s)
     s->lacp_fallback_ab_cfg = smap_get_bool(&port->cfg->other_config,
                                        "lacp-fallback-ab", false);
 
+    s->recirc_enabled = smap_get_bool(&port->cfg->other_config,
+                                      "enable-recirc", true);
+
     LIST_FOR_EACH (iface, port_elem, &port->ifaces) {
         netdev_set_miimon_interval(iface->netdev, miimon_interval);
     }
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 074535b588ef..7b97d720d276 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -1693,6 +1693,14 @@ 
             is configured to LACP mode, the bond will use LACP.
           </p>
         </column>
+
+        <column name="other_config" key="enable-recirc"
+                type='{"type": "boolean"}'>
+          <p>
+            Determines if a bond will use recirculation in the kernel datapath
+            when implementing a bond in balance-tcp mode.
+          </p>
+        </column>
       </group>
 
       <group title="Rebalancing Configuration">