Message ID | 1502813613-27353-1-git-send-email-simon.horman@netronome.com |
---|---|
State | RFC |
Headers | show |
On Tue, Aug 15, 2017 at 9:13 AM, Simon Horman <simon.horman@netronome.com> wrote: > From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> > > Adds a config parameter that determines if a bond will use recirculation > in the kernel datapath when implementing a bond in balance-tcp mode. Using recirc for bonding is a lower level decision. I am not sure OVSDB is right level for exposing this detail. > > The default for enable-recirc is "true", resulting in the traditional > implementation of a bond in balance-tcp mode. Setting enable-recirc to > false results in datapath rules that do not rely on the recirculation > action. > > example usage: > ovs-vsctl set port bond0 other_config:enable-recirc=false > > Advantages: > - Allows TC offloading of OVS bonds on hardware which > does not support recirculation Not sure it is an advantage. Bonding with recirc allows the first flow to be a mega flow. without megaflow, all bond traffic has to be micro flows, which was how OVS implements bonding. The recirc was introduced to solve the flow explosion problem Without supporting recirc, I'd image TC offloading would suffer the same issue. May be it is correct not to offload the bond flows, at least not before recirc is supported in TC. Another way to look at this is that when TC ran out of flow space, and bonding is supported by the kernel module. This configuration would prevent kernel module from using recirc. > - Appears to result in lower latency (in systems using few flows). Not sure this is true in general. When using recirc with proper megaflow, the latency should improve for new microflows since upcalls can be omitted. > > Quick ping test results in: > > other_config:enable-recirc=false > rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms > > other_config:enable-recirc=true > rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms > When setting up the flow, 'recirc" requires 2 miss upcalls, thus set up latency is larger. However, once flows are set up, the latency difference should be minimal. I'd expect the max latency to be different with or without recirc. Notice the minimal latency difference is small. I'd expect the average latency to converge to minimal for sufficiently large number of ping packets. > More comprehensive testing is in progress. > > Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> > Signed-off-by: Simon Horman <simon.horman@netronome.com>
Hi Simon, This patch only covers how to prevent recirculation but does not describe how bond selection works in the absence of recirculation. Can you explain? Thanks, Jan > -----Original Message----- > From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev- > bounces@openvswitch.org] On Behalf Of Simon Horman > Sent: Tuesday, 15 August, 2017 18:14 > To: dev@openvswitch.org > Cc: oss-drivers@netronome.com; Simon Horman > <simon.horman@netronome.com>; Pieter Jansen van Vuuren > <pieter.jansenvanvuuren@netronome.com> > Subject: [ovs-dev] [PATCH/RFC] bond: add enable-recirc configuration for > bond. > > From: Pieter Jansen van Vuuren > <pieter.jansenvanvuuren@netronome.com> > > Adds a config parameter that determines if a bond will use recirculation > in the kernel datapath when implementing a bond in balance-tcp mode. > > The default for enable-recirc is "true", resulting in the traditional > implementation of a bond in balance-tcp mode. Setting enable-recirc to > false results in datapath rules that do not rely on the recirculation > action. > > example usage: > ovs-vsctl set port bond0 other_config:enable-recirc=false > > Advantages: > - Allows TC offloading of OVS bonds on hardware which > does not support recirculation > - Appears to result in lower latency (in systems using few flows). > > Quick ping test results in: > > other_config:enable-recirc=false > rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms > > other_config:enable-recirc=true > rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms > > More comprehensive testing is in progress. > > Signed-off-by: Pieter Jansen van Vuuren > <pieter.jansenvanvuuren@netronome.com> > Signed-off-by: Simon Horman <simon.horman@netronome.com> > --- > ofproto/bond.c | 11 ++++++++++- > ofproto/bond.h | 1 + > vswitchd/bridge.c | 3 +++ > vswitchd/vswitch.xml | 8 ++++++++ > 4 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/ofproto/bond.c b/ofproto/bond.c > index 365a3ca7ffad..46f8a9afcb3b 100644 > --- a/ofproto/bond.c > +++ b/ofproto/bond.c > @@ -147,6 +147,7 @@ struct bond { > /* The MAC address of the active interface. */ > /* Legacy compatibility. */ > bool lacp_fallback_ab; /* Fallback to active-backup on LACP failure. */ > + bool recirc_enabled; > > struct ovs_refcount ref_cnt; > }; > @@ -437,6 +438,11 @@ bond_reconfigure(struct bond *bond, const struct > bond_settings *s) > revalidate = true; > } > > + if (bond->recirc_enabled != s->recirc_enabled) { > + bond->recirc_enabled = s->recirc_enabled; > + revalidate = true; > + } > + > if (bond->rebalance_interval != s->rebalance_interval) { > bond->rebalance_interval = s->rebalance_interval; > revalidate = true; > @@ -458,7 +464,10 @@ bond_reconfigure(struct bond *bond, const struct > bond_settings *s) > } > > if (bond->balance != BM_AB) { > - if (!bond->recirc_id) { > + if (!bond->recirc_enabled) { > + recirc_free_id(bond->recirc_id); > + bond->recirc_id = 0; > + } else if (!bond->recirc_id) { > bond->recirc_id = recirc_alloc_id(bond->ofproto); > } > } else if (bond->recirc_id) { > diff --git a/ofproto/bond.h b/ofproto/bond.h > index e7c3d9bc35dd..beb937b9910e 100644 > --- a/ofproto/bond.h > +++ b/ofproto/bond.h > @@ -53,6 +53,7 @@ struct bond_settings { > int down_delay; /* ms before disabling a down slave. */ > > bool lacp_fallback_ab_cfg; /* Fallback to active-backup on LACP > failure. */ > + bool recirc_enabled; > > struct eth_addr active_slave_mac; > /* The MAC address of the interface > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c > index a8cbae78cb23..b4b5c89ca6a0 100644 > --- a/vswitchd/bridge.c > +++ b/vswitchd/bridge.c > @@ -4259,6 +4259,9 @@ port_configure_bond(struct port *port, struct > bond_settings *s) > s->lacp_fallback_ab_cfg = smap_get_bool(&port->cfg->other_config, > "lacp-fallback-ab", false); > > + s->recirc_enabled = smap_get_bool(&port->cfg->other_config, > + "enable-recirc", true); > + > LIST_FOR_EACH (iface, port_elem, &port->ifaces) { > netdev_set_miimon_interval(iface->netdev, miimon_interval); > } > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 074535b588ef..7b97d720d276 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -1693,6 +1693,14 @@ > is configured to LACP mode, the bond will use LACP. > </p> > </column> > + > + <column name="other_config" key="enable-recirc" > + type='{"type": "boolean"}'> > + <p> > + Determines if a bond will use recirculation in the kernel datapath > + when implementing a bond in balance-tcp mode. > + </p> > + </column> > </group> > > <group title="Rebalancing Configuration"> > -- > 2.1.4 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Hi Andy, On Tue, Aug 15, 2017 at 01:31:36PM -0700, Andy Zhou wrote: > On Tue, Aug 15, 2017 at 9:13 AM, Simon Horman > <simon.horman@netronome.com> wrote: > > From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> > > > > Adds a config parameter that determines if a bond will use recirculation > > in the kernel datapath when implementing a bond in balance-tcp mode. > > Using recirc for bonding is a lower level decision. I am not sure > OVSDB is right level > for exposing this detail. I do see that this configuration is arguably more detailed than existing bond parameters set via OVSDB, but it is still a bond parameter so perhaps it isn't entirely out of place here. > > The default for enable-recirc is "true", resulting in the traditional > > implementation of a bond in balance-tcp mode. Setting enable-recirc to > > false results in datapath rules that do not rely on the recirculation > > action. > > > > example usage: > > ovs-vsctl set port bond0 other_config:enable-recirc=false > > > > Advantages: > > - Allows TC offloading of OVS bonds on hardware which > > does not support recirculation > > Not sure it is an advantage. Bonding with recirc allows the first flow to > be a mega flow. without megaflow, all bond traffic has to be micro > flows, which was how OVS implements bonding. The recirc was introduced to > solve the flow explosion problem > > Without supporting recirc, I'd image TC offloading would suffer the same > issue. May be it is correct not to offload the bond flows, at least not > before recirc is supported in TC. > > Another way to look at this is that when TC ran out of flow space, and > bonding is supported by the kernel module. This configuration would > prevent kernel module from using recirc. I believe that your points regarding flow explosion and fallback to the kernel datapath are well made. However, I also believe there is hardware where using recirculation is not practical. For that reason I think it is worth exploring this proposal further. > > - Appears to result in lower latency (in systems using few flows). > > Not sure this is true in general. When using recirc with proper > megaflow, the latency should > improve for new microflows since upcalls can be omitted. That may be true in terms of flow setup cost but I believe that the per-packet cost is going to be higher when recirculation is used. > > Quick ping test results in: > > > > other_config:enable-recirc=false > > rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms > > > > other_config:enable-recirc=true > > rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms > > > > When setting up the flow, 'recirc" requires 2 miss upcalls, thus set > up latency is larger. > However, once flows are set up, the latency difference should be minimal. > > I'd expect the max latency to be different with or without recirc. > Notice the minimal latency difference is small. > I'd expect the average latency to converge to minimal for sufficiently > large number of ping packets. Latency may not be the best way to measure this but having to classify each packet twice, as I believe is the case when recirculation is used, must cost something. > > More comprehensive testing is in progress. > > > > Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> > > Signed-off-by: Simon Horman <simon.horman@netronome.com>
On Wed, Aug 16, 2017 at 7:59 AM, Simon Horman <simon.horman@netronome.com> wrote: > Hi Andy, > > On Tue, Aug 15, 2017 at 01:31:36PM -0700, Andy Zhou wrote: >> On Tue, Aug 15, 2017 at 9:13 AM, Simon Horman >> <simon.horman@netronome.com> wrote: >> > From: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> >> > >> > Adds a config parameter that determines if a bond will use recirculation >> > in the kernel datapath when implementing a bond in balance-tcp mode. >> >> Using recirc for bonding is a lower level decision. I am not sure >> OVSDB is right level >> for exposing this detail. > > I do see that this configuration is arguably more detailed than existing > bond parameters set via OVSDB, but it is still a bond parameter so perhaps > it isn't entirely out of place here. > >> > The default for enable-recirc is "true", resulting in the traditional >> > implementation of a bond in balance-tcp mode. Setting enable-recirc to >> > false results in datapath rules that do not rely on the recirculation >> > action. >> > >> > example usage: >> > ovs-vsctl set port bond0 other_config:enable-recirc=false >> > >> > Advantages: >> > - Allows TC offloading of OVS bonds on hardware which >> > does not support recirculation >> >> Not sure it is an advantage. Bonding with recirc allows the first flow to >> be a mega flow. without megaflow, all bond traffic has to be micro >> flows, which was how OVS implements bonding. The recirc was introduced to >> solve the flow explosion problem >> >> Without supporting recirc, I'd image TC offloading would suffer the same >> issue. May be it is correct not to offload the bond flows, at least not >> before recirc is supported in TC. >> >> Another way to look at this is that when TC ran out of flow space, and >> bonding is supported by the kernel module. This configuration would >> prevent kernel module from using recirc. > > I believe that your points regarding flow explosion and fallback to > the kernel datapath are well made. However, I also believe there is > hardware where using recirculation is not practical. For that > reason I think it is worth exploring this proposal further. I am not sure offloading microflows to hardware is a good ideal because of flow explosion. > >> > - Appears to result in lower latency (in systems using few flows). >> >> Not sure this is true in general. When using recirc with proper >> megaflow, the latency should >> improve for new microflows since upcalls can be omitted. > > That may be true in terms of flow setup cost but I believe that > the per-packet cost is going to be higher when recirculation is used. > >> > Quick ping test results in: >> > >> > other_config:enable-recirc=false >> > rtt min/avg/max/mdev = 0.039/0.193/7.612/1.059 ms >> > >> > other_config:enable-recirc=true >> > rtt min/avg/max/mdev = 0.038/0.321/14.091/1.967 ms >> > >> >> When setting up the flow, 'recirc" requires 2 miss upcalls, thus set >> up latency is larger. >> However, once flows are set up, the latency difference should be minimal. >> >> I'd expect the max latency to be different with or without recirc. >> Notice the minimal latency difference is small. >> I'd expect the average latency to converge to minimal for sufficiently >> large number of ping packets. > > Latency may not be the best way to measure this but having > to classify each packet twice, as I believe is the case when > recirculation is used, must cost something. I agree that recirculation is not free. The trade off is that it avoids flow explosion. > >> > More comprehensive testing is in progress. >> > >> > Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> >> > Signed-off-by: Simon Horman <simon.horman@netronome.com>
diff --git a/ofproto/bond.c b/ofproto/bond.c index 365a3ca7ffad..46f8a9afcb3b 100644 --- a/ofproto/bond.c +++ b/ofproto/bond.c @@ -147,6 +147,7 @@ struct bond { /* The MAC address of the active interface. */ /* Legacy compatibility. */ bool lacp_fallback_ab; /* Fallback to active-backup on LACP failure. */ + bool recirc_enabled; struct ovs_refcount ref_cnt; }; @@ -437,6 +438,11 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s) revalidate = true; } + if (bond->recirc_enabled != s->recirc_enabled) { + bond->recirc_enabled = s->recirc_enabled; + revalidate = true; + } + if (bond->rebalance_interval != s->rebalance_interval) { bond->rebalance_interval = s->rebalance_interval; revalidate = true; @@ -458,7 +464,10 @@ bond_reconfigure(struct bond *bond, const struct bond_settings *s) } if (bond->balance != BM_AB) { - if (!bond->recirc_id) { + if (!bond->recirc_enabled) { + recirc_free_id(bond->recirc_id); + bond->recirc_id = 0; + } else if (!bond->recirc_id) { bond->recirc_id = recirc_alloc_id(bond->ofproto); } } else if (bond->recirc_id) { diff --git a/ofproto/bond.h b/ofproto/bond.h index e7c3d9bc35dd..beb937b9910e 100644 --- a/ofproto/bond.h +++ b/ofproto/bond.h @@ -53,6 +53,7 @@ struct bond_settings { int down_delay; /* ms before disabling a down slave. */ bool lacp_fallback_ab_cfg; /* Fallback to active-backup on LACP failure. */ + bool recirc_enabled; struct eth_addr active_slave_mac; /* The MAC address of the interface diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index a8cbae78cb23..b4b5c89ca6a0 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -4259,6 +4259,9 @@ port_configure_bond(struct port *port, struct bond_settings *s) s->lacp_fallback_ab_cfg = smap_get_bool(&port->cfg->other_config, "lacp-fallback-ab", false); + s->recirc_enabled = smap_get_bool(&port->cfg->other_config, + "enable-recirc", true); + LIST_FOR_EACH (iface, port_elem, &port->ifaces) { netdev_set_miimon_interval(iface->netdev, miimon_interval); } diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 074535b588ef..7b97d720d276 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -1693,6 +1693,14 @@ is configured to LACP mode, the bond will use LACP. </p> </column> + + <column name="other_config" key="enable-recirc" + type='{"type": "boolean"}'> + <p> + Determines if a bond will use recirculation in the kernel datapath + when implementing a bond in balance-tcp mode. + </p> + </column> </group> <group title="Rebalancing Configuration">