mbox series

[ovs-dev,v8,0/4] Introduce infrastructure for VIF plug providers

Message ID 20211102112916.3368168-1-frode.nordahl@canonical.com
Headers show
Series Introduce infrastructure for VIF plug providers | expand

Message

Frode Nordahl Nov. 2, 2021, 11:29 a.m. UTC
Introduce infrastructure for VIF plug providers and add feature to
ovn-controller to add and remove ports on the integration bridge as
directed by CMS through Logical_Switch_Port options.

Traditionally it has been the CMSs responsibility to create Virtual
Interfaces (VIFs) as part of instance (Container, Pod, Virtual
Machine etc.) life cycle, and subsequently manage plug/unplug
operations on the Open vSwitch integration bridge.

With the advent of NICs connected to multiple distinct CPUs we can
have a topology where the instance runs on one host and Open
vSwitch and OVN runs on a different host, the smartnic control plane
CPU.  The host facing interfaces will be visible to Open vSwitch
and OVN as representor ports.

The actions necessary for plugging and unplugging the representor
port in Open vSwitch running on the smartnic control plane CPU would
be the same for every CMS.

Hardware or platform specific details for initialization and lookup
of representor ports is provided by an plugging provider hosted
inside or outside the core OVN repository, and linked at OVN build
time.

RFC1 -> RFC2:
 - Introduce the plug-provider interface, remove hardware/platform
   dependent code.
 - Add ovsport module.
 - Integrate with binding module.
 - Split into multiple patches.

RFC2 -> v1:
 - Extend build system, `--with-plug-provider`.
 - Check for required data in b_ctx_in and lbinding data structures.
 - Split consider_plug_lport into update and create processing
   functions.
 - Consider unplug on release where relevant.
 - Add ovn-controller `--enable-dummy-plug` option for testing.
 - Consistent function naming and move ovsport module to controller/.
 - Rename plug-test.c -> plug-dummy.c.
 - Add functional- and unit- tests.

v1 -> v2:
 - Move update to controller/binding.h from patch 6 -> patch 5.
 - Fix lint problems reported by 0-day Robot.

v2 -> v3:
 - Fix build system extension for plug provider.
 - Implement DDlog version of northd change.
 - Rebase on tip.

v3 -> v4:
 - sb:Port_Binding:plugged_by -> sb:Port_Binding:requested_chassis.
 - Move documentation of plugin specific options to plugin
   implementation.
 - ovn-northd-ddlog: squash changes into same patch as C version, rework
   the DDlog implementation. 
 - controller: Use requested_chassis column instead of parsing options.
 - plug-provider: Remove the extra class instantiation layer and
   refcounting.  Add documentation and various tweaks.
 - controller: Add engine node for plug provider so that a plugin run
   function can run at an appropriate time and also allow feeding back
   information about changes that can not be handled incrementally.
 - binding: Fix return values for plug functions so they adhere to the
   incremental processing engine contract.
 - Add support for building in-tree plug providers.

v4 -> v5:
 - binding: Make some data structures and functions public to allow
   other modules to make use of its local binding tracking data
   structures.
 - Add separate incremental processing engine nodes for the plug
   provider.
 - Do change handling in the controller/plug module rather than piggy
   backing on the binding module.
 - Deal with asynchronous notification to plug provider class and
   subsequent freeing of data structures when the transaction commits.
 - Drop the representor plugin as in-tree plugin to address concerns about
   it being Linux-only and having netlink/kernel dependencies.  Will
   upstream to ovn-org/ovn-vif instead.

v5 -> v6:
 - Fix 0-day Robot documentation lexer warning treated as error:
    https://mail.openvswitch.org/pipermail/ovs-build/2021-September/017538.html

v6 -> v7:
 - lport: Fall back to strcmp when requested-chassis option is set
          but Port_Binding->requested_chassis is not filled yet.
 - physical: Make use of common lport_can_bind_on_this_chassis helper
             instead of duplicating the inverse check in the code.
 - tests: Add test cases to cover currently uncovered requested-chassis
          functionality.
 - plug: Make smaps struct members instead of pointers.
 - plug: Don't use is_deleted function when not iterating over tracked
         records.  Avoid adding multiple delete or update iface requests
         to the same transaction.
 - plug: For full recompute, don't process until northd has populated
         Port_Binding->requested_chassis to avoid thrashing on startup.
         Also fix order of processing.
 - plug: Handle failed transactions appropriately.
 - tests: Clean up and extend plug test case.

v7 -> v8:
 - Patches 1 through 9 from v7 was merged.
 - Use Port_Binding:requested_chassis index when iterating over PBs.
 - Drop separate I-P engine nodes for the VIF plug providers at this
   time, we can revisit the need for it in the event of VIF plug
   providers gaining support for Scalable Functions and the increased
   density that entails.
 - Rename module, objects and folders from 'plug-' to 'vif-plug-*' to
   avoid using too generic names.
 - Don't call vif_plug_port_ctx_destroy on vif_plug_port_prepare
   failure, make the convention that plug provider should clean up after
   itself in that case.
 - Simplify the unplug/plug thrashing avoidance on startup with a rather
   naive counter to ensure IDL has data prior to attempting any
   processing.  Once we identify the root of the issue this can be removed.

Previous discussion:
 - RFC1: https://patchwork.ozlabs.org/project/ovn/patch/20210509140305.1910796-1-frode.nordahl@canonical.com/
 - RFC2: https://patchwork.ozlabs.org/project/ovn/cover/20210805145013.3033919-1-frode.nordahl@gmail.com/

Frode Nordahl (4):
  lib: Add infrastructure for VIF plug providers.
  ovn-controller: Prepare VIF plug provider infrastructure.
  controller: Consider plugging VIF on CMS request.
  NEWS: Add note on infrastructure for VIF plug providers.

 Documentation/automake.mk                     |   2 +
 Documentation/topics/index.rst                |   1 +
 .../topics/vif-plug-providers/index.rst       |  32 +
 .../vif-plug-providers/vif-plug-providers.rst | 209 ++++++
 NEWS                                          |   7 +
 acinclude.m4                                  |  49 ++
 configure.ac                                  |   2 +
 controller/automake.mk                        |   4 +-
 controller/ovn-controller.c                   |  84 ++-
 controller/test-vif-plug.c                    |  72 ++
 controller/vif-plug.c                         | 632 ++++++++++++++++++
 controller/vif-plug.h                         |  79 +++
 lib/automake.mk                               |  10 +-
 lib/vif-plug-provider.c                       | 204 ++++++
 lib/vif-plug-provider.h                       | 163 +++++
 lib/vif-plug-providers/dummy/vif-plug-dummy.c | 120 ++++
 ovn-architecture.7.xml                        |  35 +-
 ovn-nb.xml                                    |  21 +
 tests/automake.mk                             |  13 +-
 tests/ovn-macros.at                           |   2 +-
 tests/ovn-vif-plug.at                         |   8 +
 tests/ovn.at                                  | 121 ++++
 22 files changed, 1852 insertions(+), 18 deletions(-)
 create mode 100644 Documentation/topics/vif-plug-providers/index.rst
 create mode 100644 Documentation/topics/vif-plug-providers/vif-plug-providers.rst
 create mode 100644 controller/test-vif-plug.c
 create mode 100644 controller/vif-plug.c
 create mode 100644 controller/vif-plug.h
 create mode 100644 lib/vif-plug-provider.c
 create mode 100644 lib/vif-plug-provider.h
 create mode 100644 lib/vif-plug-providers/dummy/vif-plug-dummy.c
 create mode 100644 tests/ovn-vif-plug.at

Comments

Han Zhou Nov. 5, 2021, 7:50 a.m. UTC | #1
On Tue, Nov 2, 2021 at 4:29 AM Frode Nordahl <frode.nordahl@canonical.com>
wrote:
>
> Introduce infrastructure for VIF plug providers and add feature to
> ovn-controller to add and remove ports on the integration bridge as
> directed by CMS through Logical_Switch_Port options.
>
> Traditionally it has been the CMSs responsibility to create Virtual
> Interfaces (VIFs) as part of instance (Container, Pod, Virtual
> Machine etc.) life cycle, and subsequently manage plug/unplug
> operations on the Open vSwitch integration bridge.
>
> With the advent of NICs connected to multiple distinct CPUs we can
> have a topology where the instance runs on one host and Open
> vSwitch and OVN runs on a different host, the smartnic control plane
> CPU.  The host facing interfaces will be visible to Open vSwitch
> and OVN as representor ports.
>
> The actions necessary for plugging and unplugging the representor
> port in Open vSwitch running on the smartnic control plane CPU would
> be the same for every CMS.
>
> Hardware or platform specific details for initialization and lookup
> of representor ports is provided by an plugging provider hosted
> inside or outside the core OVN repository, and linked at OVN build
> time.
>
> RFC1 -> RFC2:
>  - Introduce the plug-provider interface, remove hardware/platform
>    dependent code.
>  - Add ovsport module.
>  - Integrate with binding module.
>  - Split into multiple patches.
>
> RFC2 -> v1:
>  - Extend build system, `--with-plug-provider`.
>  - Check for required data in b_ctx_in and lbinding data structures.
>  - Split consider_plug_lport into update and create processing
>    functions.
>  - Consider unplug on release where relevant.
>  - Add ovn-controller `--enable-dummy-plug` option for testing.
>  - Consistent function naming and move ovsport module to controller/.
>  - Rename plug-test.c -> plug-dummy.c.
>  - Add functional- and unit- tests.
>
> v1 -> v2:
>  - Move update to controller/binding.h from patch 6 -> patch 5.
>  - Fix lint problems reported by 0-day Robot.
>
> v2 -> v3:
>  - Fix build system extension for plug provider.
>  - Implement DDlog version of northd change.
>  - Rebase on tip.
>
> v3 -> v4:
>  - sb:Port_Binding:plugged_by -> sb:Port_Binding:requested_chassis.
>  - Move documentation of plugin specific options to plugin
>    implementation.
>  - ovn-northd-ddlog: squash changes into same patch as C version, rework
>    the DDlog implementation.
>  - controller: Use requested_chassis column instead of parsing options.
>  - plug-provider: Remove the extra class instantiation layer and
>    refcounting.  Add documentation and various tweaks.
>  - controller: Add engine node for plug provider so that a plugin run
>    function can run at an appropriate time and also allow feeding back
>    information about changes that can not be handled incrementally.
>  - binding: Fix return values for plug functions so they adhere to the
>    incremental processing engine contract.
>  - Add support for building in-tree plug providers.
>
> v4 -> v5:
>  - binding: Make some data structures and functions public to allow
>    other modules to make use of its local binding tracking data
>    structures.
>  - Add separate incremental processing engine nodes for the plug
>    provider.
>  - Do change handling in the controller/plug module rather than piggy
>    backing on the binding module.
>  - Deal with asynchronous notification to plug provider class and
>    subsequent freeing of data structures when the transaction commits.
>  - Drop the representor plugin as in-tree plugin to address concerns about
>    it being Linux-only and having netlink/kernel dependencies.  Will
>    upstream to ovn-org/ovn-vif instead.
>
> v5 -> v6:
>  - Fix 0-day Robot documentation lexer warning treated as error:
>
https://mail.openvswitch.org/pipermail/ovs-build/2021-September/017538.html
>
> v6 -> v7:
>  - lport: Fall back to strcmp when requested-chassis option is set
>           but Port_Binding->requested_chassis is not filled yet.
>  - physical: Make use of common lport_can_bind_on_this_chassis helper
>              instead of duplicating the inverse check in the code.
>  - tests: Add test cases to cover currently uncovered requested-chassis
>           functionality.
>  - plug: Make smaps struct members instead of pointers.
>  - plug: Don't use is_deleted function when not iterating over tracked
>          records.  Avoid adding multiple delete or update iface requests
>          to the same transaction.
>  - plug: For full recompute, don't process until northd has populated
>          Port_Binding->requested_chassis to avoid thrashing on startup.
>          Also fix order of processing.
>  - plug: Handle failed transactions appropriately.
>  - tests: Clean up and extend plug test case.
>
> v7 -> v8:
>  - Patches 1 through 9 from v7 was merged.
>  - Use Port_Binding:requested_chassis index when iterating over PBs.
>  - Drop separate I-P engine nodes for the VIF plug providers at this
>    time, we can revisit the need for it in the event of VIF plug
>    providers gaining support for Scalable Functions and the increased
>    density that entails.
>  - Rename module, objects and folders from 'plug-' to 'vif-plug-*' to
>    avoid using too generic names.
>  - Don't call vif_plug_port_ctx_destroy on vif_plug_port_prepare
>    failure, make the convention that plug provider should clean up after
>    itself in that case.
>  - Simplify the unplug/plug thrashing avoidance on startup with a rather
>    naive counter to ensure IDL has data prior to attempting any
>    processing.  Once we identify the root of the issue this can be
removed.

Thanks Frode for the revision. It does look cleaner with the I-P related
code removed!
Mostly looks great, except that I still have two comments for the first
patch, and very minor ones for the 2nd and 4th. Hope we can resolve those
soon.

Thanks,
Han

>
> Previous discussion:
>  - RFC1:
https://patchwork.ozlabs.org/project/ovn/patch/20210509140305.1910796-1-frode.nordahl@canonical.com/
>  - RFC2:
https://patchwork.ozlabs.org/project/ovn/cover/20210805145013.3033919-1-frode.nordahl@gmail.com/
>
> Frode Nordahl (4):
>   lib: Add infrastructure for VIF plug providers.
>   ovn-controller: Prepare VIF plug provider infrastructure.
>   controller: Consider plugging VIF on CMS request.
>   NEWS: Add note on infrastructure for VIF plug providers.
>
>  Documentation/automake.mk                     |   2 +
>  Documentation/topics/index.rst                |   1 +
>  .../topics/vif-plug-providers/index.rst       |  32 +
>  .../vif-plug-providers/vif-plug-providers.rst | 209 ++++++
>  NEWS                                          |   7 +
>  acinclude.m4                                  |  49 ++
>  configure.ac                                  |   2 +
>  controller/automake.mk                        |   4 +-
>  controller/ovn-controller.c                   |  84 ++-
>  controller/test-vif-plug.c                    |  72 ++
>  controller/vif-plug.c                         | 632 ++++++++++++++++++
>  controller/vif-plug.h                         |  79 +++
>  lib/automake.mk                               |  10 +-
>  lib/vif-plug-provider.c                       | 204 ++++++
>  lib/vif-plug-provider.h                       | 163 +++++
>  lib/vif-plug-providers/dummy/vif-plug-dummy.c | 120 ++++
>  ovn-architecture.7.xml                        |  35 +-
>  ovn-nb.xml                                    |  21 +
>  tests/automake.mk                             |  13 +-
>  tests/ovn-macros.at                           |   2 +-
>  tests/ovn-vif-plug.at                         |   8 +
>  tests/ovn.at                                  | 121 ++++
>  22 files changed, 1852 insertions(+), 18 deletions(-)
>  create mode 100644 Documentation/topics/vif-plug-providers/index.rst
>  create mode 100644
Documentation/topics/vif-plug-providers/vif-plug-providers.rst
>  create mode 100644 controller/test-vif-plug.c
>  create mode 100644 controller/vif-plug.c
>  create mode 100644 controller/vif-plug.h
>  create mode 100644 lib/vif-plug-provider.c
>  create mode 100644 lib/vif-plug-provider.h
>  create mode 100644 lib/vif-plug-providers/dummy/vif-plug-dummy.c
>  create mode 100644 tests/ovn-vif-plug.at
>
> --
> 2.32.0
>
Frode Nordahl Nov. 5, 2021, 11:52 a.m. UTC | #2
On Fri, Nov 5, 2021 at 8:50 AM Han Zhou <hzhou@ovn.org> wrote:
>
>
>
> On Tue, Nov 2, 2021 at 4:29 AM Frode Nordahl <frode.nordahl@canonical.com> wrote:
> >
> > Introduce infrastructure for VIF plug providers and add feature to
> > ovn-controller to add and remove ports on the integration bridge as
> > directed by CMS through Logical_Switch_Port options.
> >
> > Traditionally it has been the CMSs responsibility to create Virtual
> > Interfaces (VIFs) as part of instance (Container, Pod, Virtual
> > Machine etc.) life cycle, and subsequently manage plug/unplug
> > operations on the Open vSwitch integration bridge.
> >
> > With the advent of NICs connected to multiple distinct CPUs we can
> > have a topology where the instance runs on one host and Open
> > vSwitch and OVN runs on a different host, the smartnic control plane
> > CPU.  The host facing interfaces will be visible to Open vSwitch
> > and OVN as representor ports.
> >
> > The actions necessary for plugging and unplugging the representor
> > port in Open vSwitch running on the smartnic control plane CPU would
> > be the same for every CMS.
> >
> > Hardware or platform specific details for initialization and lookup
> > of representor ports is provided by an plugging provider hosted
> > inside or outside the core OVN repository, and linked at OVN build
> > time.
> >
> > RFC1 -> RFC2:
> >  - Introduce the plug-provider interface, remove hardware/platform
> >    dependent code.
> >  - Add ovsport module.
> >  - Integrate with binding module.
> >  - Split into multiple patches.
> >
> > RFC2 -> v1:
> >  - Extend build system, `--with-plug-provider`.
> >  - Check for required data in b_ctx_in and lbinding data structures.
> >  - Split consider_plug_lport into update and create processing
> >    functions.
> >  - Consider unplug on release where relevant.
> >  - Add ovn-controller `--enable-dummy-plug` option for testing.
> >  - Consistent function naming and move ovsport module to controller/.
> >  - Rename plug-test.c -> plug-dummy.c.
> >  - Add functional- and unit- tests.
> >
> > v1 -> v2:
> >  - Move update to controller/binding.h from patch 6 -> patch 5.
> >  - Fix lint problems reported by 0-day Robot.
> >
> > v2 -> v3:
> >  - Fix build system extension for plug provider.
> >  - Implement DDlog version of northd change.
> >  - Rebase on tip.
> >
> > v3 -> v4:
> >  - sb:Port_Binding:plugged_by -> sb:Port_Binding:requested_chassis.
> >  - Move documentation of plugin specific options to plugin
> >    implementation.
> >  - ovn-northd-ddlog: squash changes into same patch as C version, rework
> >    the DDlog implementation.
> >  - controller: Use requested_chassis column instead of parsing options.
> >  - plug-provider: Remove the extra class instantiation layer and
> >    refcounting.  Add documentation and various tweaks.
> >  - controller: Add engine node for plug provider so that a plugin run
> >    function can run at an appropriate time and also allow feeding back
> >    information about changes that can not be handled incrementally.
> >  - binding: Fix return values for plug functions so they adhere to the
> >    incremental processing engine contract.
> >  - Add support for building in-tree plug providers.
> >
> > v4 -> v5:
> >  - binding: Make some data structures and functions public to allow
> >    other modules to make use of its local binding tracking data
> >    structures.
> >  - Add separate incremental processing engine nodes for the plug
> >    provider.
> >  - Do change handling in the controller/plug module rather than piggy
> >    backing on the binding module.
> >  - Deal with asynchronous notification to plug provider class and
> >    subsequent freeing of data structures when the transaction commits.
> >  - Drop the representor plugin as in-tree plugin to address concerns about
> >    it being Linux-only and having netlink/kernel dependencies.  Will
> >    upstream to ovn-org/ovn-vif instead.
> >
> > v5 -> v6:
> >  - Fix 0-day Robot documentation lexer warning treated as error:
> >     https://mail.openvswitch.org/pipermail/ovs-build/2021-September/017538.html
> >
> > v6 -> v7:
> >  - lport: Fall back to strcmp when requested-chassis option is set
> >           but Port_Binding->requested_chassis is not filled yet.
> >  - physical: Make use of common lport_can_bind_on_this_chassis helper
> >              instead of duplicating the inverse check in the code.
> >  - tests: Add test cases to cover currently uncovered requested-chassis
> >           functionality.
> >  - plug: Make smaps struct members instead of pointers.
> >  - plug: Don't use is_deleted function when not iterating over tracked
> >          records.  Avoid adding multiple delete or update iface requests
> >          to the same transaction.
> >  - plug: For full recompute, don't process until northd has populated
> >          Port_Binding->requested_chassis to avoid thrashing on startup.
> >          Also fix order of processing.
> >  - plug: Handle failed transactions appropriately.
> >  - tests: Clean up and extend plug test case.
> >
> > v7 -> v8:
> >  - Patches 1 through 9 from v7 was merged.
> >  - Use Port_Binding:requested_chassis index when iterating over PBs.
> >  - Drop separate I-P engine nodes for the VIF plug providers at this
> >    time, we can revisit the need for it in the event of VIF plug
> >    providers gaining support for Scalable Functions and the increased
> >    density that entails.
> >  - Rename module, objects and folders from 'plug-' to 'vif-plug-*' to
> >    avoid using too generic names.
> >  - Don't call vif_plug_port_ctx_destroy on vif_plug_port_prepare
> >    failure, make the convention that plug provider should clean up after
> >    itself in that case.
> >  - Simplify the unplug/plug thrashing avoidance on startup with a rather
> >    naive counter to ensure IDL has data prior to attempting any
> >    processing.  Once we identify the root of the issue this can be removed.
>
> Thanks Frode for the revision. It does look cleaner with the I-P related code removed!
> Mostly looks great, except that I still have two comments for the first patch, and very minor ones for the 2nd and 4th. Hope we can resolve those soon.

Thanks Han,

I believe the remaining issues should be resolved now, and I'll put up
a new iteration shortly.