mbox series

[ovs-dev,0/4] Introduce ECMP_nexthop monitor in ovn-controller

Message ID cover.1731495611.git.lorenzo.bianconi@redhat.com
Headers show
Series Introduce ECMP_nexthop monitor in ovn-controller | expand

Message

Lorenzo Bianconi Nov. 13, 2024, 11:04 a.m. UTC
With respect to the previous implementation, OVN is now tracking and
flushing ECMP next-hop ct entries with the next-hop mac address instead
of using an identification number committed by ovn-northd since the previous
approach had some limitations when the traffic is initialized from outside
the ovn cluster. This series is introducing a way to periodically resolve L2
address of the configured next-hops.
Please note IPv6 is not currently supported and it will be added before posting
a formal series. The goal of this series is just to collect feedbacks about the
proposed approach.

Changes in v1:
- Flush just ecmp ct entries in the router ct zone

Changes in RFC v3:
- move ecmp-next-hop-monitor code in controller/ecmp-next-hop-monitor.{c,h}
- use a hamp for send_arp_nd_data since the same nexthop can be used by
  multiple ports/routes
- add new system-ovn test to check when a given next-hop is used by multiple
  ovn-routers/ecmp routes
- use a Port_Binding reference in ECMP_Nexthop
- use nexthop and port as ECMP_Nexthop table index
- move ECMP_Nexthop mac binding update in mac_binding_add_to_sb()
- cosmetics

Changes in RFC v2:
- add IPv6 support

Lorenzo Bianconi (4):
  northd: Introduce ECMP_Nexthop table in SB db.
  pinctrl: Send periodic arp/nd to ecmp next-hops.
  pinctrl: Update ecmp-nexthop mac resolving L2 address.
  ofctrl: Introduce ecmp_nexthop_monitor.

 NEWS                               |   7 +
 controller/automake.mk             |   4 +-
 controller/ecmp-next-hop-monitor.c | 184 ++++++++++
 controller/ecmp-next-hop-monitor.h |  25 ++
 controller/ofctrl.c                |   7 +
 controller/ofctrl.h                |   3 +
 controller/ovn-controller.8.xml    |  10 +
 controller/ovn-controller.c        |  10 +
 controller/pinctrl.c               | 368 +++++++++++++++++++-
 controller/pinctrl.h               |   3 +
 include/ovn/logical-fields.h       |   3 +
 northd/en-northd.c                 |  29 ++
 northd/en-northd.h                 |   4 +
 northd/inc-proc-northd.c           |   9 +-
 northd/northd.c                    | 104 +++++-
 northd/northd.h                    |   6 +
 ovn-sb.ovsschema                   |  17 +-
 ovn-sb.xml                         |  31 ++
 tests/ovn-northd.at                |  33 +-
 tests/system-ovn.at                | 526 +++++++++++++++++++++++++++++
 20 files changed, 1355 insertions(+), 28 deletions(-)
 create mode 100644 controller/ecmp-next-hop-monitor.c
 create mode 100644 controller/ecmp-next-hop-monitor.h

Comments

Dumitru Ceara Dec. 18, 2024, 1:30 p.m. UTC | #1
On 11/13/24 12:04 PM, Lorenzo Bianconi wrote:
> With respect to the previous implementation, OVN is now tracking and
> flushing ECMP next-hop ct entries with the next-hop mac address instead
> of using an identification number committed by ovn-northd since the previous
> approach had some limitations when the traffic is initialized from outside
> the ovn cluster. This series is introducing a way to periodically resolve L2
> address of the configured next-hops.
> Please note IPv6 is not currently supported and it will be added before posting
> a formal series. The goal of this series is just to collect feedbacks about the
> proposed approach.
> 

Hi Lorenzo,

I reviewed the patches and shared some feedback but I also wanted to
share that with this patch applied the ovn-kubernetes e2e tests that
exercise the ecmp-symmetric-reply functionality fail:

https://github.com/dceara/ovn-kubernetes/actions/runs/12392477221/job/34592485272

Summarizing 2 Failures:
  [FAIL] External Gateway With annotations e2e multiple external gateway
stale conntrack entry deletion validation Namespace annotation: Should
validate conntrack entry deletion for TCP/UDP traffic via multiple
external gateways a.k.a ECMP routes [It] IPV4 udp

/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/external_gateways.go:785
  [FAIL] External Gateway With annotations e2e multiple external gateway
stale conntrack entry deletion validation Namespace annotation: Should
validate conntrack entry deletion for TCP/UDP traffic via multiple
external gateways a.k.a ECMP routes [It] IPV4 tcp

/home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/external_gateways.go:785

This needs to be investigated.  Maybe the tests do something wrong or
maybe the patch series itself has some issues.

Regards,
Dumitru

> Changes in v1:
> - Flush just ecmp ct entries in the router ct zone
> 
> Changes in RFC v3:
> - move ecmp-next-hop-monitor code in controller/ecmp-next-hop-monitor.{c,h}
> - use a hamp for send_arp_nd_data since the same nexthop can be used by
>   multiple ports/routes
> - add new system-ovn test to check when a given next-hop is used by multiple
>   ovn-routers/ecmp routes
> - use a Port_Binding reference in ECMP_Nexthop
> - use nexthop and port as ECMP_Nexthop table index
> - move ECMP_Nexthop mac binding update in mac_binding_add_to_sb()
> - cosmetics
> 
> Changes in RFC v2:
> - add IPv6 support
> 
> Lorenzo Bianconi (4):
>   northd: Introduce ECMP_Nexthop table in SB db.
>   pinctrl: Send periodic arp/nd to ecmp next-hops.
>   pinctrl: Update ecmp-nexthop mac resolving L2 address.
>   ofctrl: Introduce ecmp_nexthop_monitor.
> 
>  NEWS                               |   7 +
>  controller/automake.mk             |   4 +-
>  controller/ecmp-next-hop-monitor.c | 184 ++++++++++
>  controller/ecmp-next-hop-monitor.h |  25 ++
>  controller/ofctrl.c                |   7 +
>  controller/ofctrl.h                |   3 +
>  controller/ovn-controller.8.xml    |  10 +
>  controller/ovn-controller.c        |  10 +
>  controller/pinctrl.c               | 368 +++++++++++++++++++-
>  controller/pinctrl.h               |   3 +
>  include/ovn/logical-fields.h       |   3 +
>  northd/en-northd.c                 |  29 ++
>  northd/en-northd.h                 |   4 +
>  northd/inc-proc-northd.c           |   9 +-
>  northd/northd.c                    | 104 +++++-
>  northd/northd.h                    |   6 +
>  ovn-sb.ovsschema                   |  17 +-
>  ovn-sb.xml                         |  31 ++
>  tests/ovn-northd.at                |  33 +-
>  tests/system-ovn.at                | 526 +++++++++++++++++++++++++++++
>  20 files changed, 1355 insertions(+), 28 deletions(-)
>  create mode 100644 controller/ecmp-next-hop-monitor.c
>  create mode 100644 controller/ecmp-next-hop-monitor.h
>
Dumitru Ceara Dec. 20, 2024, 11:52 a.m. UTC | #2
On 12/18/24 2:30 PM, Dumitru Ceara wrote:
> On 11/13/24 12:04 PM, Lorenzo Bianconi wrote:
>> With respect to the previous implementation, OVN is now tracking and
>> flushing ECMP next-hop ct entries with the next-hop mac address instead
>> of using an identification number committed by ovn-northd since the previous
>> approach had some limitations when the traffic is initialized from outside
>> the ovn cluster. This series is introducing a way to periodically resolve L2
>> address of the configured next-hops.
>> Please note IPv6 is not currently supported and it will be added before posting
>> a formal series. The goal of this series is just to collect feedbacks about the
>> proposed approach.
>>
> 
> Hi Lorenzo,
> 
> I reviewed the patches and shared some feedback but I also wanted to
> share that with this patch applied the ovn-kubernetes e2e tests that
> exercise the ecmp-symmetric-reply functionality fail:
> 
> https://github.com/dceara/ovn-kubernetes/actions/runs/12392477221/job/34592485272
> 
> Summarizing 2 Failures:
>   [FAIL] External Gateway With annotations e2e multiple external gateway
> stale conntrack entry deletion validation Namespace annotation: Should
> validate conntrack entry deletion for TCP/UDP traffic via multiple
> external gateways a.k.a ECMP routes [It] IPV4 udp
> 
> /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/external_gateways.go:785
>   [FAIL] External Gateway With annotations e2e multiple external gateway
> stale conntrack entry deletion validation Namespace annotation: Should
> validate conntrack entry deletion for TCP/UDP traffic via multiple
> external gateways a.k.a ECMP routes [It] IPV4 tcp
> 
> /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/external_gateways.go:785
> 
> This needs to be investigated.  Maybe the tests do something wrong or
> maybe the patch series itself has some issues.

After some debugging we found that the problem is that ovn-kubernetes
has a bug.  Assuming it had configured an ECMP symmetric reply route
with multiple paths, all updates to that route (path removal) actually
translate to _independent_ transactions that:
1. remove ALL paths from the NB
2. re-add remaining paths to the NB

Obviously, if ovn-northd runs between transaction "1" and "2", the NB
doesn't contain any ECMP paths anymore so ovn-controller will flush
conntrack.

IMO this has to be fixed in ovn-kubernetes but until that happen we
can't ship the feature in OVN (or at least we can't ship it enabled by
default).

CCing some ovn-kubernetes maintainers.

Regards,
Dumitru