diff mbox series

[ovs-dev,1/4] ovn-ic: Do not try to advertise lla next_hops.

Message ID 20240827125554.3137819-2-xsimonar@redhat.com
State New
Headers show
Series ovn-ic patches. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Xavier Simonart Aug. 27, 2024, 12:55 p.m. UTC
Since [1] we do not learn routes with link-local next-hops.
However, routes with lla next hops were still created and deleted
from ovn-ic-sb, causing high cpu usage on ovn-ic and ovn-sc-sb,
as well as high traffic on physical interface.

[1] cb0e2b3f44da ("ovn-ic: do not learn routes with link-local next-hops")

Reported-at: https://issues.redhat.com/browse/FDP-619

Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
---
 ic/ovn-ic.c     |   5 +++
 tests/ovn-ic.at | 114 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 119 insertions(+)

Comments

Roberto Bartzen Acosta Aug. 29, 2024, 2:09 p.m. UTC | #1
Hi Xavier,

Thanks for your patch.
I have a couple of comments inline below.

Em ter., 27 de ago. de 2024 às 09:56, Xavier Simonart <xsimonar@redhat.com>
escreveu:

> Since [1] we do not learn routes with link-local next-hops.
> However, routes with lla next hops were still created and deleted
> from ovn-ic-sb, causing high cpu usage on ovn-ic and ovn-sc-sb,
> as well as high traffic on physical interface.
>
> [1] cb0e2b3f44da ("ovn-ic: do not learn routes with link-local next-hops")
>
> Reported-at: https://issues.redhat.com/browse/FDP-619
>
> Signed-off-by: Xavier Simonart <xsimonar@redhat.com>
> ---
>  ic/ovn-ic.c     |   5 +++
>  tests/ovn-ic.at | 114 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 119 insertions(+)
>
> diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
> index 69bac4ab2..e91010892 100644
> --- a/ic/ovn-ic.c
> +++ b/ic/ovn-ic.c
> @@ -1227,6 +1227,11 @@ add_network_to_routes_ad(struct hmap *routes_ad,
> const char *network,
>          return;
>      }
>
> +    if (in6_is_lla(&nexthop)) {
> +        VLOG_DBG("Route ad: skip lla nexthop of lrp %s.", nb_lrp->name);
> +        return;
> +    }
> +
>

I understand the main idea, and that makes sense to skip this type of route
advertisement. However, the way this was implemented and the filter in
learning as a whole is very confusing to me.
Imagine multi-tenant scenarios in which we will create LRPs connected to
TS, we don't know the internal IPv6 address scope of the tenant subnets.
So, the only address range that makes sense to use for the peer-link
between TS is the LLA. This way we guarantee that there will be no overlap
with internal router subnets.

According to RFC [1] the address space for LLA is a /10. However, in the
OVS lib implementation of the in6_is_lla function a /64 is expected (EUI-64
based -> with 54 additional zeros to the LLA start address "fe80").

In our use case, we use the base LLA with a network modifier (inside the
additional /54), ensuring that the address does not conflict with anyone
else, but can be used for learning and advertising.
e.g.

IPv6 Routes
Route Table <main>:
     2001:db8:1235:3::/64                fe80:1::21 dst-ip (learned) ecmp
     2001:db8:1235:3::/64                fe80:2::22 dst-ip (learned) ecmp
     2001:db8:1235:3::/64                fe80:3::23 dst-ip (learned) ecmp
     2001:db8:1235:3::/64                fe80:4::24 dst-ip (learned) ecmp
     2001:db8:1236:3::/64                fe80:1::31 dst-ip (learned) ecmp
     2001:db8:1236:3::/64                fe80:2::32 dst-ip (learned) ecmp
     2001:db8:1236:3::/64                fe80:3::33 dst-ip (learned) ecmp
     2001:db8:1236:3::/64                fe80:4::34 dst-ip (learned) ecmp
                                  ::/0              2001:db8:1:: dst-ip

To create this scenario I manually configure an IPv6 address based on fe80
in the TS/LRP's networks, without using auto-addressing.

So, could you or anyone else involved with this LLA nexthop filter use case
write test cases or filters in code to ensure that only auto-generated
EUI-64 LLA nexthop addresses are processed in the filters?

Best regards,
Roberto

[1] https://datatracker.ietf.org/doc/html/rfc4291#section-2.4



>      if (VLOG_IS_DBG_ENABLED()) {
>          struct ds msg = DS_EMPTY_INITIALIZER;
>
> diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
> index 8497cb194..4c0415d63 100644
> --- a/tests/ovn-ic.at
> +++ b/tests/ovn-ic.at
> @@ -2342,3 +2342,117 @@ OVN_CLEANUP_SBOX([hv2], ["/IGMP Querier enabled
> without a valid IPv4/d
>  OVN_CLEANUP_IC([az1],[az2])
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([interconnection - LLA])
> +AT_KEYWORDS([IP])
> +
> +# Logical network:
> +#
> +#       AZ1                     |                     AZ2
> +#   ---------------------------------------------------------------------
> +#    LS1P1  --- LS1 --- LR1 --- TS --- LR2 --- LS2 --- LS2P1
> +#   ---------------------------------------------------------------------
> +#
> +
> +ovn_init_ic_db
> +ovn_start az1
> +ovn_start az2
> +
> +net_add n1
> +
> +sim_add hv1
> +as hv1
> +check ovs-vsctl add-br br-phys
> +ovn_az_attach az1 n1 br-phys 192.168.1.1 16
> +check ovs-vsctl -- add-port br-int hv1-vif1 \
> +    -- set interface hv1-vif1 external-ids:iface-id=ls1p1 \
> +       options:tx_pcap=hv1/vif1-tx.pcap \
> +       options:rxq_pcap=hv1/vif1-rx.pcap
> +check ovs-vsctl set open . external-ids:ovn-is-interconn=true
> +
> +sim_add hv2
> +as hv2
> +check ovs-vsctl add-br br-phys
> +ovn_az_attach az2 n1 br-phys 192.168.2.1 16
> +check ovs-vsctl -- add-port br-int hv2-vif1 \
> +    -- set interface hv2-vif1 external-ids:iface-id=ls2p1 \
> +       options:tx_pcap=hv2/vif1-tx.pcap \
> +       options:rxq_pcap=hv2/vif1-rx.pcap
> +check ovs-vsctl set open . external-ids:ovn-is-interconn=true
> +
> +AT_CHECK([ovn-ic-nbctl --wait=sb create Transit_Switch name=ts1], [0],
> [ignore])
> +check ovn_as az1 ovn-nbctl wait-until logical_switch ts1
> +check ovn_as az2 ovn-nbctl wait-until logical_switch ts1
> +
> +ovn_as az1
> +check ovn-nbctl lr-add lr1 \
> +    -- lrp-add lr1 lr1-ts1 00:aa:aa:aa:aa:01 169.254.100.1/24
> 169:254:100::1/64 \
> +    -- lrp-add lr1 lr1-ls1 00:00:00:00:01:fe 172.16.1.254/24
> 172:16:1::254/64\
> +    -- lrp-set-gateway-chassis lr1-ts1 hv1
> +check ovn-nbctl ls-add ls1 \
> +    -- lsp-add ls1 ls1-lr1 \
> +    -- lsp-set-addresses ls1-lr1 router \
> +    -- lsp-set-type ls1-lr1 router \
> +    -- lsp-set-options ls1-lr1 router-port=lr1-ls1 \
> +    -- lsp-add ls1 ls1p1
> +check ovn-nbctl lsp-add ts1 ts1-lr1 \
> +    -- lsp-set-addresses ts1-lr1 router \
> +    -- lsp-set-type ts1-lr1 router \
> +    -- lsp-set-options ts1-lr1 router-port=lr1-ts1
> +wait_for_ports_up
> +
> +ovn_as az2
> +check ovn-nbctl lr-add lr2 \
> +    -- lrp-add lr2 lr2-ts1 00:aa:aa:aa:aa:02 169.254.100.2/24
> 169:254:100::2/64 \
> +    -- lrp-add lr2 lr2-ls2 00:00:00:00:02:fe 172.16.2.254/24
> 172:16:2::254/64 \
> +    -- lrp-set-gateway-chassis lr2-ts1 hv2
> +check ovn-nbctl ls-add ls2 \
> +    -- lsp-add ls2 ls2-lr2 \
> +    -- lsp-set-addresses ls2-lr2 router \
> +    -- lsp-set-type ls2-lr2 router \
> +    -- lsp-set-options ls2-lr2 router-port=lr2-ls2 \
> +    -- lsp-add ls2 ls2p1
> +check ovn-nbctl lsp-add ts1 ts1-lr2 \
> +    -- lsp-set-addresses ts1-lr2 router \
> +    -- lsp-set-type ts1-lr2 router \
> +    -- lsp-set-options ts1-lr2 router-port=lr2-ts1
> +
> +wait_for_ports_up
> +check ovn-ic-nbctl --wait=sb sync
> +
> +ovn_as az1
> +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 hv1
> +check ovn-nbctl set NB_Global . options:ic-route-adv=true
> options:ic-route-learn=true
> +
> +ovn_as az2
> +check ovn-nbctl lrp-set-gateway-chassis lr2-ts1 hv2
> +check ovn-nbctl set NB_Global . options:ic-route-adv=true
> options:ic-route-learn=true
> +
> +check ovn_as az1 ovn-nbctl --wait=hv sync
> +check ovn_as az2 ovn-nbctl --wait=hv sync
> +
> +# Pre-populate the hypervisors' ARP tables so that we don't lose any
> +# packets for ARP resolution (native tunneling doesn't queue packets
> +# for ARP resolution).
> +OVN_POPULATE_ARP
> +
> +ovn_as az1 check ovn-nbctl remove logical_router_port lr1-ts1 networks
> "169\:254\:100\:\:1/64"
> +
> +# ic-sb route should not contain lla next hops.
> +# Such routes used to be created and deleted in ic-sb.
> +# Do not use OVS_WAIT_WHILE or WAIT_UNTIL as the wrong (lla) route
> appears and disappears.
> +for i in $(seq 1 50); do
> +    AT_CHECK([ovn-ic-sbctl list route | grep "fe80:"], [1])
> +done
> +
> +OVN_CLEANUP_SBOX([hv1])
> +
> +OVN_CLEANUP_SBOX([hv2])
> +
> +OVN_CLEANUP_IC([az1],[az2])
> +AT_CLEANUP
> +])
> +
> +
> +
> --
> 2.31.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
diff mbox series

Patch

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 69bac4ab2..e91010892 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -1227,6 +1227,11 @@  add_network_to_routes_ad(struct hmap *routes_ad, const char *network,
         return;
     }
 
+    if (in6_is_lla(&nexthop)) {
+        VLOG_DBG("Route ad: skip lla nexthop of lrp %s.", nb_lrp->name);
+        return;
+    }
+
     if (VLOG_IS_DBG_ENABLED()) {
         struct ds msg = DS_EMPTY_INITIALIZER;
 
diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
index 8497cb194..4c0415d63 100644
--- a/tests/ovn-ic.at
+++ b/tests/ovn-ic.at
@@ -2342,3 +2342,117 @@  OVN_CLEANUP_SBOX([hv2], ["/IGMP Querier enabled without a valid IPv4/d
 OVN_CLEANUP_IC([az1],[az2])
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([interconnection - LLA])
+AT_KEYWORDS([IP])
+
+# Logical network:
+#
+#       AZ1                     |                     AZ2
+#   ---------------------------------------------------------------------
+#    LS1P1  --- LS1 --- LR1 --- TS --- LR2 --- LS2 --- LS2P1
+#   ---------------------------------------------------------------------
+#
+
+ovn_init_ic_db
+ovn_start az1
+ovn_start az2
+
+net_add n1
+
+sim_add hv1
+as hv1
+check ovs-vsctl add-br br-phys
+ovn_az_attach az1 n1 br-phys 192.168.1.1 16
+check ovs-vsctl -- add-port br-int hv1-vif1 \
+    -- set interface hv1-vif1 external-ids:iface-id=ls1p1 \
+       options:tx_pcap=hv1/vif1-tx.pcap \
+       options:rxq_pcap=hv1/vif1-rx.pcap
+check ovs-vsctl set open . external-ids:ovn-is-interconn=true
+
+sim_add hv2
+as hv2
+check ovs-vsctl add-br br-phys
+ovn_az_attach az2 n1 br-phys 192.168.2.1 16
+check ovs-vsctl -- add-port br-int hv2-vif1 \
+    -- set interface hv2-vif1 external-ids:iface-id=ls2p1 \
+       options:tx_pcap=hv2/vif1-tx.pcap \
+       options:rxq_pcap=hv2/vif1-rx.pcap
+check ovs-vsctl set open . external-ids:ovn-is-interconn=true
+
+AT_CHECK([ovn-ic-nbctl --wait=sb create Transit_Switch name=ts1], [0], [ignore])
+check ovn_as az1 ovn-nbctl wait-until logical_switch ts1
+check ovn_as az2 ovn-nbctl wait-until logical_switch ts1
+
+ovn_as az1
+check ovn-nbctl lr-add lr1 \
+    -- lrp-add lr1 lr1-ts1 00:aa:aa:aa:aa:01 169.254.100.1/24 169:254:100::1/64 \
+    -- lrp-add lr1 lr1-ls1 00:00:00:00:01:fe 172.16.1.254/24 172:16:1::254/64\
+    -- lrp-set-gateway-chassis lr1-ts1 hv1
+check ovn-nbctl ls-add ls1 \
+    -- lsp-add ls1 ls1-lr1 \
+    -- lsp-set-addresses ls1-lr1 router \
+    -- lsp-set-type ls1-lr1 router \
+    -- lsp-set-options ls1-lr1 router-port=lr1-ls1 \
+    -- lsp-add ls1 ls1p1
+check ovn-nbctl lsp-add ts1 ts1-lr1 \
+    -- lsp-set-addresses ts1-lr1 router \
+    -- lsp-set-type ts1-lr1 router \
+    -- lsp-set-options ts1-lr1 router-port=lr1-ts1
+wait_for_ports_up
+
+ovn_as az2
+check ovn-nbctl lr-add lr2 \
+    -- lrp-add lr2 lr2-ts1 00:aa:aa:aa:aa:02 169.254.100.2/24 169:254:100::2/64 \
+    -- lrp-add lr2 lr2-ls2 00:00:00:00:02:fe 172.16.2.254/24 172:16:2::254/64 \
+    -- lrp-set-gateway-chassis lr2-ts1 hv2
+check ovn-nbctl ls-add ls2 \
+    -- lsp-add ls2 ls2-lr2 \
+    -- lsp-set-addresses ls2-lr2 router \
+    -- lsp-set-type ls2-lr2 router \
+    -- lsp-set-options ls2-lr2 router-port=lr2-ls2 \
+    -- lsp-add ls2 ls2p1
+check ovn-nbctl lsp-add ts1 ts1-lr2 \
+    -- lsp-set-addresses ts1-lr2 router \
+    -- lsp-set-type ts1-lr2 router \
+    -- lsp-set-options ts1-lr2 router-port=lr2-ts1
+
+wait_for_ports_up
+check ovn-ic-nbctl --wait=sb sync
+
+ovn_as az1
+check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 hv1
+check ovn-nbctl set NB_Global . options:ic-route-adv=true options:ic-route-learn=true
+
+ovn_as az2
+check ovn-nbctl lrp-set-gateway-chassis lr2-ts1 hv2
+check ovn-nbctl set NB_Global . options:ic-route-adv=true options:ic-route-learn=true
+
+check ovn_as az1 ovn-nbctl --wait=hv sync
+check ovn_as az2 ovn-nbctl --wait=hv sync
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+OVN_POPULATE_ARP
+
+ovn_as az1 check ovn-nbctl remove logical_router_port lr1-ts1 networks "169\:254\:100\:\:1/64"
+
+# ic-sb route should not contain lla next hops.
+# Such routes used to be created and deleted in ic-sb.
+# Do not use OVS_WAIT_WHILE or WAIT_UNTIL as the wrong (lla) route appears and disappears.
+for i in $(seq 1 50); do
+    AT_CHECK([ovn-ic-sbctl list route | grep "fe80:"], [1])
+done
+
+OVN_CLEANUP_SBOX([hv1])
+
+OVN_CLEANUP_SBOX([hv2])
+
+OVN_CLEANUP_IC([az1],[az2])
+AT_CLEANUP
+])
+
+
+