diff mbox series

[ovs-dev,v2] northd: Respect --ecmp-symmetric-reply for single routes.

Message ID 20240920153635.901447-1-rosemarie@redhat.com
State Accepted
Headers show
Series [ovs-dev,v2] northd: Respect --ecmp-symmetric-reply for single routes. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Rosemarie O'Riorden Sept. 20, 2024, 3:36 p.m. UTC
When preparing to build ECMP and static route flows, routes are sorted
into unique routes (meaning they are not part of a group) or they are
added to EMCP groups. Then, ECMP route flows are built out of the
groups, and static route flows are built out of the unique routes.
However, 'unique routes' include ones that use the
--ecmp-symmetric-reply flag, meaning that they may not be added to an
ECMP group, and thus ECMP symmetric reply would not be used for those
flows.

For example, if one route is added and traffic is started, and then
later another route is added, the already flowing traffic might be
rerouted since it wasn't conntracked initially. This could break
symmetric reply with traffic using a different next-hop than before.

This change makes it so that when the --ecmp-symmetric-reply flag is
used, even for unique routes, an ECMP group is created which they are
added to. Thus they are added to the ECMP route flow, rather than
static. This allows ECMP groups to persist even when there is only one
route.

Edited documentation to support this change.
Also updated incorrect actions in documentation.

Fixes: 4fdca656857d ("Add ECMP symmetric replies.")
Reported-at: https://issues.redhat.com/browse/FDP-786
Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
---
v2:                                                                             
 - Update ECMP-symmetric-reply system tests to verify correct behavior.         
---
 northd/northd.c         | 33 ++++++++++++++++++++++-----------
 northd/ovn-northd.8.xml | 13 ++++++++++++-
 tests/ovn-northd.at     |  5 ++++-
 tests/system-ovn.at     | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 70 insertions(+), 13 deletions(-)

Comments

Dumitru Ceara Sept. 20, 2024, 7 p.m. UTC | #1
On 9/20/24 17:36, Rosemarie O'Riorden wrote:
> When preparing to build ECMP and static route flows, routes are sorted
> into unique routes (meaning they are not part of a group) or they are
> added to EMCP groups. Then, ECMP route flows are built out of the
> groups, and static route flows are built out of the unique routes.
> However, 'unique routes' include ones that use the
> --ecmp-symmetric-reply flag, meaning that they may not be added to an
> ECMP group, and thus ECMP symmetric reply would not be used for those
> flows.
> 
> For example, if one route is added and traffic is started, and then
> later another route is added, the already flowing traffic might be
> rerouted since it wasn't conntracked initially. This could break
> symmetric reply with traffic using a different next-hop than before.
> 
> This change makes it so that when the --ecmp-symmetric-reply flag is
> used, even for unique routes, an ECMP group is created which they are
> added to. Thus they are added to the ECMP route flow, rather than
> static. This allows ECMP groups to persist even when there is only one
> route.
> 
> Edited documentation to support this change.
> Also updated incorrect actions in documentation.
> 
> Fixes: 4fdca656857d ("Add ECMP symmetric replies.")
> Reported-at: https://issues.redhat.com/browse/FDP-786
> Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
> ---

Looks good to me, thanks!

Acked-by: Dumitru Ceara <dceara@redhat.com>

Regards,
Dumitru
Dumitru Ceara Sept. 26, 2024, 9:38 a.m. UTC | #2
On 9/20/24 21:00, Dumitru Ceara wrote:
> On 9/20/24 17:36, Rosemarie O'Riorden wrote:
>> When preparing to build ECMP and static route flows, routes are sorted
>> into unique routes (meaning they are not part of a group) or they are
>> added to EMCP groups. Then, ECMP route flows are built out of the
>> groups, and static route flows are built out of the unique routes.
>> However, 'unique routes' include ones that use the
>> --ecmp-symmetric-reply flag, meaning that they may not be added to an
>> ECMP group, and thus ECMP symmetric reply would not be used for those
>> flows.
>>
>> For example, if one route is added and traffic is started, and then
>> later another route is added, the already flowing traffic might be
>> rerouted since it wasn't conntracked initially. This could break
>> symmetric reply with traffic using a different next-hop than before.
>>
>> This change makes it so that when the --ecmp-symmetric-reply flag is
>> used, even for unique routes, an ECMP group is created which they are
>> added to. Thus they are added to the ECMP route flow, rather than
>> static. This allows ECMP groups to persist even when there is only one
>> route.
>>
>> Edited documentation to support this change.
>> Also updated incorrect actions in documentation.
>>
>> Fixes: 4fdca656857d ("Add ECMP symmetric replies.")
>> Reported-at: https://issues.redhat.com/browse/FDP-786
>> Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
>> ---
> 
> Looks good to me, thanks!
> 
> Acked-by: Dumitru Ceara <dceara@redhat.com>
> 

I actually went ahead and changed my ack to a signed-off-by and applied
the patch to main, 24.09, 24.03 and 23.09.

I did have to slightly adapt the test code on 24.03 and 23.09 because we
don't have 9c3ae6f27475 ("northd: Add ECMP symmetric replies for
egress.") there.

Regards,
Dumitru
Dumitru Ceara Sept. 27, 2024, 7:25 a.m. UTC | #3
On 9/26/24 11:38, Dumitru Ceara wrote:
> On 9/20/24 21:00, Dumitru Ceara wrote:
>> On 9/20/24 17:36, Rosemarie O'Riorden wrote:
>>> When preparing to build ECMP and static route flows, routes are sorted
>>> into unique routes (meaning they are not part of a group) or they are
>>> added to EMCP groups. Then, ECMP route flows are built out of the
>>> groups, and static route flows are built out of the unique routes.
>>> However, 'unique routes' include ones that use the
>>> --ecmp-symmetric-reply flag, meaning that they may not be added to an
>>> ECMP group, and thus ECMP symmetric reply would not be used for those
>>> flows.
>>>
>>> For example, if one route is added and traffic is started, and then
>>> later another route is added, the already flowing traffic might be
>>> rerouted since it wasn't conntracked initially. This could break
>>> symmetric reply with traffic using a different next-hop than before.
>>>
>>> This change makes it so that when the --ecmp-symmetric-reply flag is
>>> used, even for unique routes, an ECMP group is created which they are
>>> added to. Thus they are added to the ECMP route flow, rather than
>>> static. This allows ECMP groups to persist even when there is only one
>>> route.
>>>
>>> Edited documentation to support this change.
>>> Also updated incorrect actions in documentation.
>>>
>>> Fixes: 4fdca656857d ("Add ECMP symmetric replies.")
>>> Reported-at: https://issues.redhat.com/browse/FDP-786
>>> Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
>>> ---
>>
>> Looks good to me, thanks!
>>
>> Acked-by: Dumitru Ceara <dceara@redhat.com>
>>
> 
> I actually went ahead and changed my ack to a signed-off-by and applied
> the patch to main, 24.09, 24.03 and 23.09.

I also applied this to 23.06.

Regards,
Dumitru
diff mbox series

Patch

diff --git a/northd/northd.c b/northd/northd.c
index 61f4f27ec..19f79da3f 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -11565,21 +11565,27 @@  build_ecmp_route_flow(struct lflow_table *lflows, struct ovn_datapath *od,
 
     struct ds actions = DS_EMPTY_INITIALIZER;
     ds_put_format(&actions, "ip.ttl--; flags.loopback = 1; %s = %"PRIu16
-                  "; %s = select(", REG_ECMP_GROUP_ID, eg->id,
-                  REG_ECMP_MEMBER_ID);
+                  "; %s = ", REG_ECMP_GROUP_ID, eg->id, REG_ECMP_MEMBER_ID);
 
-    bool is_first = true;
-    LIST_FOR_EACH (er, list_node, &eg->route_list) {
-        if (is_first) {
-            is_first = false;
-        } else {
-            ds_put_cstr(&actions, ", ");
+    if (!ovs_list_is_singleton(&eg->route_list)) {
+        bool is_first = true;
+
+        ds_put_cstr(&actions, "select(");
+        LIST_FOR_EACH (er, list_node, &eg->route_list) {
+            if (is_first) {
+                is_first = false;
+            } else {
+                ds_put_cstr(&actions, ", ");
+            }
+            ds_put_format(&actions, "%"PRIu16, er->id);
         }
-        ds_put_format(&actions, "%"PRIu16, er->id);
+        ds_put_cstr(&actions, ");");
+    } else {
+        er = CONTAINER_OF(ovs_list_front(&eg->route_list),
+                          struct ecmp_route_list_node, list_node);
+        ds_put_format(&actions, "%"PRIu16"; next;", er->id);
     }
 
-    ds_put_cstr(&actions, ");");
-
     ovn_lflow_add(lflows, od, S_ROUTER_IN_IP_ROUTING, priority,
                   ds_cstr(&route_match), ds_cstr(&actions),
                   lflow_ref);
@@ -13541,6 +13547,11 @@  build_static_route_flows_for_lrouter(
                 if (group) {
                     ecmp_groups_add_route(group, route);
                 }
+            } else if (route->ecmp_symmetric_reply) {
+                /* Traffic for symmetric reply routes has to be conntracked
+                 * even if there is only one next-hop, in case another next-hop
+                 * is added later. */
+                ecmp_groups_add(&ecmp_groups, route);
             } else {
                 unique_routes_add(&unique_routes, route);
             }
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index ede38882a..ef5cd0c8c 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -4328,7 +4328,18 @@  next;
 ip.ttl--;
 flags.loopback = 1;
 reg8[0..15] = <var>GID</var>;
-select(reg8[16..31], <var>MID1</var>, <var>MID2</var>, ...);
+reg8[16..31] = select(<var>MID1</var>, <var>MID2</var>, ...);
+        </pre>
+        <p>
+          However, when there is only one route in an ECMP group, group actions
+          will be:
+        </p>
+
+        <pre>
+ip.ttl--;
+flags.loopback = 1;
+reg8[0..15] = <var>GID</var>;
+reg8[16..31] = <var>MID1</var>);
         </pre>
       </li>
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index dcc3dbbc3..d459c23c0 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -6799,20 +6799,23 @@  check ovn-nbctl lsp-set-type public-lr0 router
 check ovn-nbctl lsp-set-addresses public-lr0 router
 check ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public
 
+# ECMP flows will be added even if there is only one next-hop.
 check ovn-nbctl --wait=sb --ecmp-symmetric-reply lr-route-add lr0 1.0.0.1 192.168.0.10
 
 ovn-sbctl dump-flows lr0 > lr0flows
 
 AT_CHECK([grep -w "lr_in_ip_routing" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_in_ip_routing   ), priority=0    , match=(1), action=(drop;)
+  table=??(lr_in_ip_routing   ), priority=10300, match=(ct_mark.ecmp_reply_port == 1 && reg7 == 0 && ip4.dst == 1.0.0.1/32), action=(ip.ttl--; flags.loopback = 1; eth.src = 00:00:20:20:12:13; reg1 = 192.168.0.1; outport = "lr0-public"; next;)
   table=??(lr_in_ip_routing   ), priority=10550, match=(nd_rs || nd_ra), action=(drop;)
   table=??(lr_in_ip_routing   ), priority=194  , match=(inport == "lr0-public" && ip6.dst == fe80::/64), action=(ip.ttl--; reg8[[0..15]] = 0; xxreg0 = ip6.dst; xxreg1 = fe80::200:20ff:fe20:1213; eth.src = 00:00:20:20:12:13; outport = "lr0-public"; flags.loopback = 1; next;)
   table=??(lr_in_ip_routing   ), priority=74   , match=(ip4.dst == 192.168.0.0/24), action=(ip.ttl--; reg8[[0..15]] = 0; reg0 = ip4.dst; reg1 = 192.168.0.1; eth.src = 00:00:20:20:12:13; outport = "lr0-public"; flags.loopback = 1; next;)
-  table=??(lr_in_ip_routing   ), priority=97   , match=(reg7 == 0 && ip4.dst == 1.0.0.1/32), action=(ip.ttl--; reg8[[0..15]] = 0; reg0 = 192.168.0.10; reg1 = 192.168.0.1; eth.src = 00:00:20:20:12:13; outport = "lr0-public"; flags.loopback = 1; next;)
+  table=??(lr_in_ip_routing   ), priority=97   , match=(reg7 == 0 && ip4.dst == 1.0.0.1/32), action=(ip.ttl--; flags.loopback = 1; reg8[[0..15]] = 1; reg8[[16..31]] = 1; next;)
 ])
 
 AT_CHECK([grep -e "lr_in_ip_routing_ecmp" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_in_ip_routing_ecmp), priority=0    , match=(1), action=(drop;)
+  table=??(lr_in_ip_routing_ecmp), priority=100  , match=(reg8[[0..15]] == 1 && reg8[[16..31]] == 1), action=(reg0 = 192.168.0.10; reg1 = 192.168.0.1; eth.src = 00:00:20:20:12:13; outport = "lr0-public"; next;)
   table=??(lr_in_ip_routing_ecmp), priority=150  , match=(reg8[[0..15]] == 0), action=(next;)
 ])
 
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index e065b1079..cb72bcdd2 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -6241,6 +6241,22 @@  sed -e 's/mark=[[0-9]]*/mark=<cleared>/' | sort], [0], [dnl
 tcp,orig=(src=172.16.0.1,dst=10.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=10.0.0.2,dst=172.16.0.1,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=<cleared>,labels=0x401020500000000,protoinfo=(state=<cleared>)
 ])
 
+# Now remove one ECMP route and check that traffic is still being conntracked.
+check ovn-nbctl --policy="src-ip" lr-route-del R1 10.0.0.0/24 20.0.0.3
+check ovn-nbctl --wait=hv sync
+AT_CHECK([ovs-appctl dpctl/flush-conntrack])
+NETNS_DAEMONIZE([bob1], [nc -l -k 8081], [bob2.pid])
+NS_CHECK_EXEC([alice1], [nc -z 172.16.0.1 8081], [0])
+NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 172.16.0.1 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep 0x401020500000000 | FORMAT_CT(172.16.0.1) | \
+sed -e 's/zone=[[0-9]]*/zone=<cleared>/' |
+sed -e 's/mark=[[0-9]]*/mark=<cleared>/' | sort], [0], [dnl
+tcp,orig=(src=172.16.0.1,dst=10.0.0.2,sport=<cleared>,dport=<cleared>),reply=(src=10.0.0.2,dst=172.16.0.1,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=<cleared>,labels=0x401020500000000,protoinfo=(state=<cleared>)
+])
+
 OVS_APP_EXIT_AND_WAIT([ovn-controller])
 
 as ovn-sb
@@ -6457,6 +6473,22 @@  sed -e 's/mark=[[0-9]]*/mark=<cleared>/' | sort], [0], [dnl
 tcp,orig=(src=fd07::1,dst=fd01::2,sport=<cleared>,dport=<cleared>),reply=(src=fd01::2,dst=fd07::1,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=<cleared>,labels=0x1001020400000000,protoinfo=(state=<cleared>)
 ])
 
+# Now remove one ECMP route and check that traffic is still being conntracked.
+check ovn-nbctl --policy="src-ip" lr-route-del R1 fd01::/126 fd02::3
+check ovn-nbctl --wait=hv sync
+AT_CHECK([ovs-appctl dpctl/flush-conntrack])
+NETNS_DAEMONIZE([bob1], [nc -6 -l -k 8081], [bob2.pid])
+NS_CHECK_EXEC([alice1], [nc -6 -z fd07::1 8081], [0])
+NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 fd07::1 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep 0x1001020400000000 | FORMAT_CT(fd07::1) | \
+sed -e 's/zone=[[0-9]]*/zone=<cleared>/' |
+sed -e 's/mark=[[0-9]]*/mark=<cleared>/' | sort], [0], [dnl
+tcp,orig=(src=fd07::1,dst=fd01::2,sport=<cleared>,dport=<cleared>),reply=(src=fd01::2,dst=fd07::1,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=<cleared>,labels=0x1001020400000000,protoinfo=(state=<cleared>)
+])
+
 OVS_APP_EXIT_AND_WAIT([ovn-controller])
 
 as ovn-sb