diff mbox series

[ovs-dev] northd: Use next-hop network for snat when lb_force_snat_ip=routerip.

Message ID 20250211194939.3877019-1-rosemarie@redhat.com
State Changes Requested
Headers show
Series [ovs-dev] northd: Use next-hop network for snat when lb_force_snat_ip=routerip. | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Rosemarie O'Riorden Feb. 11, 2025, 7:49 p.m. UTC
When a gateway router has a load balancer configured, the option
lb_force_snat_ip=routerip can be set so that OVN snats load balanced
packets to the source logical router port's IP address, that is, the
port chosen as "outport" in the lr_in_ip_routing stage.

However, this was only designed to work when one network was configured
on the source logical router outport. When multiple networks are configured,
OVN's behavior was to simply choose the lexicographically first IP address for
snat. This lead to an incorrect address often being used for snat.

To fix this, two main components have been added:
 1. A new flag, flags.network_id. It is 4 bits and stores an index.
 2. A new stage in the router ingress pipeline, lr_in_network_id.

Now in the stage lr_in_network_id, OVN generates flows that assign
flags.network_id with an index. This index is then matched on later and the
network at that index will be chosen for snat.

Two tests have been added to verify that:
 1. The correct network is chosen for snat.
 2. The new and updated flows with flags.network_id are correct.

And tests that were broken by this new behavior have been updated.

Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
---
 include/ovn/logical-fields.h |   5 ++
 lib/logical-fields.c         |   3 +
 northd/northd.c              | 141 +++++++++++++++++++----------
 northd/northd.h              |   3 +-
 northd/ovn-northd.8.xml      |  17 ++--
 tests/ovn-northd.at          | 169 +++++++++++++++++++++++++++++++----
 6 files changed, 267 insertions(+), 71 deletions(-)

Comments

Rosemarie O'Riorden Feb. 12, 2025, 7:04 p.m. UTC | #1
Reported-at: https://issues.redhat.com/browse/FDP-871

On 2/11/25 2:49 PM, Rosemarie O'Riorden wrote:
> When a gateway router has a load balancer configured, the option
> lb_force_snat_ip=routerip can be set so that OVN snats load balanced
> packets to the source logical router port's IP address, that is, the
> port chosen as "outport" in the lr_in_ip_routing stage.
> 
> However, this was only designed to work when one network was configured
> on the source logical router outport. When multiple networks are configured,
> OVN's behavior was to simply choose the lexicographically first IP address for
> snat. This lead to an incorrect address often being used for snat.
> 
> To fix this, two main components have been added:
>  1. A new flag, flags.network_id. It is 4 bits and stores an index.
>  2. A new stage in the router ingress pipeline, lr_in_network_id.
> 
> Now in the stage lr_in_network_id, OVN generates flows that assign
> flags.network_id with an index. This index is then matched on later and the
> network at that index will be chosen for snat.
> 
> Two tests have been added to verify that:
>  1. The correct network is chosen for snat.
>  2. The new and updated flows with flags.network_id are correct.
> 
> And tests that were broken by this new behavior have been updated.
> 
> Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
> ---
>  include/ovn/logical-fields.h |   5 ++
>  lib/logical-fields.c         |   3 +
>  northd/northd.c              | 141 +++++++++++++++++++----------
>  northd/northd.h              |   3 +-
>  northd/ovn-northd.8.xml      |  17 ++--
>  tests/ovn-northd.at          | 169 +++++++++++++++++++++++++++++++----
>  6 files changed, 267 insertions(+), 71 deletions(-)
> 
> diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
> index fbe73763e..73feb4b1b 100644
> --- a/include/ovn/logical-fields.h
> +++ b/include/ovn/logical-fields.h
> @@ -95,6 +95,8 @@ enum mff_log_flags_bits {
>      MLF_ICMP_SNAT_BIT = 17,
>      MLF_OVERRIDE_LOCAL_ONLY_BIT = 18,
>      MLF_FROM_CTRL_BIT = 19,
> +    MLF_NETWORK_ID_START_BIT = 28,
> +    MLF_NETWORK_ID_END_BIT = 31,
>  };
>  
>  /* MFF_LOG_FLAGS_REG flag assignments */
> @@ -152,6 +154,9 @@ enum mff_log_flags {
>      MLF_ICMP_SNAT = (1 << MLF_ICMP_SNAT_BIT),
>  
>      MLF_OVERRIDE_LOCAL_ONLY = (1 << MLF_OVERRIDE_LOCAL_ONLY_BIT),
> +
> +    MLF_NETWORK_ID = ((MLF_NETWORK_ID_END_BIT - MLF_NETWORK_ID_START_BIT + 1)
> +                       << MLF_NETWORK_ID_START_BIT) - 1,
>  };
>  
>  /* OVN logical fields
> diff --git a/lib/logical-fields.c b/lib/logical-fields.c
> index df1b4243c..84a5dec99 100644
> --- a/lib/logical-fields.c
> +++ b/lib/logical-fields.c
> @@ -139,6 +139,9 @@ ovn_init_symtab(struct shash *symtab)
>                               flags_str);
>      snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_RX_FROM_TUNNEL_BIT);
>      expr_symtab_add_subfield(symtab, "flags.tunnel_rx", NULL, flags_str);
> +    snprintf(flags_str, sizeof flags_str, "flags[%d..%d]",
> +             MLF_NETWORK_ID_START_BIT, MLF_NETWORK_ID_END_BIT);
> +    expr_symtab_add_subfield(symtab, "flags.network_id", NULL, flags_str);
>  
>      snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_FROM_CTRL_BIT);
>      expr_symtab_add_subfield(symtab, "flags.from_ctrl", NULL, flags_str);
> diff --git a/northd/northd.c b/northd/northd.c
> index 1097bb159..ea784766b 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -13091,12 +13091,12 @@ build_lrouter_force_snat_flows_op(struct ovn_port *op,
>          return;
>      }
>  
> -    if (op->lrp_networks.n_ipv4_addrs) {
> +    for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
>          ds_clear(match);
>          ds_clear(actions);
>  
>          ds_put_format(match, "inport == %s && ip4.dst == %s",
> -                      op->json_key, op->lrp_networks.ipv4_addrs[0].addr_s);
> +                      op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s);
>          ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
>                        ds_cstr(match), "ct_snat;", lflow_ref);
>  
> @@ -13105,53 +13105,46 @@ build_lrouter_force_snat_flows_op(struct ovn_port *op,
>          /* Higher priority rules to force SNAT with the router port ip.
>           * This only takes effect when the packet has already been
>           * load balanced once. */
> -        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip4 && "
> -                      "outport == %s", op->json_key);
> +        ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                      "flags.network_id == %"PRIuSIZE" && ip4 && "
> +                      "outport == %s", i, op->json_key);
>          ds_put_format(actions, "ct_snat(%s);",
> -                      op->lrp_networks.ipv4_addrs[0].addr_s);
> +                      op->lrp_networks.ipv4_addrs[i].addr_s);
>          ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
>                        ds_cstr(match), ds_cstr(actions),
>                        lflow_ref);
> -        if (op->lrp_networks.n_ipv4_addrs > 1) {
> -            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> -            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
> -                              "multiple IPv4 addresses.  Only the first "
> -                              "IP [%s] is considered as SNAT for load "
> -                              "balancer", op->json_key,
> -                              op->lrp_networks.ipv4_addrs[0].addr_s);
> -        }
>      }
>  
>      /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
> -     * last in the list. So add the flows only if n_ipv6_addrs > 1. */
> +     * last in the list. So add the flows only if n_ipv6_addrs > 1, and loop
> +     * n_ipv6_addrs - 1 times. */
>      if (op->lrp_networks.n_ipv6_addrs > 1) {
> -        ds_clear(match);
> -        ds_clear(actions);
> -
> -        ds_put_format(match, "inport == %s && ip6.dst == %s",
> -                      op->json_key, op->lrp_networks.ipv6_addrs[0].addr_s);
> -        ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
> -                      ds_cstr(match), "ct_snat;", lflow_ref);
> +        for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
> +            ds_clear(match);
> +            ds_clear(actions);
>  
> -        ds_clear(match);
> +            ds_put_format(match, "inport == %s && ip6.dst == %s",
> +                          op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s);
> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
> +                          ds_cstr(match), "ct_snat;", lflow_ref);
> +            ds_clear(match);
>  
> -        /* Higher priority rules to force SNAT with the router port ip.
> -         * This only takes effect when the packet has already been
> -         * load balanced once. */
> -        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
> -                      "outport == %s", op->json_key);
> -        ds_put_format(actions, "ct_snat(%s);",
> -                      op->lrp_networks.ipv6_addrs[0].addr_s);
> -        ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
> -                      ds_cstr(match), ds_cstr(actions),
> -                      lflow_ref);
> -        if (op->lrp_networks.n_ipv6_addrs > 2) {
> -            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> -            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
> -                              "multiple IPv6 addresses.  Only the first "
> -                              "IP [%s] is considered as SNAT for load "
> -                              "balancer", op->json_key,
> -                              op->lrp_networks.ipv6_addrs[0].addr_s);
> +            /* Higher priority rules to force SNAT with the router port ip.
> +             * This only takes effect when the packet has already been
> +             * load balanced once. */
> +            if (op->lrp_networks.n_ipv6_addrs == 1) {
> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
> +                              "outport == %s", op->json_key);
> +            } else if (op->lrp_networks.n_ipv6_addrs > 1) {
> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                              "flags.network_id == %"PRIuSIZE" && ip6 && "
> +                              "outport == %s", i, op->json_key);
> +            }
> +            ds_put_format(actions, "ct_snat(%s);",
> +                          op->lrp_networks.ipv6_addrs[i].addr_s);
> +            ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
> +                          ds_cstr(match), ds_cstr(actions),
> +                          lflow_ref);
>          }
>      }
>  }
> @@ -15038,6 +15031,56 @@ build_arp_request_flows_for_lrouter(
>                    lflow_ref);
>  }
>  
> +static void
> +build_lr_force_snat_network_id_flows(
> +            struct ovn_datapath *od, struct lflow_table *lflows,
> +            struct ds *match, struct ds *actions, struct lflow_ref *lflow_ref)
> +{
> +    const struct ovn_port *op;
> +    HMAP_FOR_EACH (op, dp_node, &od->ports) {
> +        for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> +            ds_clear(match);
> +            ds_clear(actions);
> +
> +            ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                          "outport == %s && " REG_NEXT_HOP_IPV4 " == %s/%d",
> +                          op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s,
> +                         op->lrp_networks.ipv4_addrs[i].plen);
> +
> +            ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);
> +            ds_put_format(actions, "next;");
> +
> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
> +               ds_cstr(match), ds_cstr(actions),
> +               lflow_ref);
> +        }
> +
> +        /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
> +         * last in the list. So add the flows only if n_ipv6_addrs > 1, and
> +         * loop n_ipv6_addrs - 1 times. */
> +        if (op->lrp_networks.n_ipv6_addrs > 1) {
> +            for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
> +                ds_clear(match);
> +                ds_clear(actions);
> +
> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                              "outport == %s && " REG_NEXT_HOP_IPV6 " == %s/%d",
> +                              op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s,
> +                             op->lrp_networks.ipv6_addrs[i].plen);
> +
> +                ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);
> +                ds_put_format(actions, "next;");
> +
> +                ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
> +                   ds_cstr(match), ds_cstr(actions),
> +                   lflow_ref);
> +            }
> +        }
> +    }
> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_NETWORK_ID, 0,
> +                  "1", "next;", lflow_ref);
> +}
> +
>  /* Logical router egress table DELIVERY: Delivery (priority 100-110).
>   *
>   * Priority 100 rules deliver packets to enabled logical ports.
> @@ -17035,26 +17078,28 @@ build_lrouter_nat_defrag_and_lb(
>      /* Handle force SNAT options set in the gateway router. */
>      if (od->is_gw_router) {
>          if (dnat_force_snat_ip) {
> -            if (lrnat_rec->dnat_force_snat_addrs.n_ipv4_addrs) {
> +            struct lport_addresses dn_addrs = lrnat_rec->dnat_force_snat_addrs;
> +            for (size_t i = 0; i < dn_addrs.n_ipv4_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "4",
> -                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[0].addr_s,
> +                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[i].addr_s,
>                      "dnat", lflow_ref);
>              }
> -            if (lrnat_rec->dnat_force_snat_addrs.n_ipv6_addrs) {
> +            for (size_t i = 0; i < dn_addrs.n_ipv6_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "6",
> -                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[0].addr_s,
> +                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[i].addr_s,
>                      "dnat", lflow_ref);
>              }
>          }
>          if (lb_force_snat_ip) {
> -            if (lrnat_rec->lb_force_snat_addrs.n_ipv4_addrs) {
> +            struct lport_addresses lb_addrs = lrnat_rec->lb_force_snat_addrs;
> +            for (size_t i = 0; i < lb_addrs.n_ipv4_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "4",
> -                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[0].addr_s, "lb",
> +                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[i].addr_s, "lb",
>                      lflow_ref);
>              }
> -            if (lrnat_rec->lb_force_snat_addrs.n_ipv6_addrs) {
> +            for (size_t i = 0; i < lb_addrs.n_ipv6_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "6",
> -                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[0].addr_s, "lb",
> +                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[i].addr_s, "lb",
>                      lflow_ref);
>              }
>          }
> @@ -17415,6 +17460,8 @@ build_lswitch_and_lrouter_iterate_by_lr(struct ovn_datapath *od,
>                                          &lsi->actions,
>                                          lsi->meter_groups,
>                                          NULL);
> +    build_lr_force_snat_network_id_flows(od, lsi->lflows, &lsi->match,
> +                                         &lsi->actions, NULL);
>      build_misc_local_traffic_drop_flows_for_lrouter(od, lsi->lflows, NULL);
>  
>      build_lr_nat_defrag_and_lb_default_flows(od, lsi->lflows, NULL);
> diff --git a/northd/northd.h b/northd/northd.h
> index 1f29645c7..e89ab3c74 100644
> --- a/northd/northd.h
> +++ b/northd/northd.h
> @@ -537,7 +537,8 @@ enum ovn_stage {
>      PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     21, "lr_in_chk_pkt_len")     \
>      PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     22, "lr_in_larger_pkts")     \
>      PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     23, "lr_in_gw_redirect")     \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     24, "lr_in_arp_request")     \
> +    PIPELINE_STAGE(ROUTER, IN,  NETWORK_ID,      24, "lr_in_network_id")      \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     25, "lr_in_arp_request")     \
>                                                                        \
>      /* Logical router egress stages. */                               \
>      PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 93b1a9135..e15b91f1f 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -5256,18 +5256,19 @@ nd_ns {
>            table="Logical_Router"/>:lb_force_snat_ip=router_ip), then for
>            each logical router port <var>P</var> attached to the Gateway
>            router, a priority-110 flow matches
> -          <code>flags.force_snat_for_lb == 1 &amp;&amp; outport == <var>P</var>
> -          </code> with an action <code>ct_snat(<var>R</var>);</code>
> -          where <var>R</var> is the IP configured on the router port.
> -          If <code>R</code> is an IPv4 address then the match will also
> -          include <code>ip4</code> and if it is an IPv6 address, then the
> -          match will also include <code>ip6</code>.
> +          <code>flags.force_snat_for_lb == 1 &amp;&amp; flags.network_id ==
> +          <var>I</var> &amp;&amp; outport == <var>P</var></code>, where
> +          <var>I</var> is the network index, with an action
> +          <code>ct_snat(<var>R</var>);</code> where <var>R</var> is the IP
> +          configured on the router port. If <code>R</code> is an IPv4 address
> +          then the match will also include <code>ip4</code> and if it is an
> +          IPv6 address, then the match will also include <code>ip6</code>.
>          </p>
>  
>          <p>
>            If the logical router port <var>P</var> is configured with multiple
> -          IPv4 and multiple IPv6 addresses, only the first IPv4 and first IPv6
> -          address is considered.
> +          IPv4 and multiple IPv6 addresses, the IPv4 and IPv6 address within
> +          the same network as the next-hop will be chosen.
>          </p>
>        </li>
>  
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 64991ff75..777f3b95b 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -4460,9 +4460,9 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>  ])
>  
> @@ -4525,10 +4525,10 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>  ])
>  
> @@ -6235,8 +6235,8 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
> @@ -6301,8 +6301,8 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
> @@ -6379,10 +6379,10 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
> @@ -15542,3 +15542,142 @@ CHECK_NO_CHANGE_AFTER_RECOMPUTE
>  
>  AT_CLEANUP
>  ])
> +
> +AT_SETUP([lb_force_snat_ip=routerip select correct network for snat])
> +AT_SKIP_IF([test $HAVE_SCAPY = no])
> +ovn_start
> +
> +check ovn-nbctl lr-add lr
> +check ovn-nbctl set logical_router lr options:chassis=hv1
> +check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
> +check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24
> +check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 1.2.1.1/24 \
> +                    1.1.3.1/24 2.2.2.1/24 7.7.7.1/24 8.8.8.1/24
> +check ovn-nbctl ls-add ls-client
> +check ovn-nbctl ls-add ls-server
> +check ovn-nbctl lsp-add ls-client lsp-client-router
> +check ovn-nbctl lsp-set-type lsp-client-router router
> +check ovn-nbctl lsp-add ls-server lsp-server-router
> +check ovn-nbctl lsp-set-type lsp-server-router router
> +check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
> +check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
> +check ovn-nbctl lsp-add ls-client client
> +check ovn-nbctl lsp-add ls-server server
> +check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
> +check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
> +check ovn-nbctl lsp-set-addresses lsp-client-router router
> +check ovn-nbctl lsp-set-addresses lsp-server-router router
> +check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
> +check ovn-nbctl lr-lb-add lr lb
> +
> +# Create a hypervisor and create OVS ports corresponding to logical ports.
> +net_add n1
> +sim_add hv1
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +
> +ovs-vsctl -- add-port br-int hv1-vif1 -- \
> +    set interface hv1-vif1 external-ids:iface-id=client \
> +    options:tx_pcap=hv1/client-tx.pcap \
> +    options:rxq_pcap=hv1/client-rx.pcap
> +
> +ovs-vsctl -- add-port br-int hv1-vif2 -- \
> +    set interface hv1-vif2 external-ids:iface-id=server \
> +    options:tx_pcap=hv1/server-tx.pcap \
> +    options:rxq_pcap=hv1/server-rx.pcap
> +
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
> +tx_src_mac="02:00:00:00:00:01"
> +tx_dst_mac="02:00:00:00:00:02"
> +tx_src_ip=1.1.1.10
> +tx_dst_ip=42.42.42.42
> +request=$(fmt_pkt "Ether(dst='${tx_dst_mac}', src='${tx_src_mac}')/ \
> +                  IP(src='${tx_src_ip}', dst='${tx_dst_ip}')/ \
> +                  UDP(sport=20001, dport=80)")
> +
> +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $request
> +
> +rx_src_mac="02:00:00:00:00:03"
> +rx_dst_mac="02:00:00:00:00:04"
> +rx_src_ip=2.2.2.1
> +rx_dst_ip=2.2.2.10
> +expected=$(fmt_pkt "Ether(dst='${rx_dst_mac}', src='${rx_src_mac}')/ \
> +                  IP(src='${rx_src_ip}', dst='${rx_dst_ip}', ttl=0x3F)/ \
> +                  UDP(sport=20001, dport=80)")
> +
> +echo $expected > expected
> +OVN_CHECK_PACKETS([hv1/server-tx.pcap], [expected])
> +AT_CLEANUP
> +
> +AT_SETUP([lb_force_snat_ip=routerip generate flags.network_id flows])
> +ovn_start
> +
> +check ovn-nbctl lr-add lr
> +check ovn-nbctl set logical_router lr options:chassis=hv1
> +check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
> +check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24 ff01::01
> +check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 7.7.7.1/24 8.8.8.1/24 \
> +                      ff01::02 ff01::03 ff01::06
> +check ovn-nbctl ls-add ls-client
> +check ovn-nbctl ls-add ls-server
> +check ovn-nbctl lsp-add ls-client lsp-client-router
> +check ovn-nbctl lsp-set-type lsp-client-router router
> +check ovn-nbctl lsp-add ls-server lsp-server-router
> +check ovn-nbctl lsp-set-type lsp-server-router router
> +check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
> +check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
> +check ovn-nbctl lsp-add ls-client client
> +check ovn-nbctl lsp-add ls-server server
> +check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
> +check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
> +check ovn-nbctl lsp-set-addresses lsp-client-router router
> +check ovn-nbctl lsp-set-addresses lsp-server-router router
> +check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
> +check ovn-nbctl lr-lb-add lr lb
> +
> +# Create a hypervisor and create OVS ports corresponding to logical ports.
> +net_add n1
> +sim_add hv1
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +
> +ovs-vsctl add-port br-int hv1-vif1 -- \
> +    set interface hv1-vif1 external-ids:iface-id=client \
> +    options:tx_pcap=hv1/client-tx.pcap \
> +    options:rxq_pcap=hv1/client-rx.pcap
> +
> +ovs-vsctl add-port br-int hv1-vif2 -- \
> +    set interface hv1-vif2 external-ids:iface-id=server \
> +    options:tx_pcap=hv1/server-tx.pcap \
> +    options:rxq_pcap=hv1/server-rx.pcap
> +
> +#OVN_POPULATE_ARP
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
> +ovn-sbctl dump-flows lr > lrflows
> +AT_CAPTURE_FILE([lrflows])
> +
> +AT_CHECK([grep -E flags.network_id lrflows | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && reg0 == 1.1.1.1/24), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && xxreg0 == ff01::1/128), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 1.1.2.1/24), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 7.7.7.1/24), action=(flags.network_id = 1; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 8.8.8.1/24), action=(flags.network_id = 2; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::2/128), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::3/128), action=(flags.network_id = 1; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::6/128), action=(flags.network_id = 2; next;)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-client"), action=(ct_snat(1.1.1.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-server"), action=(ct_snat(1.1.2.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-client"), action=(ct_snat(ff01::1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::2);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip4 && outport == "lrp-server"), action=(ct_snat(7.7.7.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::3);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip4 && outport == "lrp-server"), action=(ct_snat(8.8.8.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::6);)
> +])
> +AT_CLEANUP
Ilya Maximets Feb. 12, 2025, 10:48 p.m. UTC | #2
Hi, Rosemarie.  Thanks for the fix!  See some comments below.

Best regards, Ilya Maximets.

On 2/11/25 20:49, Rosemarie O'Riorden wrote:
> When a gateway router has a load balancer configured, the option
> lb_force_snat_ip=routerip can be set so that OVN snats load balanced
> packets to the source logical router port's IP address, that is, the

Nit: The word 'source' is a little confusing here, it makes me think
of a router port from which the packet entered the router.  Maybe
"logical router's egress interface's IP address" ?

> port chosen as "outport" in the lr_in_ip_routing stage.
> 
> However, this was only designed to work when one network was configured
> on the source logical router outport. When multiple networks are configured,

Nit: And drop the 'source' here, I guess.

> OVN's behavior was to simply choose the lexicographically first IP address for
> snat. This lead to an incorrect address often being used for snat.
> 
> To fix this, two main components have been added:
>  1. A new flag, flags.network_id. It is 4 bits and stores an index.
>  2. A new stage in the router ingress pipeline, lr_in_network_id.
> 
> Now in the stage lr_in_network_id, OVN generates flows that assign
> flags.network_id with an index. This index is then matched on later and the
> network at that index will be chosen for snat.

Nit: This description is mising a key thing - how the network id is assigned,
i.e. what's the criteria of choosing a network id for a packet.

> 
> Two tests have been added to verify that:
>  1. The correct network is chosen for snat.
>  2. The new and updated flows with flags.network_id are correct.
> 
> And tests that were broken by this new behavior have been updated.
> 

We may want to link the original mailing list discussion as well:

Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417717.html

> Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
> ---
>  include/ovn/logical-fields.h |   5 ++
>  lib/logical-fields.c         |   3 +
>  northd/northd.c              | 141 +++++++++++++++++++----------
>  northd/northd.h              |   3 +-
>  northd/ovn-northd.8.xml      |  17 ++--
>  tests/ovn-northd.at          | 169 +++++++++++++++++++++++++++++++----
>  6 files changed, 267 insertions(+), 71 deletions(-)
> 
> diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
> index fbe73763e..73feb4b1b 100644
> --- a/include/ovn/logical-fields.h
> +++ b/include/ovn/logical-fields.h
> @@ -95,6 +95,8 @@ enum mff_log_flags_bits {
>      MLF_ICMP_SNAT_BIT = 17,
>      MLF_OVERRIDE_LOCAL_ONLY_BIT = 18,
>      MLF_FROM_CTRL_BIT = 19,
> +    MLF_NETWORK_ID_START_BIT = 28,
> +    MLF_NETWORK_ID_END_BIT = 31,
>  };
>  
>  /* MFF_LOG_FLAGS_REG flag assignments */
> @@ -152,6 +154,9 @@ enum mff_log_flags {
>      MLF_ICMP_SNAT = (1 << MLF_ICMP_SNAT_BIT),
>  
>      MLF_OVERRIDE_LOCAL_ONLY = (1 << MLF_OVERRIDE_LOCAL_ONLY_BIT),
> +
> +    MLF_NETWORK_ID = ((MLF_NETWORK_ID_END_BIT - MLF_NETWORK_ID_START_BIT + 1)
> +                       << MLF_NETWORK_ID_START_BIT) - 1,

Hmm, this doesn't seem right:

>>> hex(((31 - 28 + 1) << 28) - 1)
'0x3fffffff'

And what we actually want is 0xf0000000.

This constant is not used anywhere in the code, but it should still be correct.

>  };
>  
>  /* OVN logical fields
> diff --git a/lib/logical-fields.c b/lib/logical-fields.c
> index df1b4243c..84a5dec99 100644
> --- a/lib/logical-fields.c
> +++ b/lib/logical-fields.c
> @@ -139,6 +139,9 @@ ovn_init_symtab(struct shash *symtab)
>                               flags_str);
>      snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_RX_FROM_TUNNEL_BIT);
>      expr_symtab_add_subfield(symtab, "flags.tunnel_rx", NULL, flags_str);
> +    snprintf(flags_str, sizeof flags_str, "flags[%d..%d]",
> +             MLF_NETWORK_ID_START_BIT, MLF_NETWORK_ID_END_BIT);
> +    expr_symtab_add_subfield(symtab, "flags.network_id", NULL, flags_str);
>  
>      snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_FROM_CTRL_BIT);
>      expr_symtab_add_subfield(symtab, "flags.from_ctrl", NULL, flags_str);
> diff --git a/northd/northd.c b/northd/northd.c
> index 1097bb159..ea784766b 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -13091,12 +13091,12 @@ build_lrouter_force_snat_flows_op(struct ovn_port *op,
>          return;
>      }
>  
> -    if (op->lrp_networks.n_ipv4_addrs) {
> +    for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
>          ds_clear(match);
>          ds_clear(actions);
>  
>          ds_put_format(match, "inport == %s && ip4.dst == %s",
> -                      op->json_key, op->lrp_networks.ipv4_addrs[0].addr_s);
> +                      op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s);
>          ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
>                        ds_cstr(match), "ct_snat;", lflow_ref);
>  
> @@ -13105,53 +13105,46 @@ build_lrouter_force_snat_flows_op(struct ovn_port *op,
>          /* Higher priority rules to force SNAT with the router port ip.
>           * This only takes effect when the packet has already been
>           * load balanced once. */
> -        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip4 && "
> -                      "outport == %s", op->json_key);
> +        ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                      "flags.network_id == %"PRIuSIZE" && ip4 && "
> +                      "outport == %s", i, op->json_key);
>          ds_put_format(actions, "ct_snat(%s);",
> -                      op->lrp_networks.ipv4_addrs[0].addr_s);
> +                      op->lrp_networks.ipv4_addrs[i].addr_s);
>          ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
>                        ds_cstr(match), ds_cstr(actions),
>                        lflow_ref);
> -        if (op->lrp_networks.n_ipv4_addrs > 1) {
> -            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> -            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
> -                              "multiple IPv4 addresses.  Only the first "
> -                              "IP [%s] is considered as SNAT for load "
> -                              "balancer", op->json_key,
> -                              op->lrp_networks.ipv4_addrs[0].addr_s);
> -        }

What if we have more than 16 networks?  This loop will iterate over and
try to match on values larger than the register.

We can't match on more than 16 networks, so we should not create broken
flows for them.  We'll need to handle that case in the fucntion that
assigns the ids - build_lr_force_snat_network_id_flows.


One more thing is that we should add an extra lower priority flow that
will not match on the network id, but will directly use the first network
for SNAT:

  stage:S_ROUTER_OUT_SNAT
  priority:100
  match: flags.force_snat_for_lb == 1 && ip4 && outport == X
  actions: ct_snat(lrp_networks.ipv4_addrs[0]);

This is useful for upgrade scenarios, or in case this change is going to
be backported.  If northd is upgraded first, older ovn-controller that
doesn't recognize network_id flag will not install flows that use it, but
it will install this one and will have old behavior of SNATting to the
first IP working, until upgraded.
It is not a correct upgrade order, but if we want to beckport this change,
then it becomes reasonable as we do not enforce upgrade order for minor
release updates.

>      }
>  
>      /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
> -     * last in the list. So add the flows only if n_ipv6_addrs > 1. */
> +     * last in the list. So add the flows only if n_ipv6_addrs > 1, and loop
> +     * n_ipv6_addrs - 1 times. */
>      if (op->lrp_networks.n_ipv6_addrs > 1) {

The loop below will not run if n_ipv6_addrs <= 1, so this if condition
is not needed.

> -        ds_clear(match);
> -        ds_clear(actions);
> -
> -        ds_put_format(match, "inport == %s && ip6.dst == %s",
> -                      op->json_key, op->lrp_networks.ipv6_addrs[0].addr_s);
> -        ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
> -                      ds_cstr(match), "ct_snat;", lflow_ref);
> +        for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
> +            ds_clear(match);
> +            ds_clear(actions);
>  
> -        ds_clear(match);
> +            ds_put_format(match, "inport == %s && ip6.dst == %s",
> +                          op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s);
> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
> +                          ds_cstr(match), "ct_snat;", lflow_ref);
> +            ds_clear(match);
>  
> -        /* Higher priority rules to force SNAT with the router port ip.
> -         * This only takes effect when the packet has already been
> -         * load balanced once. */
> -        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
> -                      "outport == %s", op->json_key);
> -        ds_put_format(actions, "ct_snat(%s);",
> -                      op->lrp_networks.ipv6_addrs[0].addr_s);
> -        ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
> -                      ds_cstr(match), ds_cstr(actions),
> -                      lflow_ref);
> -        if (op->lrp_networks.n_ipv6_addrs > 2) {
> -            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> -            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
> -                              "multiple IPv6 addresses.  Only the first "
> -                              "IP [%s] is considered as SNAT for load "
> -                              "balancer", op->json_key,
> -                              op->lrp_networks.ipv6_addrs[0].addr_s);

Same thing as for IPv4.  We need to check that the network IDs can fit
into our 4-bit field.  Assign ID 0 and warn, if they do not.

> +            /* Higher priority rules to force SNAT with the router port ip.
> +             * This only takes effect when the packet has already been
> +             * load balanced once. */
> +            if (op->lrp_networks.n_ipv6_addrs == 1) {

This is impossible here.  We already checked twice for the number of netwroks
to be more than 1.

> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
> +                              "outport == %s", op->json_key);
> +            } else if (op->lrp_networks.n_ipv6_addrs > 1) {

And this is always true here.

> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                              "flags.network_id == %"PRIuSIZE" && ip6 && "
> +                              "outport == %s", i, op->json_key);
> +            }
> +            ds_put_format(actions, "ct_snat(%s);",
> +                          op->lrp_networks.ipv6_addrs[i].addr_s);
> +            ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
> +                          ds_cstr(match), ds_cstr(actions),
> +                          lflow_ref);

And the same comment for the lower priority rule without match on network id
here as well.

>          }
>      }
>  }
> @@ -15038,6 +15031,56 @@ build_arp_request_flows_for_lrouter(
>                    lflow_ref);
>  }
>  
> +static void
> +build_lr_force_snat_network_id_flows(
> +            struct ovn_datapath *od, struct lflow_table *lflows,
> +            struct ds *match, struct ds *actions, struct lflow_ref *lflow_ref)
> +{
> +    const struct ovn_port *op;
> +    HMAP_FOR_EACH (op, dp_node, &od->ports) {
> +        for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> +            ds_clear(match);
> +            ds_clear(actions);
> +
> +            ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                          "outport == %s && " REG_NEXT_HOP_IPV4 " == %s/%d",
> +                          op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s,
> +                         op->lrp_networks.ipv4_addrs[i].plen);

Nit: indentation is a little off.

> +
> +            ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);

What if the number of networks is more than 16?  We should check for that
condition and assign, let's say, network id 0 for all the networks that can't
fit into our 4 bits of space.

And we may want to preserve a (modified) warning from the old
build_lrouter_force_snat_flows_op for that case as well.

> +            ds_put_format(actions, "next;");
> +
> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
> +               ds_cstr(match), ds_cstr(actions),
> +               lflow_ref);

Nit: Arguments should be aligned to the opening parenthesis.

> +        }
> +
> +        /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
> +         * last in the list. So add the flows only if n_ipv6_addrs > 1, and
> +         * loop n_ipv6_addrs - 1 times. */
> +        if (op->lrp_networks.n_ipv6_addrs > 1) {

Same here, the loop will not run if this condition is not met, so the if
statement is not needed.

> +            for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
> +                ds_clear(match);
> +                ds_clear(actions);
> +
> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
> +                              "outport == %s && " REG_NEXT_HOP_IPV6 " == %s/%d",
> +                              op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s,
> +                             op->lrp_networks.ipv6_addrs[i].plen);

Nit: indentation.

> +
> +                ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);

Same comemnt about network ID larger than 4 bits.

> +                ds_put_format(actions, "next;");
> +
> +                ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
> +                   ds_cstr(match), ds_cstr(actions),
> +                   lflow_ref);

Nit: indentation.

> +            }
> +        }
> +    }
> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_NETWORK_ID, 0,
> +                  "1", "next;", lflow_ref);
> +}
> +
>  /* Logical router egress table DELIVERY: Delivery (priority 100-110).
>   *
>   * Priority 100 rules deliver packets to enabled logical ports.
> @@ -17035,26 +17078,28 @@ build_lrouter_nat_defrag_and_lb(
>      /* Handle force SNAT options set in the gateway router. */
>      if (od->is_gw_router) {
>          if (dnat_force_snat_ip) {
> -            if (lrnat_rec->dnat_force_snat_addrs.n_ipv4_addrs) {
> +            struct lport_addresses dn_addrs = lrnat_rec->dnat_force_snat_addrs;
> +            for (size_t i = 0; i < dn_addrs.n_ipv4_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "4",
> -                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[0].addr_s,
> +                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[i].addr_s,
>                      "dnat", lflow_ref);
>              }
> -            if (lrnat_rec->dnat_force_snat_addrs.n_ipv6_addrs) {
> +            for (size_t i = 0; i < dn_addrs.n_ipv6_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "6",
> -                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[0].addr_s,
> +                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[i].addr_s,
>                      "dnat", lflow_ref);
>              }
>          }
>          if (lb_force_snat_ip) {
> -            if (lrnat_rec->lb_force_snat_addrs.n_ipv4_addrs) {
> +            struct lport_addresses lb_addrs = lrnat_rec->lb_force_snat_addrs;
> +            for (size_t i = 0; i < lb_addrs.n_ipv4_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "4",
> -                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[0].addr_s, "lb",
> +                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[i].addr_s, "lb",
>                      lflow_ref);
>              }
> -            if (lrnat_rec->lb_force_snat_addrs.n_ipv6_addrs) {
> +            for (size_t i = 0; i < lb_addrs.n_ipv6_addrs; i++) {
>                  build_lrouter_force_snat_flows(lflows, od, "6",
> -                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[0].addr_s, "lb",
> +                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[i].addr_s, "lb",
>                      lflow_ref);
>              }
>          }

I don't think the changes above are correct.  They are adding multiple logical
flows with exact same match criteria but different actions.

This code should be left intact.

lb_force_snat_addrs are populated from explicitly provided IP addresses for the
force SNAT.  When  lb_force_snat_ip=router_ip, the set is empty, so it's not the
case you're trying to solve.

See northd/en-lr-nat.c:lr_nat_record_init().  Your change should only cover the
case of lrnat_rec->lb_force_snat_router_ip = true.

The other part here is for dnat and not even the load balancer (dnat_force_snat_addrs),
so should remain intact as well.


> @@ -17415,6 +17460,8 @@ build_lswitch_and_lrouter_iterate_by_lr(struct ovn_datapath *od,
>                                          &lsi->actions,
>                                          lsi->meter_groups,
>                                          NULL);
> +    build_lr_force_snat_network_id_flows(od, lsi->lflows, &lsi->match,
> +                                         &lsi->actions, NULL);
>      build_misc_local_traffic_drop_flows_for_lrouter(od, lsi->lflows, NULL);
>  
>      build_lr_nat_defrag_and_lb_default_flows(od, lsi->lflows, NULL);
> diff --git a/northd/northd.h b/northd/northd.h
> index 1f29645c7..e89ab3c74 100644
> --- a/northd/northd.h
> +++ b/northd/northd.h
> @@ -537,7 +537,8 @@ enum ovn_stage {
>      PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     21, "lr_in_chk_pkt_len")     \
>      PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     22, "lr_in_larger_pkts")     \
>      PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     23, "lr_in_gw_redirect")     \
> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     24, "lr_in_arp_request")     \
> +    PIPELINE_STAGE(ROUTER, IN,  NETWORK_ID,      24, "lr_in_network_id")      \
> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     25, "lr_in_arp_request")     \
>                                                                        \
>      /* Logical router egress stages. */                               \
>      PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 93b1a9135..e15b91f1f 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -5256,18 +5256,19 @@ nd_ns {
>            table="Logical_Router"/>:lb_force_snat_ip=router_ip), then for
>            each logical router port <var>P</var> attached to the Gateway
>            router, a priority-110 flow matches
> -          <code>flags.force_snat_for_lb == 1 &amp;&amp; outport == <var>P</var>
> -          </code> with an action <code>ct_snat(<var>R</var>);</code>
> -          where <var>R</var> is the IP configured on the router port.
> -          If <code>R</code> is an IPv4 address then the match will also
> -          include <code>ip4</code> and if it is an IPv6 address, then the
> -          match will also include <code>ip6</code>.
> +          <code>flags.force_snat_for_lb == 1 &amp;&amp; flags.network_id ==
> +          <var>I</var> &amp;&amp; outport == <var>P</var></code>, where
> +          <var>I</var> is the network index, with an action
> +          <code>ct_snat(<var>R</var>);</code> where <var>R</var> is the IP
> +          configured on the router port. If <code>R</code> is an IPv4 address
> +          then the match will also include <code>ip4</code> and if it is an
> +          IPv6 address, then the match will also include <code>ip6</code>.
>          </p>
>  
>          <p>
>            If the logical router port <var>P</var> is configured with multiple
> -          IPv4 and multiple IPv6 addresses, only the first IPv4 and first IPv6
> -          address is considered.
> +          IPv4 and multiple IPv6 addresses, the IPv4 and IPv6 address within
> +          the same network as the next-hop will be chosen.
>          </p>
>        </li>

This is part of the 'Egress Table 3: SNAT on Gateway Routers' description,
but you're also adding a new NETWORK_ID stage that needs to be documented.
And you changed the table number for the ARP_REQUEST stage, you need to
update that in the doc as well.

>  
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 64991ff75..777f3b95b 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -4460,9 +4460,9 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>  ])
>  
> @@ -4525,10 +4525,10 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>  ])
>  
> @@ -6235,8 +6235,8 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
> @@ -6301,8 +6301,8 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
> @@ -6379,10 +6379,10 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
> @@ -15542,3 +15542,142 @@ CHECK_NO_CHANGE_AFTER_RECOMPUTE
>  
>  AT_CLEANUP
>  ])
> +
> +AT_SETUP([lb_force_snat_ip=routerip select correct network for snat])

This is an end-to-end test.  It should be in ovn.at, not in this file.

> +AT_SKIP_IF([test $HAVE_SCAPY = no])
> +ovn_start
> +
> +check ovn-nbctl lr-add lr
> +check ovn-nbctl set logical_router lr options:chassis=hv1
> +check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
> +check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24
> +check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 1.2.1.1/24 \
> +                    1.1.3.1/24 2.2.2.1/24 7.7.7.1/24 8.8.8.1/24
> +check ovn-nbctl ls-add ls-client
> +check ovn-nbctl ls-add ls-server
> +check ovn-nbctl lsp-add ls-client lsp-client-router
> +check ovn-nbctl lsp-set-type lsp-client-router router
> +check ovn-nbctl lsp-add ls-server lsp-server-router
> +check ovn-nbctl lsp-set-type lsp-server-router router
> +check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
> +check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
> +check ovn-nbctl lsp-add ls-client client
> +check ovn-nbctl lsp-add ls-server server
> +check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
> +check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
> +check ovn-nbctl lsp-set-addresses lsp-client-router router
> +check ovn-nbctl lsp-set-addresses lsp-server-router router
> +check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
> +check ovn-nbctl lr-lb-add lr lb
> +
> +# Create a hypervisor and create OVS ports corresponding to logical ports.
> +net_add n1
> +sim_add hv1
> +as hv1
> +ovs-vsctl add-br br-phys

Need a 'check'.

> +ovn_attach n1 br-phys 192.168.0.1
> +
> +ovs-vsctl -- add-port br-int hv1-vif1 -- \

check.

> +    set interface hv1-vif1 external-ids:iface-id=client \
> +    options:tx_pcap=hv1/client-tx.pcap \
> +    options:rxq_pcap=hv1/client-rx.pcap
> +
> +ovs-vsctl -- add-port br-int hv1-vif2 -- \

check.

> +    set interface hv1-vif2 external-ids:iface-id=server \
> +    options:tx_pcap=hv1/server-tx.pcap \
> +    options:rxq_pcap=hv1/server-rx.pcap
> +
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
> +tx_src_mac="02:00:00:00:00:01"
> +tx_dst_mac="02:00:00:00:00:02"
> +tx_src_ip=1.1.1.10
> +tx_dst_ip=42.42.42.42
> +request=$(fmt_pkt "Ether(dst='${tx_dst_mac}', src='${tx_src_mac}')/ \
> +                  IP(src='${tx_src_ip}', dst='${tx_dst_ip}')/ \
> +                  UDP(sport=20001, dport=80)")
> +
> +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $request

check.

> +
> +rx_src_mac="02:00:00:00:00:03"
> +rx_dst_mac="02:00:00:00:00:04"
> +rx_src_ip=2.2.2.1
> +rx_dst_ip=2.2.2.10
> +expected=$(fmt_pkt "Ether(dst='${rx_dst_mac}', src='${rx_src_mac}')/ \
> +                  IP(src='${rx_src_ip}', dst='${rx_dst_ip}', ttl=0x3F)/ \
> +                  UDP(sport=20001, dport=80)")
> +
> +echo $expected > expected
> +OVN_CHECK_PACKETS([hv1/server-tx.pcap], [expected])
> +AT_CLEANUP
> +
> +AT_SETUP([lb_force_snat_ip=routerip generate flags.network_id flows])
> +ovn_start
> +
> +check ovn-nbctl lr-add lr
> +check ovn-nbctl set logical_router lr options:chassis=hv1
> +check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
> +check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24 ff01::01
> +check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 7.7.7.1/24 8.8.8.1/24 \
> +                      ff01::02 ff01::03 ff01::06
> +check ovn-nbctl ls-add ls-client
> +check ovn-nbctl ls-add ls-server
> +check ovn-nbctl lsp-add ls-client lsp-client-router
> +check ovn-nbctl lsp-set-type lsp-client-router router
> +check ovn-nbctl lsp-add ls-server lsp-server-router
> +check ovn-nbctl lsp-set-type lsp-server-router router
> +check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
> +check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
> +check ovn-nbctl lsp-add ls-client client
> +check ovn-nbctl lsp-add ls-server server
> +check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
> +check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
> +check ovn-nbctl lsp-set-addresses lsp-client-router router
> +check ovn-nbctl lsp-set-addresses lsp-server-router router
> +check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
> +check ovn-nbctl lr-lb-add lr lb
> +
> +# Create a hypervisor and create OVS ports corresponding to logical ports.

We don't need that for the logical flow check, we only need central components
running.

> +net_add n1
> +sim_add hv1
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +
> +ovs-vsctl add-port br-int hv1-vif1 -- \
> +    set interface hv1-vif1 external-ids:iface-id=client \
> +    options:tx_pcap=hv1/client-tx.pcap \
> +    options:rxq_pcap=hv1/client-rx.pcap
> +
> +ovs-vsctl add-port br-int hv1-vif2 -- \
> +    set interface hv1-vif2 external-ids:iface-id=server \
> +    options:tx_pcap=hv1/server-tx.pcap \
> +    options:rxq_pcap=hv1/server-rx.pcap
> +
> +#OVN_POPULATE_ARP
> +wait_for_ports_up

All of the above til the previous comment should not be needed for this test.

> +check ovn-nbctl --wait=hv sync
> +
> +ovn-sbctl dump-flows lr > lrflows
> +AT_CAPTURE_FILE([lrflows])
> +
> +AT_CHECK([grep -E flags.network_id lrflows | ovn_strip_lflows], [0], [dnl
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && reg0 == 1.1.1.1/24), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && xxreg0 == ff01::1/128), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 1.1.2.1/24), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 7.7.7.1/24), action=(flags.network_id = 1; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 8.8.8.1/24), action=(flags.network_id = 2; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::2/128), action=(flags.network_id = 0; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::3/128), action=(flags.network_id = 1; next;)
> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::6/128), action=(flags.network_id = 2; next;)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-client"), action=(ct_snat(1.1.1.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-server"), action=(ct_snat(1.1.2.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-client"), action=(ct_snat(ff01::1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::2);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip4 && outport == "lrp-server"), action=(ct_snat(7.7.7.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::3);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip4 && outport == "lrp-server"), action=(ct_snat(8.8.8.1);)
> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::6);)
> +])
> +AT_CLEANUP
Dumitru Ceara Feb. 12, 2025, 11:03 p.m. UTC | #3
On 2/12/25 11:48 PM, Ilya Maximets wrote:
> Hi, Rosemarie.  Thanks for the fix!  See some comments below.

Hi Rosemarie, Ilya,

Thanks for the patch and for the review.  I moved the state of this
patch in patchwork to "Changes Requested":

https://patchwork.ozlabs.org/project/ovn/patch/20250211194939.3877019-1-rosemarie@redhat.com/

Looking forward to v2.

Regards,
Dumitru

> 
> Best regards, Ilya Maximets.
> 
> On 2/11/25 20:49, Rosemarie O'Riorden wrote:
>> When a gateway router has a load balancer configured, the option
>> lb_force_snat_ip=routerip can be set so that OVN snats load balanced
>> packets to the source logical router port's IP address, that is, the
> 
> Nit: The word 'source' is a little confusing here, it makes me think
> of a router port from which the packet entered the router.  Maybe
> "logical router's egress interface's IP address" ?
> 
>> port chosen as "outport" in the lr_in_ip_routing stage.
>>
>> However, this was only designed to work when one network was configured
>> on the source logical router outport. When multiple networks are configured,
> 
> Nit: And drop the 'source' here, I guess.
> 
>> OVN's behavior was to simply choose the lexicographically first IP address for
>> snat. This lead to an incorrect address often being used for snat.
>>
>> To fix this, two main components have been added:
>>  1. A new flag, flags.network_id. It is 4 bits and stores an index.
>>  2. A new stage in the router ingress pipeline, lr_in_network_id.
>>
>> Now in the stage lr_in_network_id, OVN generates flows that assign
>> flags.network_id with an index. This index is then matched on later and the
>> network at that index will be chosen for snat.
> 
> Nit: This description is mising a key thing - how the network id is assigned,
> i.e. what's the criteria of choosing a network id for a packet.
> 
>>
>> Two tests have been added to verify that:
>>  1. The correct network is chosen for snat.
>>  2. The new and updated flows with flags.network_id are correct.
>>
>> And tests that were broken by this new behavior have been updated.
>>
> 
> We may want to link the original mailing list discussion as well:
> 
> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2024-October/417717.html
> 
>> Signed-off-by: Rosemarie O'Riorden <rosemarie@redhat.com>
>> ---
>>  include/ovn/logical-fields.h |   5 ++
>>  lib/logical-fields.c         |   3 +
>>  northd/northd.c              | 141 +++++++++++++++++++----------
>>  northd/northd.h              |   3 +-
>>  northd/ovn-northd.8.xml      |  17 ++--
>>  tests/ovn-northd.at          | 169 +++++++++++++++++++++++++++++++----
>>  6 files changed, 267 insertions(+), 71 deletions(-)
>>
>> diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
>> index fbe73763e..73feb4b1b 100644
>> --- a/include/ovn/logical-fields.h
>> +++ b/include/ovn/logical-fields.h
>> @@ -95,6 +95,8 @@ enum mff_log_flags_bits {
>>      MLF_ICMP_SNAT_BIT = 17,
>>      MLF_OVERRIDE_LOCAL_ONLY_BIT = 18,
>>      MLF_FROM_CTRL_BIT = 19,
>> +    MLF_NETWORK_ID_START_BIT = 28,
>> +    MLF_NETWORK_ID_END_BIT = 31,
>>  };
>>  
>>  /* MFF_LOG_FLAGS_REG flag assignments */
>> @@ -152,6 +154,9 @@ enum mff_log_flags {
>>      MLF_ICMP_SNAT = (1 << MLF_ICMP_SNAT_BIT),
>>  
>>      MLF_OVERRIDE_LOCAL_ONLY = (1 << MLF_OVERRIDE_LOCAL_ONLY_BIT),
>> +
>> +    MLF_NETWORK_ID = ((MLF_NETWORK_ID_END_BIT - MLF_NETWORK_ID_START_BIT + 1)
>> +                       << MLF_NETWORK_ID_START_BIT) - 1,
> 
> Hmm, this doesn't seem right:
> 
>>>> hex(((31 - 28 + 1) << 28) - 1)
> '0x3fffffff'
> 
> And what we actually want is 0xf0000000.
> 
> This constant is not used anywhere in the code, but it should still be correct.
> 
>>  };
>>  
>>  /* OVN logical fields
>> diff --git a/lib/logical-fields.c b/lib/logical-fields.c
>> index df1b4243c..84a5dec99 100644
>> --- a/lib/logical-fields.c
>> +++ b/lib/logical-fields.c
>> @@ -139,6 +139,9 @@ ovn_init_symtab(struct shash *symtab)
>>                               flags_str);
>>      snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_RX_FROM_TUNNEL_BIT);
>>      expr_symtab_add_subfield(symtab, "flags.tunnel_rx", NULL, flags_str);
>> +    snprintf(flags_str, sizeof flags_str, "flags[%d..%d]",
>> +             MLF_NETWORK_ID_START_BIT, MLF_NETWORK_ID_END_BIT);
>> +    expr_symtab_add_subfield(symtab, "flags.network_id", NULL, flags_str);
>>  
>>      snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_FROM_CTRL_BIT);
>>      expr_symtab_add_subfield(symtab, "flags.from_ctrl", NULL, flags_str);
>> diff --git a/northd/northd.c b/northd/northd.c
>> index 1097bb159..ea784766b 100644
>> --- a/northd/northd.c
>> +++ b/northd/northd.c
>> @@ -13091,12 +13091,12 @@ build_lrouter_force_snat_flows_op(struct ovn_port *op,
>>          return;
>>      }
>>  
>> -    if (op->lrp_networks.n_ipv4_addrs) {
>> +    for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
>>          ds_clear(match);
>>          ds_clear(actions);
>>  
>>          ds_put_format(match, "inport == %s && ip4.dst == %s",
>> -                      op->json_key, op->lrp_networks.ipv4_addrs[0].addr_s);
>> +                      op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s);
>>          ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
>>                        ds_cstr(match), "ct_snat;", lflow_ref);
>>  
>> @@ -13105,53 +13105,46 @@ build_lrouter_force_snat_flows_op(struct ovn_port *op,
>>          /* Higher priority rules to force SNAT with the router port ip.
>>           * This only takes effect when the packet has already been
>>           * load balanced once. */
>> -        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip4 && "
>> -                      "outport == %s", op->json_key);
>> +        ds_put_format(match, "flags.force_snat_for_lb == 1 && "
>> +                      "flags.network_id == %"PRIuSIZE" && ip4 && "
>> +                      "outport == %s", i, op->json_key);
>>          ds_put_format(actions, "ct_snat(%s);",
>> -                      op->lrp_networks.ipv4_addrs[0].addr_s);
>> +                      op->lrp_networks.ipv4_addrs[i].addr_s);
>>          ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
>>                        ds_cstr(match), ds_cstr(actions),
>>                        lflow_ref);
>> -        if (op->lrp_networks.n_ipv4_addrs > 1) {
>> -            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> -            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
>> -                              "multiple IPv4 addresses.  Only the first "
>> -                              "IP [%s] is considered as SNAT for load "
>> -                              "balancer", op->json_key,
>> -                              op->lrp_networks.ipv4_addrs[0].addr_s);
>> -        }
> 
> What if we have more than 16 networks?  This loop will iterate over and
> try to match on values larger than the register.
> 
> We can't match on more than 16 networks, so we should not create broken
> flows for them.  We'll need to handle that case in the fucntion that
> assigns the ids - build_lr_force_snat_network_id_flows.
> 
> 
> One more thing is that we should add an extra lower priority flow that
> will not match on the network id, but will directly use the first network
> for SNAT:
> 
>   stage:S_ROUTER_OUT_SNAT
>   priority:100
>   match: flags.force_snat_for_lb == 1 && ip4 && outport == X
>   actions: ct_snat(lrp_networks.ipv4_addrs[0]);
> 
> This is useful for upgrade scenarios, or in case this change is going to
> be backported.  If northd is upgraded first, older ovn-controller that
> doesn't recognize network_id flag will not install flows that use it, but
> it will install this one and will have old behavior of SNATting to the
> first IP working, until upgraded.
> It is not a correct upgrade order, but if we want to beckport this change,
> then it becomes reasonable as we do not enforce upgrade order for minor
> release updates.
> 
>>      }
>>  
>>      /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
>> -     * last in the list. So add the flows only if n_ipv6_addrs > 1. */
>> +     * last in the list. So add the flows only if n_ipv6_addrs > 1, and loop
>> +     * n_ipv6_addrs - 1 times. */
>>      if (op->lrp_networks.n_ipv6_addrs > 1) {
> 
> The loop below will not run if n_ipv6_addrs <= 1, so this if condition
> is not needed.
> 
>> -        ds_clear(match);
>> -        ds_clear(actions);
>> -
>> -        ds_put_format(match, "inport == %s && ip6.dst == %s",
>> -                      op->json_key, op->lrp_networks.ipv6_addrs[0].addr_s);
>> -        ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
>> -                      ds_cstr(match), "ct_snat;", lflow_ref);
>> +        for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
>> +            ds_clear(match);
>> +            ds_clear(actions);
>>  
>> -        ds_clear(match);
>> +            ds_put_format(match, "inport == %s && ip6.dst == %s",
>> +                          op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s);
>> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
>> +                          ds_cstr(match), "ct_snat;", lflow_ref);
>> +            ds_clear(match);
>>  
>> -        /* Higher priority rules to force SNAT with the router port ip.
>> -         * This only takes effect when the packet has already been
>> -         * load balanced once. */
>> -        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
>> -                      "outport == %s", op->json_key);
>> -        ds_put_format(actions, "ct_snat(%s);",
>> -                      op->lrp_networks.ipv6_addrs[0].addr_s);
>> -        ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
>> -                      ds_cstr(match), ds_cstr(actions),
>> -                      lflow_ref);
>> -        if (op->lrp_networks.n_ipv6_addrs > 2) {
>> -            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> -            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
>> -                              "multiple IPv6 addresses.  Only the first "
>> -                              "IP [%s] is considered as SNAT for load "
>> -                              "balancer", op->json_key,
>> -                              op->lrp_networks.ipv6_addrs[0].addr_s);
> 
> Same thing as for IPv4.  We need to check that the network IDs can fit
> into our 4-bit field.  Assign ID 0 and warn, if they do not.
> 
>> +            /* Higher priority rules to force SNAT with the router port ip.
>> +             * This only takes effect when the packet has already been
>> +             * load balanced once. */
>> +            if (op->lrp_networks.n_ipv6_addrs == 1) {
> 
> This is impossible here.  We already checked twice for the number of netwroks
> to be more than 1.
> 
>> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
>> +                              "outport == %s", op->json_key);
>> +            } else if (op->lrp_networks.n_ipv6_addrs > 1) {
> 
> And this is always true here.
> 
>> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
>> +                              "flags.network_id == %"PRIuSIZE" && ip6 && "
>> +                              "outport == %s", i, op->json_key);
>> +            }
>> +            ds_put_format(actions, "ct_snat(%s);",
>> +                          op->lrp_networks.ipv6_addrs[i].addr_s);
>> +            ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
>> +                          ds_cstr(match), ds_cstr(actions),
>> +                          lflow_ref);
> 
> And the same comment for the lower priority rule without match on network id
> here as well.
> 
>>          }
>>      }
>>  }
>> @@ -15038,6 +15031,56 @@ build_arp_request_flows_for_lrouter(
>>                    lflow_ref);
>>  }
>>  
>> +static void
>> +build_lr_force_snat_network_id_flows(
>> +            struct ovn_datapath *od, struct lflow_table *lflows,
>> +            struct ds *match, struct ds *actions, struct lflow_ref *lflow_ref)
>> +{
>> +    const struct ovn_port *op;
>> +    HMAP_FOR_EACH (op, dp_node, &od->ports) {
>> +        for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
>> +            ds_clear(match);
>> +            ds_clear(actions);
>> +
>> +            ds_put_format(match, "flags.force_snat_for_lb == 1 && "
>> +                          "outport == %s && " REG_NEXT_HOP_IPV4 " == %s/%d",
>> +                          op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s,
>> +                         op->lrp_networks.ipv4_addrs[i].plen);
> 
> Nit: indentation is a little off.
> 
>> +
>> +            ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);
> 
> What if the number of networks is more than 16?  We should check for that
> condition and assign, let's say, network id 0 for all the networks that can't
> fit into our 4 bits of space.
> 
> And we may want to preserve a (modified) warning from the old
> build_lrouter_force_snat_flows_op for that case as well.
> 
>> +            ds_put_format(actions, "next;");
>> +
>> +            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
>> +               ds_cstr(match), ds_cstr(actions),
>> +               lflow_ref);
> 
> Nit: Arguments should be aligned to the opening parenthesis.
> 
>> +        }
>> +
>> +        /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
>> +         * last in the list. So add the flows only if n_ipv6_addrs > 1, and
>> +         * loop n_ipv6_addrs - 1 times. */
>> +        if (op->lrp_networks.n_ipv6_addrs > 1) {
> 
> Same here, the loop will not run if this condition is not met, so the if
> statement is not needed.
> 
>> +            for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
>> +                ds_clear(match);
>> +                ds_clear(actions);
>> +
>> +                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
>> +                              "outport == %s && " REG_NEXT_HOP_IPV6 " == %s/%d",
>> +                              op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s,
>> +                             op->lrp_networks.ipv6_addrs[i].plen);
> 
> Nit: indentation.
> 
>> +
>> +                ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);
> 
> Same comemnt about network ID larger than 4 bits.
> 
>> +                ds_put_format(actions, "next;");
>> +
>> +                ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
>> +                   ds_cstr(match), ds_cstr(actions),
>> +                   lflow_ref);
> 
> Nit: indentation.
> 
>> +            }
>> +        }
>> +    }
>> +    ovn_lflow_add(lflows, od, S_ROUTER_IN_NETWORK_ID, 0,
>> +                  "1", "next;", lflow_ref);
>> +}
>> +
>>  /* Logical router egress table DELIVERY: Delivery (priority 100-110).
>>   *
>>   * Priority 100 rules deliver packets to enabled logical ports.
>> @@ -17035,26 +17078,28 @@ build_lrouter_nat_defrag_and_lb(
>>      /* Handle force SNAT options set in the gateway router. */
>>      if (od->is_gw_router) {
>>          if (dnat_force_snat_ip) {
>> -            if (lrnat_rec->dnat_force_snat_addrs.n_ipv4_addrs) {
>> +            struct lport_addresses dn_addrs = lrnat_rec->dnat_force_snat_addrs;
>> +            for (size_t i = 0; i < dn_addrs.n_ipv4_addrs; i++) {
>>                  build_lrouter_force_snat_flows(lflows, od, "4",
>> -                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[0].addr_s,
>> +                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[i].addr_s,
>>                      "dnat", lflow_ref);
>>              }
>> -            if (lrnat_rec->dnat_force_snat_addrs.n_ipv6_addrs) {
>> +            for (size_t i = 0; i < dn_addrs.n_ipv6_addrs; i++) {
>>                  build_lrouter_force_snat_flows(lflows, od, "6",
>> -                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[0].addr_s,
>> +                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[i].addr_s,
>>                      "dnat", lflow_ref);
>>              }
>>          }
>>          if (lb_force_snat_ip) {
>> -            if (lrnat_rec->lb_force_snat_addrs.n_ipv4_addrs) {
>> +            struct lport_addresses lb_addrs = lrnat_rec->lb_force_snat_addrs;
>> +            for (size_t i = 0; i < lb_addrs.n_ipv4_addrs; i++) {
>>                  build_lrouter_force_snat_flows(lflows, od, "4",
>> -                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[0].addr_s, "lb",
>> +                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[i].addr_s, "lb",
>>                      lflow_ref);
>>              }
>> -            if (lrnat_rec->lb_force_snat_addrs.n_ipv6_addrs) {
>> +            for (size_t i = 0; i < lb_addrs.n_ipv6_addrs; i++) {
>>                  build_lrouter_force_snat_flows(lflows, od, "6",
>> -                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[0].addr_s, "lb",
>> +                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[i].addr_s, "lb",
>>                      lflow_ref);
>>              }
>>          }
> 
> I don't think the changes above are correct.  They are adding multiple logical
> flows with exact same match criteria but different actions.
> 
> This code should be left intact.
> 
> lb_force_snat_addrs are populated from explicitly provided IP addresses for the
> force SNAT.  When  lb_force_snat_ip=router_ip, the set is empty, so it's not the
> case you're trying to solve.
> 
> See northd/en-lr-nat.c:lr_nat_record_init().  Your change should only cover the
> case of lrnat_rec->lb_force_snat_router_ip = true.
> 
> The other part here is for dnat and not even the load balancer (dnat_force_snat_addrs),
> so should remain intact as well.
> 
> 
>> @@ -17415,6 +17460,8 @@ build_lswitch_and_lrouter_iterate_by_lr(struct ovn_datapath *od,
>>                                          &lsi->actions,
>>                                          lsi->meter_groups,
>>                                          NULL);
>> +    build_lr_force_snat_network_id_flows(od, lsi->lflows, &lsi->match,
>> +                                         &lsi->actions, NULL);
>>      build_misc_local_traffic_drop_flows_for_lrouter(od, lsi->lflows, NULL);
>>  
>>      build_lr_nat_defrag_and_lb_default_flows(od, lsi->lflows, NULL);
>> diff --git a/northd/northd.h b/northd/northd.h
>> index 1f29645c7..e89ab3c74 100644
>> --- a/northd/northd.h
>> +++ b/northd/northd.h
>> @@ -537,7 +537,8 @@ enum ovn_stage {
>>      PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     21, "lr_in_chk_pkt_len")     \
>>      PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     22, "lr_in_larger_pkts")     \
>>      PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     23, "lr_in_gw_redirect")     \
>> -    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     24, "lr_in_arp_request")     \
>> +    PIPELINE_STAGE(ROUTER, IN,  NETWORK_ID,      24, "lr_in_network_id")      \
>> +    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     25, "lr_in_arp_request")     \
>>                                                                        \
>>      /* Logical router egress stages. */                               \
>>      PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>> index 93b1a9135..e15b91f1f 100644
>> --- a/northd/ovn-northd.8.xml
>> +++ b/northd/ovn-northd.8.xml
>> @@ -5256,18 +5256,19 @@ nd_ns {
>>            table="Logical_Router"/>:lb_force_snat_ip=router_ip), then for
>>            each logical router port <var>P</var> attached to the Gateway
>>            router, a priority-110 flow matches
>> -          <code>flags.force_snat_for_lb == 1 &amp;&amp; outport == <var>P</var>
>> -          </code> with an action <code>ct_snat(<var>R</var>);</code>
>> -          where <var>R</var> is the IP configured on the router port.
>> -          If <code>R</code> is an IPv4 address then the match will also
>> -          include <code>ip4</code> and if it is an IPv6 address, then the
>> -          match will also include <code>ip6</code>.
>> +          <code>flags.force_snat_for_lb == 1 &amp;&amp; flags.network_id ==
>> +          <var>I</var> &amp;&amp; outport == <var>P</var></code>, where
>> +          <var>I</var> is the network index, with an action
>> +          <code>ct_snat(<var>R</var>);</code> where <var>R</var> is the IP
>> +          configured on the router port. If <code>R</code> is an IPv4 address
>> +          then the match will also include <code>ip4</code> and if it is an
>> +          IPv6 address, then the match will also include <code>ip6</code>.
>>          </p>
>>  
>>          <p>
>>            If the logical router port <var>P</var> is configured with multiple
>> -          IPv4 and multiple IPv6 addresses, only the first IPv4 and first IPv6
>> -          address is considered.
>> +          IPv4 and multiple IPv6 addresses, the IPv4 and IPv6 address within
>> +          the same network as the next-hop will be chosen.
>>          </p>
>>        </li>
> 
> This is part of the 'Egress Table 3: SNAT on Gateway Routers' description,
> but you're also adding a new NETWORK_ID stage that needs to be documented.
> And you changed the table number for the ARP_REQUEST stage, you need to
> update that in the doc as well.
> 
>>  
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index 64991ff75..777f3b95b 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -4460,9 +4460,9 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
>>  
>>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>>  ])
>>  
>> @@ -4525,10 +4525,10 @@ AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
>>  
>>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>>  ])
>>  
>> @@ -6235,8 +6235,8 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>>  
>>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
>> @@ -6301,8 +6301,8 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>>  
>>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
>> @@ -6379,10 +6379,10 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
>>  
>>  AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
>>    table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
>> -  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
>>    table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
>>    table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
>>    table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
>> @@ -15542,3 +15542,142 @@ CHECK_NO_CHANGE_AFTER_RECOMPUTE
>>  
>>  AT_CLEANUP
>>  ])
>> +
>> +AT_SETUP([lb_force_snat_ip=routerip select correct network for snat])
> 
> This is an end-to-end test.  It should be in ovn.at, not in this file.
> 
>> +AT_SKIP_IF([test $HAVE_SCAPY = no])
>> +ovn_start
>> +
>> +check ovn-nbctl lr-add lr
>> +check ovn-nbctl set logical_router lr options:chassis=hv1
>> +check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
>> +check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24
>> +check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 1.2.1.1/24 \
>> +                    1.1.3.1/24 2.2.2.1/24 7.7.7.1/24 8.8.8.1/24
>> +check ovn-nbctl ls-add ls-client
>> +check ovn-nbctl ls-add ls-server
>> +check ovn-nbctl lsp-add ls-client lsp-client-router
>> +check ovn-nbctl lsp-set-type lsp-client-router router
>> +check ovn-nbctl lsp-add ls-server lsp-server-router
>> +check ovn-nbctl lsp-set-type lsp-server-router router
>> +check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
>> +check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
>> +check ovn-nbctl lsp-add ls-client client
>> +check ovn-nbctl lsp-add ls-server server
>> +check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
>> +check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
>> +check ovn-nbctl lsp-set-addresses lsp-client-router router
>> +check ovn-nbctl lsp-set-addresses lsp-server-router router
>> +check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
>> +check ovn-nbctl lr-lb-add lr lb
>> +
>> +# Create a hypervisor and create OVS ports corresponding to logical ports.
>> +net_add n1
>> +sim_add hv1
>> +as hv1
>> +ovs-vsctl add-br br-phys
> 
> Need a 'check'.
> 
>> +ovn_attach n1 br-phys 192.168.0.1
>> +
>> +ovs-vsctl -- add-port br-int hv1-vif1 -- \
> 
> check.
> 
>> +    set interface hv1-vif1 external-ids:iface-id=client \
>> +    options:tx_pcap=hv1/client-tx.pcap \
>> +    options:rxq_pcap=hv1/client-rx.pcap
>> +
>> +ovs-vsctl -- add-port br-int hv1-vif2 -- \
> 
> check.
> 
>> +    set interface hv1-vif2 external-ids:iface-id=server \
>> +    options:tx_pcap=hv1/server-tx.pcap \
>> +    options:rxq_pcap=hv1/server-rx.pcap
>> +
>> +wait_for_ports_up
>> +check ovn-nbctl --wait=hv sync
>> +
>> +tx_src_mac="02:00:00:00:00:01"
>> +tx_dst_mac="02:00:00:00:00:02"
>> +tx_src_ip=1.1.1.10
>> +tx_dst_ip=42.42.42.42
>> +request=$(fmt_pkt "Ether(dst='${tx_dst_mac}', src='${tx_src_mac}')/ \
>> +                  IP(src='${tx_src_ip}', dst='${tx_dst_ip}')/ \
>> +                  UDP(sport=20001, dport=80)")
>> +
>> +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $request
> 
> check.
> 
>> +
>> +rx_src_mac="02:00:00:00:00:03"
>> +rx_dst_mac="02:00:00:00:00:04"
>> +rx_src_ip=2.2.2.1
>> +rx_dst_ip=2.2.2.10
>> +expected=$(fmt_pkt "Ether(dst='${rx_dst_mac}', src='${rx_src_mac}')/ \
>> +                  IP(src='${rx_src_ip}', dst='${rx_dst_ip}', ttl=0x3F)/ \
>> +                  UDP(sport=20001, dport=80)")
>> +
>> +echo $expected > expected
>> +OVN_CHECK_PACKETS([hv1/server-tx.pcap], [expected])
>> +AT_CLEANUP
>> +
>> +AT_SETUP([lb_force_snat_ip=routerip generate flags.network_id flows])
>> +ovn_start
>> +
>> +check ovn-nbctl lr-add lr
>> +check ovn-nbctl set logical_router lr options:chassis=hv1
>> +check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
>> +check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24 ff01::01
>> +check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 7.7.7.1/24 8.8.8.1/24 \
>> +                      ff01::02 ff01::03 ff01::06
>> +check ovn-nbctl ls-add ls-client
>> +check ovn-nbctl ls-add ls-server
>> +check ovn-nbctl lsp-add ls-client lsp-client-router
>> +check ovn-nbctl lsp-set-type lsp-client-router router
>> +check ovn-nbctl lsp-add ls-server lsp-server-router
>> +check ovn-nbctl lsp-set-type lsp-server-router router
>> +check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
>> +check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
>> +check ovn-nbctl lsp-add ls-client client
>> +check ovn-nbctl lsp-add ls-server server
>> +check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
>> +check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
>> +check ovn-nbctl lsp-set-addresses lsp-client-router router
>> +check ovn-nbctl lsp-set-addresses lsp-server-router router
>> +check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
>> +check ovn-nbctl lr-lb-add lr lb
>> +
>> +# Create a hypervisor and create OVS ports corresponding to logical ports.
> 
> We don't need that for the logical flow check, we only need central components
> running.
> 
>> +net_add n1
>> +sim_add hv1
>> +as hv1
>> +ovs-vsctl add-br br-phys
>> +ovn_attach n1 br-phys 192.168.0.1
>> +
>> +ovs-vsctl add-port br-int hv1-vif1 -- \
>> +    set interface hv1-vif1 external-ids:iface-id=client \
>> +    options:tx_pcap=hv1/client-tx.pcap \
>> +    options:rxq_pcap=hv1/client-rx.pcap
>> +
>> +ovs-vsctl add-port br-int hv1-vif2 -- \
>> +    set interface hv1-vif2 external-ids:iface-id=server \
>> +    options:tx_pcap=hv1/server-tx.pcap \
>> +    options:rxq_pcap=hv1/server-rx.pcap
>> +
>> +#OVN_POPULATE_ARP
>> +wait_for_ports_up
> 
> All of the above til the previous comment should not be needed for this test.
> 
>> +check ovn-nbctl --wait=hv sync
>> +
>> +ovn-sbctl dump-flows lr > lrflows
>> +AT_CAPTURE_FILE([lrflows])
>> +
>> +AT_CHECK([grep -E flags.network_id lrflows | ovn_strip_lflows], [0], [dnl
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && reg0 == 1.1.1.1/24), action=(flags.network_id = 0; next;)
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && xxreg0 == ff01::1/128), action=(flags.network_id = 0; next;)
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 1.1.2.1/24), action=(flags.network_id = 0; next;)
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 7.7.7.1/24), action=(flags.network_id = 1; next;)
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 8.8.8.1/24), action=(flags.network_id = 2; next;)
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::2/128), action=(flags.network_id = 0; next;)
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::3/128), action=(flags.network_id = 1; next;)
>> +  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::6/128), action=(flags.network_id = 2; next;)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-client"), action=(ct_snat(1.1.1.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-server"), action=(ct_snat(1.1.2.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-client"), action=(ct_snat(ff01::1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::2);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip4 && outport == "lrp-server"), action=(ct_snat(7.7.7.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::3);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip4 && outport == "lrp-server"), action=(ct_snat(8.8.8.1);)
>> +  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::6);)
>> +])
>> +AT_CLEANUP
>
Ilya Maximets Feb. 12, 2025, 11:05 p.m. UTC | #4
On 2/12/25 23:48, Ilya Maximets wrote:
>
> One more thing is that we should add an extra lower priority flow that
> will not match on the network id, but will directly use the first network
> for SNAT:
> 
>   stage:S_ROUTER_OUT_SNAT
>   priority:100

105 might be a better priority for this, as there are other flows in this
stage at 100 (the ones for lb_force_snat_ip=<some actual ip>).

>   match: flags.force_snat_for_lb == 1 && ip4 && outport == X
>   actions: ct_snat(lrp_networks.ipv4_addrs[0]);
> 
> This is useful for upgrade scenarios, or in case this change is going to
> be backported.  If northd is upgraded first, older ovn-controller that
> doesn't recognize network_id flag will not install flows that use it, but
> it will install this one and will have old behavior of SNATting to the
> first IP working, until upgraded.
> It is not a correct upgrade order, but if we want to beckport this change,
> then it becomes reasonable as we do not enforce upgrade order for minor
> release updates.
diff mbox series

Patch

diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
index fbe73763e..73feb4b1b 100644
--- a/include/ovn/logical-fields.h
+++ b/include/ovn/logical-fields.h
@@ -95,6 +95,8 @@  enum mff_log_flags_bits {
     MLF_ICMP_SNAT_BIT = 17,
     MLF_OVERRIDE_LOCAL_ONLY_BIT = 18,
     MLF_FROM_CTRL_BIT = 19,
+    MLF_NETWORK_ID_START_BIT = 28,
+    MLF_NETWORK_ID_END_BIT = 31,
 };
 
 /* MFF_LOG_FLAGS_REG flag assignments */
@@ -152,6 +154,9 @@  enum mff_log_flags {
     MLF_ICMP_SNAT = (1 << MLF_ICMP_SNAT_BIT),
 
     MLF_OVERRIDE_LOCAL_ONLY = (1 << MLF_OVERRIDE_LOCAL_ONLY_BIT),
+
+    MLF_NETWORK_ID = ((MLF_NETWORK_ID_END_BIT - MLF_NETWORK_ID_START_BIT + 1)
+                       << MLF_NETWORK_ID_START_BIT) - 1,
 };
 
 /* OVN logical fields
diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index df1b4243c..84a5dec99 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -139,6 +139,9 @@  ovn_init_symtab(struct shash *symtab)
                              flags_str);
     snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_RX_FROM_TUNNEL_BIT);
     expr_symtab_add_subfield(symtab, "flags.tunnel_rx", NULL, flags_str);
+    snprintf(flags_str, sizeof flags_str, "flags[%d..%d]",
+             MLF_NETWORK_ID_START_BIT, MLF_NETWORK_ID_END_BIT);
+    expr_symtab_add_subfield(symtab, "flags.network_id", NULL, flags_str);
 
     snprintf(flags_str, sizeof flags_str, "flags[%d]", MLF_FROM_CTRL_BIT);
     expr_symtab_add_subfield(symtab, "flags.from_ctrl", NULL, flags_str);
diff --git a/northd/northd.c b/northd/northd.c
index 1097bb159..ea784766b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -13091,12 +13091,12 @@  build_lrouter_force_snat_flows_op(struct ovn_port *op,
         return;
     }
 
-    if (op->lrp_networks.n_ipv4_addrs) {
+    for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
         ds_clear(match);
         ds_clear(actions);
 
         ds_put_format(match, "inport == %s && ip4.dst == %s",
-                      op->json_key, op->lrp_networks.ipv4_addrs[0].addr_s);
+                      op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s);
         ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
                       ds_cstr(match), "ct_snat;", lflow_ref);
 
@@ -13105,53 +13105,46 @@  build_lrouter_force_snat_flows_op(struct ovn_port *op,
         /* Higher priority rules to force SNAT with the router port ip.
          * This only takes effect when the packet has already been
          * load balanced once. */
-        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip4 && "
-                      "outport == %s", op->json_key);
+        ds_put_format(match, "flags.force_snat_for_lb == 1 && "
+                      "flags.network_id == %"PRIuSIZE" && ip4 && "
+                      "outport == %s", i, op->json_key);
         ds_put_format(actions, "ct_snat(%s);",
-                      op->lrp_networks.ipv4_addrs[0].addr_s);
+                      op->lrp_networks.ipv4_addrs[i].addr_s);
         ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
                       ds_cstr(match), ds_cstr(actions),
                       lflow_ref);
-        if (op->lrp_networks.n_ipv4_addrs > 1) {
-            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
-            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
-                              "multiple IPv4 addresses.  Only the first "
-                              "IP [%s] is considered as SNAT for load "
-                              "balancer", op->json_key,
-                              op->lrp_networks.ipv4_addrs[0].addr_s);
-        }
     }
 
     /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
-     * last in the list. So add the flows only if n_ipv6_addrs > 1. */
+     * last in the list. So add the flows only if n_ipv6_addrs > 1, and loop
+     * n_ipv6_addrs - 1 times. */
     if (op->lrp_networks.n_ipv6_addrs > 1) {
-        ds_clear(match);
-        ds_clear(actions);
-
-        ds_put_format(match, "inport == %s && ip6.dst == %s",
-                      op->json_key, op->lrp_networks.ipv6_addrs[0].addr_s);
-        ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
-                      ds_cstr(match), "ct_snat;", lflow_ref);
+        for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
+            ds_clear(match);
+            ds_clear(actions);
 
-        ds_clear(match);
+            ds_put_format(match, "inport == %s && ip6.dst == %s",
+                          op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s);
+            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_UNSNAT, 110,
+                          ds_cstr(match), "ct_snat;", lflow_ref);
+            ds_clear(match);
 
-        /* Higher priority rules to force SNAT with the router port ip.
-         * This only takes effect when the packet has already been
-         * load balanced once. */
-        ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
-                      "outport == %s", op->json_key);
-        ds_put_format(actions, "ct_snat(%s);",
-                      op->lrp_networks.ipv6_addrs[0].addr_s);
-        ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
-                      ds_cstr(match), ds_cstr(actions),
-                      lflow_ref);
-        if (op->lrp_networks.n_ipv6_addrs > 2) {
-            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
-            VLOG_WARN_RL(&rl, "Logical router port %s is configured with "
-                              "multiple IPv6 addresses.  Only the first "
-                              "IP [%s] is considered as SNAT for load "
-                              "balancer", op->json_key,
-                              op->lrp_networks.ipv6_addrs[0].addr_s);
+            /* Higher priority rules to force SNAT with the router port ip.
+             * This only takes effect when the packet has already been
+             * load balanced once. */
+            if (op->lrp_networks.n_ipv6_addrs == 1) {
+                ds_put_format(match, "flags.force_snat_for_lb == 1 && ip6 && "
+                              "outport == %s", op->json_key);
+            } else if (op->lrp_networks.n_ipv6_addrs > 1) {
+                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
+                              "flags.network_id == %"PRIuSIZE" && ip6 && "
+                              "outport == %s", i, op->json_key);
+            }
+            ds_put_format(actions, "ct_snat(%s);",
+                          op->lrp_networks.ipv6_addrs[i].addr_s);
+            ovn_lflow_add(lflows, op->od, S_ROUTER_OUT_SNAT, 110,
+                          ds_cstr(match), ds_cstr(actions),
+                          lflow_ref);
         }
     }
 }
@@ -15038,6 +15031,56 @@  build_arp_request_flows_for_lrouter(
                   lflow_ref);
 }
 
+static void
+build_lr_force_snat_network_id_flows(
+            struct ovn_datapath *od, struct lflow_table *lflows,
+            struct ds *match, struct ds *actions, struct lflow_ref *lflow_ref)
+{
+    const struct ovn_port *op;
+    HMAP_FOR_EACH (op, dp_node, &od->ports) {
+        for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
+            ds_clear(match);
+            ds_clear(actions);
+
+            ds_put_format(match, "flags.force_snat_for_lb == 1 && "
+                          "outport == %s && " REG_NEXT_HOP_IPV4 " == %s/%d",
+                          op->json_key, op->lrp_networks.ipv4_addrs[i].addr_s,
+                         op->lrp_networks.ipv4_addrs[i].plen);
+
+            ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);
+            ds_put_format(actions, "next;");
+
+            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
+               ds_cstr(match), ds_cstr(actions),
+               lflow_ref);
+        }
+
+        /* op->lrp_networks.ipv6_addrs will always have LLA and that will be
+         * last in the list. So add the flows only if n_ipv6_addrs > 1, and
+         * loop n_ipv6_addrs - 1 times. */
+        if (op->lrp_networks.n_ipv6_addrs > 1) {
+            for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs - 1; i++) {
+                ds_clear(match);
+                ds_clear(actions);
+
+                ds_put_format(match, "flags.force_snat_for_lb == 1 && "
+                              "outport == %s && " REG_NEXT_HOP_IPV6 " == %s/%d",
+                              op->json_key, op->lrp_networks.ipv6_addrs[i].addr_s,
+                             op->lrp_networks.ipv6_addrs[i].plen);
+
+                ds_put_format(actions, "flags.network_id = %"PRIuSIZE"; ", i);
+                ds_put_format(actions, "next;");
+
+                ovn_lflow_add(lflows, op->od, S_ROUTER_IN_NETWORK_ID, 110,
+                   ds_cstr(match), ds_cstr(actions),
+                   lflow_ref);
+            }
+        }
+    }
+    ovn_lflow_add(lflows, od, S_ROUTER_IN_NETWORK_ID, 0,
+                  "1", "next;", lflow_ref);
+}
+
 /* Logical router egress table DELIVERY: Delivery (priority 100-110).
  *
  * Priority 100 rules deliver packets to enabled logical ports.
@@ -17035,26 +17078,28 @@  build_lrouter_nat_defrag_and_lb(
     /* Handle force SNAT options set in the gateway router. */
     if (od->is_gw_router) {
         if (dnat_force_snat_ip) {
-            if (lrnat_rec->dnat_force_snat_addrs.n_ipv4_addrs) {
+            struct lport_addresses dn_addrs = lrnat_rec->dnat_force_snat_addrs;
+            for (size_t i = 0; i < dn_addrs.n_ipv4_addrs; i++) {
                 build_lrouter_force_snat_flows(lflows, od, "4",
-                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[0].addr_s,
+                    lrnat_rec->dnat_force_snat_addrs.ipv4_addrs[i].addr_s,
                     "dnat", lflow_ref);
             }
-            if (lrnat_rec->dnat_force_snat_addrs.n_ipv6_addrs) {
+            for (size_t i = 0; i < dn_addrs.n_ipv6_addrs; i++) {
                 build_lrouter_force_snat_flows(lflows, od, "6",
-                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[0].addr_s,
+                    lrnat_rec->dnat_force_snat_addrs.ipv6_addrs[i].addr_s,
                     "dnat", lflow_ref);
             }
         }
         if (lb_force_snat_ip) {
-            if (lrnat_rec->lb_force_snat_addrs.n_ipv4_addrs) {
+            struct lport_addresses lb_addrs = lrnat_rec->lb_force_snat_addrs;
+            for (size_t i = 0; i < lb_addrs.n_ipv4_addrs; i++) {
                 build_lrouter_force_snat_flows(lflows, od, "4",
-                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[0].addr_s, "lb",
+                    lrnat_rec->lb_force_snat_addrs.ipv4_addrs[i].addr_s, "lb",
                     lflow_ref);
             }
-            if (lrnat_rec->lb_force_snat_addrs.n_ipv6_addrs) {
+            for (size_t i = 0; i < lb_addrs.n_ipv6_addrs; i++) {
                 build_lrouter_force_snat_flows(lflows, od, "6",
-                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[0].addr_s, "lb",
+                    lrnat_rec->lb_force_snat_addrs.ipv6_addrs[i].addr_s, "lb",
                     lflow_ref);
             }
         }
@@ -17415,6 +17460,8 @@  build_lswitch_and_lrouter_iterate_by_lr(struct ovn_datapath *od,
                                         &lsi->actions,
                                         lsi->meter_groups,
                                         NULL);
+    build_lr_force_snat_network_id_flows(od, lsi->lflows, &lsi->match,
+                                         &lsi->actions, NULL);
     build_misc_local_traffic_drop_flows_for_lrouter(od, lsi->lflows, NULL);
 
     build_lr_nat_defrag_and_lb_default_flows(od, lsi->lflows, NULL);
diff --git a/northd/northd.h b/northd/northd.h
index 1f29645c7..e89ab3c74 100644
--- a/northd/northd.h
+++ b/northd/northd.h
@@ -537,7 +537,8 @@  enum ovn_stage {
     PIPELINE_STAGE(ROUTER, IN,  CHK_PKT_LEN,     21, "lr_in_chk_pkt_len")     \
     PIPELINE_STAGE(ROUTER, IN,  LARGER_PKTS,     22, "lr_in_larger_pkts")     \
     PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,     23, "lr_in_gw_redirect")     \
-    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     24, "lr_in_arp_request")     \
+    PIPELINE_STAGE(ROUTER, IN,  NETWORK_ID,      24, "lr_in_network_id")      \
+    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,     25, "lr_in_arp_request")     \
                                                                       \
     /* Logical router egress stages. */                               \
     PIPELINE_STAGE(ROUTER, OUT, CHECK_DNAT_LOCAL,   0,                       \
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 93b1a9135..e15b91f1f 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -5256,18 +5256,19 @@  nd_ns {
           table="Logical_Router"/>:lb_force_snat_ip=router_ip), then for
           each logical router port <var>P</var> attached to the Gateway
           router, a priority-110 flow matches
-          <code>flags.force_snat_for_lb == 1 &amp;&amp; outport == <var>P</var>
-          </code> with an action <code>ct_snat(<var>R</var>);</code>
-          where <var>R</var> is the IP configured on the router port.
-          If <code>R</code> is an IPv4 address then the match will also
-          include <code>ip4</code> and if it is an IPv6 address, then the
-          match will also include <code>ip6</code>.
+          <code>flags.force_snat_for_lb == 1 &amp;&amp; flags.network_id ==
+          <var>I</var> &amp;&amp; outport == <var>P</var></code>, where
+          <var>I</var> is the network index, with an action
+          <code>ct_snat(<var>R</var>);</code> where <var>R</var> is the IP
+          configured on the router port. If <code>R</code> is an IPv4 address
+          then the match will also include <code>ip4</code> and if it is an
+          IPv6 address, then the match will also include <code>ip6</code>.
         </p>
 
         <p>
           If the logical router port <var>P</var> is configured with multiple
-          IPv4 and multiple IPv6 addresses, only the first IPv4 and first IPv6
-          address is considered.
+          IPv4 and multiple IPv6 addresses, the IPv4 and IPv6 address within
+          the same network as the next-hop will be chosen.
         </p>
       </li>
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 64991ff75..777f3b95b 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -4460,9 +4460,9 @@  AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
 
 AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
   table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
 ])
 
@@ -4525,10 +4525,10 @@  AT_CHECK([grep "lr_in_dnat" lr0flows | ovn_strip_lflows], [0], [dnl
 
 AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.100);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw1"), action=(ct_snat(20.0.0.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw1"), action=(ct_snat(bef0::1);)
   table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
 ])
 
@@ -6235,8 +6235,8 @@  AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
 
 AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
   table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
   table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
   table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
@@ -6301,8 +6301,8 @@  AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
 
 AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
   table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
   table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
   table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
@@ -6379,10 +6379,10 @@  AT_CHECK([grep "lr_out_post_undnat" lr0flows | ovn_strip_lflows], [0], [dnl
 
 AT_CHECK([grep "lr_out_snat" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_out_snat        ), priority=0    , match=(1), action=(next;)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
-  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-public"), action=(ct_snat(172.168.0.10);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lr0-sw0"), action=(ct_snat(10.0.0.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-public"), action=(ct_snat(def0::10);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lr0-sw0"), action=(ct_snat(aef0::1);)
   table=??(lr_out_snat        ), priority=120  , match=(nd_ns), action=(next;)
   table=??(lr_out_snat        ), priority=25   , match=(ip && ip4.src == 10.0.0.0/24 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.10);)
   table=??(lr_out_snat        ), priority=33   , match=(ip && ip4.src == 10.0.0.10 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.168.0.30);)
@@ -15542,3 +15542,142 @@  CHECK_NO_CHANGE_AFTER_RECOMPUTE
 
 AT_CLEANUP
 ])
+
+AT_SETUP([lb_force_snat_ip=routerip select correct network for snat])
+AT_SKIP_IF([test $HAVE_SCAPY = no])
+ovn_start
+
+check ovn-nbctl lr-add lr
+check ovn-nbctl set logical_router lr options:chassis=hv1
+check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
+check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24
+check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 1.2.1.1/24 \
+                    1.1.3.1/24 2.2.2.1/24 7.7.7.1/24 8.8.8.1/24
+check ovn-nbctl ls-add ls-client
+check ovn-nbctl ls-add ls-server
+check ovn-nbctl lsp-add ls-client lsp-client-router
+check ovn-nbctl lsp-set-type lsp-client-router router
+check ovn-nbctl lsp-add ls-server lsp-server-router
+check ovn-nbctl lsp-set-type lsp-server-router router
+check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
+check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
+check ovn-nbctl lsp-add ls-client client
+check ovn-nbctl lsp-add ls-server server
+check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
+check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
+check ovn-nbctl lsp-set-addresses lsp-client-router router
+check ovn-nbctl lsp-set-addresses lsp-server-router router
+check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
+check ovn-nbctl lr-lb-add lr lb
+
+# Create a hypervisor and create OVS ports corresponding to logical ports.
+net_add n1
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+
+ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=client \
+    options:tx_pcap=hv1/client-tx.pcap \
+    options:rxq_pcap=hv1/client-rx.pcap
+
+ovs-vsctl -- add-port br-int hv1-vif2 -- \
+    set interface hv1-vif2 external-ids:iface-id=server \
+    options:tx_pcap=hv1/server-tx.pcap \
+    options:rxq_pcap=hv1/server-rx.pcap
+
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
+tx_src_mac="02:00:00:00:00:01"
+tx_dst_mac="02:00:00:00:00:02"
+tx_src_ip=1.1.1.10
+tx_dst_ip=42.42.42.42
+request=$(fmt_pkt "Ether(dst='${tx_dst_mac}', src='${tx_src_mac}')/ \
+                  IP(src='${tx_src_ip}', dst='${tx_dst_ip}')/ \
+                  UDP(sport=20001, dport=80)")
+
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $request
+
+rx_src_mac="02:00:00:00:00:03"
+rx_dst_mac="02:00:00:00:00:04"
+rx_src_ip=2.2.2.1
+rx_dst_ip=2.2.2.10
+expected=$(fmt_pkt "Ether(dst='${rx_dst_mac}', src='${rx_src_mac}')/ \
+                  IP(src='${rx_src_ip}', dst='${rx_dst_ip}', ttl=0x3F)/ \
+                  UDP(sport=20001, dport=80)")
+
+echo $expected > expected
+OVN_CHECK_PACKETS([hv1/server-tx.pcap], [expected])
+AT_CLEANUP
+
+AT_SETUP([lb_force_snat_ip=routerip generate flags.network_id flows])
+ovn_start
+
+check ovn-nbctl lr-add lr
+check ovn-nbctl set logical_router lr options:chassis=hv1
+check ovn-nbctl set logical_router lr options:lb_force_snat_ip=router_ip
+check ovn-nbctl lrp-add lr lrp-client 02:00:00:00:00:02 1.1.1.1/24 ff01::01
+check ovn-nbctl lrp-add lr lrp-server 02:00:00:00:00:03 1.1.2.1/24 7.7.7.1/24 8.8.8.1/24 \
+                      ff01::02 ff01::03 ff01::06
+check ovn-nbctl ls-add ls-client
+check ovn-nbctl ls-add ls-server
+check ovn-nbctl lsp-add ls-client lsp-client-router
+check ovn-nbctl lsp-set-type lsp-client-router router
+check ovn-nbctl lsp-add ls-server lsp-server-router
+check ovn-nbctl lsp-set-type lsp-server-router router
+check ovn-nbctl set logical_switch_port lsp-client-router options:router-port=lrp-client
+check ovn-nbctl set logical_switch_port lsp-server-router options:router-port=lrp-server
+check ovn-nbctl lsp-add ls-client client
+check ovn-nbctl lsp-add ls-server server
+check ovn-nbctl lsp-set-addresses client "02:00:00:00:00:01 1.1.1.10"
+check ovn-nbctl lsp-set-addresses server "02:00:00:00:00:04 2.2.2.10"
+check ovn-nbctl lsp-set-addresses lsp-client-router router
+check ovn-nbctl lsp-set-addresses lsp-server-router router
+check ovn-nbctl lb-add lb 42.42.42.42:80 2.2.2.10:80 udp
+check ovn-nbctl lr-lb-add lr lb
+
+# Create a hypervisor and create OVS ports corresponding to logical ports.
+net_add n1
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+
+ovs-vsctl add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=client \
+    options:tx_pcap=hv1/client-tx.pcap \
+    options:rxq_pcap=hv1/client-rx.pcap
+
+ovs-vsctl add-port br-int hv1-vif2 -- \
+    set interface hv1-vif2 external-ids:iface-id=server \
+    options:tx_pcap=hv1/server-tx.pcap \
+    options:rxq_pcap=hv1/server-rx.pcap
+
+#OVN_POPULATE_ARP
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
+ovn-sbctl dump-flows lr > lrflows
+AT_CAPTURE_FILE([lrflows])
+
+AT_CHECK([grep -E flags.network_id lrflows | ovn_strip_lflows], [0], [dnl
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && reg0 == 1.1.1.1/24), action=(flags.network_id = 0; next;)
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-client" && xxreg0 == ff01::1/128), action=(flags.network_id = 0; next;)
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 1.1.2.1/24), action=(flags.network_id = 0; next;)
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 7.7.7.1/24), action=(flags.network_id = 1; next;)
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && reg0 == 8.8.8.1/24), action=(flags.network_id = 2; next;)
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::2/128), action=(flags.network_id = 0; next;)
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::3/128), action=(flags.network_id = 1; next;)
+  table=??(lr_in_network_id   ), priority=110  , match=(flags.force_snat_for_lb == 1 && outport == "lrp-server" && xxreg0 == ff01::6/128), action=(flags.network_id = 2; next;)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-client"), action=(ct_snat(1.1.1.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip4 && outport == "lrp-server"), action=(ct_snat(1.1.2.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-client"), action=(ct_snat(ff01::1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 0 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::2);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip4 && outport == "lrp-server"), action=(ct_snat(7.7.7.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 1 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::3);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip4 && outport == "lrp-server"), action=(ct_snat(8.8.8.1);)
+  table=??(lr_out_snat        ), priority=110  , match=(flags.force_snat_for_lb == 1 && flags.network_id == 2 && ip6 && outport == "lrp-server"), action=(ct_snat(ff01::6);)
+])
+AT_CLEANUP