diff mbox series

[ovs-dev,v7,3/3] pinctrl: Handle arp/nd for other address families.

Message ID 20241204151207.39657-4-martin.kalcok@canonical.com
State Accepted
Delegated to: Dumitru Ceara
Headers show
Series IPv4 routes over IPv6 next hop addresses. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Martin Kalcok Dec. 4, 2024, 3:10 p.m. UTC
From: Felix Huettner <felix.huettner@mail.schwarz>

Previously we could only generate ARP requests from IPv4 packets
and NS requests from IPv6 packets. This was the case because we rely on
information in the packet to generate the ARP/NS requests.

However in case of ARP/NS requests originating from the Logical_Router
pipeline for nexthop lookups we overwrite the affected fields
afterwards. This overwrite is done by the userdata openflow actions.
Because of this we actually do not rely on any information of the IPv4/6
packets in these cases.

Unfortunately we can not easily determine if we are actually later
overwriting the affected fields. The approach now is to use the fields
from the IP header if we have a matching IP version and default to some
values otherwise. In case we overwrite this data afterwards we are
generally good. If we do not overwrite this data because of some bug we
will send out invalid ARP/NS requests. They will hopefully be dropped by
the rest of the network.

The alternative would have been to introduce new arp/nd_ns actions where
we guarantee this overwrite. This would not suffer from the above
limitations, but would require a coordination on upgrades between all
ovn-controllers and northd.

Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
Signed-off-by: Martin Kalcok <martin.kalcok@canonical.com>
Co-authored-by: Martin Kalcok <martin.kalcok@canonical.com>
---
 controller/pinctrl.c |  52 +++++--
 lib/actions.c        |   4 +-
 northd/northd.c      |   9 +-
 tests/ovn-northd.at  |   8 +-
 tests/ovn.at         | 268 +++++++++++++++++++++++++++++++++++-
 tests/system-ovn.at  | 318 +++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 638 insertions(+), 21 deletions(-)

Comments

Dumitru Ceara Dec. 17, 2024, 3:25 p.m. UTC | #1
On 12/4/24 4:10 PM, Martin Kalcok wrote:
> From: Felix Huettner <felix.huettner@mail.schwarz>
> 
> Previously we could only generate ARP requests from IPv4 packets
> and NS requests from IPv6 packets. This was the case because we rely on
> information in the packet to generate the ARP/NS requests.
> 
> However in case of ARP/NS requests originating from the Logical_Router
> pipeline for nexthop lookups we overwrite the affected fields
> afterwards. This overwrite is done by the userdata openflow actions.
> Because of this we actually do not rely on any information of the IPv4/6
> packets in these cases.
> 
> Unfortunately we can not easily determine if we are actually later
> overwriting the affected fields. The approach now is to use the fields
> from the IP header if we have a matching IP version and default to some
> values otherwise. In case we overwrite this data afterwards we are
> generally good. If we do not overwrite this data because of some bug we
> will send out invalid ARP/NS requests. They will hopefully be dropped by
> the rest of the network.
> 
> The alternative would have been to introduce new arp/nd_ns actions where
> we guarantee this overwrite. This would not suffer from the above
> limitations, but would require a coordination on upgrades between all
> ovn-controllers and northd.
> 
> Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
> Signed-off-by: Martin Kalcok <martin.kalcok@canonical.com>
> Co-authored-by: Martin Kalcok <martin.kalcok@canonical.com>
> ---

Thanks, Felix and Martin!  This looks good to me.  I only had a few
minor comments (see inline) but I can take care of those when applying
the patch to main.

Just as for patch 2/3, you can find the rebased version here:
https://github.com/dceara/ovn/commits/bcba1b74

I'll wait for confirmation that it still looks OK before pushing the
series to main.

Regards,
Dumitru

>  controller/pinctrl.c |  52 +++++--
>  lib/actions.c        |   4 +-
>  northd/northd.c      |   9 +-
>  tests/ovn-northd.at  |   8 +-
>  tests/ovn.at         | 268 +++++++++++++++++++++++++++++++++++-
>  tests/system-ovn.at  | 318 +++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 638 insertions(+), 21 deletions(-)
> 
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 3fb7e2fd7..b05a0639b 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -1567,9 +1567,11 @@ pinctrl_handle_arp(struct rconn *swconn, const struct flow *ip_flow,
>                     const struct ofputil_packet_in *pin,
>                     struct ofpbuf *userdata, const struct ofpbuf *continuation)
>  {
> -    /* This action only works for IP packets, and the switch should only send
> -     * us IP packets this way, but check here just to be sure. */
> -    if (ip_flow->dl_type != htons(ETH_TYPE_IP)) {
> +    uint16_t dl_type = ntohs(ip_flow->dl_type);
> +
> +    /* This action only works for IPv4 or IPv6 packets, and the switch should
> +     * only send us IP packets this way, but check here just to be sure. */
> +    if (dl_type != ETH_TYPE_IP && dl_type != ETH_TYPE_IPV6) {
>          static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>          VLOG_WARN_RL(&rl, "ARP action on non-IP packet (Ethertype %"PRIx16")",
>                       ntohs(ip_flow->dl_type));
> @@ -1593,9 +1595,25 @@ pinctrl_handle_arp(struct rconn *swconn, const struct flow *ip_flow,
>      struct arp_eth_header *arp = dp_packet_l3(&packet);
>      arp->ar_op = htons(ARP_OP_REQUEST);
>      arp->ar_sha = ip_flow->dl_src;
> -    put_16aligned_be32(&arp->ar_spa, ip_flow->nw_src);
>      arp->ar_tha = eth_addr_zero;
> -    put_16aligned_be32(&arp->ar_tpa, ip_flow->nw_dst);
> +
> +    /* We might be here without actually currently handling an IPv4 packet.
> +     * This can happen in the case where we route IPv6 packets over an IPv4
> +     * link.
> +     * In these cases we have no destination IPv4 address from the packet that
> +     * we can reuse. But we receive the actual destination IPv4 address via
> +     * userdata anyway, so what we set for now is irrelevant.
> +     * This is just a hope since we do not parse the userdata. If we land here
> +     * for whatever reason without being an IPv4 packet and without userdata we
> +     * will send out a wrong packet.
> +     */
> +    if (ip_flow->dl_type == htons(ETH_TYPE_IP)) {
> +        put_16aligned_be32(&arp->ar_spa, ip_flow->nw_src);
> +        put_16aligned_be32(&arp->ar_tpa, ip_flow->nw_dst);
> +    } else {
> +        put_16aligned_be32(&arp->ar_spa, 0);
> +        put_16aligned_be32(&arp->ar_tpa, 0);
> +    }
>  
>      if (ip_flow->vlans[0].tci & htons(VLAN_CFI)) {
>          eth_push_vlan(&packet, htons(ETH_TYPE_VLAN_8021Q),
> @@ -6620,8 +6638,11 @@ pinctrl_handle_nd_ns(struct rconn *swconn, const struct flow *ip_flow,
>                       struct ofpbuf *userdata,
>                       const struct ofpbuf *continuation)
>  {
> -    /* This action only works for IPv6 packets. */
> -    if (get_dl_type(ip_flow) != htons(ETH_TYPE_IPV6)) {
> +    uint16_t dl_type = ntohs(ip_flow->dl_type);
> +
> +    /* This action only works for IPv4 or IPv6 packets, and the switch should
> +     * only send us IP packets this way, but check here just to be sure. */
> +    if (dl_type != ETH_TYPE_IP && dl_type != ETH_TYPE_IPV6) {
>          static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>          VLOG_WARN_RL(&rl, "NS action on non-IPv6 packet");
>          return;
> @@ -6637,8 +6658,23 @@ pinctrl_handle_nd_ns(struct rconn *swconn, const struct flow *ip_flow,
>      dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
>  
>      in6_generate_lla(ip_flow->dl_src, &ipv6_src);
> +
> +    /* We might be here without actually currently handling an IPv6 packet.
> +     * This can happen in the case where we route IPv4 packets over an IPv6
> +     * link.
> +     * In these cases we have no destination IPv6 address from the packet that
> +     * we can reuse. But we receive the actual destination IPv6 address via
> +     * userdata anyway, so what we pass to compose_nd_ns is irrelevant.
> +     * This is just a hope since we do not parse the userdata. If we land here
> +     * for whatever reason without being an IPv6 packet and without userdata we
> +     * will send out a wrong packet.
> +     */
> +    struct in6_addr ipv6_dst = IN6ADDR_EXACT_INIT;
> +    if (get_dl_type(ip_flow) == htons(ETH_TYPE_IPV6)) {
> +        ipv6_dst = ip_flow->ipv6_dst;
> +    }
>      compose_nd_ns(&packet, ip_flow->dl_src, &ipv6_src,
> -                  &ip_flow->ipv6_dst);
> +                  &ipv6_dst);
>  
>      /* Reload previous packet metadata and set actions from userdata. */
>      set_actions_and_enqueue_msg(swconn, &packet,
> diff --git a/lib/actions.c b/lib/actions.c
> index d5fc30b27..ea30be767 100644
> --- a/lib/actions.c
> +++ b/lib/actions.c
> @@ -1765,7 +1765,7 @@ parse_nested_action(struct action_context *ctx, enum ovnact_type type,
>  static void
>  parse_ARP(struct action_context *ctx)
>  {
> -    parse_nested_action(ctx, OVNACT_ARP, "ip4", ctx->scope);
> +    parse_nested_action(ctx, OVNACT_ARP, "ip", ctx->scope);
>  }
>  
>  static void
> @@ -1819,7 +1819,7 @@ parse_ND_NA_ROUTER(struct action_context *ctx)
>  static void
>  parse_ND_NS(struct action_context *ctx)
>  {
> -    parse_nested_action(ctx, OVNACT_ND_NS, "ip6", ctx->scope);
> +    parse_nested_action(ctx, OVNACT_ND_NS, "ip", ctx->scope);
>  }
>  
>  static void
> diff --git a/northd/northd.c b/northd/northd.c
> index 4fb48838b..436e42248 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -14893,7 +14893,8 @@ build_arp_request_flows_for_lrouter(
>  
>          ds_clear(match);
>          ds_put_format(match, "eth.dst == 00:00:00:00:00:00 && "
> -                      "ip6 && " REG_NEXT_HOP_IPV6 " == %s",
> +                      REGBIT_NEXTHOP_IS_IPV4" == 0 && "
> +                      REG_NEXT_HOP_IPV6 " == %s",
>                        route->nexthop);
>          struct in6_addr sn_addr;
>          struct eth_addr eth_dst;
> @@ -14923,7 +14924,8 @@ build_arp_request_flows_for_lrouter(
>      }
>  
>      ovn_lflow_metered(lflows, od, S_ROUTER_IN_ARP_REQUEST, 100,
> -                      "eth.dst == 00:00:00:00:00:00 && ip4",
> +                      "eth.dst == 00:00:00:00:00:00 && "
> +                      REGBIT_NEXTHOP_IS_IPV4" == 1",
>                        "arp { "
>                        "eth.dst = ff:ff:ff:ff:ff:ff; "
>                        "arp.spa = " REG_SRC_IPV4 "; "
> @@ -14935,7 +14937,8 @@ build_arp_request_flows_for_lrouter(
>                                       meter_groups),
>                        lflow_ref);
>      ovn_lflow_metered(lflows, od, S_ROUTER_IN_ARP_REQUEST, 100,
> -                      "eth.dst == 00:00:00:00:00:00 && ip6",
> +                      "eth.dst == 00:00:00:00:00:00 && "
> +                      REGBIT_NEXTHOP_IS_IPV4" == 0",
>                        "nd_ns { "
>                        "nd.target = " REG_NEXT_HOP_IPV6 "; "
>                        "output; "
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 803823afa..4335baeec 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -6979,10 +6979,10 @@ AT_CHECK([grep -e "lr_in_arp_resolve" lr0flows | ovn_strip_lflows], [0], [dnl
>  
>  AT_CHECK([grep -e "lr_in_arp_request" lr0flows | ovn_strip_lflows], [0], [dnl
>    table=??(lr_in_arp_request  ), priority=0    , match=(1), action=(output;)
> -  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip4), action=(arp { eth.dst = ff:ff:ff:ff:ff:ff; arp.spa = reg5; arp.tpa = reg0; arp.op = 1; output; }; output;)
> -  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip6), action=(nd_ns { nd.target = xxreg0; output; }; output;)
> -  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && ip6 && xxreg0 == 2001:db8::10), action=(nd_ns { eth.dst = 33:33:ff:00:00:10; ip6.dst = ff02::1:ff00:10; nd.target = 2001:db8::10; output; }; output;)
> -  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && ip6 && xxreg0 == 2001:db8::20), action=(nd_ns { eth.dst = 33:33:ff:00:00:20; ip6.dst = ff02::1:ff00:20; nd.target = 2001:db8::20; output; }; output;)
> +  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 0), action=(nd_ns { nd.target = xxreg0; output; }; output;)
> +  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 1), action=(arp { eth.dst = ff:ff:ff:ff:ff:ff; arp.spa = reg5; arp.tpa = reg0; arp.op = 1; output; }; output;)
> +  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 0 && xxreg0 == 2001:db8::10), action=(nd_ns { eth.dst = 33:33:ff:00:00:10; ip6.dst = ff02::1:ff00:10; nd.target = 2001:db8::10; output; }; output;)
> +  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 0 && xxreg0 == 2001:db8::20), action=(nd_ns { eth.dst = 33:33:ff:00:00:20; ip6.dst = ff02::1:ff00:20; nd.target = 2001:db8::20; output; }; output;)
>  ])
>  
>  AT_CLEANUP
> diff --git a/tests/ovn.at b/tests/ovn.at
> index ec90a3b4e..a29ec7114 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -1396,11 +1396,11 @@ clone { ip4.dst = 255.255.255.255; output; }; next;
>  # arp
>  arp { eth.dst = ff:ff:ff:ff:ff:ff; output; }; output;
>      encodes as controller(userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.ff.ff.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.OFTABLE_SAVE_INPORT_HEX.00.00.00,pause),resubmit(,OFTABLE_SAVE_INPORT)
> -    has prereqs ip4
> +    has prereqs ip
>  arp { };
>      formats as arp { drop; };
>      encodes as controller(userdata=00.00.00.00.00.00.00.00,pause)
> -    has prereqs ip4
> +    has prereqs ip
>  
>  # get_arp
>  get_arp(outport, ip4.dst);
> @@ -1564,12 +1564,12 @@ reg9[[8]] = dhcp_relay_resp_chk(192.168.1, 172.16.1.1);
>  # nd_ns
>  nd_ns { nd.target = xxreg0; output; };
>      encodes as controller(userdata=00.00.00.09.00.00.00.00.00.1c.00.18.00.80.00.00.00.00.00.00.00.01.de.10.80.00.3e.10.00.00.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.OFTABLE_SAVE_INPORT_HEX.00.00.00,pause)
> -    has prereqs ip6
> +    has prereqs ip
>  
>  nd_ns { };
>      formats as nd_ns { drop; };
>      encodes as controller(userdata=00.00.00.09.00.00.00.00,pause)
> -    has prereqs ip6
> +    has prereqs ip
>  
>  # nd_na
>  nd_na { eth.src = 12:34:56:78:9a:bc; nd.tll = 12:34:56:78:9a:bc; outport = inport; inport = ""; /* Allow sending out inport. */ output; };
> @@ -40144,6 +40144,266 @@ OVN_CLEANUP([hv1],[hv2])
>  AT_CLEANUP
>  ])
>  
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([2 HVs, 2 LS, 1 lport/LS, LRs connected via LS, IPv4 over IPv6, dynamic])
> +AT_SKIP_IF([test $HAVE_SCAPY = no])
> +ovn_start
> +
> +# Logical network:
> +# Two LRs - R1 and R2 that are connected to ls-transfer in 2001:db8::/64
> +# network. R1 has a switchs ls1 (192.168.1.0/24) connected to it.
> +# R2 has ls2 (172.16.1.0/24) connected to it.
> +
> +ls1_lp1_mac="f0:00:00:01:02:03"
> +rp_ls1_mac="00:00:00:01:02:03"
> +rp_ls2_mac="00:00:00:01:02:04"
> +ls2_lp1_mac="f0:00:00:01:02:04"
> +
> +ls1_lp1_ip="192.168.1.2"
> +ls2_lp1_ip="172.16.1.2"
> +
> +check ovn-nbctl lr-add R1
> +check ovn-nbctl lr-add R2
> +
> +check ovn-nbctl ls-add ls1
> +check ovn-nbctl ls-add ls2
> +check ovn-nbctl ls-add ls-transfer
> +
> +# Connect ls1 to R1
> +check ovn-nbctl lrp-add R1 ls1 $rp_ls1_mac 192.168.1.1/24
> +check ovn-nbctl set Logical_Router R1 options:dynamic_neigh_routers=true
> +
> +check ovn-nbctl lsp-add ls1 rp-ls1 -- set Logical_Switch_Port rp-ls1 type=router \
> +  options:router-port=ls1 addresses=\"$rp_ls1_mac\"
> +
> +# Connect ls2 to R2
> +check ovn-nbctl lrp-add R2 ls2 $rp_ls2_mac 172.16.1.1/24
> +check ovn-nbctl set Logical_Router R2 options:dynamic_neigh_routers=true
> +
> +check ovn-nbctl lsp-add ls2 rp-ls2 -- set Logical_Switch_Port rp-ls2 type=router \
> +  options:router-port=ls2 addresses=\"$rp_ls2_mac\"
> +
> +# Connect R1 to R2
> +check ovn-nbctl lrp-add R1 R1_ls-transfer 00:00:00:02:03:04 2001:db8::1/64
> +check ovn-nbctl lrp-add R2 R2_ls-transfer 00:00:00:02:03:05 2001:db8::2/64
> +
> +check ovn-nbctl lsp-add ls-transfer ls-transfer_r1 -- \
> +  set Logical_Switch_Port ls-transfer_r1 type=router \
> +  options:router-port=R1_ls-transfer addresses=\"router\"
> +check ovn-nbctl lsp-add ls-transfer ls-transfer_r2 -- \
> +  set Logical_Switch_Port ls-transfer_r2 type=router \
> +  options:router-port=R2_ls-transfer addresses=\"router\"
> +
> +AT_CHECK([ovn-nbctl lr-route-add R1 "0.0.0.0/0" 2001:db8::2])
> +AT_CHECK([ovn-nbctl lr-route-add R2 "0.0.0.0/0" 2001:db8::1])
> +
> +# Create logical port ls1-lp1 in ls1
> +check ovn-nbctl lsp-add ls1 ls1-lp1 \
> +-- lsp-set-addresses ls1-lp1 "$ls1_lp1_mac $ls1_lp1_ip"
> +
> +# Create logical port ls2-lp1 in ls2
> +check ovn-nbctl lsp-add ls2 ls2-lp1 \
> +-- lsp-set-addresses ls2-lp1 "$ls2_lp1_mac $ls2_lp1_ip"
> +
> +# Create two hypervisor and create OVS ports corresponding to logical ports.
> +net_add n1
> +
> +sim_add hv1
> +as hv1
> +check ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
> +    set interface hv1-vif1 external-ids:iface-id=ls1-lp1 \
> +    options:tx_pcap=hv1/vif1-tx.pcap \
> +    options:rxq_pcap=hv1/vif1-rx.pcap \
> +    ofport-request=1
> +
> +sim_add hv2
> +as hv2
> +check ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.2
> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \
> +    set interface hv2-vif1 external-ids:iface-id=ls2-lp1 \
> +    options:tx_pcap=hv2/vif1-tx.pcap \
> +    options:rxq_pcap=hv2/vif1-rx.pcap \
> +    ofport-request=1
> +
> +
> +# Pre-populate the hypervisors' ARP tables so that we don't lose any
> +# packets for ARP resolution (native tunneling doesn't queue packets
> +# for ARP resolution).
> +OVN_POPULATE_ARP
> +
> +# Allow some time for ovn-northd and ovn-controller to catch up.
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
> +# Packet to send.
> +packet=$(fmt_pkt "Ether(dst='${rp_ls1_mac}', src='${ls1_lp1_mac}')/ \
> +                        IP(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', ttl=64)/ \
> +                        UDP(sport=53, dport=4369)")

Nit: indentation.

> +check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
> +
> +# Packet to Expect
> +# The TTL should be decremented by 2.
> +expected=$(fmt_pkt "Ether(dst='${ls2_lp1_mac}', src='${rp_ls2_mac}')/ \
> +                        IP(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', ttl=62)/ \
> +                        UDP(sport=53, dport=4369)")

Nit: indentation.

> +echo ${expected} > expected
> +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
> +
> +AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
> +grep "reg0 == 172.16.1.2" | wc -l], [0], [1
> +])
> +
> +# Disable the ls2-lp1 port.
> +check ovn-nbctl --wait=hv set logical_switch_port ls2-lp1 enabled=false
> +
> +AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
> +grep "reg0 == 172.16.1.2" | wc -l], [0], [0
> +])
> +
> +# Send the same packet again and it should not be delivered
> +check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
> +
> +# The 2nd packet sent shound not be received.
> +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
> +
> +OVN_CLEANUP([hv1],[hv2])
> +
> +AT_CLEANUP
> +])
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([2 HVs, 2 LS, 1 lport/LS, LRs connected via LS, IPv6 over IPv4, dynamic])
> +AT_SKIP_IF([test $HAVE_SCAPY = no])
> +ovn_start
> +
> +# Logical network:
> +# Two LRs - R1 and R2 that are connected to ls-transfer in 10.0.0.0/24
> +# network. R1 has a switchs ls1 (2001:db8:1::/64) connected to it.
> +# R2 has ls2 (2001:db8:2::/64) connected to it.
> +
> +ls1_lp1_mac="f0:00:00:01:02:03"
> +rp_ls1_mac="00:00:00:01:02:03"
> +rp_ls2_mac="00:00:00:01:02:04"
> +ls2_lp1_mac="f0:00:00:01:02:04"
> +
> +ls1_lp1_ip="2001:db8:1::2"
> +ls2_lp1_ip="2001:db8:2::2"
> +
> +check ovn-nbctl lr-add R1
> +check ovn-nbctl lr-add R2
> +
> +check ovn-nbctl ls-add ls1
> +check ovn-nbctl ls-add ls2
> +check ovn-nbctl ls-add ls-transfer
> +
> +# Connect ls1 to R1
> +check ovn-nbctl lrp-add R1 ls1 $rp_ls1_mac 2001:db8:1::1/64
> +check ovn-nbctl set Logical_Router R1 options:dynamic_neigh_routers=true
> +
> +check ovn-nbctl lsp-add ls1 rp-ls1 -- set Logical_Switch_Port rp-ls1 type=router \
> +  options:router-port=ls1 addresses=\"$rp_ls1_mac\"
> +
> +# Connect ls2 to R2
> +check ovn-nbctl lrp-add R2 ls2 $rp_ls2_mac 2001:db8:2::1/64
> +check ovn-nbctl set Logical_Router R2 options:dynamic_neigh_routers=true
> +
> +check ovn-nbctl lsp-add ls2 rp-ls2 -- set Logical_Switch_Port rp-ls2 type=router \
> +  options:router-port=ls2 addresses=\"$rp_ls2_mac\"
> +
> +# Connect R1 to R2
> +check ovn-nbctl lrp-add R1 R1_ls-transfer 00:00:00:02:03:04 10.0.0.1/24
> +check ovn-nbctl lrp-add R2 R2_ls-transfer 00:00:00:02:03:05 10.0.0.2/24
> +
> +check ovn-nbctl lsp-add ls-transfer ls-transfer_r1 -- \
> +  set Logical_Switch_Port ls-transfer_r1 type=router \
> +  options:router-port=R1_ls-transfer addresses=\"router\"
> +check ovn-nbctl lsp-add ls-transfer ls-transfer_r2 -- \
> +  set Logical_Switch_Port ls-transfer_r2 type=router \
> +  options:router-port=R2_ls-transfer addresses=\"router\"
> +
> +AT_CHECK([ovn-nbctl lr-route-add R1 "::/0" 10.0.0.2])
> +AT_CHECK([ovn-nbctl lr-route-add R2 "::/0" 10.0.0.1])
> +
> +# Create logical port ls1-lp1 in ls1
> +check ovn-nbctl lsp-add ls1 ls1-lp1 \
> +-- lsp-set-addresses ls1-lp1 "$ls1_lp1_mac $ls1_lp1_ip"
> +
> +# Create logical port ls2-lp1 in ls2
> +check ovn-nbctl lsp-add ls2 ls2-lp1 \
> +-- lsp-set-addresses ls2-lp1 "$ls2_lp1_mac $ls2_lp1_ip"
> +
> +# Create two hypervisor and create OVS ports corresponding to logical ports.
> +net_add n1
> +
> +sim_add hv1
> +as hv1
> +check ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +check ovs-vsctl -- add-port br-int hv1-vif1 -- \
> +    set interface hv1-vif1 external-ids:iface-id=ls1-lp1 \
> +    options:tx_pcap=hv1/vif1-tx.pcap \
> +    options:rxq_pcap=hv1/vif1-rx.pcap \
> +    ofport-request=1
> +
> +sim_add hv2
> +as hv2
> +check ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.2
> +check ovs-vsctl -- add-port br-int hv2-vif1 -- \
> +    set interface hv2-vif1 external-ids:iface-id=ls2-lp1 \
> +    options:tx_pcap=hv2/vif1-tx.pcap \
> +    options:rxq_pcap=hv2/vif1-rx.pcap \
> +    ofport-request=1
> +
> +
> +# Pre-populate the hypervisors' ARP tables so that we don't lose any
> +# packets for ARP resolution (native tunneling doesn't queue packets
> +# for ARP resolution).
> +OVN_POPULATE_ARP
> +
> +# Allow some time for ovn-northd and ovn-controller to catch up.
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
> +# Packet to send.
> +packet=$(fmt_pkt "Ether(dst='${rp_ls1_mac}', src='${ls1_lp1_mac}')/ \
> +                        IPv6(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', hlim=64)/ \
> +                        UDP(sport=53, dport=4369)")

Nit: indentation.

> +check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
> +
> +# Packet to Expect
> +# The TTL should be decremented by 2.
> +expected=$(fmt_pkt "Ether(dst='${ls2_lp1_mac}', src='${rp_ls2_mac}')/ \
> +                        IPv6(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', hlim=62)/ \
> +                        UDP(sport=53, dport=4369)")

Nit: indentation.

> +echo ${expected} > expected
> +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
> +
> +AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
> +grep "reg0 == 2001:db8:2::2" | wc -l], [0], [1
> +])
> +
> +# Disable the ls2-lp1 port.
> +check ovn-nbctl --wait=hv set logical_switch_port ls2-lp1 enabled=false
> +
> +AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
> +grep "reg0 == 2001:db8:2::2" | wc -l], [0], [0
> +])
> +
> +# Send the same packet again and it should not be delivered
> +check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
> +
> +# The 2nd packet sent shound not be received.
> +OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
> +
> +OVN_CLEANUP([hv1],[hv2])
> +
> +AT_CLEANUP
> +])
> +
>  OVN_FOR_EACH_NORTHD([
>  AT_SETUP([2 HVs, 2 LS, 1 lport/LS, LRs connected via LS, IPv4 over IPv6, ECMP])
>  AT_SKIP_IF([test $HAVE_SCAPY = no])
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index 145399ded..2e7efb919 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -14121,3 +14121,321 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>  /.*terminating with signal 15.*/d"])
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([Routing IPv4 to external network over IPv6 next-hop, Distributed and Centralized NAT])
> +# Logical network:
> +#  * R1 is distributed LR
> +#    * connected to "physical" router "ext1" via IPv6 network
> +#    * connected to alice via LS "sw0" and bob vial LS "sw1"
> +#    * provides distributed IPv4 dnat_and_snat for alice
> +#    * provides centralized IPv4 snat for bob
> +#  * ext1 acts as a "physical router" connected to R1 over IPv6 and to further
> +#    external networks via IPv4.
> +# This test ensures connectivity between hosts on internal IPv4 networks
> +# to the external IPv4 networks over IPv6 network connecting R1 and ext1.
> +# +------------+
> +# |    alice   |---+   +--------+
> +# +------------+   +---|   R1   |  IPv6 net. +------------+   |
> +#                      |        |------------|    ext1    |---| IPv4 net.
> +# +------------+   +---|        |            +------------+   |
> +# |     bob    |---+   +--------+
> +# +------------+
> +
> +ovn_start
> +OVS_TRAFFIC_VSWITCHD_START()
> +
> +ADD_BR([br-int])
> +ADD_BR([br-ext])
> +
> +ovs-vsctl set-fail-mode br-ext standalone
> +# Set external-ids in br-int needed for ovn-controller
> +check ovs-vsctl \
> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true \
> +        -- set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext
> +
> +# Start ovn-controller
> +start_daemon ovn-controller
> +
> +# Create router and switch facilitating external connectivity
> +check ovn-nbctl \
> +        -- lr-add R1 \
> +        -- lrp-add R1 lrp-r1-public 00:00:02:ff:ff:01 2001:1db8:ffff::1/64 \
> +        -- ls-add public \
> +        -- lsp-add public lsp-public-r1 \
> +        -- lsp-set-type lsp-public-r1 router \
> +        -- lsp-set-options lsp-public-r1 router-port=lrp-r1-public \
> +        -- lsp-set-addresses lsp-public-r1 router
> +
> +# Create HA Distributed GW Port
> +check ovn-nbctl \
> +        -- ha-chassis-group-add G1 \
> +        -- ha-chassis-group-add-chassis G1 hv1 10
> +
> +group_uuid=$(ovn-nbctl get ha_chassis_group G1 _uuid)
> +check ovn-nbctl set logical_router_port lrp-r1-public ha_chassis_group="$group_uuid"
> +
> +
> +# Interconnect external and internal bridges
> +check ovn-nbctl \
> +        -- lsp-add public  ext-patch \
> +        -- lsp-set-addresses ext-patch unknown \
> +        -- lsp-set-type ext-patch localnet \
> +        -- lsp-set-options ext-patch network_name=phynet
> +
> +check ovn-nbctl --wait=hv sync
> +
> +# Create "external host"
> +ADD_NAMESPACES(ext1)
> +ADD_VETH(ext1, ext1, br-ext, "2001:1db8:ffff::2/64", "00:00:02:ff:ff:02", [], [nodad])
> +OVS_WAIT_UNTIL([NS_EXEC([ext1], [ip a show dev ext1 | grep "fe80::" | grep -v tentative])])
> +
> +# Simulate external IPv4 network "behind" external host
> +NS_CHECK_EXEC([ext1], [ip link add dummy0 type dummy])
> +NS_CHECK_EXEC([ext1], [ip link set dummy0 up])
> +NS_CHECK_EXEC([ext1], [ip addr add 10.42.0.10/32 dev dummy0])
> +
> +# Add IPv4 route over IPv6 next-hop to the router
> +check ovn-nbctl lr-route-add R1 10.42.0.10/24 2001:1db8:ffff::2 lrp-r1-public
> +
> +# Test Distributed NAT
> +# Create internal network and connect it to router
> +check ovn-nbctl \
> +        -- lrp-add R1 lrp-r1-sw0 00:00:03:00:00:01 192.168.100.1/24 \
> +        -- ls-add sw0 \
> +        -- lsp-add sw0 lsp-sw0-r1 \
> +        -- lsp-set-type lsp-sw0-r1 router \
> +        -- lsp-set-options lsp-sw0-r1 router-port=lrp-r1-sw0 \
> +        -- lsp-set-addresses lsp-sw0-r1 router
> +
> +# Create "guest host" alice with distributed dnat_and_snat mapping
> +ADD_NAMESPACES(alice)
> +ADD_VETH(alice, alice, br-int, "192.168.100.10/24", "00:00:03:00:00:02", \
> +         "192.168.100.1")
> +check ovn-nbctl \
> +        -- lsp-add sw0 alice \
> +        -- lsp-set-addresses alice "00:00:03:00:00:02 192.168.100.10/24"
> +
> +check ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.10.1 192.168.100.10 alice 00:00:04:00:00:01
> +

This is racy, we need to wait for ports to be up and for ovn-controller
to catch up:

wait_for_ports_up
check ovn-nbctl --wait=hv sync


> +# Add route for R1's external SNAT/DNAT address to external host
> +NS_EXEC([ext1], [ip route add 172.16.10.1/32 via inet6 2001:1db8:ffff::1])
> +
> +# Ping external network from alice via NAT and IPv6 next-hop
> +NS_CHECK_EXEC([alice], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +# Ping alice's DNAT address from external network
> +NS_EXEC([ext1], [ip neighbor flush dev ext1])
> +NS_CHECK_EXEC([ext1], [ping -q -c 3 -i 0.3 -w 2 -I 10.42.0.10 172.16.10.1 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +# Test Centralized NAT
> +# Create internal network and connect it to router
> +check ovn-nbctl \
> +        -- lrp-add R1 lrp-r1-sw1 00:00:05:00:00:01 192.168.200.1/24 \
> +        -- ls-add sw1 \
> +        -- lsp-add sw1 lsp-sw1-r1 \
> +        -- lsp-set-type lsp-sw1-r1 router \
> +        -- lsp-set-options lsp-sw1-r1 router-port=lrp-r1-sw1 \
> +        -- lsp-set-addresses lsp-sw1-r1 router
> +
> +# Create "guest host" bob with centralized SNAT
> +ADD_NAMESPACES(bob)
> +ADD_VETH(bob, bob, br-int, "192.168.200.10/24", "00:00:05:00:00:02", \
> +         "192.168.200.1")
> +check ovn-nbctl \
> +        -- lsp-add sw1 bob \
> +        -- lsp-set-addresses bob "00:00:05:00:00:02 192.168.200.10/24"
> +check ovn-nbctl lr-nat-add R1 snat 172.16.10.2 192.168.200.10
> +
> +# Add route for R1's external SNAT address to external host
> +NS_EXEC([ext1], [ip route add 172.16.10.2/32 via inet6 2001:1db8:ffff::1])
> +

Same here.

> +# Ping external network from bob via NAT and IPv6 next-hop
> +NS_CHECK_EXEC([bob], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> +
> +as ovn-sb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as ovn-nb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as northd
> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
> +
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> +/connection dropped.*/d"])
> +AT_CLEANUP
> +])
> +
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([Routing IPv4 to external network over IPv6 next-hop, Gateway router with NAT])
> +# Logical network:
> +#  * R1 is gateway LR
> +#    * connected to "physical" router "ext1" via IPv6 network
> +#    * connected to alice via LS "sw0" and bob vial LS "sw1"
> +#    * provides IPv4 dnat_and_snat for alice
> +#    * provides IPv4 snat for bob
> +#  * ext1 acts as a "physical router" connected to R1 over IPv6 and to further
> +#    external networks via IPv4.
> +# This test ensures connectivity between hosts on internal IPv4 networks
> +# to the external IPv4 networks over IPv6 network connecting R1 and ext1.
> +# +------------+
> +# |    alice   |---+   +--------+
> +# +------------+   +---|   R1   |  IPv6 net. +------------+   |
> +#                      |        |------------|    ext1    |---| IPv4 net.
> +# +------------+   +---|        |            +------------+   |
> +# |     bob    |---+   +--------+
> +# +------------+
> +
> +ovn_start
> +OVS_TRAFFIC_VSWITCHD_START()
> +
> +ADD_BR([br-int])
> +ADD_BR([br-ext])
> +
> +ovs-vsctl set-fail-mode br-ext standalone
> +# Set external-ids in br-int needed for ovn-controller
> +check ovs-vsctl \
> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true \
> +        -- set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext
> +
> +# Start ovn-controller
> +start_daemon ovn-controller
> +
> +# Create gateway router and switch facilitating external connectivity
> +check ovn-nbctl \
> +        -- lr-add R1 \
> +        -- lrp-add R1 lrp-r1-public 00:00:02:ff:ff:01 2001:1db8:ffff::1/64 \
> +        -- set Logical_Router R1 options:chassis=hv1 \
> +        -- ls-add public \
> +        -- lsp-add public lsp-public-r1 \
> +        -- lsp-set-type lsp-public-r1 router \
> +        -- lsp-set-options lsp-public-r1 router-port=lrp-r1-public \
> +        -- lsp-set-addresses lsp-public-r1 router
> +
> +
> +# Interconnect external and internal bridges
> +check ovn-nbctl \
> +        -- lsp-add public  ext-patch \
> +        -- lsp-set-addresses ext-patch unknown \
> +        -- lsp-set-type ext-patch localnet \
> +        -- lsp-set-options ext-patch network_name=phynet
> +
> +check ovn-nbctl --wait=hv sync
> +
> +# Create "external host"
> +ADD_NAMESPACES(ext1)
> +ADD_VETH(ext1, ext1, br-ext, "2001:1db8:ffff::2/64", "00:00:02:ff:ff:02", [], [nodad])
> +OVS_WAIT_UNTIL([NS_EXEC([ext1], [ip a show dev ext1 | grep "fe80::" | grep -v tentative])])
> +
> +# Simulate external IPv4 network "behind" external host
> +NS_CHECK_EXEC([ext1], [ip link add dummy0 type dummy])
> +NS_CHECK_EXEC([ext1], [ip link set dummy0 up])
> +NS_CHECK_EXEC([ext1], [ip addr add 10.42.0.10/32 dev dummy0])
> +
> +# Add IPv4 route over IPv6 next-hop to the router
> +check ovn-nbctl lr-route-add R1 10.42.0.10/24 2001:1db8:ffff::2 lrp-r1-public
> +
> +# Test dnat_and_snat NAT type
> +# Create internal network and connect it to router
> +check ovn-nbctl \
> +        -- lrp-add R1 lrp-r1-sw0 00:00:03:00:00:01 192.168.100.1/24 \
> +        -- ls-add sw0 \
> +        -- lsp-add sw0 lsp-sw0-r1 \
> +        -- lsp-set-type lsp-sw0-r1 router \
> +        -- lsp-set-options lsp-sw0-r1 router-port=lrp-r1-sw0 \
> +        -- lsp-set-addresses lsp-sw0-r1 router
> +
> +# Create "guest host" alice with dnat_and_snat mapping
> +ADD_NAMESPACES(alice)
> +ADD_VETH(alice, alice, br-int, "192.168.100.10/24", "00:00:03:00:00:02", \
> +         "192.168.100.1")
> +check ovn-nbctl \
> +        -- lsp-add sw0 alice \
> +        -- lsp-set-addresses alice "00:00:03:00:00:02 192.168.100.10/24"
> +
> +check ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.10.1 192.168.100.10
> +
> +# Add route for R1's external SNAT/DNAT address to external host
> +NS_EXEC([ext1], [ip route add 172.16.10.1/32 via inet6 2001:1db8:ffff::1])
> +

Here too.

> +# Ping external network from alice via NAT and IPv6 next-hop
> +NS_CHECK_EXEC([alice], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +# Ping alice's DNAT address from external network
> +NS_EXEC([ext1], [ip neighbor flush dev ext1])
> +NS_CHECK_EXEC([ext1], [ping -q -c 3 -i 0.3 -w 2 -I 10.42.0.10 172.16.10.1 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +# Test SNAT
> +# Create internal network and connect it to router
> +check ovn-nbctl \
> +        -- lrp-add R1 lrp-r1-sw1 00:00:05:00:00:01 192.168.200.1/24 \
> +        -- ls-add sw1 \
> +        -- lsp-add sw1 lsp-sw1-r1 \
> +        -- lsp-set-type lsp-sw1-r1 router \
> +        -- lsp-set-options lsp-sw1-r1 router-port=lrp-r1-sw1 \
> +        -- lsp-set-addresses lsp-sw1-r1 router
> +
> +# Create "guest host" bob with SNAT
> +ADD_NAMESPACES(bob)
> +ADD_VETH(bob, bob, br-int, "192.168.200.10/24", "00:00:05:00:00:02", \
> +         "192.168.200.1")
> +check ovn-nbctl \
> +        -- lsp-add sw1 bob \
> +        -- lsp-set-addresses bob "00:00:05:00:00:02 192.168.200.10/24"
> +
> +check ovn-nbctl lr-nat-add R1 snat 172.16.10.2 192.168.200.10
> +
> +# Add route for R1's external SNAT address to external host
> +NS_EXEC([ext1], [ip route add 172.16.10.2/32 via inet6 2001:1db8:ffff::1])
> +

And here.

> +# Ping external network from bob via NAT and IPv6 next-hop
> +NS_CHECK_EXEC([bob], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> +
> +as ovn-sb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as ovn-nb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as northd
> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
> +
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> +/connection dropped.*/d"])
> +AT_CLEANUP
> +])
Dumitru Ceara Dec. 17, 2024, 4:39 p.m. UTC | #2
On 12/17/24 4:25 PM, Dumitru Ceara wrote:
> On 12/4/24 4:10 PM, Martin Kalcok wrote:
>> From: Felix Huettner <felix.huettner@mail.schwarz>
>>
>> Previously we could only generate ARP requests from IPv4 packets
>> and NS requests from IPv6 packets. This was the case because we rely on
>> information in the packet to generate the ARP/NS requests.
>>
>> However in case of ARP/NS requests originating from the Logical_Router
>> pipeline for nexthop lookups we overwrite the affected fields
>> afterwards. This overwrite is done by the userdata openflow actions.
>> Because of this we actually do not rely on any information of the IPv4/6
>> packets in these cases.
>>
>> Unfortunately we can not easily determine if we are actually later
>> overwriting the affected fields. The approach now is to use the fields
>> from the IP header if we have a matching IP version and default to some
>> values otherwise. In case we overwrite this data afterwards we are
>> generally good. If we do not overwrite this data because of some bug we
>> will send out invalid ARP/NS requests. They will hopefully be dropped by
>> the rest of the network.
>>
>> The alternative would have been to introduce new arp/nd_ns actions where
>> we guarantee this overwrite. This would not suffer from the above
>> limitations, but would require a coordination on upgrades between all
>> ovn-controllers and northd.
>>
>> Signed-off-by: Felix Huettner <felix.huettner@mail.schwarz>
>> Signed-off-by: Martin Kalcok <martin.kalcok@canonical.com>
>> Co-authored-by: Martin Kalcok <martin.kalcok@canonical.com>
>> ---
> 
> Thanks, Felix and Martin!  This looks good to me.  I only had a few
> minor comments (see inline) but I can take care of those when applying
> the patch to main.
> 
> Just as for patch 2/3, you can find the rebased version here:
> https://github.com/dceara/ovn/commits/bcba1b74
> 
> I'll wait for confirmation that it still looks OK before pushing the
> series to main.
> 

Applied to main (as discussed on patch 2/3).

Regards,
Dumitru
diff mbox series

Patch

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 3fb7e2fd7..b05a0639b 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -1567,9 +1567,11 @@  pinctrl_handle_arp(struct rconn *swconn, const struct flow *ip_flow,
                    const struct ofputil_packet_in *pin,
                    struct ofpbuf *userdata, const struct ofpbuf *continuation)
 {
-    /* This action only works for IP packets, and the switch should only send
-     * us IP packets this way, but check here just to be sure. */
-    if (ip_flow->dl_type != htons(ETH_TYPE_IP)) {
+    uint16_t dl_type = ntohs(ip_flow->dl_type);
+
+    /* This action only works for IPv4 or IPv6 packets, and the switch should
+     * only send us IP packets this way, but check here just to be sure. */
+    if (dl_type != ETH_TYPE_IP && dl_type != ETH_TYPE_IPV6) {
         static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
         VLOG_WARN_RL(&rl, "ARP action on non-IP packet (Ethertype %"PRIx16")",
                      ntohs(ip_flow->dl_type));
@@ -1593,9 +1595,25 @@  pinctrl_handle_arp(struct rconn *swconn, const struct flow *ip_flow,
     struct arp_eth_header *arp = dp_packet_l3(&packet);
     arp->ar_op = htons(ARP_OP_REQUEST);
     arp->ar_sha = ip_flow->dl_src;
-    put_16aligned_be32(&arp->ar_spa, ip_flow->nw_src);
     arp->ar_tha = eth_addr_zero;
-    put_16aligned_be32(&arp->ar_tpa, ip_flow->nw_dst);
+
+    /* We might be here without actually currently handling an IPv4 packet.
+     * This can happen in the case where we route IPv6 packets over an IPv4
+     * link.
+     * In these cases we have no destination IPv4 address from the packet that
+     * we can reuse. But we receive the actual destination IPv4 address via
+     * userdata anyway, so what we set for now is irrelevant.
+     * This is just a hope since we do not parse the userdata. If we land here
+     * for whatever reason without being an IPv4 packet and without userdata we
+     * will send out a wrong packet.
+     */
+    if (ip_flow->dl_type == htons(ETH_TYPE_IP)) {
+        put_16aligned_be32(&arp->ar_spa, ip_flow->nw_src);
+        put_16aligned_be32(&arp->ar_tpa, ip_flow->nw_dst);
+    } else {
+        put_16aligned_be32(&arp->ar_spa, 0);
+        put_16aligned_be32(&arp->ar_tpa, 0);
+    }
 
     if (ip_flow->vlans[0].tci & htons(VLAN_CFI)) {
         eth_push_vlan(&packet, htons(ETH_TYPE_VLAN_8021Q),
@@ -6620,8 +6638,11 @@  pinctrl_handle_nd_ns(struct rconn *swconn, const struct flow *ip_flow,
                      struct ofpbuf *userdata,
                      const struct ofpbuf *continuation)
 {
-    /* This action only works for IPv6 packets. */
-    if (get_dl_type(ip_flow) != htons(ETH_TYPE_IPV6)) {
+    uint16_t dl_type = ntohs(ip_flow->dl_type);
+
+    /* This action only works for IPv4 or IPv6 packets, and the switch should
+     * only send us IP packets this way, but check here just to be sure. */
+    if (dl_type != ETH_TYPE_IP && dl_type != ETH_TYPE_IPV6) {
         static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
         VLOG_WARN_RL(&rl, "NS action on non-IPv6 packet");
         return;
@@ -6637,8 +6658,23 @@  pinctrl_handle_nd_ns(struct rconn *swconn, const struct flow *ip_flow,
     dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub);
 
     in6_generate_lla(ip_flow->dl_src, &ipv6_src);
+
+    /* We might be here without actually currently handling an IPv6 packet.
+     * This can happen in the case where we route IPv4 packets over an IPv6
+     * link.
+     * In these cases we have no destination IPv6 address from the packet that
+     * we can reuse. But we receive the actual destination IPv6 address via
+     * userdata anyway, so what we pass to compose_nd_ns is irrelevant.
+     * This is just a hope since we do not parse the userdata. If we land here
+     * for whatever reason without being an IPv6 packet and without userdata we
+     * will send out a wrong packet.
+     */
+    struct in6_addr ipv6_dst = IN6ADDR_EXACT_INIT;
+    if (get_dl_type(ip_flow) == htons(ETH_TYPE_IPV6)) {
+        ipv6_dst = ip_flow->ipv6_dst;
+    }
     compose_nd_ns(&packet, ip_flow->dl_src, &ipv6_src,
-                  &ip_flow->ipv6_dst);
+                  &ipv6_dst);
 
     /* Reload previous packet metadata and set actions from userdata. */
     set_actions_and_enqueue_msg(swconn, &packet,
diff --git a/lib/actions.c b/lib/actions.c
index d5fc30b27..ea30be767 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -1765,7 +1765,7 @@  parse_nested_action(struct action_context *ctx, enum ovnact_type type,
 static void
 parse_ARP(struct action_context *ctx)
 {
-    parse_nested_action(ctx, OVNACT_ARP, "ip4", ctx->scope);
+    parse_nested_action(ctx, OVNACT_ARP, "ip", ctx->scope);
 }
 
 static void
@@ -1819,7 +1819,7 @@  parse_ND_NA_ROUTER(struct action_context *ctx)
 static void
 parse_ND_NS(struct action_context *ctx)
 {
-    parse_nested_action(ctx, OVNACT_ND_NS, "ip6", ctx->scope);
+    parse_nested_action(ctx, OVNACT_ND_NS, "ip", ctx->scope);
 }
 
 static void
diff --git a/northd/northd.c b/northd/northd.c
index 4fb48838b..436e42248 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -14893,7 +14893,8 @@  build_arp_request_flows_for_lrouter(
 
         ds_clear(match);
         ds_put_format(match, "eth.dst == 00:00:00:00:00:00 && "
-                      "ip6 && " REG_NEXT_HOP_IPV6 " == %s",
+                      REGBIT_NEXTHOP_IS_IPV4" == 0 && "
+                      REG_NEXT_HOP_IPV6 " == %s",
                       route->nexthop);
         struct in6_addr sn_addr;
         struct eth_addr eth_dst;
@@ -14923,7 +14924,8 @@  build_arp_request_flows_for_lrouter(
     }
 
     ovn_lflow_metered(lflows, od, S_ROUTER_IN_ARP_REQUEST, 100,
-                      "eth.dst == 00:00:00:00:00:00 && ip4",
+                      "eth.dst == 00:00:00:00:00:00 && "
+                      REGBIT_NEXTHOP_IS_IPV4" == 1",
                       "arp { "
                       "eth.dst = ff:ff:ff:ff:ff:ff; "
                       "arp.spa = " REG_SRC_IPV4 "; "
@@ -14935,7 +14937,8 @@  build_arp_request_flows_for_lrouter(
                                      meter_groups),
                       lflow_ref);
     ovn_lflow_metered(lflows, od, S_ROUTER_IN_ARP_REQUEST, 100,
-                      "eth.dst == 00:00:00:00:00:00 && ip6",
+                      "eth.dst == 00:00:00:00:00:00 && "
+                      REGBIT_NEXTHOP_IS_IPV4" == 0",
                       "nd_ns { "
                       "nd.target = " REG_NEXT_HOP_IPV6 "; "
                       "output; "
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 803823afa..4335baeec 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -6979,10 +6979,10 @@  AT_CHECK([grep -e "lr_in_arp_resolve" lr0flows | ovn_strip_lflows], [0], [dnl
 
 AT_CHECK([grep -e "lr_in_arp_request" lr0flows | ovn_strip_lflows], [0], [dnl
   table=??(lr_in_arp_request  ), priority=0    , match=(1), action=(output;)
-  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip4), action=(arp { eth.dst = ff:ff:ff:ff:ff:ff; arp.spa = reg5; arp.tpa = reg0; arp.op = 1; output; }; output;)
-  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && ip6), action=(nd_ns { nd.target = xxreg0; output; }; output;)
-  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && ip6 && xxreg0 == 2001:db8::10), action=(nd_ns { eth.dst = 33:33:ff:00:00:10; ip6.dst = ff02::1:ff00:10; nd.target = 2001:db8::10; output; }; output;)
-  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && ip6 && xxreg0 == 2001:db8::20), action=(nd_ns { eth.dst = 33:33:ff:00:00:20; ip6.dst = ff02::1:ff00:20; nd.target = 2001:db8::20; output; }; output;)
+  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 0), action=(nd_ns { nd.target = xxreg0; output; }; output;)
+  table=??(lr_in_arp_request  ), priority=100  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 1), action=(arp { eth.dst = ff:ff:ff:ff:ff:ff; arp.spa = reg5; arp.tpa = reg0; arp.op = 1; output; }; output;)
+  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 0 && xxreg0 == 2001:db8::10), action=(nd_ns { eth.dst = 33:33:ff:00:00:10; ip6.dst = ff02::1:ff00:10; nd.target = 2001:db8::10; output; }; output;)
+  table=??(lr_in_arp_request  ), priority=200  , match=(eth.dst == 00:00:00:00:00:00 && reg9[[9]] == 0 && xxreg0 == 2001:db8::20), action=(nd_ns { eth.dst = 33:33:ff:00:00:20; ip6.dst = ff02::1:ff00:20; nd.target = 2001:db8::20; output; }; output;)
 ])
 
 AT_CLEANUP
diff --git a/tests/ovn.at b/tests/ovn.at
index ec90a3b4e..a29ec7114 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -1396,11 +1396,11 @@  clone { ip4.dst = 255.255.255.255; output; }; next;
 # arp
 arp { eth.dst = ff:ff:ff:ff:ff:ff; output; }; output;
     encodes as controller(userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.ff.ff.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.OFTABLE_SAVE_INPORT_HEX.00.00.00,pause),resubmit(,OFTABLE_SAVE_INPORT)
-    has prereqs ip4
+    has prereqs ip
 arp { };
     formats as arp { drop; };
     encodes as controller(userdata=00.00.00.00.00.00.00.00,pause)
-    has prereqs ip4
+    has prereqs ip
 
 # get_arp
 get_arp(outport, ip4.dst);
@@ -1564,12 +1564,12 @@  reg9[[8]] = dhcp_relay_resp_chk(192.168.1, 172.16.1.1);
 # nd_ns
 nd_ns { nd.target = xxreg0; output; };
     encodes as controller(userdata=00.00.00.09.00.00.00.00.00.1c.00.18.00.80.00.00.00.00.00.00.00.01.de.10.80.00.3e.10.00.00.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.OFTABLE_SAVE_INPORT_HEX.00.00.00,pause)
-    has prereqs ip6
+    has prereqs ip
 
 nd_ns { };
     formats as nd_ns { drop; };
     encodes as controller(userdata=00.00.00.09.00.00.00.00,pause)
-    has prereqs ip6
+    has prereqs ip
 
 # nd_na
 nd_na { eth.src = 12:34:56:78:9a:bc; nd.tll = 12:34:56:78:9a:bc; outport = inport; inport = ""; /* Allow sending out inport. */ output; };
@@ -40144,6 +40144,266 @@  OVN_CLEANUP([hv1],[hv2])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([2 HVs, 2 LS, 1 lport/LS, LRs connected via LS, IPv4 over IPv6, dynamic])
+AT_SKIP_IF([test $HAVE_SCAPY = no])
+ovn_start
+
+# Logical network:
+# Two LRs - R1 and R2 that are connected to ls-transfer in 2001:db8::/64
+# network. R1 has a switchs ls1 (192.168.1.0/24) connected to it.
+# R2 has ls2 (172.16.1.0/24) connected to it.
+
+ls1_lp1_mac="f0:00:00:01:02:03"
+rp_ls1_mac="00:00:00:01:02:03"
+rp_ls2_mac="00:00:00:01:02:04"
+ls2_lp1_mac="f0:00:00:01:02:04"
+
+ls1_lp1_ip="192.168.1.2"
+ls2_lp1_ip="172.16.1.2"
+
+check ovn-nbctl lr-add R1
+check ovn-nbctl lr-add R2
+
+check ovn-nbctl ls-add ls1
+check ovn-nbctl ls-add ls2
+check ovn-nbctl ls-add ls-transfer
+
+# Connect ls1 to R1
+check ovn-nbctl lrp-add R1 ls1 $rp_ls1_mac 192.168.1.1/24
+check ovn-nbctl set Logical_Router R1 options:dynamic_neigh_routers=true
+
+check ovn-nbctl lsp-add ls1 rp-ls1 -- set Logical_Switch_Port rp-ls1 type=router \
+  options:router-port=ls1 addresses=\"$rp_ls1_mac\"
+
+# Connect ls2 to R2
+check ovn-nbctl lrp-add R2 ls2 $rp_ls2_mac 172.16.1.1/24
+check ovn-nbctl set Logical_Router R2 options:dynamic_neigh_routers=true
+
+check ovn-nbctl lsp-add ls2 rp-ls2 -- set Logical_Switch_Port rp-ls2 type=router \
+  options:router-port=ls2 addresses=\"$rp_ls2_mac\"
+
+# Connect R1 to R2
+check ovn-nbctl lrp-add R1 R1_ls-transfer 00:00:00:02:03:04 2001:db8::1/64
+check ovn-nbctl lrp-add R2 R2_ls-transfer 00:00:00:02:03:05 2001:db8::2/64
+
+check ovn-nbctl lsp-add ls-transfer ls-transfer_r1 -- \
+  set Logical_Switch_Port ls-transfer_r1 type=router \
+  options:router-port=R1_ls-transfer addresses=\"router\"
+check ovn-nbctl lsp-add ls-transfer ls-transfer_r2 -- \
+  set Logical_Switch_Port ls-transfer_r2 type=router \
+  options:router-port=R2_ls-transfer addresses=\"router\"
+
+AT_CHECK([ovn-nbctl lr-route-add R1 "0.0.0.0/0" 2001:db8::2])
+AT_CHECK([ovn-nbctl lr-route-add R2 "0.0.0.0/0" 2001:db8::1])
+
+# Create logical port ls1-lp1 in ls1
+check ovn-nbctl lsp-add ls1 ls1-lp1 \
+-- lsp-set-addresses ls1-lp1 "$ls1_lp1_mac $ls1_lp1_ip"
+
+# Create logical port ls2-lp1 in ls2
+check ovn-nbctl lsp-add ls2 ls2-lp1 \
+-- lsp-set-addresses ls2-lp1 "$ls2_lp1_mac $ls2_lp1_ip"
+
+# Create two hypervisor and create OVS ports corresponding to logical ports.
+net_add n1
+
+sim_add hv1
+as hv1
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+check ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=ls1-lp1 \
+    options:tx_pcap=hv1/vif1-tx.pcap \
+    options:rxq_pcap=hv1/vif1-rx.pcap \
+    ofport-request=1
+
+sim_add hv2
+as hv2
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.2
+check ovs-vsctl -- add-port br-int hv2-vif1 -- \
+    set interface hv2-vif1 external-ids:iface-id=ls2-lp1 \
+    options:tx_pcap=hv2/vif1-tx.pcap \
+    options:rxq_pcap=hv2/vif1-rx.pcap \
+    ofport-request=1
+
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+OVN_POPULATE_ARP
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
+# Packet to send.
+packet=$(fmt_pkt "Ether(dst='${rp_ls1_mac}', src='${ls1_lp1_mac}')/ \
+                        IP(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', ttl=64)/ \
+                        UDP(sport=53, dport=4369)")
+check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
+
+# Packet to Expect
+# The TTL should be decremented by 2.
+expected=$(fmt_pkt "Ether(dst='${ls2_lp1_mac}', src='${rp_ls2_mac}')/ \
+                        IP(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', ttl=62)/ \
+                        UDP(sport=53, dport=4369)")
+echo ${expected} > expected
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
+
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
+grep "reg0 == 172.16.1.2" | wc -l], [0], [1
+])
+
+# Disable the ls2-lp1 port.
+check ovn-nbctl --wait=hv set logical_switch_port ls2-lp1 enabled=false
+
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
+grep "reg0 == 172.16.1.2" | wc -l], [0], [0
+])
+
+# Send the same packet again and it should not be delivered
+check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
+
+# The 2nd packet sent shound not be received.
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
+
+OVN_CLEANUP([hv1],[hv2])
+
+AT_CLEANUP
+])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([2 HVs, 2 LS, 1 lport/LS, LRs connected via LS, IPv6 over IPv4, dynamic])
+AT_SKIP_IF([test $HAVE_SCAPY = no])
+ovn_start
+
+# Logical network:
+# Two LRs - R1 and R2 that are connected to ls-transfer in 10.0.0.0/24
+# network. R1 has a switchs ls1 (2001:db8:1::/64) connected to it.
+# R2 has ls2 (2001:db8:2::/64) connected to it.
+
+ls1_lp1_mac="f0:00:00:01:02:03"
+rp_ls1_mac="00:00:00:01:02:03"
+rp_ls2_mac="00:00:00:01:02:04"
+ls2_lp1_mac="f0:00:00:01:02:04"
+
+ls1_lp1_ip="2001:db8:1::2"
+ls2_lp1_ip="2001:db8:2::2"
+
+check ovn-nbctl lr-add R1
+check ovn-nbctl lr-add R2
+
+check ovn-nbctl ls-add ls1
+check ovn-nbctl ls-add ls2
+check ovn-nbctl ls-add ls-transfer
+
+# Connect ls1 to R1
+check ovn-nbctl lrp-add R1 ls1 $rp_ls1_mac 2001:db8:1::1/64
+check ovn-nbctl set Logical_Router R1 options:dynamic_neigh_routers=true
+
+check ovn-nbctl lsp-add ls1 rp-ls1 -- set Logical_Switch_Port rp-ls1 type=router \
+  options:router-port=ls1 addresses=\"$rp_ls1_mac\"
+
+# Connect ls2 to R2
+check ovn-nbctl lrp-add R2 ls2 $rp_ls2_mac 2001:db8:2::1/64
+check ovn-nbctl set Logical_Router R2 options:dynamic_neigh_routers=true
+
+check ovn-nbctl lsp-add ls2 rp-ls2 -- set Logical_Switch_Port rp-ls2 type=router \
+  options:router-port=ls2 addresses=\"$rp_ls2_mac\"
+
+# Connect R1 to R2
+check ovn-nbctl lrp-add R1 R1_ls-transfer 00:00:00:02:03:04 10.0.0.1/24
+check ovn-nbctl lrp-add R2 R2_ls-transfer 00:00:00:02:03:05 10.0.0.2/24
+
+check ovn-nbctl lsp-add ls-transfer ls-transfer_r1 -- \
+  set Logical_Switch_Port ls-transfer_r1 type=router \
+  options:router-port=R1_ls-transfer addresses=\"router\"
+check ovn-nbctl lsp-add ls-transfer ls-transfer_r2 -- \
+  set Logical_Switch_Port ls-transfer_r2 type=router \
+  options:router-port=R2_ls-transfer addresses=\"router\"
+
+AT_CHECK([ovn-nbctl lr-route-add R1 "::/0" 10.0.0.2])
+AT_CHECK([ovn-nbctl lr-route-add R2 "::/0" 10.0.0.1])
+
+# Create logical port ls1-lp1 in ls1
+check ovn-nbctl lsp-add ls1 ls1-lp1 \
+-- lsp-set-addresses ls1-lp1 "$ls1_lp1_mac $ls1_lp1_ip"
+
+# Create logical port ls2-lp1 in ls2
+check ovn-nbctl lsp-add ls2 ls2-lp1 \
+-- lsp-set-addresses ls2-lp1 "$ls2_lp1_mac $ls2_lp1_ip"
+
+# Create two hypervisor and create OVS ports corresponding to logical ports.
+net_add n1
+
+sim_add hv1
+as hv1
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+check ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=ls1-lp1 \
+    options:tx_pcap=hv1/vif1-tx.pcap \
+    options:rxq_pcap=hv1/vif1-rx.pcap \
+    ofport-request=1
+
+sim_add hv2
+as hv2
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.2
+check ovs-vsctl -- add-port br-int hv2-vif1 -- \
+    set interface hv2-vif1 external-ids:iface-id=ls2-lp1 \
+    options:tx_pcap=hv2/vif1-tx.pcap \
+    options:rxq_pcap=hv2/vif1-rx.pcap \
+    ofport-request=1
+
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+OVN_POPULATE_ARP
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
+# Packet to send.
+packet=$(fmt_pkt "Ether(dst='${rp_ls1_mac}', src='${ls1_lp1_mac}')/ \
+                        IPv6(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', hlim=64)/ \
+                        UDP(sport=53, dport=4369)")
+check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
+
+# Packet to Expect
+# The TTL should be decremented by 2.
+expected=$(fmt_pkt "Ether(dst='${ls2_lp1_mac}', src='${rp_ls2_mac}')/ \
+                        IPv6(src='${ls1_lp1_ip}', dst='${ls2_lp1_ip}', hlim=62)/ \
+                        UDP(sport=53, dport=4369)")
+echo ${expected} > expected
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
+
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
+grep "reg0 == 2001:db8:2::2" | wc -l], [0], [1
+])
+
+# Disable the ls2-lp1 port.
+check ovn-nbctl --wait=hv set logical_switch_port ls2-lp1 enabled=false
+
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_arp_resolve | \
+grep "reg0 == 2001:db8:2::2" | wc -l], [0], [0
+])
+
+# Send the same packet again and it should not be delivered
+check as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 "$packet"
+
+# The 2nd packet sent shound not be received.
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
+
+OVN_CLEANUP([hv1],[hv2])
+
+AT_CLEANUP
+])
+
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([2 HVs, 2 LS, 1 lport/LS, LRs connected via LS, IPv4 over IPv6, ECMP])
 AT_SKIP_IF([test $HAVE_SCAPY = no])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 145399ded..2e7efb919 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -14121,3 +14121,321 @@  OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
 /.*terminating with signal 15.*/d"])
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([Routing IPv4 to external network over IPv6 next-hop, Distributed and Centralized NAT])
+# Logical network:
+#  * R1 is distributed LR
+#    * connected to "physical" router "ext1" via IPv6 network
+#    * connected to alice via LS "sw0" and bob vial LS "sw1"
+#    * provides distributed IPv4 dnat_and_snat for alice
+#    * provides centralized IPv4 snat for bob
+#  * ext1 acts as a "physical router" connected to R1 over IPv6 and to further
+#    external networks via IPv4.
+# This test ensures connectivity between hosts on internal IPv4 networks
+# to the external IPv4 networks over IPv6 network connecting R1 and ext1.
+# +------------+
+# |    alice   |---+   +--------+
+# +------------+   +---|   R1   |  IPv6 net. +------------+   |
+#                      |        |------------|    ext1    |---| IPv4 net.
+# +------------+   +---|        |            +------------+   |
+# |     bob    |---+   +--------+
+# +------------+
+
+ovn_start
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_BR([br-int])
+ADD_BR([br-ext])
+
+ovs-vsctl set-fail-mode br-ext standalone
+# Set external-ids in br-int needed for ovn-controller
+check ovs-vsctl \
+        -- set Open_vSwitch . external-ids:system-id=hv1 \
+        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
+        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
+        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
+        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true \
+        -- set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext
+
+# Start ovn-controller
+start_daemon ovn-controller
+
+# Create router and switch facilitating external connectivity
+check ovn-nbctl \
+        -- lr-add R1 \
+        -- lrp-add R1 lrp-r1-public 00:00:02:ff:ff:01 2001:1db8:ffff::1/64 \
+        -- ls-add public \
+        -- lsp-add public lsp-public-r1 \
+        -- lsp-set-type lsp-public-r1 router \
+        -- lsp-set-options lsp-public-r1 router-port=lrp-r1-public \
+        -- lsp-set-addresses lsp-public-r1 router
+
+# Create HA Distributed GW Port
+check ovn-nbctl \
+        -- ha-chassis-group-add G1 \
+        -- ha-chassis-group-add-chassis G1 hv1 10
+
+group_uuid=$(ovn-nbctl get ha_chassis_group G1 _uuid)
+check ovn-nbctl set logical_router_port lrp-r1-public ha_chassis_group="$group_uuid"
+
+
+# Interconnect external and internal bridges
+check ovn-nbctl \
+        -- lsp-add public  ext-patch \
+        -- lsp-set-addresses ext-patch unknown \
+        -- lsp-set-type ext-patch localnet \
+        -- lsp-set-options ext-patch network_name=phynet
+
+check ovn-nbctl --wait=hv sync
+
+# Create "external host"
+ADD_NAMESPACES(ext1)
+ADD_VETH(ext1, ext1, br-ext, "2001:1db8:ffff::2/64", "00:00:02:ff:ff:02", [], [nodad])
+OVS_WAIT_UNTIL([NS_EXEC([ext1], [ip a show dev ext1 | grep "fe80::" | grep -v tentative])])
+
+# Simulate external IPv4 network "behind" external host
+NS_CHECK_EXEC([ext1], [ip link add dummy0 type dummy])
+NS_CHECK_EXEC([ext1], [ip link set dummy0 up])
+NS_CHECK_EXEC([ext1], [ip addr add 10.42.0.10/32 dev dummy0])
+
+# Add IPv4 route over IPv6 next-hop to the router
+check ovn-nbctl lr-route-add R1 10.42.0.10/24 2001:1db8:ffff::2 lrp-r1-public
+
+# Test Distributed NAT
+# Create internal network and connect it to router
+check ovn-nbctl \
+        -- lrp-add R1 lrp-r1-sw0 00:00:03:00:00:01 192.168.100.1/24 \
+        -- ls-add sw0 \
+        -- lsp-add sw0 lsp-sw0-r1 \
+        -- lsp-set-type lsp-sw0-r1 router \
+        -- lsp-set-options lsp-sw0-r1 router-port=lrp-r1-sw0 \
+        -- lsp-set-addresses lsp-sw0-r1 router
+
+# Create "guest host" alice with distributed dnat_and_snat mapping
+ADD_NAMESPACES(alice)
+ADD_VETH(alice, alice, br-int, "192.168.100.10/24", "00:00:03:00:00:02", \
+         "192.168.100.1")
+check ovn-nbctl \
+        -- lsp-add sw0 alice \
+        -- lsp-set-addresses alice "00:00:03:00:00:02 192.168.100.10/24"
+
+check ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.10.1 192.168.100.10 alice 00:00:04:00:00:01
+
+# Add route for R1's external SNAT/DNAT address to external host
+NS_EXEC([ext1], [ip route add 172.16.10.1/32 via inet6 2001:1db8:ffff::1])
+
+# Ping external network from alice via NAT and IPv6 next-hop
+NS_CHECK_EXEC([alice], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+# Ping alice's DNAT address from external network
+NS_EXEC([ext1], [ip neighbor flush dev ext1])
+NS_CHECK_EXEC([ext1], [ping -q -c 3 -i 0.3 -w 2 -I 10.42.0.10 172.16.10.1 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+# Test Centralized NAT
+# Create internal network and connect it to router
+check ovn-nbctl \
+        -- lrp-add R1 lrp-r1-sw1 00:00:05:00:00:01 192.168.200.1/24 \
+        -- ls-add sw1 \
+        -- lsp-add sw1 lsp-sw1-r1 \
+        -- lsp-set-type lsp-sw1-r1 router \
+        -- lsp-set-options lsp-sw1-r1 router-port=lrp-r1-sw1 \
+        -- lsp-set-addresses lsp-sw1-r1 router
+
+# Create "guest host" bob with centralized SNAT
+ADD_NAMESPACES(bob)
+ADD_VETH(bob, bob, br-int, "192.168.200.10/24", "00:00:05:00:00:02", \
+         "192.168.200.1")
+check ovn-nbctl \
+        -- lsp-add sw1 bob \
+        -- lsp-set-addresses bob "00:00:05:00:00:02 192.168.200.10/24"
+check ovn-nbctl lr-nat-add R1 snat 172.16.10.2 192.168.200.10
+
+# Add route for R1's external SNAT address to external host
+NS_EXEC([ext1], [ip route add 172.16.10.2/32 via inet6 2001:1db8:ffff::1])
+
+# Ping external network from bob via NAT and IPv6 next-hop
+NS_CHECK_EXEC([bob], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+
+OVS_APP_EXIT_AND_WAIT([ovn-controller])
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as
+OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
+/connection dropped.*/d"])
+AT_CLEANUP
+])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([Routing IPv4 to external network over IPv6 next-hop, Gateway router with NAT])
+# Logical network:
+#  * R1 is gateway LR
+#    * connected to "physical" router "ext1" via IPv6 network
+#    * connected to alice via LS "sw0" and bob vial LS "sw1"
+#    * provides IPv4 dnat_and_snat for alice
+#    * provides IPv4 snat for bob
+#  * ext1 acts as a "physical router" connected to R1 over IPv6 and to further
+#    external networks via IPv4.
+# This test ensures connectivity between hosts on internal IPv4 networks
+# to the external IPv4 networks over IPv6 network connecting R1 and ext1.
+# +------------+
+# |    alice   |---+   +--------+
+# +------------+   +---|   R1   |  IPv6 net. +------------+   |
+#                      |        |------------|    ext1    |---| IPv4 net.
+# +------------+   +---|        |            +------------+   |
+# |     bob    |---+   +--------+
+# +------------+
+
+ovn_start
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_BR([br-int])
+ADD_BR([br-ext])
+
+ovs-vsctl set-fail-mode br-ext standalone
+# Set external-ids in br-int needed for ovn-controller
+check ovs-vsctl \
+        -- set Open_vSwitch . external-ids:system-id=hv1 \
+        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
+        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
+        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
+        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true \
+        -- set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext
+
+# Start ovn-controller
+start_daemon ovn-controller
+
+# Create gateway router and switch facilitating external connectivity
+check ovn-nbctl \
+        -- lr-add R1 \
+        -- lrp-add R1 lrp-r1-public 00:00:02:ff:ff:01 2001:1db8:ffff::1/64 \
+        -- set Logical_Router R1 options:chassis=hv1 \
+        -- ls-add public \
+        -- lsp-add public lsp-public-r1 \
+        -- lsp-set-type lsp-public-r1 router \
+        -- lsp-set-options lsp-public-r1 router-port=lrp-r1-public \
+        -- lsp-set-addresses lsp-public-r1 router
+
+
+# Interconnect external and internal bridges
+check ovn-nbctl \
+        -- lsp-add public  ext-patch \
+        -- lsp-set-addresses ext-patch unknown \
+        -- lsp-set-type ext-patch localnet \
+        -- lsp-set-options ext-patch network_name=phynet
+
+check ovn-nbctl --wait=hv sync
+
+# Create "external host"
+ADD_NAMESPACES(ext1)
+ADD_VETH(ext1, ext1, br-ext, "2001:1db8:ffff::2/64", "00:00:02:ff:ff:02", [], [nodad])
+OVS_WAIT_UNTIL([NS_EXEC([ext1], [ip a show dev ext1 | grep "fe80::" | grep -v tentative])])
+
+# Simulate external IPv4 network "behind" external host
+NS_CHECK_EXEC([ext1], [ip link add dummy0 type dummy])
+NS_CHECK_EXEC([ext1], [ip link set dummy0 up])
+NS_CHECK_EXEC([ext1], [ip addr add 10.42.0.10/32 dev dummy0])
+
+# Add IPv4 route over IPv6 next-hop to the router
+check ovn-nbctl lr-route-add R1 10.42.0.10/24 2001:1db8:ffff::2 lrp-r1-public
+
+# Test dnat_and_snat NAT type
+# Create internal network and connect it to router
+check ovn-nbctl \
+        -- lrp-add R1 lrp-r1-sw0 00:00:03:00:00:01 192.168.100.1/24 \
+        -- ls-add sw0 \
+        -- lsp-add sw0 lsp-sw0-r1 \
+        -- lsp-set-type lsp-sw0-r1 router \
+        -- lsp-set-options lsp-sw0-r1 router-port=lrp-r1-sw0 \
+        -- lsp-set-addresses lsp-sw0-r1 router
+
+# Create "guest host" alice with dnat_and_snat mapping
+ADD_NAMESPACES(alice)
+ADD_VETH(alice, alice, br-int, "192.168.100.10/24", "00:00:03:00:00:02", \
+         "192.168.100.1")
+check ovn-nbctl \
+        -- lsp-add sw0 alice \
+        -- lsp-set-addresses alice "00:00:03:00:00:02 192.168.100.10/24"
+
+check ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.10.1 192.168.100.10
+
+# Add route for R1's external SNAT/DNAT address to external host
+NS_EXEC([ext1], [ip route add 172.16.10.1/32 via inet6 2001:1db8:ffff::1])
+
+# Ping external network from alice via NAT and IPv6 next-hop
+NS_CHECK_EXEC([alice], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+# Ping alice's DNAT address from external network
+NS_EXEC([ext1], [ip neighbor flush dev ext1])
+NS_CHECK_EXEC([ext1], [ping -q -c 3 -i 0.3 -w 2 -I 10.42.0.10 172.16.10.1 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+# Test SNAT
+# Create internal network and connect it to router
+check ovn-nbctl \
+        -- lrp-add R1 lrp-r1-sw1 00:00:05:00:00:01 192.168.200.1/24 \
+        -- ls-add sw1 \
+        -- lsp-add sw1 lsp-sw1-r1 \
+        -- lsp-set-type lsp-sw1-r1 router \
+        -- lsp-set-options lsp-sw1-r1 router-port=lrp-r1-sw1 \
+        -- lsp-set-addresses lsp-sw1-r1 router
+
+# Create "guest host" bob with SNAT
+ADD_NAMESPACES(bob)
+ADD_VETH(bob, bob, br-int, "192.168.200.10/24", "00:00:05:00:00:02", \
+         "192.168.200.1")
+check ovn-nbctl \
+        -- lsp-add sw1 bob \
+        -- lsp-set-addresses bob "00:00:05:00:00:02 192.168.200.10/24"
+
+check ovn-nbctl lr-nat-add R1 snat 172.16.10.2 192.168.200.10
+
+# Add route for R1's external SNAT address to external host
+NS_EXEC([ext1], [ip route add 172.16.10.2/32 via inet6 2001:1db8:ffff::1])
+
+# Ping external network from bob via NAT and IPv6 next-hop
+NS_CHECK_EXEC([bob], [ping -q -c 3 -i 0.3 -w 2 10.42.0.10 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+
+OVS_APP_EXIT_AND_WAIT([ovn-controller])
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as
+OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
+/connection dropped.*/d"])
+AT_CLEANUP
+])