diff mbox series

[ovs-dev,v2] Reply only for the multicast ND solicitations.

Message ID 20240815030257.1091794-1-numans@ovn.org
State Changes Requested
Headers show
Series [ovs-dev,v2] Reply only for the multicast ND solicitations. | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Numan Siddique Aug. 15, 2024, 3:02 a.m. UTC
From: Numan Siddique <numans@ovn.org>

IPv6 ND Solicitation (NS) responder logical flows match on ip6.dst
field.  These flows when translated to datapath flows also match on
ip6.dst, which means a separate datapath flow per destination IP
address.  This may cause significant performance issues in some
setups (particularly ovs-dpdk telco deployments).

This patch addresses this issue by matching on eth.mcast6 so that
datapath flows for normal IPv6 traffic doesn't have to match on
ip6.dst.  IPv6 NS packets are generally multicast.  A new logical
match "nd_ns_mcast" is added for this purpose.

After this patch, We no longer respond to IPv6 NS unicast packets.
Let the target reply to it, so that the sender has the ability to
monitor the targe liveness via the unicast ND solicitations.
This behavior now matches the IPv4 ARP responder flows.  Note that
after the commit [1] which was recently added we now only respond
to IPv4 ARP broadcast packets.

A recent patch [2] from Ilya partially addressed the same datapath
flow explosion issue by matching on eth.mcast6 for MLD packets.
With this patch, we now completely address the datapath flow
explosion issue for IPv6 traffic provided all the logical ports
of a logical switch are not configured with port security.

[1] - c48ed1736a58 ("Do not reply on unicast arps for IPv4 targets.")
[2] - 43c34f2e6676 ("logical-fields: Add missing multicast matches for MLD and IGMP.")

Note: Documentation for 'eth.mcastv6' and 'ip6.mcast' predicates were
missing from ovn-sb.xml and this patch adds it.

Reported-at: https://issues.redhat.com/browse/FDP-728
Reported-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Numan Siddique <numans@ovn.org>
---

v1 -> v2
-------
  - Addressed review comments from Ilya.

 lib/logical-fields.c |   3 ++
 northd/northd.c      |  21 +++++---
 ovn-sb.xml           |   3 ++
 tests/ovn-northd.at  |   4 +-
 tests/ovn.at         |   4 +-
 tests/system-ovn.at  | 123 +++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 146 insertions(+), 12 deletions(-)

Comments

Dumitru Ceara Aug. 15, 2024, 10:37 a.m. UTC | #1
On 8/15/24 05:02, numans@ovn.org wrote:
> From: Numan Siddique <numans@ovn.org>
> 
> IPv6 ND Solicitation (NS) responder logical flows match on ip6.dst
> field.  These flows when translated to datapath flows also match on
> ip6.dst, which means a separate datapath flow per destination IP
> address.  This may cause significant performance issues in some
> setups (particularly ovs-dpdk telco deployments).
> 
> This patch addresses this issue by matching on eth.mcast6 so that
> datapath flows for normal IPv6 traffic doesn't have to match on
> ip6.dst.  IPv6 NS packets are generally multicast.  A new logical
> match "nd_ns_mcast" is added for this purpose.
> 
> After this patch, We no longer respond to IPv6 NS unicast packets.
> Let the target reply to it, so that the sender has the ability to
> monitor the targe liveness via the unicast ND solicitations.
> This behavior now matches the IPv4 ARP responder flows.  Note that
> after the commit [1] which was recently added we now only respond
> to IPv4 ARP broadcast packets.
> 
> A recent patch [2] from Ilya partially addressed the same datapath
> flow explosion issue by matching on eth.mcast6 for MLD packets.
> With this patch, we now completely address the datapath flow
> explosion issue for IPv6 traffic provided all the logical ports
> of a logical switch are not configured with port security.
> 
> [1] - c48ed1736a58 ("Do not reply on unicast arps for IPv4 targets.")
> [2] - 43c34f2e6676 ("logical-fields: Add missing multicast matches for MLD and IGMP.")
> 
> Note: Documentation for 'eth.mcastv6' and 'ip6.mcast' predicates were
> missing from ovn-sb.xml and this patch adds it.
> 
> Reported-at: https://issues.redhat.com/browse/FDP-728
> Reported-by: Mike Pattrick <mkp@redhat.com>
> Signed-off-by: Numan Siddique <numans@ovn.org>
> ---
> 
> v1 -> v2
> -------
>   - Addressed review comments from Ilya.
> 

Hi Numan,

For proxy-arp we still match on unicast ND as far as I can tell.  I'm
not sure if we can remove that part though.  Maybe it's something to
consider as a follow up?

build_lswitch_arp_nd_responder_known_ips()
[...]
        /* Add IPv6 NDP responses.
         * For ND solicitations, we need to listen for both the
         * unicast IPv6 address and its all-nodes multicast address,
         * but always respond with the unicast IPv6 address. */
        if (op->proxy_arp_addrs.n_ipv6_addrs) {
            struct ds ip6_dst_match = DS_EMPTY_INITIALIZER;
            struct ds nd_target_match = DS_EMPTY_INITIALIZER;
            for (size_t j = 0; j < op->proxy_arp_addrs.n_ipv6_addrs; j++) {
                ds_put_format(&ip6_dst_match, "%s/%u, %s/%u, ",
                        op->proxy_arp_addrs.ipv6_addrs[j].addr_s,
                        op->proxy_arp_addrs.ipv6_addrs[j].plen,
                        op->proxy_arp_addrs.ipv6_addrs[j].sn_addr_s,
                        op->proxy_arp_addrs.ipv6_addrs[j].plen);
                ds_put_format(&nd_target_match, "%s/%u, ",
                        op->proxy_arp_addrs.ipv6_addrs[j].addr_s,
                        op->proxy_arp_addrs.ipv6_addrs[j].plen);
            }

Aside from that I have a few small comments on the test.  In general
the change makes sense to me.

Thanks,
Dumitru

>  lib/logical-fields.c |   3 ++
>  northd/northd.c      |  21 +++++---
>  ovn-sb.xml           |   3 ++
>  tests/ovn-northd.at  |   4 +-
>  tests/ovn.at         |   4 +-
>  tests/system-ovn.at  | 123 +++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 146 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/logical-fields.c b/lib/logical-fields.c
> index 2c9d3c61bf..5a8b53f2b6 100644
> --- a/lib/logical-fields.c
> +++ b/lib/logical-fields.c
> @@ -293,6 +293,9 @@ ovn_init_symtab(struct shash *symtab)
>                "icmp6.type == {135, 136} && icmp6.code == 0 && ip.ttl == 255");
>      expr_symtab_add_predicate(symtab, "nd_ns",
>                "icmp6.type == 135 && icmp6.code == 0 && ip.ttl == 255");
> +    expr_symtab_add_predicate(symtab, "nd_ns_mcast",
> +              "ip6.mcast && icmp6.type == 135 && icmp6.code == 0 && "
> +              "ip.ttl == 255");
>      expr_symtab_add_predicate(symtab, "nd_na",
>                "icmp6.type == 136 && icmp6.code == 0 && ip.ttl == 255");
>      expr_symtab_add_predicate(symtab, "nd_rs",
> diff --git a/northd/northd.c b/northd/northd.c
> index c4824aae3b..c9ae1e958f 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -9633,16 +9633,21 @@ build_lswitch_arp_nd_responder_known_ips(struct ovn_port *op,
>                                                    op->lflow_ref);
>              }
>  
> -            /* For ND solicitations, we need to listen for both the
> -             * unicast IPv6 address and its all-nodes multicast address,
> -             * but always respond with the unicast IPv6 address. */
> +            /* For ND solicitations:
> +             *   - Reply only for the all-nodes multicast address(es) of the
> +             *     logical port IPv6 address(es).
> +             *
> +             *   - Do not reply for unicast ND solicitations.  Let the target
> +             *     reply it, so that the sender has the ability to monitor
> +             *     the targe liveness via the unicast ND solicitations.
> +             */
>              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) {
>                  ds_clear(match);
> -                ds_put_format(match,
> -                        "nd_ns && ip6.dst == {%s, %s} && nd.target == %s",
> -                        op->lsp_addrs[i].ipv6_addrs[j].addr_s,
> -                        op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
> -                        op->lsp_addrs[i].ipv6_addrs[j].addr_s);
> +                ds_put_format(
> +                    match,
> +                    "nd_ns_mcast && ip6.dst == %s && nd.target == %s",
> +                    op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
> +                    op->lsp_addrs[i].ipv6_addrs[j].addr_s);
>  
>                  ds_clear(actions);
>                  ds_put_format(actions,
> diff --git a/ovn-sb.xml b/ovn-sb.xml
> index de0bd636fd..c11296d7c4 100644
> --- a/ovn-sb.xml
> +++ b/ovn-sb.xml
> @@ -1146,6 +1146,7 @@
>        <ul>
>          <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
>          <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
> +        <li><code>eth.mcastv6</code> expands to <code>eth.dst[32..47] == 0x3333</code></li>
>          <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
>          <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
>          <li><code>ip4.src_mcast</code> expands to
> @@ -1161,8 +1162,10 @@
>          <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
>          <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
>          <li><code>rarp</code> expands to <code>eth.type == 0x8035</code></li>
> +        <li><code>ip6.mcast</code> expands to <code>eth.mcastv6 &amp;&amp; ip6.dst[120..127] == 0xff</code></li>
>          <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
>          <li><code>nd_ns</code> expands to <code>icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> +        <li><code>nd_ns_mcast</code> expands to <code>ip6.mcast  &amp;&amp; icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
>          <li><code>nd_na</code> expands to <code>icmp6.type == 136 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
>          <li><code>nd_rs</code> expands to <code>icmp6.type == 133 &amp;&amp;
>          icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 3c49c0d43b..93ccbce6b0 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -9503,9 +9503,9 @@ AT_CAPTURE_FILE([S1flows])
>  AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | ovn_strip_lflows], [0], [dnl
>    table=??(ls_in_arp_rsp      ), priority=0    , match=(1), action=(next;)
>    table=??(ls_in_arp_rsp      ), priority=100  , match=(arp.tpa == 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff && inport == "S1-vm1"), action=(next;)
> -  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;)
> +  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns_mcast && ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;)
>    table=??(ls_in_arp_rsp      ), priority=50   , match=(arp.tpa == 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff), action=(eth.dst = eth.src; eth.src = 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = 192.168.0.10; outport = inport; flags.loopback = 1; output;)
> -  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
> +  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns_mcast && ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
>  ])
>  
>  #Set the disable_arp_nd_rsp option and verify the flow
> diff --git a/tests/ovn.at b/tests/ovn.at
> index a1d689e849..7583da7e07 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -8617,7 +8617,7 @@ done
>  # Complete Neighbor Solicitation packet and Neighbor Advertisement packet
>  # vif1 -> NS -> vif2.  vif1 <- NA <- ovn-controller.
>  # vif2 will not receive NS packet, since ovn-controller will reply for it.
> -ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598fd81ce49a9480000f8163efffea1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
> +ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598ff0200000000000000000001ffa1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
>  na_packet=fa163e940598fa163ea1f9ae86dd6000000000203afffd81ce49a9480000f8163efffea1f9aefd81ce49a9480000f8163efffe9405988800e9ed60000000fd81ce49a9480000f8163efffea1f9ae0201fa163ea1f9ae
>  
>  as hv1 ovs-appctl netdev-dummy/receive vif1 $ns_packet
> @@ -14417,7 +14417,7 @@ test_ns_na() {
>      local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
>  
>      packet=$(fmt_pkt "
> -        Ether(dst='ff:ff:ff:ff:ff:ff', src='${src_mac}') /
> +        Ether(dst='33:33:ff:ff:ff:ff', src='${src_mac}') /
>          IPv6(src='${src_ip}', dst='ff02::1:ff00:2') /
>          ICMPv6ND_NS(tgt='${dst_ip}')
>      ")
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index 6e4ec42473..e43244ca60 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -13899,3 +13899,126 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>  /.*terminating with signal 15.*/d"])
>  AT_CLEANUP
>  ])
> +
> +dnl This test sends IPv6 TCP packets between two lports of the same
> +dnl logical switch and checks that the datapath flows don't match on
> +dnl IPv6 source and destination addresses.
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([ovn -- IPv6 switching - datapath flows])
> +AT_KEYWORDS([IPv6])
> +
> +ovn_start
> +OVS_TRAFFIC_VSWITCHD_START()
> +ADD_BR([br-int])
> +
> +dnl Set external-ids in br-int needed for ovn-controller
> +check ovs-vsctl \
> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
> +
> +dnl Start ovn-controller
> +start_daemon ovn-controller
> +
> +check ovn-nbctl ls-add sw0
> +
> +check ovn-nbctl lsp-add sw0 vm0
> +check ovn-nbctl lsp-set-addresses vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
> +
> +check ovn-nbctl lsp-add sw0 vm1
> +check ovn-nbctl lsp-set-addresses vm1 "f0:00:0f:01:02:04 10.0.0.4 1000::4"
> +
> +check ovn-nbctl --wait=hv sync
> +
> +ADD_NAMESPACES(vm0)
> +ADD_VETH(vm0, vm0, br-int, "1000::3/64", "f0:00:0f:01:02:03", "1000::1", "nodad")
> +
> +ADD_NAMESPACES(vm1)
> +ADD_VETH(vm1, vm1, br-int, "1000::4/64", "f0:00:0f:01:02:04", "1000::1", "nodad")
> +

It's probably best to ensure OVN bound the ports here, e.g.:

dnl Wait for ovn-controller to catch up.
wait_for_ports_up
check ovn-nbctl --wait=hv sync

The sync before ADD_NAMESPACES can probably be removed then.

> +dnl Checks the datapath flows and makes sure that there are no matches to
> +dnl IPv6 src or dst fields.
> +check_dp_flows() {
> +  ipv6_src_dst_match=$1
> +  dnl Dump the flows. Will be helpful while debugging.
> +  ovs-appctl dpctl/dump-flows

It might be better to change this to something like:

ovs-appctl dpctl/dump-flows > dp-flows
AT_CAPTURE_FILE([dp-flows])

> +
> +  eth_match1="eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)"
> +  eth_match2="eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)"
> +  ipv6_src_match="ipv6(src="
> +  ipv6_dst_match="ipv6(dst="
> +
> +  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" -c], [0], [dnl
> +2
> +])
> +
> +  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" -c], [0], [dnl
> +2
> +])
> +
> +  if [[ "$ipv6_src_dst_match" == "yes" ]]; then
> +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
> +    dnl matches on IPv6 src or dst.
> +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep -e "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
> +2
> +])
> +
> +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
> +    dnl matches on IPv6 src or dst.
> +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep -e "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
> +2
> +])
> +  else
> +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
> +    dnl doesn't match on IPv6 src or dst.
> +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep "$ipv6_src_match"], [1], [])
> +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep "$ipv6_dst_match"], [1], [])
> +
> +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
> +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep "$ipv6_src_match"], [1], [])
> +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep "$ipv6_dst_match"], [1], [])
> +  fi
> +}
> +
> +NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +# Start IPv6 TCP server on vm0.
> +NETNS_DAEMONIZE([vm0], [nc -6 -l -k 1000::3 4242], [nc-vm1-1.pid])
> +NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
> +
> +check_dp_flows no
> +
> +# Now configure port security on vm0.  dp flows should now match on IPv6 fields.
> +check ovn-nbctl lsp-set-port-security vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
> +check ovn-nbctl --wait=hv sync
> +
> +NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
> +[0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
> +check_dp_flows yes
> +

The test failed here with the userspace datapath in the ovsrobot run:
https://github.com/ovsrobot/ovn/actions/runs/10398228623/job/28795300825

Is it a matter of waiting for output instead of checking for datapath flows
immediately?  Alternatively do we need to wait for the revalidator to run
(ovs-appctl revalidator/wait)?

> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> +
> +as ovn-sb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as ovn-nb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as northd
> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
> +
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> +/failed to query port patch-.*/d
> +/.*terminating with signal 15.*/d"])
> +AT_CLEANUP
> +])
Numan Siddique Aug. 15, 2024, 4:56 p.m. UTC | #2
On Thu, Aug 15, 2024 at 6:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 8/15/24 05:02, numans@ovn.org wrote:
> > From: Numan Siddique <numans@ovn.org>
> >
> > IPv6 ND Solicitation (NS) responder logical flows match on ip6.dst
> > field.  These flows when translated to datapath flows also match on
> > ip6.dst, which means a separate datapath flow per destination IP
> > address.  This may cause significant performance issues in some
> > setups (particularly ovs-dpdk telco deployments).
> >
> > This patch addresses this issue by matching on eth.mcast6 so that
> > datapath flows for normal IPv6 traffic doesn't have to match on
> > ip6.dst.  IPv6 NS packets are generally multicast.  A new logical
> > match "nd_ns_mcast" is added for this purpose.
> >
> > After this patch, We no longer respond to IPv6 NS unicast packets.
> > Let the target reply to it, so that the sender has the ability to
> > monitor the targe liveness via the unicast ND solicitations.
> > This behavior now matches the IPv4 ARP responder flows.  Note that
> > after the commit [1] which was recently added we now only respond
> > to IPv4 ARP broadcast packets.
> >
> > A recent patch [2] from Ilya partially addressed the same datapath
> > flow explosion issue by matching on eth.mcast6 for MLD packets.
> > With this patch, we now completely address the datapath flow
> > explosion issue for IPv6 traffic provided all the logical ports
> > of a logical switch are not configured with port security.
> >
> > [1] - c48ed1736a58 ("Do not reply on unicast arps for IPv4 targets.")
> > [2] - 43c34f2e6676 ("logical-fields: Add missing multicast matches for MLD and IGMP.")
> >
> > Note: Documentation for 'eth.mcastv6' and 'ip6.mcast' predicates were
> > missing from ovn-sb.xml and this patch adds it.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-728
> > Reported-by: Mike Pattrick <mkp@redhat.com>
> > Signed-off-by: Numan Siddique <numans@ovn.org>
> > ---
> >
> > v1 -> v2
> > -------
> >   - Addressed review comments from Ilya.
> >
>
> Hi Numan,
>
> For proxy-arp we still match on unicast ND as far as I can tell.  I'm
> not sure if we can remove that part though.  Maybe it's something to
> consider as a follow up?

Hi Dumitru,
Thanks for the review.  If a logical switch has  a router port with
proxy_arp configured
with IPv6 addresses then the dp flow would match on the ipv6 src/dst fields.

I've updated the commit message with this limitation.  I'm not sure if
we can avoid matching
on unicast ND solicitation for proxy_arp.  I need to think about it.
I guess it can be a follow up
if it's possible to address that.

I've submitted v3.  Please take a look -
https://patchwork.ozlabs.org/project/ovn/patch/20240815165222.1144250-1-numans@ovn.org/.
I removed the  system test and added the test in ovn.at instead so
that the test is reliable.

Thanks
Numan

>
> build_lswitch_arp_nd_responder_known_ips()
> [...]
>         /* Add IPv6 NDP responses.
>          * For ND solicitations, we need to listen for both the
>          * unicast IPv6 address and its all-nodes multicast address,
>          * but always respond with the unicast IPv6 address. */
>         if (op->proxy_arp_addrs.n_ipv6_addrs) {
>             struct ds ip6_dst_match = DS_EMPTY_INITIALIZER;
>             struct ds nd_target_match = DS_EMPTY_INITIALIZER;
>             for (size_t j = 0; j < op->proxy_arp_addrs.n_ipv6_addrs; j++) {
>                 ds_put_format(&ip6_dst_match, "%s/%u, %s/%u, ",
>                         op->proxy_arp_addrs.ipv6_addrs[j].addr_s,
>                         op->proxy_arp_addrs.ipv6_addrs[j].plen,
>                         op->proxy_arp_addrs.ipv6_addrs[j].sn_addr_s,
>                         op->proxy_arp_addrs.ipv6_addrs[j].plen);
>                 ds_put_format(&nd_target_match, "%s/%u, ",
>                         op->proxy_arp_addrs.ipv6_addrs[j].addr_s,
>                         op->proxy_arp_addrs.ipv6_addrs[j].plen);
>             }
>
> Aside from that I have a few small comments on the test.  In general
> the change makes sense to me.
>
> Thanks,
> Dumitru
>
> >  lib/logical-fields.c |   3 ++
> >  northd/northd.c      |  21 +++++---
> >  ovn-sb.xml           |   3 ++
> >  tests/ovn-northd.at  |   4 +-
> >  tests/ovn.at         |   4 +-
> >  tests/system-ovn.at  | 123 +++++++++++++++++++++++++++++++++++++++++++
> >  6 files changed, 146 insertions(+), 12 deletions(-)
> >
> > diff --git a/lib/logical-fields.c b/lib/logical-fields.c
> > index 2c9d3c61bf..5a8b53f2b6 100644
> > --- a/lib/logical-fields.c
> > +++ b/lib/logical-fields.c
> > @@ -293,6 +293,9 @@ ovn_init_symtab(struct shash *symtab)
> >                "icmp6.type == {135, 136} && icmp6.code == 0 && ip.ttl == 255");
> >      expr_symtab_add_predicate(symtab, "nd_ns",
> >                "icmp6.type == 135 && icmp6.code == 0 && ip.ttl == 255");
> > +    expr_symtab_add_predicate(symtab, "nd_ns_mcast",
> > +              "ip6.mcast && icmp6.type == 135 && icmp6.code == 0 && "
> > +              "ip.ttl == 255");
> >      expr_symtab_add_predicate(symtab, "nd_na",
> >                "icmp6.type == 136 && icmp6.code == 0 && ip.ttl == 255");
> >      expr_symtab_add_predicate(symtab, "nd_rs",
> > diff --git a/northd/northd.c b/northd/northd.c
> > index c4824aae3b..c9ae1e958f 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -9633,16 +9633,21 @@ build_lswitch_arp_nd_responder_known_ips(struct ovn_port *op,
> >                                                    op->lflow_ref);
> >              }
> >
> > -            /* For ND solicitations, we need to listen for both the
> > -             * unicast IPv6 address and its all-nodes multicast address,
> > -             * but always respond with the unicast IPv6 address. */
> > +            /* For ND solicitations:
> > +             *   - Reply only for the all-nodes multicast address(es) of the
> > +             *     logical port IPv6 address(es).
> > +             *
> > +             *   - Do not reply for unicast ND solicitations.  Let the target
> > +             *     reply it, so that the sender has the ability to monitor
> > +             *     the targe liveness via the unicast ND solicitations.
> > +             */
> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) {
> >                  ds_clear(match);
> > -                ds_put_format(match,
> > -                        "nd_ns && ip6.dst == {%s, %s} && nd.target == %s",
> > -                        op->lsp_addrs[i].ipv6_addrs[j].addr_s,
> > -                        op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
> > -                        op->lsp_addrs[i].ipv6_addrs[j].addr_s);
> > +                ds_put_format(
> > +                    match,
> > +                    "nd_ns_mcast && ip6.dst == %s && nd.target == %s",
> > +                    op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
> > +                    op->lsp_addrs[i].ipv6_addrs[j].addr_s);
> >
> >                  ds_clear(actions);
> >                  ds_put_format(actions,
> > diff --git a/ovn-sb.xml b/ovn-sb.xml
> > index de0bd636fd..c11296d7c4 100644
> > --- a/ovn-sb.xml
> > +++ b/ovn-sb.xml
> > @@ -1146,6 +1146,7 @@
> >        <ul>
> >          <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
> >          <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
> > +        <li><code>eth.mcastv6</code> expands to <code>eth.dst[32..47] == 0x3333</code></li>
> >          <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
> >          <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
> >          <li><code>ip4.src_mcast</code> expands to
> > @@ -1161,8 +1162,10 @@
> >          <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
> >          <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
> >          <li><code>rarp</code> expands to <code>eth.type == 0x8035</code></li>
> > +        <li><code>ip6.mcast</code> expands to <code>eth.mcastv6 &amp;&amp; ip6.dst[120..127] == 0xff</code></li>
> >          <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> >          <li><code>nd_ns</code> expands to <code>icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> > +        <li><code>nd_ns_mcast</code> expands to <code>ip6.mcast  &amp;&amp; icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> >          <li><code>nd_na</code> expands to <code>icmp6.type == 136 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> >          <li><code>nd_rs</code> expands to <code>icmp6.type == 133 &amp;&amp;
> >          icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 3c49c0d43b..93ccbce6b0 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -9503,9 +9503,9 @@ AT_CAPTURE_FILE([S1flows])
> >  AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | ovn_strip_lflows], [0], [dnl
> >    table=??(ls_in_arp_rsp      ), priority=0    , match=(1), action=(next;)
> >    table=??(ls_in_arp_rsp      ), priority=100  , match=(arp.tpa == 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff && inport == "S1-vm1"), action=(next;)
> > -  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;)
> > +  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns_mcast && ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;)
> >    table=??(ls_in_arp_rsp      ), priority=50   , match=(arp.tpa == 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff), action=(eth.dst = eth.src; eth.src = 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = 192.168.0.10; outport = inport; flags.loopback = 1; output;)
> > -  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
> > +  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns_mcast && ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
> >  ])
> >
> >  #Set the disable_arp_nd_rsp option and verify the flow
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index a1d689e849..7583da7e07 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -8617,7 +8617,7 @@ done
> >  # Complete Neighbor Solicitation packet and Neighbor Advertisement packet
> >  # vif1 -> NS -> vif2.  vif1 <- NA <- ovn-controller.
> >  # vif2 will not receive NS packet, since ovn-controller will reply for it.
> > -ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598fd81ce49a9480000f8163efffea1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
> > +ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598ff0200000000000000000001ffa1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
> >  na_packet=fa163e940598fa163ea1f9ae86dd6000000000203afffd81ce49a9480000f8163efffea1f9aefd81ce49a9480000f8163efffe9405988800e9ed60000000fd81ce49a9480000f8163efffea1f9ae0201fa163ea1f9ae
> >
> >  as hv1 ovs-appctl netdev-dummy/receive vif1 $ns_packet
> > @@ -14417,7 +14417,7 @@ test_ns_na() {
> >      local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
> >
> >      packet=$(fmt_pkt "
> > -        Ether(dst='ff:ff:ff:ff:ff:ff', src='${src_mac}') /
> > +        Ether(dst='33:33:ff:ff:ff:ff', src='${src_mac}') /
> >          IPv6(src='${src_ip}', dst='ff02::1:ff00:2') /
> >          ICMPv6ND_NS(tgt='${dst_ip}')
> >      ")
> > diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> > index 6e4ec42473..e43244ca60 100644
> > --- a/tests/system-ovn.at
> > +++ b/tests/system-ovn.at
> > @@ -13899,3 +13899,126 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> >  /.*terminating with signal 15.*/d"])
> >  AT_CLEANUP
> >  ])
> > +
> > +dnl This test sends IPv6 TCP packets between two lports of the same
> > +dnl logical switch and checks that the datapath flows don't match on
> > +dnl IPv6 source and destination addresses.
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([ovn -- IPv6 switching - datapath flows])
> > +AT_KEYWORDS([IPv6])
> > +
> > +ovn_start
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +ADD_BR([br-int])
> > +
> > +dnl Set external-ids in br-int needed for ovn-controller
> > +check ovs-vsctl \
> > +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> > +        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> > +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> > +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> > +        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
> > +
> > +dnl Start ovn-controller
> > +start_daemon ovn-controller
> > +
> > +check ovn-nbctl ls-add sw0
> > +
> > +check ovn-nbctl lsp-add sw0 vm0
> > +check ovn-nbctl lsp-set-addresses vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
> > +
> > +check ovn-nbctl lsp-add sw0 vm1
> > +check ovn-nbctl lsp-set-addresses vm1 "f0:00:0f:01:02:04 10.0.0.4 1000::4"
> > +
> > +check ovn-nbctl --wait=hv sync
> > +
> > +ADD_NAMESPACES(vm0)
> > +ADD_VETH(vm0, vm0, br-int, "1000::3/64", "f0:00:0f:01:02:03", "1000::1", "nodad")
> > +
> > +ADD_NAMESPACES(vm1)
> > +ADD_VETH(vm1, vm1, br-int, "1000::4/64", "f0:00:0f:01:02:04", "1000::1", "nodad")
> > +
>
> It's probably best to ensure OVN bound the ports here, e.g.:
>
> dnl Wait for ovn-controller to catch up.
> wait_for_ports_up
> check ovn-nbctl --wait=hv sync
>
> The sync before ADD_NAMESPACES can probably be removed then.
>
> > +dnl Checks the datapath flows and makes sure that there are no matches to
> > +dnl IPv6 src or dst fields.
> > +check_dp_flows() {
> > +  ipv6_src_dst_match=$1
> > +  dnl Dump the flows. Will be helpful while debugging.
> > +  ovs-appctl dpctl/dump-flows
>
> It might be better to change this to something like:
>
> ovs-appctl dpctl/dump-flows > dp-flows
> AT_CAPTURE_FILE([dp-flows])
>
> > +
> > +  eth_match1="eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)"
> > +  eth_match2="eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)"
> > +  ipv6_src_match="ipv6(src="
> > +  ipv6_dst_match="ipv6(dst="
> > +
> > +  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" -c], [0], [dnl
> > +2
> > +])
> > +
> > +  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" -c], [0], [dnl
> > +2
> > +])
> > +
> > +  if [[ "$ipv6_src_dst_match" == "yes" ]]; then
> > +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
> > +    dnl matches on IPv6 src or dst.
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep -e "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
> > +2
> > +])
> > +
> > +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
> > +    dnl matches on IPv6 src or dst.
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep -e "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
> > +2
> > +])
> > +  else
> > +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
> > +    dnl doesn't match on IPv6 src or dst.
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep "$ipv6_src_match"], [1], [])
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep "$ipv6_dst_match"], [1], [])
> > +
> > +    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep "$ipv6_src_match"], [1], [])
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep "$ipv6_dst_match"], [1], [])
> > +  fi
> > +}
> > +
> > +NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
> > +[0], [dnl
> > +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> > +])
> > +
> > +# Start IPv6 TCP server on vm0.
> > +NETNS_DAEMONIZE([vm0], [nc -6 -l -k 1000::3 4242], [nc-vm1-1.pid])
> > +NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
> > +
> > +check_dp_flows no
> > +
> > +# Now configure port security on vm0.  dp flows should now match on IPv6 fields.
> > +check ovn-nbctl lsp-set-port-security vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
> > +check ovn-nbctl --wait=hv sync
> > +
> > +NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
> > +[0], [dnl
> > +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> > +])
> > +
> > +NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
> > +check_dp_flows yes
> > +
>
> The test failed here with the userspace datapath in the ovsrobot run:
> https://github.com/ovsrobot/ovn/actions/runs/10398228623/job/28795300825
>
> Is it a matter of waiting for output instead of checking for datapath flows
> immediately?  Alternatively do we need to wait for the revalidator to run
> (ovs-appctl revalidator/wait)?
>
> > +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> > +
> > +as ovn-sb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as ovn-nb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as northd
> > +OVS_APP_EXIT_AND_WAIT([ovn-northd])
> > +
> > +as
> > +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> > +/failed to query port patch-.*/d
> > +/.*terminating with signal 15.*/d"])
> > +AT_CLEANUP
> > +])
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Ilya Maximets Aug. 15, 2024, 5:30 p.m. UTC | #3
On 8/15/24 18:56, Numan Siddique wrote:
> On Thu, Aug 15, 2024 at 6:38 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> On 8/15/24 05:02, numans@ovn.org wrote:
>>> From: Numan Siddique <numans@ovn.org>
>>>
>>> IPv6 ND Solicitation (NS) responder logical flows match on ip6.dst
>>> field.  These flows when translated to datapath flows also match on
>>> ip6.dst, which means a separate datapath flow per destination IP
>>> address.  This may cause significant performance issues in some
>>> setups (particularly ovs-dpdk telco deployments).
>>>
>>> This patch addresses this issue by matching on eth.mcast6 so that
>>> datapath flows for normal IPv6 traffic doesn't have to match on
>>> ip6.dst.  IPv6 NS packets are generally multicast.  A new logical
>>> match "nd_ns_mcast" is added for this purpose.
>>>
>>> After this patch, We no longer respond to IPv6 NS unicast packets.
>>> Let the target reply to it, so that the sender has the ability to
>>> monitor the targe liveness via the unicast ND solicitations.
>>> This behavior now matches the IPv4 ARP responder flows.  Note that
>>> after the commit [1] which was recently added we now only respond
>>> to IPv4 ARP broadcast packets.
>>>
>>> A recent patch [2] from Ilya partially addressed the same datapath
>>> flow explosion issue by matching on eth.mcast6 for MLD packets.
>>> With this patch, we now completely address the datapath flow
>>> explosion issue for IPv6 traffic provided all the logical ports
>>> of a logical switch are not configured with port security.
>>>
>>> [1] - c48ed1736a58 ("Do not reply on unicast arps for IPv4 targets.")
>>> [2] - 43c34f2e6676 ("logical-fields: Add missing multicast matches for MLD and IGMP.")
>>>
>>> Note: Documentation for 'eth.mcastv6' and 'ip6.mcast' predicates were
>>> missing from ovn-sb.xml and this patch adds it.
>>>
>>> Reported-at: https://issues.redhat.com/browse/FDP-728
>>> Reported-by: Mike Pattrick <mkp@redhat.com>
>>> Signed-off-by: Numan Siddique <numans@ovn.org>
>>> ---
>>>
>>> v1 -> v2
>>> -------
>>>   - Addressed review comments from Ilya.
>>>
>>
>> Hi Numan,
>>
>> For proxy-arp we still match on unicast ND as far as I can tell.  I'm
>> not sure if we can remove that part though.  Maybe it's something to
>> consider as a follow up?
> 
> Hi Dumitru,
> Thanks for the review.  If a logical switch has  a router port with
> proxy_arp configured
> with IPv6 addresses then the dp flow would match on the ipv6 src/dst fields.
> 
> I've updated the commit message with this limitation.  I'm not sure if
> we can avoid matching
> on unicast ND solicitation for proxy_arp.  I need to think about it.
> I guess it can be a follow up
> if it's possible to address that.

We can still try to limit the impact by matching on the exact unicast
eth.dst in case of a unicast ND request, i.e. split the logical flow
in two:

 1. nd_ns_mcast && ip6.dst=IPv6_mcast && nd.target=IPv6_unicast
 2. eth.dst=proxy-mac-address && ipv6.dst=IPv6_unicast && nd.target=IPv6_unicast

This will limit the issue to multicast traffic and traffic targeted to
the address behind ARP proxy.  Datapath flows will have exact match on
eth.dst, but only ones going to ARP-proxied addresses will have ip6.dst
matches.  The ones going between LSPs on the same switch should not be
affected by the ip6.dst match.

I did not test this, but may be worth exploring.

Definitely can be a separate change.

Best regards, Ilya Maximets.

> 
> I've submitted v3.  Please take a look -
> https://patchwork.ozlabs.org/project/ovn/patch/20240815165222.1144250-1-numans@ovn.org/.
> I removed the  system test and added the test in ovn.at instead so
> that the test is reliable.
> 
> Thanks
> Numan
diff mbox series

Patch

diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index 2c9d3c61bf..5a8b53f2b6 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -293,6 +293,9 @@  ovn_init_symtab(struct shash *symtab)
               "icmp6.type == {135, 136} && icmp6.code == 0 && ip.ttl == 255");
     expr_symtab_add_predicate(symtab, "nd_ns",
               "icmp6.type == 135 && icmp6.code == 0 && ip.ttl == 255");
+    expr_symtab_add_predicate(symtab, "nd_ns_mcast",
+              "ip6.mcast && icmp6.type == 135 && icmp6.code == 0 && "
+              "ip.ttl == 255");
     expr_symtab_add_predicate(symtab, "nd_na",
               "icmp6.type == 136 && icmp6.code == 0 && ip.ttl == 255");
     expr_symtab_add_predicate(symtab, "nd_rs",
diff --git a/northd/northd.c b/northd/northd.c
index c4824aae3b..c9ae1e958f 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -9633,16 +9633,21 @@  build_lswitch_arp_nd_responder_known_ips(struct ovn_port *op,
                                                   op->lflow_ref);
             }
 
-            /* For ND solicitations, we need to listen for both the
-             * unicast IPv6 address and its all-nodes multicast address,
-             * but always respond with the unicast IPv6 address. */
+            /* For ND solicitations:
+             *   - Reply only for the all-nodes multicast address(es) of the
+             *     logical port IPv6 address(es).
+             *
+             *   - Do not reply for unicast ND solicitations.  Let the target
+             *     reply it, so that the sender has the ability to monitor
+             *     the targe liveness via the unicast ND solicitations.
+             */
             for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) {
                 ds_clear(match);
-                ds_put_format(match,
-                        "nd_ns && ip6.dst == {%s, %s} && nd.target == %s",
-                        op->lsp_addrs[i].ipv6_addrs[j].addr_s,
-                        op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
-                        op->lsp_addrs[i].ipv6_addrs[j].addr_s);
+                ds_put_format(
+                    match,
+                    "nd_ns_mcast && ip6.dst == %s && nd.target == %s",
+                    op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
+                    op->lsp_addrs[i].ipv6_addrs[j].addr_s);
 
                 ds_clear(actions);
                 ds_put_format(actions,
diff --git a/ovn-sb.xml b/ovn-sb.xml
index de0bd636fd..c11296d7c4 100644
--- a/ovn-sb.xml
+++ b/ovn-sb.xml
@@ -1146,6 +1146,7 @@ 
       <ul>
         <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
         <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
+        <li><code>eth.mcastv6</code> expands to <code>eth.dst[32..47] == 0x3333</code></li>
         <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
         <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
         <li><code>ip4.src_mcast</code> expands to
@@ -1161,8 +1162,10 @@ 
         <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
         <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
         <li><code>rarp</code> expands to <code>eth.type == 0x8035</code></li>
+        <li><code>ip6.mcast</code> expands to <code>eth.mcastv6 &amp;&amp; ip6.dst[120..127] == 0xff</code></li>
         <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
         <li><code>nd_ns</code> expands to <code>icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
+        <li><code>nd_ns_mcast</code> expands to <code>ip6.mcast  &amp;&amp; icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
         <li><code>nd_na</code> expands to <code>icmp6.type == 136 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
         <li><code>nd_rs</code> expands to <code>icmp6.type == 133 &amp;&amp;
         icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 3c49c0d43b..93ccbce6b0 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9503,9 +9503,9 @@  AT_CAPTURE_FILE([S1flows])
 AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | ovn_strip_lflows], [0], [dnl
   table=??(ls_in_arp_rsp      ), priority=0    , match=(1), action=(next;)
   table=??(ls_in_arp_rsp      ), priority=100  , match=(arp.tpa == 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff && inport == "S1-vm1"), action=(next;)
-  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;)
+  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns_mcast && ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;)
   table=??(ls_in_arp_rsp      ), priority=50   , match=(arp.tpa == 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff), action=(eth.dst = eth.src; eth.src = 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = 192.168.0.10; outport = inport; flags.loopback = 1; output;)
-  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
+  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns_mcast && ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
 ])
 
 #Set the disable_arp_nd_rsp option and verify the flow
diff --git a/tests/ovn.at b/tests/ovn.at
index a1d689e849..7583da7e07 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -8617,7 +8617,7 @@  done
 # Complete Neighbor Solicitation packet and Neighbor Advertisement packet
 # vif1 -> NS -> vif2.  vif1 <- NA <- ovn-controller.
 # vif2 will not receive NS packet, since ovn-controller will reply for it.
-ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598fd81ce49a9480000f8163efffea1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
+ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598ff0200000000000000000001ffa1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
 na_packet=fa163e940598fa163ea1f9ae86dd6000000000203afffd81ce49a9480000f8163efffea1f9aefd81ce49a9480000f8163efffe9405988800e9ed60000000fd81ce49a9480000f8163efffea1f9ae0201fa163ea1f9ae
 
 as hv1 ovs-appctl netdev-dummy/receive vif1 $ns_packet
@@ -14417,7 +14417,7 @@  test_ns_na() {
     local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
 
     packet=$(fmt_pkt "
-        Ether(dst='ff:ff:ff:ff:ff:ff', src='${src_mac}') /
+        Ether(dst='33:33:ff:ff:ff:ff', src='${src_mac}') /
         IPv6(src='${src_ip}', dst='ff02::1:ff00:2') /
         ICMPv6ND_NS(tgt='${dst_ip}')
     ")
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 6e4ec42473..e43244ca60 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -13899,3 +13899,126 @@  OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
 /.*terminating with signal 15.*/d"])
 AT_CLEANUP
 ])
+
+dnl This test sends IPv6 TCP packets between two lports of the same
+dnl logical switch and checks that the datapath flows don't match on
+dnl IPv6 source and destination addresses.
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- IPv6 switching - datapath flows])
+AT_KEYWORDS([IPv6])
+
+ovn_start
+OVS_TRAFFIC_VSWITCHD_START()
+ADD_BR([br-int])
+
+dnl Set external-ids in br-int needed for ovn-controller
+check ovs-vsctl \
+        -- set Open_vSwitch . external-ids:system-id=hv1 \
+        -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
+        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
+        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
+        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
+
+dnl Start ovn-controller
+start_daemon ovn-controller
+
+check ovn-nbctl ls-add sw0
+
+check ovn-nbctl lsp-add sw0 vm0
+check ovn-nbctl lsp-set-addresses vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
+
+check ovn-nbctl lsp-add sw0 vm1
+check ovn-nbctl lsp-set-addresses vm1 "f0:00:0f:01:02:04 10.0.0.4 1000::4"
+
+check ovn-nbctl --wait=hv sync
+
+ADD_NAMESPACES(vm0)
+ADD_VETH(vm0, vm0, br-int, "1000::3/64", "f0:00:0f:01:02:03", "1000::1", "nodad")
+
+ADD_NAMESPACES(vm1)
+ADD_VETH(vm1, vm1, br-int, "1000::4/64", "f0:00:0f:01:02:04", "1000::1", "nodad")
+
+dnl Checks the datapath flows and makes sure that there are no matches to
+dnl IPv6 src or dst fields.
+check_dp_flows() {
+  ipv6_src_dst_match=$1
+  dnl Dump the flows. Will be helpful while debugging.
+  ovs-appctl dpctl/dump-flows
+
+  eth_match1="eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)"
+  eth_match2="eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)"
+  ipv6_src_match="ipv6(src="
+  ipv6_dst_match="ipv6(dst="
+
+  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" -c], [0], [dnl
+2
+])
+
+  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" -c], [0], [dnl
+2
+])
+
+  if [[ "$ipv6_src_dst_match" == "yes" ]]; then
+    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
+    dnl matches on IPv6 src or dst.
+    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep -e "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
+2
+])
+
+    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
+    dnl matches on IPv6 src or dst.
+    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep -e "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
+2
+])
+  else
+    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
+    dnl doesn't match on IPv6 src or dst.
+    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep "$ipv6_src_match"], [1], [])
+    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep "$ipv6_dst_match"], [1], [])
+
+    dnl Check that the dp flow for eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
+    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep "$ipv6_src_match"], [1], [])
+    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep "$ipv6_dst_match"], [1], [])
+  fi
+}
+
+NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+# Start IPv6 TCP server on vm0.
+NETNS_DAEMONIZE([vm0], [nc -6 -l -k 1000::3 4242], [nc-vm1-1.pid])
+NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
+
+check_dp_flows no
+
+# Now configure port security on vm0.  dp flows should now match on IPv6 fields.
+check ovn-nbctl lsp-set-port-security vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
+check ovn-nbctl --wait=hv sync
+
+NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
+[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
+check_dp_flows yes
+
+OVS_APP_EXIT_AND_WAIT([ovn-controller])
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as
+OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
+/failed to query port patch-.*/d
+/.*terminating with signal 15.*/d"])
+AT_CLEANUP
+])