diff mbox series

[ovs-dev,v3,4/7] controller: Introduce route-exchange-netlink.

Message ID 20240725140009.413791-4-fnordahl@ubuntu.com
State Changes Requested
Headers show
Series [ovs-dev,v3,1/7] controller: Move address with port parser to lib. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Frode Nordahl July 25, 2024, 2 p.m. UTC
Introduce route-exchange-netlink module which implements interface
for maintaining VRFs [0] and host routes through Netlink.

We want to export host routes to resources such as NAT addresses
and LB VIPs to routing protocol suite software running on the same
system, which subject to configuration can redistribute routes to
external sytems such as one or more Top-of-Rack (ToR) switches.

There is a desire to do this without having to (re-)implement
routing protocol state machines in OVN, and to accomplish this we
make use of Netlink.

Netlink was chosen because:
* Its ubiquitous nature with availability on any Linux system as
  as well other platforms.
* Presence of a very good Netlink library implementation in our
  sibling project and library, Open vSwitch.
* Popular routing protocol software conveniently already have
  support for redistributing routes to/from Netlink.
* Support for interacting with Virtual Routing and Forwarding
  domains [0], allowing full isolation between virtual network
  resources defined within OVN and the hosting system while
  retaining access to all system network interfaces.

It is important to note that the purpose of this integration is
generic exchange of control plane information, while allowing to
keep the datapath in OVS/OVN, enabling users to leverage its full
range of user-, kernel- and mixed- space datapath implementations.

0: https://docs.kernel.org/networking/vrf.html

Signed-off-by: Frode Nordahl <fnordahl@ubuntu.com>
---
 configure.ac                                |   2 +
 controller/automake.mk                      |   7 +
 controller/route-exchange-netlink-private.h | 243 ++++++++++++++++++
 controller/route-exchange-netlink.c         | 264 ++++++++++++++++++++
 controller/route-exchange-netlink.h         |  40 +++
 controller/test-route-exchange-netlink.c    | 173 +++++++++++++
 m4/ovn.m4                                   |  25 ++
 tests/automake.mk                           |  13 +-
 tests/ovn-system-route-exchange.at          |  16 ++
 tests/system-common-macros.at               |  12 +
 tests/system-kmod-testsuite.at              |   1 +
 11 files changed, 795 insertions(+), 1 deletion(-)
 create mode 100644 controller/route-exchange-netlink-private.h
 create mode 100644 controller/route-exchange-netlink.c
 create mode 100644 controller/route-exchange-netlink.h
 create mode 100644 controller/test-route-exchange-netlink.c
 create mode 100644 tests/ovn-system-route-exchange.at

Comments

Roberto Bartzen Acosta Sept. 5, 2024, 1:29 p.m. UTC | #1
Hi Frode,

Thanks for the patch series. Have you tested integration with
routing-protocol-redirect for BGP?
I have performed internal tests and observed that the VRF route
manipulation via netlink needs to have a filter to consist only of what is
managed by the exchange module itself. Without this filter, the BGP will
not work integrated with the VRF exchange.

 Is there any case where you need to remove link/broadcast routes from the
VRF to keep only the FIPs and LBs?

Another question,  the exchange_run does not maintain the routes in the VRF
if the system operator manually or automatically triggers a link event
(down/up) in the Linux VRF interface. I mean, changing admin state in the
VRF interface will cause the addresses applied via exchange no longer exist
in the VRF and will only reappear in case of changes to the router/lrp
configuration (modifying FIPs or LBs). Perhaps there is a missing trigger
for VRF events to call exchange_run.


Em qui., 25 de jul. de 2024 às 11:00, Frode Nordahl <fnordahl@ubuntu.com>
escreveu:

> Introduce route-exchange-netlink module which implements interface
> for maintaining VRFs [0] and host routes through Netlink.
>
> We want to export host routes to resources such as NAT addresses
> and LB VIPs to routing protocol suite software running on the same
> system, which subject to configuration can redistribute routes to
> external sytems such as one or more Top-of-Rack (ToR) switches.
>
> There is a desire to do this without having to (re-)implement
> routing protocol state machines in OVN, and to accomplish this we
> make use of Netlink.
>
> Netlink was chosen because:
> * Its ubiquitous nature with availability on any Linux system as
>   as well other platforms.
> * Presence of a very good Netlink library implementation in our
>   sibling project and library, Open vSwitch.
> * Popular routing protocol software conveniently already have
>   support for redistributing routes to/from Netlink.
> * Support for interacting with Virtual Routing and Forwarding
>   domains [0], allowing full isolation between virtual network
>   resources defined within OVN and the hosting system while
>   retaining access to all system network interfaces.
>
> It is important to note that the purpose of this integration is
> generic exchange of control plane information, while allowing to
> keep the datapath in OVS/OVN, enabling users to leverage its full
> range of user-, kernel- and mixed- space datapath implementations.
>
> 0: https://docs.kernel.org/networking/vrf.html
>
> Signed-off-by: Frode Nordahl <fnordahl@ubuntu.com>
> ---
>  configure.ac                                |   2 +
>  controller/automake.mk                      |   7 +
>  controller/route-exchange-netlink-private.h | 243 ++++++++++++++++++
>  controller/route-exchange-netlink.c         | 264 ++++++++++++++++++++
>  controller/route-exchange-netlink.h         |  40 +++
>  controller/test-route-exchange-netlink.c    | 173 +++++++++++++
>  m4/ovn.m4                                   |  25 ++
>  tests/automake.mk                           |  13 +-
>  tests/ovn-system-route-exchange.at          |  16 ++
>  tests/system-common-macros.at               |  12 +
>  tests/system-kmod-testsuite.at              |   1 +
>  11 files changed, 795 insertions(+), 1 deletion(-)
>  create mode 100644 controller/route-exchange-netlink-private.h
>  create mode 100644 controller/route-exchange-netlink.c
>  create mode 100644 controller/route-exchange-netlink.h
>  create mode 100644 controller/test-route-exchange-netlink.c
>  create mode 100644 tests/ovn-system-route-exchange.at
>
> diff --git a/configure.ac b/configure.ac
> index 6a6b0db6a..6f0f485c4 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -87,6 +87,8 @@ OVS_CHECK_WIN32
>  OVS_CHECK_VISUAL_STUDIO_DDK
>  OVN_CHECK_COVERAGE
>  OVS_CHECK_NDEBUG
> +OVS_CHECK_NETLINK
> +OVS_CHECK_LINUX_NETLINK
>  OVS_CHECK_OPENSSL
>  OVN_CHECK_LOGDIR
>  OVN_CHECK_PYTHON3
> diff --git a/controller/automake.mk b/controller/automake.mk
> index ed93cfb3c..006e884dc 100644
> --- a/controller/automake.mk
> +++ b/controller/automake.mk
> @@ -51,6 +51,13 @@ controller_ovn_controller_SOURCES = \
>         controller/ct-zone.h \
>         controller/ct-zone.c
>
> +if HAVE_NETLINK
> +controller_ovn_controller_SOURCES += \
> +       controller/route-exchange-netlink.h \
> +       controller/route-exchange-netlink-private.h \
> +       controller/route-exchange-netlink.c
> +endif
> +
>  controller_ovn_controller_LDADD = lib/libovn.la $(OVS_LIBDIR)/
> libopenvswitch.la
>  man_MANS += controller/ovn-controller.8
>  EXTRA_DIST += controller/ovn-controller.8.xml
> diff --git a/controller/route-exchange-netlink-private.h
> b/controller/route-exchange-netlink-private.h
> new file mode 100644
> index 000000000..4c2559895
> --- /dev/null
> +++ b/controller/route-exchange-netlink-private.h
> @@ -0,0 +1,243 @@
> +/*
> + * Copyright (c) 2024 Canonical, Ltd.
> + * Copyright (c) 2011, 2012, 2013, 2014, 2017 Nicira, Inc.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef ROUTE_EXCHANGE_NETLINK_PRIVATE_H
> +#define ROUTE_EXCHANGE_NETLINK_PRIVATE_H 1
> +
> +/*
> + * NOTE(fnordahl): The below code is stolen directly from OVS
> lib/route-table.c
> + * with the addition of inlining of function definitions for practical
> reasons
> + * and modifications:
> + *
> + * struct route_data:
> + *
> + * - Add rta_table_id.
> + *
> + * route_table_parse():
> + *
> + * - Consider non-standard routing tables and store the table_id.
> + *
> + * route_table_dump_one_table():
> + *
> + * - Use uint32_t for table id and pass it to kernel using thee RTA_TABLE
> + *   attribute to allow use of table IDs greater than 256.
> + * - Use callback with argument instead of hard coded call to static
> function
> + *   route_table_handle_msg().
> + *
> + * Ideally we would upstream those changes along with export of
> interesting
> + * data structures and functions to OVS, but in the interest of time we
> vendor
> + * the code here for now.
> + *
> + * BEGIN VENDORED CODE FROM OVS lib/route-table.c
> + */
> +struct route_data {
> +    /* Copied from struct rtmsg. */
> +    unsigned char rtm_dst_len;
> +    bool local;
> +
> +    /* Extracted from Netlink attributes. */
> +    struct in6_addr rta_dst; /* 0 if missing. */
> +    struct in6_addr rta_prefsrc; /* 0 if missing. */
> +    struct in6_addr rta_gw;
> +    char ifname[IFNAMSIZ]; /* Interface name. */
> +    uint32_t mark;
> +    uint32_t rta_table_id; /* 0 if missing. */
> +};
> +
> +/* A digested version of a route message sent down by the kernel to
> indicate
> + * that a route has changed. */
> +struct route_table_msg {
> +    bool relevant;        /* Should this message be processed? */
> +    int nlmsg_type;       /* e.g. RTM_NEWROUTE, RTM_DELROUTE. */
> +    struct route_data rd; /* Data parsed from this message. */
> +};
> +
> +/* Return RTNLGRP_IPV4_ROUTE or RTNLGRP_IPV6_ROUTE on success, 0 on parse
> + * error. */
> +static inline int
> +route_table_parse(struct ofpbuf *buf, struct route_table_msg *change)
> +{
> +    bool parsed, ipv4 = false;
> +
> +    static const struct nl_policy policy[] = {
> +        [RTA_DST] = { .type = NL_A_U32, .optional = true  },
> +        [RTA_OIF] = { .type = NL_A_U32, .optional = true },
> +        [RTA_GATEWAY] = { .type = NL_A_U32, .optional = true },
> +        [RTA_MARK] = { .type = NL_A_U32, .optional = true },
> +        [RTA_PREFSRC] = { .type = NL_A_U32, .optional = true },
> +        [RTA_TABLE] = { .type = NL_A_U32, .optional = true },
> +    };
> +
> +    static const struct nl_policy policy6[] = {
> +        [RTA_DST] = { .type = NL_A_IPV6, .optional = true },
> +        [RTA_OIF] = { .type = NL_A_U32, .optional = true },
> +        [RTA_MARK] = { .type = NL_A_U32, .optional = true },
> +        [RTA_GATEWAY] = { .type = NL_A_IPV6, .optional = true },
> +        [RTA_PREFSRC] = { .type = NL_A_IPV6, .optional = true },
> +        [RTA_TABLE] = { .type = NL_A_U32, .optional = true },
> +    };
> +
> +    struct nlattr *attrs[ARRAY_SIZE(policy)];
> +    const struct rtmsg *rtm;
> +
> +    rtm = ofpbuf_at(buf, NLMSG_HDRLEN, sizeof *rtm);
> +
> +    if (rtm->rtm_family == AF_INET) {
> +        parsed = nl_policy_parse(buf, NLMSG_HDRLEN + sizeof(struct rtmsg),
> +                                 policy, attrs, ARRAY_SIZE(policy));
> +        ipv4 = true;
> +    } else if (rtm->rtm_family == AF_INET6) {
> +        parsed = nl_policy_parse(buf, NLMSG_HDRLEN + sizeof(struct rtmsg),
> +                                 policy6, attrs, ARRAY_SIZE(policy6));
> +    } else {
> +        VLOG_DBG_RL(&rl, "received non AF_INET rtnetlink route message");
> +        return 0;
> +    }
> +
> +    if (parsed) {
> +        const struct nlmsghdr *nlmsg;
> +        uint32_t table_id;
> +        int rta_oif;      /* Output interface index. */
> +
> +        nlmsg = buf->data;
> +
> +        memset(change, 0, sizeof *change);
> +        change->relevant = true;
> +
> +        if (rtm->rtm_scope == RT_SCOPE_NOWHERE) {
> +            change->relevant = false;
> +        }
> +
> +        if (rtm->rtm_type != RTN_UNICAST &&
> +            rtm->rtm_type != RTN_LOCAL) {
> +            change->relevant = false;
> +        }
> +
>

I suggest you include a filter to mark as relevant only the routes that the
exchange module creates via netlink, since all routes created in this
context are marked as rtm_rotocol RTPROT_BOOT. Otherwise, the routes needed
for BGP, for example, will be removed and the BGP connection will be closed
when we run exchange_run.

We will start seeing messages like this, with removal of routes of the type
2 and 3 (local/broadcast):

2024-09-03T20:10:08.816Z|00067|route_exchange_netlink|WARN|Delete route
table_id=1002 dst=172.16.1.0: No such process
2024-09-03T20:10:08.816Z|00068|route_exchange_netlink|WARN|Delete route
table_id=1002 dst=fe80::: No such process
2024-09-03T20:10:08.816Z|00069|route_exchange_netlink|WARN|Delete route
table_id=1002 dst=ff00::: No such process


diff --git a/controller/route-exchange-netlink-private.h
b/controller/route-exchange-netlink-private.h
index 4c2559895..d18ec55ae 100644
--- a/controller/route-exchange-netlink-private.h
+++ b/controller/route-exchange-netlink-private.h
@@ -127,6 +127,11 @@ route_table_parse(struct ofpbuf *buf, struct
route_table_msg *change)
             change->relevant = false;
         }

+        /* Manage only exchange-owned routes as relevant */
+        if (rtm->rtm_protocol != RTPROT_BOOT) {
+            change->relevant = false;
+        }
+
         table_id = rtm->rtm_table;
         if (attrs[RTA_TABLE]) {
             table_id = nl_attr_get_u32(attrs[RTA_TABLE]);



> +        table_id = rtm->rtm_table;
> +        if (attrs[RTA_TABLE]) {
> +            table_id = nl_attr_get_u32(attrs[RTA_TABLE]);
> +            change->rd.rta_table_id = table_id;
> +        }
> +
> +        change->nlmsg_type     = nlmsg->nlmsg_type;
> +        change->rd.rtm_dst_len = rtm->rtm_dst_len + (ipv4 ? 96 : 0);
> +        change->rd.local = rtm->rtm_type == RTN_LOCAL;
> +        if (attrs[RTA_OIF]) {
> +            rta_oif = nl_attr_get_u32(attrs[RTA_OIF]);
> +
> +            if (!if_indextoname(rta_oif, change->rd.ifname)) {
> +                int error = errno;
> +
> +                VLOG_DBG_RL(&rl, "Could not find interface name[%u]: %s",
> +                            rta_oif, ovs_strerror(error));
> +                if (error == ENXIO) {
> +                    change->relevant = false;
> +                } else {
> +                    return 0;
> +                }
> +            }
> +        }
> +
> +        if (attrs[RTA_DST]) {
> +            if (ipv4) {
> +                ovs_be32 dst;
> +                dst = nl_attr_get_be32(attrs[RTA_DST]);
> +                in6_addr_set_mapped_ipv4(&change->rd.rta_dst, dst);
> +            } else {
> +                change->rd.rta_dst = nl_attr_get_in6_addr(attrs[RTA_DST]);
> +            }
> +        } else if (ipv4) {
> +            in6_addr_set_mapped_ipv4(&change->rd.rta_dst, 0);
> +        }
> +        if (attrs[RTA_PREFSRC]) {
> +            if (ipv4) {
> +                ovs_be32 prefsrc;
> +                prefsrc = nl_attr_get_be32(attrs[RTA_PREFSRC]);
> +                in6_addr_set_mapped_ipv4(&change->rd.rta_prefsrc,
> prefsrc);
> +            } else {
> +                change->rd.rta_prefsrc =
> +                    nl_attr_get_in6_addr(attrs[RTA_PREFSRC]);
> +            }
> +        }
> +        if (attrs[RTA_GATEWAY]) {
> +            if (ipv4) {
> +                ovs_be32 gw;
> +                gw = nl_attr_get_be32(attrs[RTA_GATEWAY]);
> +                in6_addr_set_mapped_ipv4(&change->rd.rta_gw, gw);
> +            } else {
> +                change->rd.rta_gw =
> nl_attr_get_in6_addr(attrs[RTA_GATEWAY]);
> +            }
> +        }
> +        if (attrs[RTA_MARK]) {
> +            change->rd.mark = nl_attr_get_u32(attrs[RTA_MARK]);
> +        }
> +    } else {
> +        VLOG_DBG_RL(&rl, "received unparseable rtnetlink route message");
> +        return 0;
> +    }
> +
> +    /* Success. */
> +    return ipv4 ? RTNLGRP_IPV4_ROUTE : RTNLGRP_IPV6_ROUTE;
> +}
> +
> +static inline bool
> +route_table_dump_one_table(
> +    uint32_t id,
> +    void (*handle_msg)(struct route_table_msg *, void *),
> +    void *data)
> +{
> +    uint64_t reply_stub[NL_DUMP_BUFSIZE / 8];
> +    struct ofpbuf request, reply, buf;
> +    struct rtmsg *rq_msg;
> +    bool filtered = true;
> +    struct nl_dump dump;
> +
> +    ofpbuf_init(&request, 0);
> +
> +    nl_msg_put_nlmsghdr(&request, 0, RTM_GETROUTE, NLM_F_REQUEST);
> +
> +    rq_msg = ofpbuf_put_zeros(&request, sizeof *rq_msg);
> +    rq_msg->rtm_family = AF_UNSPEC;
> +    rq_msg->rtm_table = RT_TABLE_UNSPEC;
> +
> +    nl_msg_put_u32(&request, RTA_TABLE, id);
> +
> +    nl_dump_start(&dump, NETLINK_ROUTE, &request);
> +    ofpbuf_uninit(&request);
> +
> +    ofpbuf_use_stub(&buf, reply_stub, sizeof reply_stub);
> +    while (nl_dump_next(&dump, &reply, &buf)) {
> +        struct route_table_msg msg;
> +
> +        if (route_table_parse(&reply, &msg)) {
> +            struct nlmsghdr *nlmsghdr = nl_msg_nlmsghdr(&reply);
> +
> +            /* Older kernels do not support filtering. */
> +            if (!(nlmsghdr->nlmsg_flags & NLM_F_DUMP_FILTERED)) {
> +                filtered = false;
> +            }
> +            (*handle_msg)(&msg, data);
> +        }
> +    }
> +    ofpbuf_uninit(&buf);
> +    nl_dump_done(&dump);
> +
> +    return filtered;
> +}
> +/* END VENDORED CODE */
> +
> +#endif /* route-exchange-netlink-private.h */
> diff --git a/controller/route-exchange-netlink.c
> b/controller/route-exchange-netlink.c
> new file mode 100644
> index 000000000..707676f33
> --- /dev/null
> +++ b/controller/route-exchange-netlink.c
> @@ -0,0 +1,264 @@
> +/*
> + * Copyright (c) 2024 Canonical, Ltd.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#include <config.h>
> +
> +#include <errno.h>
> +#include <inttypes.h>
> +#include <linux/rtnetlink.h>
> +#include <net/if.h>
> +
> +#include "netlink-socket.h"
> +#include "netlink.h"
> +#include "openvswitch/hmap.h"
> +#include "openvswitch/ofpbuf.h"
> +#include "openvswitch/vlog.h"
> +#include "packets.h"
> +
> +#include "route-exchange-netlink.h"
> +
> +VLOG_DEFINE_THIS_MODULE(route_exchange_netlink);
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
> +
> +/* Due to inlining of vendored code from OVS lib/route-table.c, we need to
> + * include this after the above VLOG statements. */
> +#include "route-exchange-netlink-private.h"
> +
> +#define TABLE_ID_VALID(table_id) (table_id != RT_TABLE_UNSPEC &&
>     \
> +                                  table_id != RT_TABLE_COMPAT &&
>     \
> +                                  table_id != RT_TABLE_DEFAULT &&
>      \
> +                                  table_id != RT_TABLE_MAIN &&
>     \
> +                                  table_id != RT_TABLE_LOCAL &&
>      \
> +                                  table_id != RT_TABLE_MAX)
> +
> +static int
> +modify_vrf(uint32_t type, uint32_t flags_arg,
> +           const char *ifname, uint32_t table_id)
> +{
> +    uint32_t flags = NLM_F_REQUEST | NLM_F_ACK;
> +    size_t linkinfo_off, infodata_off;
> +    struct ifinfomsg *ifinfo;
> +    struct ofpbuf request;
> +    int err;
> +
> +    flags |= flags_arg;
> +
> +    ofpbuf_init(&request, 0);
> +    nl_msg_put_nlmsghdr(&request, 0, type, flags);
> +    ifinfo = ofpbuf_put_zeros(&request, sizeof *ifinfo);
> +    nl_msg_put_string(&request, IFLA_IFNAME, ifname);
> +    if (type == RTM_DELLINK) {
> +        goto out;
> +    }
> +
> +    ifinfo->ifi_change = ifinfo->ifi_flags = IFF_UP;
> +    linkinfo_off = nl_msg_start_nested(&request, IFLA_LINKINFO);
> +    nl_msg_put_string(&request, IFLA_INFO_KIND, "vrf");
> +    infodata_off = nl_msg_start_nested(&request, IFLA_INFO_DATA);
> +    nl_msg_put_u32(&request, IFLA_VRF_TABLE, table_id);
> +    nl_msg_end_nested(&request, infodata_off);
> +    nl_msg_end_nested(&request, linkinfo_off);
> +
> +out:
> +    err = nl_transact(NETLINK_ROUTE, &request, NULL);
> +
> +    ofpbuf_uninit(&request);
> +
> +    return err;
> +}
> +
> +int
> +re_nl_create_vrf(const char *ifname, uint32_t table_id)
> +{
> +    uint32_t flags = NLM_F_CREATE | NLM_F_EXCL;
> +    uint32_t type = RTM_NEWLINK;
> +
> +    if (!TABLE_ID_VALID(table_id)) {
> +        VLOG_WARN_RL(&rl,
> +                     "attempt to create VRF using invalid table id
> %"PRIu32,
> +                     table_id);
> +        return EINVAL;
> +    }
> +
> +    return modify_vrf(type, flags, ifname, table_id);
> +}
> +
> +int
> +re_nl_delete_vrf(const char *ifname)
> +{
> +    return modify_vrf(RTM_DELLINK, 0, ifname, 0);
> +}
> +
> +static int
> +modify_route(uint32_t type, uint32_t flags_arg, uint32_t table_id,
> +             struct in6_addr *dst, uint32_t oif)
> +{
> +    uint32_t flags = NLM_F_REQUEST | NLM_F_ACK;
> +    bool is_ipv4 = IN6_IS_ADDR_V4MAPPED(dst);
> +    struct ofpbuf request;
> +    struct rtmsg *rt;
> +    int err;
> +
> +    flags |= flags_arg;
> +
> +    ofpbuf_init(&request, 0);
> +    nl_msg_put_nlmsghdr(&request, 0, type, flags);
> +    rt = ofpbuf_put_zeros(&request, sizeof *rt);
> +    rt->rtm_family = is_ipv4 ? AF_INET : AF_INET6;
> +    rt->rtm_table = RT_TABLE_UNSPEC; /* RTA_TABLE attribute allows id >
> 256 */
> +    if (type == RTM_DELROUTE) {
> +        rt->rtm_scope = RT_SCOPE_NOWHERE;
> +    } else {
> +        rt->rtm_protocol = RTPROT_BOOT;
> +        rt->rtm_scope = RT_SCOPE_UNIVERSE;
> +        rt->rtm_type = RTN_UNICAST;
> +    }
> +    rt->rtm_dst_len = is_ipv4 ? 32 : 128;
> +
> +    nl_msg_put_u32(&request, RTA_TABLE, table_id);
> +
> +    if (is_ipv4) {
> +        nl_msg_put_be32(&request, RTA_DST, in6_addr_get_mapped_ipv4(dst));
> +    } else {
> +        nl_msg_put_in6_addr(&request, RTA_DST, dst);
> +    }
> +
> +    if (oif) {
> +        nl_msg_put_u32(&request, RTA_OIF, oif);
> +    }
> +
> +    err = nl_transact(NETLINK_ROUTE, &request, NULL);
> +    ofpbuf_uninit(&request);
> +
> +    return err;
> +}
> +
> +int
> +re_nl_add_route(uint32_t table_id, struct in6_addr *dst, const char
> *ifname)
> +{
> +    uint32_t flags = NLM_F_CREATE | NLM_F_EXCL;
> +    uint32_t type = RTM_NEWROUTE;
> +
> +    if (!TABLE_ID_VALID(table_id)) {
> +        VLOG_WARN_RL(&rl,
> +                     "attempt to add route using invalid table id
> %"PRIu32,
> +                     table_id);
> +        return EINVAL;
> +    }
> +
> +    return modify_route(type, flags, table_id, dst,
> if_nametoindex(ifname));
> +}
> +
> +int
> +re_nl_delete_route(uint32_t table_id, struct in6_addr *dst)
> +{
> +    if (!TABLE_ID_VALID(table_id)) {
> +        VLOG_WARN_RL(&rl,
> +                     "attempt to delete route using invalid table id
> %"PRIu32,
> +                     table_id);
> +        return EINVAL;
> +    }
> +
> +    return modify_route(RTM_DELROUTE, 0, table_id, dst, 0);
> +}
> +
> +struct host_route_node {
> +    struct hmap_node hmap_node;
> +    uint32_t table_id;
> +    struct in6_addr addr;
> +};
> +
> +static uint32_t
> +host_route_hash(const struct in6_addr *dst)
> +{
> +    return hash_bytes(dst->s6_addr, 16, 0);
> +}
> +
> +void
> +host_route_insert(struct hmap *host_routes, uint32_t table_id,
> +                  struct in6_addr *dst)
> +{
> +    struct host_route_node *hr = xzalloc(sizeof *hr);
> +    hmap_insert(host_routes, &hr->hmap_node, host_route_hash(dst));
> +    hr->table_id = table_id;
> +    hr->addr = *dst;
> +}
> +
> +void
> +host_routes_destroy(struct hmap *host_routes)
> +{
> +    struct host_route_node *hr;
> +    HMAP_FOR_EACH_SAFE (hr, hmap_node, host_routes) {
> +        hmap_remove(host_routes, &hr->hmap_node);
> +        free(hr);
> +    }
> +    hmap_destroy(host_routes);
> +}
> +
> +static void
> +handle_route_msg_delete_host_routes(struct route_table_msg *msg, void
> *data)
> +{
> +    struct route_data *rd = &msg->rd;
> +    struct hmap *host_routes = data;
> +    struct host_route_node *hr;
> +    int err;
> +
>

Same idea as the previous comment, filter the netlink deletion.

diff --git a/controller/route-exchange-netlink.c
b/controller/route-exchange-netlink.c
index 707676f33..14dd828fd 100644
--- a/controller/route-exchange-netlink.c
+++ b/controller/route-exchange-netlink.c
@@ -216,6 +216,12 @@ handle_route_msg_delete_host_routes(struct
route_table_msg *msg, void *data)
     struct host_route_node *hr;
     int err;

+    /* Delete only relevant exchange-owned routes removed from config and
not
+     * present in netlink. */
+    if (!msg->relevant) {
+        return;
+    }
+
     uint32_t hash = host_route_hash(&rd->rta_dst);
     HMAP_FOR_EACH_WITH_HASH (hr, hmap_node, hash, host_routes) {
         if (ipv6_addr_equals(&hr->addr, &rd->rta_dst)) {

Best regards,
Roberto



> +    uint32_t hash = host_route_hash(&rd->rta_dst);
> +    HMAP_FOR_EACH_WITH_HASH (hr, hmap_node, hash, host_routes) {
> +        if (ipv6_addr_equals(&hr->addr, &rd->rta_dst)) {
> +            hmap_remove(host_routes, &hr->hmap_node);
> +            free(hr);
> +            return;
> +        }
> +    }
> +    err = re_nl_delete_route(rd->rta_table_id, &rd->rta_dst);
> +    if (err) {
> +        char addr_s[INET6_ADDRSTRLEN + 1];
> +        VLOG_WARN_RL(&rl, "Delete route table_id=%"PRIu32" dst=%s: %s",
> +                     rd->rta_table_id,
> +                     ipv6_string_mapped(
> +                         addr_s, &rd->rta_dst) ? addr_s : "(invalid)",
> +                     ovs_strerror(err));
> +    }
> +}
> +
> +void
> +re_nl_sync_routes(uint32_t table_id, const char *ifname,
> +                  struct hmap *host_routes)
> +{
> +    /* Remove routes from the system that are not in the host_routes hmap
> and
> +     * remove entries from host_routes hmap that match routes already
> installed
> +     * in the system. */
> +    route_table_dump_one_table(table_id,
> handle_route_msg_delete_host_routes,
> +                               host_routes);
> +
> +    /* Add any remaining routes in the host_routes hmap to the system
> routing
> +     * table. */
> +    struct host_route_node *hr;
> +    HMAP_FOR_EACH_SAFE (hr, hmap_node, host_routes) {
> +        int err = re_nl_add_route(table_id, &hr->addr, ifname);
> +        if (err) {
> +            char addr_s[INET6_ADDRSTRLEN + 1];
> +            VLOG_WARN_RL(&rl, "Add route table_id=%"PRIu32" dst=%s
> dev=%s: %s",
> +                         table_id, ifname,
> +                         ipv6_string_mapped(
> +                             addr_s, &hr->addr) ? addr_s : "(invalid)",
> +                         ovs_strerror(err));
> +        }
> +        hmap_remove(host_routes, &hr->hmap_node);
> +        free(hr);
> +    }
> +}
> diff --git a/controller/route-exchange-netlink.h
> b/controller/route-exchange-netlink.h
> new file mode 100644
> index 000000000..10a60a60e
> --- /dev/null
> +++ b/controller/route-exchange-netlink.h
> @@ -0,0 +1,40 @@
> +/*
> + * Copyright (c) 2024 Canonical
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef ROUTE_EXCHANGE_NETLINK_H
> +#define ROUTE_EXCHANGE_NETLINK_H 1
> +
> +#include <stdint.h>
> +
> +struct in6_addr;
> +struct hmap;
> +
> +int re_nl_create_vrf(const char *ifname, uint32_t table_id);
> +int re_nl_delete_vrf(const char *ifname);
> +
> +int re_nl_add_route(uint32_t table_id, struct in6_addr *dst,
> +                    const char *ifname);
> +int re_nl_delete_route(uint32_t table_id, struct in6_addr *dst);
> +
> +void re_nl_dump(uint32_t table_id);
> +
> +void host_route_insert(struct hmap *host_routes, uint32_t table_id,
> +                       struct in6_addr *dst);
> +void host_routes_destroy(struct hmap *);
> +void re_nl_sync_routes(uint32_t table_id, const char *ifname,
> +                       struct hmap *host_routes);
> +
> +#endif /* route-exchange-netlink.h */
> diff --git a/controller/test-route-exchange-netlink.c
> b/controller/test-route-exchange-netlink.c
> new file mode 100644
> index 000000000..7097d5182
> --- /dev/null
> +++ b/controller/test-route-exchange-netlink.c
> @@ -0,0 +1,173 @@
> +/* Copyright (c) 2021, Canonical
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#include <config.h>
> +
> +#include <errno.h>
> +
> +#include "openvswitch/hmap.h"
> +#include "openvswitch/types.h"
> +#include "packets.h"
> +#include "route-exchange-netlink.h"
> +#include "tests/ovstest.h"
> +
> +#define VRF_IFNAME "ovnvrf42"
> +#define TABLE_ID 42
> +
> +static void
> +test_re_nl_sync_routes(struct ovs_cmdl_context *ctx OVS_UNUSED)
> +{
> +    struct hmap host_routes = HMAP_INITIALIZER(&host_routes);
> +    struct in6_addr dst4, dst6;
> +    ovs_be32 ip;
> +    int err;
> +
> +    ipv6_parse("2001:db8:42::100", &dst6);
> +    host_route_insert(&host_routes, TABLE_ID, &dst6);
> +
> +    ip_parse("172.16.42.100", &ip);
> +    in6_addr_set_mapped_ipv4(&dst4, ip);
> +    host_route_insert(&host_routes, TABLE_ID, &dst4);
> +
> +    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
> +    ovs_assert(err == 0);
> +    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
> +    ovs_assert(err == EEXIST);
> +    re_nl_sync_routes(TABLE_ID, VRF_IFNAME, &host_routes);
> +    host_routes_destroy(&host_routes);
> +
> +    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
> +    ovs_assert(err == EEXIST);
> +    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
> +    ovs_assert(err == EEXIST);
> +
> +    hmap_init(&host_routes);
> +    re_nl_sync_routes(TABLE_ID, VRF_IFNAME, &host_routes);
> +    host_routes_destroy(&host_routes);
> +
> +    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
> +    ovs_assert(err == 0);
> +    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
> +    ovs_assert(err == 0);
> +
> +    err = re_nl_delete_vrf(VRF_IFNAME);
> +    ovs_assert(err == 0);
> +}
> +
> +static void
> +test_re_nl_create_vrf(struct ovs_cmdl_context *ctx OVS_UNUSED)
> +{
> +    int err;
> +
> +    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
> +    ovs_assert(err == 0);
> +    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
> +    ovs_assert(err == EEXIST);
> +    err = re_nl_delete_vrf(VRF_IFNAME);
> +    ovs_assert(err == 0);
> +}
> +
> +static void
> +test_re_nl_delete_vrf(struct ovs_cmdl_context *ctx OVS_UNUSED)
> +{
> +    int err;
> +
> +    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
> +    ovs_assert(err == 0);
> +    err = re_nl_delete_vrf(VRF_IFNAME);
> +    ovs_assert(err == 0);
> +    err = re_nl_delete_vrf(VRF_IFNAME);
> +    ovs_assert(err == ENODEV);
> +}
> +
> +static void
> +test_re_nl_add_route(struct ovs_cmdl_context *ctx OVS_UNUSED)
> +{
> +    int err;
> +    struct in6_addr dst4, dst6;
> +    ovs_be32 ip;
> +
> +    ipv6_parse("2001:db8:42::100", &dst6);
> +    ip_parse("172.16.42.100", &ip);
> +    in6_addr_set_mapped_ipv4(&dst4, ip);
> +
> +    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
> +    ovs_assert(err == 0);
> +
> +    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
> +    ovs_assert(err == 0);
> +    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
> +    ovs_assert(err == 0);
> +    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
> +    ovs_assert(err == EEXIST);
> +    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
> +    ovs_assert(err == EEXIST);
> +
> +    err = re_nl_delete_vrf(VRF_IFNAME);
> +    ovs_assert(err == 0);
> +}
> +
> +static void
> +test_re_nl_delete_route(struct ovs_cmdl_context *ctx OVS_UNUSED)
> +{
> +    int err;
> +    struct in6_addr dst4, dst6;
> +    ovs_be32 ip;
> +
> +    ipv6_parse("2001:db8:42::100", &dst6);
> +    ip_parse("172.16.42.100", &ip);
> +    in6_addr_set_mapped_ipv4(&dst4, ip);
> +
> +    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
> +    ovs_assert(err == 0);
> +
> +    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
> +    ovs_assert(err == 0);
> +    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
> +    ovs_assert(err == 0);
> +
> +    err = re_nl_delete_route(TABLE_ID, &dst6);
> +    ovs_assert(err == 0);
> +    err = re_nl_delete_route(TABLE_ID, &dst4);
> +    ovs_assert(err == 0);
> +    err = re_nl_delete_route(TABLE_ID, &dst6);
> +    ovs_assert(err == ESRCH);
> +    err = re_nl_delete_route(TABLE_ID, &dst4);
> +    ovs_assert(err == ESRCH);
> +
> +    err = re_nl_delete_vrf(VRF_IFNAME);
> +    ovs_assert(err == 0);
> +}
> +
> +static void
> +test_route_exchange_netlink_main(int argc, char *argv[])
> +{
> +    set_program_name(argv[0]);
> +    static const struct ovs_cmdl_command commands[] = {
> +        {"sync-routes", NULL, 0, 0, test_re_nl_sync_routes, OVS_RO},
> +        {"create-vrf", NULL, 0, 0, test_re_nl_create_vrf, OVS_RO},
> +        {"delete-vrf", NULL, 0, 0, test_re_nl_delete_vrf, OVS_RO},
> +        {"add-route", NULL, 0, 0, test_re_nl_add_route, OVS_RO},
> +        {"delete-route", NULL, 0, 0, test_re_nl_delete_route, OVS_RO},
> +        {NULL, NULL, 0, 0, NULL, OVS_RO},
> +    };
> +    struct ovs_cmdl_context ctx;
> +    ctx.argc = argc - 1;
> +    ctx.argv = argv + 1;
> +    ovs_cmdl_run_command(&ctx, commands);
> +}
> +
> +OVSTEST_REGISTER("test-route-exchange-netlink",
> +                 test_route_exchange_netlink_main);
> diff --git a/m4/ovn.m4 b/m4/ovn.m4
> index ebe4c9612..e8f30e0ac 100644
> --- a/m4/ovn.m4
> +++ b/m4/ovn.m4
> @@ -576,3 +576,28 @@ AC_DEFUN([OVN_CHECK_UNBOUND],
>     fi
>     AM_CONDITIONAL([HAVE_UNBOUND], [test "$HAVE_UNBOUND" = yes])
>     AC_SUBST([HAVE_UNBOUND])])
> +
> +dnl Checks for Netlink support.
> +AC_DEFUN([OVS_CHECK_NETLINK],
> +  [AC_CHECK_HEADER([linux/netlink.h],
> +                   [HAVE_NETLINK=yes],
> +                   [HAVE_NETLINK=no],
> +                   [#include <sys/socket.h>
> +   ])
> +   AM_CONDITIONAL([HAVE_NETLINK], [test "$HAVE_NETLINK" = yes])
> +   if test "$HAVE_NETLINK" = yes; then
> +      AC_DEFINE([HAVE_NETLINK], [1],
> +                [Define to 1 if Netlink protocol is available.])
> +   fi])
> +
> +dnl OVS_CHECK_LINUX_NETLINK
> +dnl
> +dnl Configure Linux netlink compat.
> +AC_DEFUN([OVS_CHECK_LINUX_NETLINK], [
> +  AC_COMPILE_IFELSE([
> +    AC_LANG_PROGRAM([#include <linux/netlink.h>], [
> +        struct nla_bitfield32 x =  { 0 };
> +    ])],
> +    [AC_DEFINE([HAVE_NLA_BITFIELD32], [1],
> +    [Define to 1 if struct nla_bitfield32 is available.])])
> +])
> diff --git a/tests/automake.mk b/tests/automake.mk
> index 3899c9e80..0087bff69 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -55,7 +55,8 @@ SYSTEM_DPDK_TESTSUITE_AT = \
>  SYSTEM_KMOD_TESTSUITE_AT = \
>         tests/system-kmod-macros.at \
>         tests/system-kmod-testsuite.at \
> -       tests/system-ovn-kmod.at
> +       tests/system-ovn-kmod.at \
> +       tests/ovn-system-route-exchange.at
>
>  SYSTEM_USERSPACE_TESTSUITE_AT = \
>         tests/system-userspace-testsuite.at \
> @@ -290,6 +291,11 @@ tests_ovstest_SOURCES = \
>         lib/test-ovn-features.c \
>         northd/test-ipam.c
>
> +if HAVE_NETLINK
> +tests_ovstest_SOURCES += \
> +       controller/test-route-exchange-netlink.c
> +endif
> +
>  tests_ovstest_LDADD = $(OVS_LIBDIR)/daemon.lo \
>      $(OVS_LIBDIR)/libopenvswitch.la lib/libovn.la \
>         controller/binding.$(OBJEXT) \
> @@ -307,6 +313,11 @@ tests_ovstest_LDADD = $(OVS_LIBDIR)/daemon.lo \
>         controller/vif-plug.$(OBJEXT) \
>         northd/ipam.$(OBJEXT)
>
> +if HAVE_NETLINK
> +tests_ovstest_LDADD += \
> +       controller/route-exchange-netlink.$(OBJEXT)
> +endif
> +
>  # Python tests.
>  CHECK_PYFILES = \
>         tests/test-l7.py \
> diff --git a/tests/ovn-system-route-exchange.at b/tests/
> ovn-system-route-exchange.at
> new file mode 100644
> index 000000000..36d7e3d2a
> --- /dev/null
> +++ b/tests/ovn-system-route-exchange.at
> @@ -0,0 +1,16 @@
> +#
> +# System level unit tests for controller/route-exchange-netlink.c module.
> +#
> +AT_BANNER([OVN system level unit tests])
> +
> +AT_SETUP([system level unit test -- route-exchange-netlink])
> +AT_KEYWORDS([route-exchange])
> +
> +CHECK_VRF()
> +
> +AT_CHECK([ovstest test-route-exchange-netlink sync-routes], [0], [])
> +AT_CHECK([ovstest test-route-exchange-netlink create-vrf], [0], [])
> +AT_CHECK([ovstest test-route-exchange-netlink delete-vrf], [0], [])
> +AT_CHECK([ovstest test-route-exchange-netlink add-route], [0], [])
> +AT_CHECK([ovstest test-route-exchange-netlink delete-route], [0], [])
> +AT_CLEANUP
> diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
> index 691c271a3..159396a40 100644
> --- a/tests/system-common-macros.at
> +++ b/tests/system-common-macros.at
> @@ -519,3 +519,15 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>  /failed to query port patch-.*/d
>  /.*terminating with signal 15.*/d"])
>  ]))
> +
> +# CHECK_VRF()
> +#
> +# Perform a requirements check for running VRF tests.
> +#
> +m4_define([CHECK_VRF],
> +[
> +    rc=0
> +    modprobe vrf || rc=$?
> +    AT_SKIP_IF([test $rc -ne 0])
> +    on_exit 'modprobe -r vrf'
> +])
> diff --git a/tests/system-kmod-testsuite.at b/tests/
> system-kmod-testsuite.at
> index 5ba35babb..16b633ece 100644
> --- a/tests/system-kmod-testsuite.at
> +++ b/tests/system-kmod-testsuite.at
> @@ -25,3 +25,4 @@ m4_include([tests/system-kmod-macros.at])
>
>  m4_include([tests/system-ovn.at])
>  m4_include([tests/system-ovn-kmod.at])
> +m4_include([tests/ovn-system-route-exchange.at])
> --
> 2.45.2
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
diff mbox series

Patch

diff --git a/configure.ac b/configure.ac
index 6a6b0db6a..6f0f485c4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -87,6 +87,8 @@  OVS_CHECK_WIN32
 OVS_CHECK_VISUAL_STUDIO_DDK
 OVN_CHECK_COVERAGE
 OVS_CHECK_NDEBUG
+OVS_CHECK_NETLINK
+OVS_CHECK_LINUX_NETLINK
 OVS_CHECK_OPENSSL
 OVN_CHECK_LOGDIR
 OVN_CHECK_PYTHON3
diff --git a/controller/automake.mk b/controller/automake.mk
index ed93cfb3c..006e884dc 100644
--- a/controller/automake.mk
+++ b/controller/automake.mk
@@ -51,6 +51,13 @@  controller_ovn_controller_SOURCES = \
 	controller/ct-zone.h \
 	controller/ct-zone.c
 
+if HAVE_NETLINK
+controller_ovn_controller_SOURCES += \
+	controller/route-exchange-netlink.h \
+	controller/route-exchange-netlink-private.h \
+	controller/route-exchange-netlink.c
+endif
+
 controller_ovn_controller_LDADD = lib/libovn.la $(OVS_LIBDIR)/libopenvswitch.la
 man_MANS += controller/ovn-controller.8
 EXTRA_DIST += controller/ovn-controller.8.xml
diff --git a/controller/route-exchange-netlink-private.h b/controller/route-exchange-netlink-private.h
new file mode 100644
index 000000000..4c2559895
--- /dev/null
+++ b/controller/route-exchange-netlink-private.h
@@ -0,0 +1,243 @@ 
+/*
+ * Copyright (c) 2024 Canonical, Ltd.
+ * Copyright (c) 2011, 2012, 2013, 2014, 2017 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef ROUTE_EXCHANGE_NETLINK_PRIVATE_H
+#define ROUTE_EXCHANGE_NETLINK_PRIVATE_H 1
+
+/*
+ * NOTE(fnordahl): The below code is stolen directly from OVS lib/route-table.c
+ * with the addition of inlining of function definitions for practical reasons
+ * and modifications:
+ *
+ * struct route_data:
+ *
+ * - Add rta_table_id.
+ *
+ * route_table_parse():
+ *
+ * - Consider non-standard routing tables and store the table_id.
+ *
+ * route_table_dump_one_table():
+ *
+ * - Use uint32_t for table id and pass it to kernel using thee RTA_TABLE
+ *   attribute to allow use of table IDs greater than 256.
+ * - Use callback with argument instead of hard coded call to static function
+ *   route_table_handle_msg().
+ *
+ * Ideally we would upstream those changes along with export of interesting
+ * data structures and functions to OVS, but in the interest of time we vendor
+ * the code here for now.
+ *
+ * BEGIN VENDORED CODE FROM OVS lib/route-table.c
+ */
+struct route_data {
+    /* Copied from struct rtmsg. */
+    unsigned char rtm_dst_len;
+    bool local;
+
+    /* Extracted from Netlink attributes. */
+    struct in6_addr rta_dst; /* 0 if missing. */
+    struct in6_addr rta_prefsrc; /* 0 if missing. */
+    struct in6_addr rta_gw;
+    char ifname[IFNAMSIZ]; /* Interface name. */
+    uint32_t mark;
+    uint32_t rta_table_id; /* 0 if missing. */
+};
+
+/* A digested version of a route message sent down by the kernel to indicate
+ * that a route has changed. */
+struct route_table_msg {
+    bool relevant;        /* Should this message be processed? */
+    int nlmsg_type;       /* e.g. RTM_NEWROUTE, RTM_DELROUTE. */
+    struct route_data rd; /* Data parsed from this message. */
+};
+
+/* Return RTNLGRP_IPV4_ROUTE or RTNLGRP_IPV6_ROUTE on success, 0 on parse
+ * error. */
+static inline int
+route_table_parse(struct ofpbuf *buf, struct route_table_msg *change)
+{
+    bool parsed, ipv4 = false;
+
+    static const struct nl_policy policy[] = {
+        [RTA_DST] = { .type = NL_A_U32, .optional = true  },
+        [RTA_OIF] = { .type = NL_A_U32, .optional = true },
+        [RTA_GATEWAY] = { .type = NL_A_U32, .optional = true },
+        [RTA_MARK] = { .type = NL_A_U32, .optional = true },
+        [RTA_PREFSRC] = { .type = NL_A_U32, .optional = true },
+        [RTA_TABLE] = { .type = NL_A_U32, .optional = true },
+    };
+
+    static const struct nl_policy policy6[] = {
+        [RTA_DST] = { .type = NL_A_IPV6, .optional = true },
+        [RTA_OIF] = { .type = NL_A_U32, .optional = true },
+        [RTA_MARK] = { .type = NL_A_U32, .optional = true },
+        [RTA_GATEWAY] = { .type = NL_A_IPV6, .optional = true },
+        [RTA_PREFSRC] = { .type = NL_A_IPV6, .optional = true },
+        [RTA_TABLE] = { .type = NL_A_U32, .optional = true },
+    };
+
+    struct nlattr *attrs[ARRAY_SIZE(policy)];
+    const struct rtmsg *rtm;
+
+    rtm = ofpbuf_at(buf, NLMSG_HDRLEN, sizeof *rtm);
+
+    if (rtm->rtm_family == AF_INET) {
+        parsed = nl_policy_parse(buf, NLMSG_HDRLEN + sizeof(struct rtmsg),
+                                 policy, attrs, ARRAY_SIZE(policy));
+        ipv4 = true;
+    } else if (rtm->rtm_family == AF_INET6) {
+        parsed = nl_policy_parse(buf, NLMSG_HDRLEN + sizeof(struct rtmsg),
+                                 policy6, attrs, ARRAY_SIZE(policy6));
+    } else {
+        VLOG_DBG_RL(&rl, "received non AF_INET rtnetlink route message");
+        return 0;
+    }
+
+    if (parsed) {
+        const struct nlmsghdr *nlmsg;
+        uint32_t table_id;
+        int rta_oif;      /* Output interface index. */
+
+        nlmsg = buf->data;
+
+        memset(change, 0, sizeof *change);
+        change->relevant = true;
+
+        if (rtm->rtm_scope == RT_SCOPE_NOWHERE) {
+            change->relevant = false;
+        }
+
+        if (rtm->rtm_type != RTN_UNICAST &&
+            rtm->rtm_type != RTN_LOCAL) {
+            change->relevant = false;
+        }
+
+        table_id = rtm->rtm_table;
+        if (attrs[RTA_TABLE]) {
+            table_id = nl_attr_get_u32(attrs[RTA_TABLE]);
+            change->rd.rta_table_id = table_id;
+        }
+
+        change->nlmsg_type     = nlmsg->nlmsg_type;
+        change->rd.rtm_dst_len = rtm->rtm_dst_len + (ipv4 ? 96 : 0);
+        change->rd.local = rtm->rtm_type == RTN_LOCAL;
+        if (attrs[RTA_OIF]) {
+            rta_oif = nl_attr_get_u32(attrs[RTA_OIF]);
+
+            if (!if_indextoname(rta_oif, change->rd.ifname)) {
+                int error = errno;
+
+                VLOG_DBG_RL(&rl, "Could not find interface name[%u]: %s",
+                            rta_oif, ovs_strerror(error));
+                if (error == ENXIO) {
+                    change->relevant = false;
+                } else {
+                    return 0;
+                }
+            }
+        }
+
+        if (attrs[RTA_DST]) {
+            if (ipv4) {
+                ovs_be32 dst;
+                dst = nl_attr_get_be32(attrs[RTA_DST]);
+                in6_addr_set_mapped_ipv4(&change->rd.rta_dst, dst);
+            } else {
+                change->rd.rta_dst = nl_attr_get_in6_addr(attrs[RTA_DST]);
+            }
+        } else if (ipv4) {
+            in6_addr_set_mapped_ipv4(&change->rd.rta_dst, 0);
+        }
+        if (attrs[RTA_PREFSRC]) {
+            if (ipv4) {
+                ovs_be32 prefsrc;
+                prefsrc = nl_attr_get_be32(attrs[RTA_PREFSRC]);
+                in6_addr_set_mapped_ipv4(&change->rd.rta_prefsrc, prefsrc);
+            } else {
+                change->rd.rta_prefsrc =
+                    nl_attr_get_in6_addr(attrs[RTA_PREFSRC]);
+            }
+        }
+        if (attrs[RTA_GATEWAY]) {
+            if (ipv4) {
+                ovs_be32 gw;
+                gw = nl_attr_get_be32(attrs[RTA_GATEWAY]);
+                in6_addr_set_mapped_ipv4(&change->rd.rta_gw, gw);
+            } else {
+                change->rd.rta_gw = nl_attr_get_in6_addr(attrs[RTA_GATEWAY]);
+            }
+        }
+        if (attrs[RTA_MARK]) {
+            change->rd.mark = nl_attr_get_u32(attrs[RTA_MARK]);
+        }
+    } else {
+        VLOG_DBG_RL(&rl, "received unparseable rtnetlink route message");
+        return 0;
+    }
+
+    /* Success. */
+    return ipv4 ? RTNLGRP_IPV4_ROUTE : RTNLGRP_IPV6_ROUTE;
+}
+
+static inline bool
+route_table_dump_one_table(
+    uint32_t id,
+    void (*handle_msg)(struct route_table_msg *, void *),
+    void *data)
+{
+    uint64_t reply_stub[NL_DUMP_BUFSIZE / 8];
+    struct ofpbuf request, reply, buf;
+    struct rtmsg *rq_msg;
+    bool filtered = true;
+    struct nl_dump dump;
+
+    ofpbuf_init(&request, 0);
+
+    nl_msg_put_nlmsghdr(&request, 0, RTM_GETROUTE, NLM_F_REQUEST);
+
+    rq_msg = ofpbuf_put_zeros(&request, sizeof *rq_msg);
+    rq_msg->rtm_family = AF_UNSPEC;
+    rq_msg->rtm_table = RT_TABLE_UNSPEC;
+
+    nl_msg_put_u32(&request, RTA_TABLE, id);
+
+    nl_dump_start(&dump, NETLINK_ROUTE, &request);
+    ofpbuf_uninit(&request);
+
+    ofpbuf_use_stub(&buf, reply_stub, sizeof reply_stub);
+    while (nl_dump_next(&dump, &reply, &buf)) {
+        struct route_table_msg msg;
+
+        if (route_table_parse(&reply, &msg)) {
+            struct nlmsghdr *nlmsghdr = nl_msg_nlmsghdr(&reply);
+
+            /* Older kernels do not support filtering. */
+            if (!(nlmsghdr->nlmsg_flags & NLM_F_DUMP_FILTERED)) {
+                filtered = false;
+            }
+            (*handle_msg)(&msg, data);
+        }
+    }
+    ofpbuf_uninit(&buf);
+    nl_dump_done(&dump);
+
+    return filtered;
+}
+/* END VENDORED CODE */
+
+#endif /* route-exchange-netlink-private.h */
diff --git a/controller/route-exchange-netlink.c b/controller/route-exchange-netlink.c
new file mode 100644
index 000000000..707676f33
--- /dev/null
+++ b/controller/route-exchange-netlink.c
@@ -0,0 +1,264 @@ 
+/*
+ * Copyright (c) 2024 Canonical, Ltd.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+
+#include <errno.h>
+#include <inttypes.h>
+#include <linux/rtnetlink.h>
+#include <net/if.h>
+
+#include "netlink-socket.h"
+#include "netlink.h"
+#include "openvswitch/hmap.h"
+#include "openvswitch/ofpbuf.h"
+#include "openvswitch/vlog.h"
+#include "packets.h"
+
+#include "route-exchange-netlink.h"
+
+VLOG_DEFINE_THIS_MODULE(route_exchange_netlink);
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+
+/* Due to inlining of vendored code from OVS lib/route-table.c, we need to
+ * include this after the above VLOG statements. */
+#include "route-exchange-netlink-private.h"
+
+#define TABLE_ID_VALID(table_id) (table_id != RT_TABLE_UNSPEC &&              \
+                                  table_id != RT_TABLE_COMPAT &&              \
+                                  table_id != RT_TABLE_DEFAULT &&             \
+                                  table_id != RT_TABLE_MAIN &&                \
+                                  table_id != RT_TABLE_LOCAL &&               \
+                                  table_id != RT_TABLE_MAX)
+
+static int
+modify_vrf(uint32_t type, uint32_t flags_arg,
+           const char *ifname, uint32_t table_id)
+{
+    uint32_t flags = NLM_F_REQUEST | NLM_F_ACK;
+    size_t linkinfo_off, infodata_off;
+    struct ifinfomsg *ifinfo;
+    struct ofpbuf request;
+    int err;
+
+    flags |= flags_arg;
+
+    ofpbuf_init(&request, 0);
+    nl_msg_put_nlmsghdr(&request, 0, type, flags);
+    ifinfo = ofpbuf_put_zeros(&request, sizeof *ifinfo);
+    nl_msg_put_string(&request, IFLA_IFNAME, ifname);
+    if (type == RTM_DELLINK) {
+        goto out;
+    }
+
+    ifinfo->ifi_change = ifinfo->ifi_flags = IFF_UP;
+    linkinfo_off = nl_msg_start_nested(&request, IFLA_LINKINFO);
+    nl_msg_put_string(&request, IFLA_INFO_KIND, "vrf");
+    infodata_off = nl_msg_start_nested(&request, IFLA_INFO_DATA);
+    nl_msg_put_u32(&request, IFLA_VRF_TABLE, table_id);
+    nl_msg_end_nested(&request, infodata_off);
+    nl_msg_end_nested(&request, linkinfo_off);
+
+out:
+    err = nl_transact(NETLINK_ROUTE, &request, NULL);
+
+    ofpbuf_uninit(&request);
+
+    return err;
+}
+
+int
+re_nl_create_vrf(const char *ifname, uint32_t table_id)
+{
+    uint32_t flags = NLM_F_CREATE | NLM_F_EXCL;
+    uint32_t type = RTM_NEWLINK;
+
+    if (!TABLE_ID_VALID(table_id)) {
+        VLOG_WARN_RL(&rl,
+                     "attempt to create VRF using invalid table id %"PRIu32,
+                     table_id);
+        return EINVAL;
+    }
+
+    return modify_vrf(type, flags, ifname, table_id);
+}
+
+int
+re_nl_delete_vrf(const char *ifname)
+{
+    return modify_vrf(RTM_DELLINK, 0, ifname, 0);
+}
+
+static int
+modify_route(uint32_t type, uint32_t flags_arg, uint32_t table_id,
+             struct in6_addr *dst, uint32_t oif)
+{
+    uint32_t flags = NLM_F_REQUEST | NLM_F_ACK;
+    bool is_ipv4 = IN6_IS_ADDR_V4MAPPED(dst);
+    struct ofpbuf request;
+    struct rtmsg *rt;
+    int err;
+
+    flags |= flags_arg;
+
+    ofpbuf_init(&request, 0);
+    nl_msg_put_nlmsghdr(&request, 0, type, flags);
+    rt = ofpbuf_put_zeros(&request, sizeof *rt);
+    rt->rtm_family = is_ipv4 ? AF_INET : AF_INET6;
+    rt->rtm_table = RT_TABLE_UNSPEC; /* RTA_TABLE attribute allows id > 256 */
+    if (type == RTM_DELROUTE) {
+        rt->rtm_scope = RT_SCOPE_NOWHERE;
+    } else {
+        rt->rtm_protocol = RTPROT_BOOT;
+        rt->rtm_scope = RT_SCOPE_UNIVERSE;
+        rt->rtm_type = RTN_UNICAST;
+    }
+    rt->rtm_dst_len = is_ipv4 ? 32 : 128;
+
+    nl_msg_put_u32(&request, RTA_TABLE, table_id);
+
+    if (is_ipv4) {
+        nl_msg_put_be32(&request, RTA_DST, in6_addr_get_mapped_ipv4(dst));
+    } else {
+        nl_msg_put_in6_addr(&request, RTA_DST, dst);
+    }
+
+    if (oif) {
+        nl_msg_put_u32(&request, RTA_OIF, oif);
+    }
+
+    err = nl_transact(NETLINK_ROUTE, &request, NULL);
+    ofpbuf_uninit(&request);
+
+    return err;
+}
+
+int
+re_nl_add_route(uint32_t table_id, struct in6_addr *dst, const char *ifname)
+{
+    uint32_t flags = NLM_F_CREATE | NLM_F_EXCL;
+    uint32_t type = RTM_NEWROUTE;
+
+    if (!TABLE_ID_VALID(table_id)) {
+        VLOG_WARN_RL(&rl,
+                     "attempt to add route using invalid table id %"PRIu32,
+                     table_id);
+        return EINVAL;
+    }
+
+    return modify_route(type, flags, table_id, dst, if_nametoindex(ifname));
+}
+
+int
+re_nl_delete_route(uint32_t table_id, struct in6_addr *dst)
+{
+    if (!TABLE_ID_VALID(table_id)) {
+        VLOG_WARN_RL(&rl,
+                     "attempt to delete route using invalid table id %"PRIu32,
+                     table_id);
+        return EINVAL;
+    }
+
+    return modify_route(RTM_DELROUTE, 0, table_id, dst, 0);
+}
+
+struct host_route_node {
+    struct hmap_node hmap_node;
+    uint32_t table_id;
+    struct in6_addr addr;
+};
+
+static uint32_t
+host_route_hash(const struct in6_addr *dst)
+{
+    return hash_bytes(dst->s6_addr, 16, 0);
+}
+
+void
+host_route_insert(struct hmap *host_routes, uint32_t table_id,
+                  struct in6_addr *dst)
+{
+    struct host_route_node *hr = xzalloc(sizeof *hr);
+    hmap_insert(host_routes, &hr->hmap_node, host_route_hash(dst));
+    hr->table_id = table_id;
+    hr->addr = *dst;
+}
+
+void
+host_routes_destroy(struct hmap *host_routes)
+{
+    struct host_route_node *hr;
+    HMAP_FOR_EACH_SAFE (hr, hmap_node, host_routes) {
+        hmap_remove(host_routes, &hr->hmap_node);
+        free(hr);
+    }
+    hmap_destroy(host_routes);
+}
+
+static void
+handle_route_msg_delete_host_routes(struct route_table_msg *msg, void *data)
+{
+    struct route_data *rd = &msg->rd;
+    struct hmap *host_routes = data;
+    struct host_route_node *hr;
+    int err;
+
+    uint32_t hash = host_route_hash(&rd->rta_dst);
+    HMAP_FOR_EACH_WITH_HASH (hr, hmap_node, hash, host_routes) {
+        if (ipv6_addr_equals(&hr->addr, &rd->rta_dst)) {
+            hmap_remove(host_routes, &hr->hmap_node);
+            free(hr);
+            return;
+        }
+    }
+    err = re_nl_delete_route(rd->rta_table_id, &rd->rta_dst);
+    if (err) {
+        char addr_s[INET6_ADDRSTRLEN + 1];
+        VLOG_WARN_RL(&rl, "Delete route table_id=%"PRIu32" dst=%s: %s",
+                     rd->rta_table_id,
+                     ipv6_string_mapped(
+                         addr_s, &rd->rta_dst) ? addr_s : "(invalid)",
+                     ovs_strerror(err));
+    }
+}
+
+void
+re_nl_sync_routes(uint32_t table_id, const char *ifname,
+                  struct hmap *host_routes)
+{
+    /* Remove routes from the system that are not in the host_routes hmap and
+     * remove entries from host_routes hmap that match routes already installed
+     * in the system. */
+    route_table_dump_one_table(table_id, handle_route_msg_delete_host_routes,
+                               host_routes);
+
+    /* Add any remaining routes in the host_routes hmap to the system routing
+     * table. */
+    struct host_route_node *hr;
+    HMAP_FOR_EACH_SAFE (hr, hmap_node, host_routes) {
+        int err = re_nl_add_route(table_id, &hr->addr, ifname);
+        if (err) {
+            char addr_s[INET6_ADDRSTRLEN + 1];
+            VLOG_WARN_RL(&rl, "Add route table_id=%"PRIu32" dst=%s dev=%s: %s",
+                         table_id, ifname,
+                         ipv6_string_mapped(
+                             addr_s, &hr->addr) ? addr_s : "(invalid)",
+                         ovs_strerror(err));
+        }
+        hmap_remove(host_routes, &hr->hmap_node);
+        free(hr);
+    }
+}
diff --git a/controller/route-exchange-netlink.h b/controller/route-exchange-netlink.h
new file mode 100644
index 000000000..10a60a60e
--- /dev/null
+++ b/controller/route-exchange-netlink.h
@@ -0,0 +1,40 @@ 
+/*
+ * Copyright (c) 2024 Canonical
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef ROUTE_EXCHANGE_NETLINK_H
+#define ROUTE_EXCHANGE_NETLINK_H 1
+
+#include <stdint.h>
+
+struct in6_addr;
+struct hmap;
+
+int re_nl_create_vrf(const char *ifname, uint32_t table_id);
+int re_nl_delete_vrf(const char *ifname);
+
+int re_nl_add_route(uint32_t table_id, struct in6_addr *dst,
+                    const char *ifname);
+int re_nl_delete_route(uint32_t table_id, struct in6_addr *dst);
+
+void re_nl_dump(uint32_t table_id);
+
+void host_route_insert(struct hmap *host_routes, uint32_t table_id,
+                       struct in6_addr *dst);
+void host_routes_destroy(struct hmap *);
+void re_nl_sync_routes(uint32_t table_id, const char *ifname,
+                       struct hmap *host_routes);
+
+#endif /* route-exchange-netlink.h */
diff --git a/controller/test-route-exchange-netlink.c b/controller/test-route-exchange-netlink.c
new file mode 100644
index 000000000..7097d5182
--- /dev/null
+++ b/controller/test-route-exchange-netlink.c
@@ -0,0 +1,173 @@ 
+/* Copyright (c) 2021, Canonical
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+
+#include <errno.h>
+
+#include "openvswitch/hmap.h"
+#include "openvswitch/types.h"
+#include "packets.h"
+#include "route-exchange-netlink.h"
+#include "tests/ovstest.h"
+
+#define VRF_IFNAME "ovnvrf42"
+#define TABLE_ID 42
+
+static void
+test_re_nl_sync_routes(struct ovs_cmdl_context *ctx OVS_UNUSED)
+{
+    struct hmap host_routes = HMAP_INITIALIZER(&host_routes);
+    struct in6_addr dst4, dst6;
+    ovs_be32 ip;
+    int err;
+
+    ipv6_parse("2001:db8:42::100", &dst6);
+    host_route_insert(&host_routes, TABLE_ID, &dst6);
+
+    ip_parse("172.16.42.100", &ip);
+    in6_addr_set_mapped_ipv4(&dst4, ip);
+    host_route_insert(&host_routes, TABLE_ID, &dst4);
+
+    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
+    ovs_assert(err == 0);
+    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
+    ovs_assert(err == EEXIST);
+    re_nl_sync_routes(TABLE_ID, VRF_IFNAME, &host_routes);
+    host_routes_destroy(&host_routes);
+
+    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
+    ovs_assert(err == EEXIST);
+    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
+    ovs_assert(err == EEXIST);
+
+    hmap_init(&host_routes);
+    re_nl_sync_routes(TABLE_ID, VRF_IFNAME, &host_routes);
+    host_routes_destroy(&host_routes);
+
+    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
+    ovs_assert(err == 0);
+    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
+    ovs_assert(err == 0);
+
+    err = re_nl_delete_vrf(VRF_IFNAME);
+    ovs_assert(err == 0);
+}
+
+static void
+test_re_nl_create_vrf(struct ovs_cmdl_context *ctx OVS_UNUSED)
+{
+    int err;
+
+    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
+    ovs_assert(err == 0);
+    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
+    ovs_assert(err == EEXIST);
+    err = re_nl_delete_vrf(VRF_IFNAME);
+    ovs_assert(err == 0);
+}
+
+static void
+test_re_nl_delete_vrf(struct ovs_cmdl_context *ctx OVS_UNUSED)
+{
+    int err;
+
+    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
+    ovs_assert(err == 0);
+    err = re_nl_delete_vrf(VRF_IFNAME);
+    ovs_assert(err == 0);
+    err = re_nl_delete_vrf(VRF_IFNAME);
+    ovs_assert(err == ENODEV);
+}
+
+static void
+test_re_nl_add_route(struct ovs_cmdl_context *ctx OVS_UNUSED)
+{
+    int err;
+    struct in6_addr dst4, dst6;
+    ovs_be32 ip;
+
+    ipv6_parse("2001:db8:42::100", &dst6);
+    ip_parse("172.16.42.100", &ip);
+    in6_addr_set_mapped_ipv4(&dst4, ip);
+
+    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
+    ovs_assert(err == 0);
+
+    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
+    ovs_assert(err == 0);
+    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
+    ovs_assert(err == 0);
+    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
+    ovs_assert(err == EEXIST);
+    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
+    ovs_assert(err == EEXIST);
+
+    err = re_nl_delete_vrf(VRF_IFNAME);
+    ovs_assert(err == 0);
+}
+
+static void
+test_re_nl_delete_route(struct ovs_cmdl_context *ctx OVS_UNUSED)
+{
+    int err;
+    struct in6_addr dst4, dst6;
+    ovs_be32 ip;
+
+    ipv6_parse("2001:db8:42::100", &dst6);
+    ip_parse("172.16.42.100", &ip);
+    in6_addr_set_mapped_ipv4(&dst4, ip);
+
+    err = re_nl_create_vrf(VRF_IFNAME, TABLE_ID);
+    ovs_assert(err == 0);
+
+    err = re_nl_add_route(TABLE_ID, &dst6, VRF_IFNAME);
+    ovs_assert(err == 0);
+    err = re_nl_add_route(TABLE_ID, &dst4, VRF_IFNAME);
+    ovs_assert(err == 0);
+
+    err = re_nl_delete_route(TABLE_ID, &dst6);
+    ovs_assert(err == 0);
+    err = re_nl_delete_route(TABLE_ID, &dst4);
+    ovs_assert(err == 0);
+    err = re_nl_delete_route(TABLE_ID, &dst6);
+    ovs_assert(err == ESRCH);
+    err = re_nl_delete_route(TABLE_ID, &dst4);
+    ovs_assert(err == ESRCH);
+
+    err = re_nl_delete_vrf(VRF_IFNAME);
+    ovs_assert(err == 0);
+}
+
+static void
+test_route_exchange_netlink_main(int argc, char *argv[])
+{
+    set_program_name(argv[0]);
+    static const struct ovs_cmdl_command commands[] = {
+        {"sync-routes", NULL, 0, 0, test_re_nl_sync_routes, OVS_RO},
+        {"create-vrf", NULL, 0, 0, test_re_nl_create_vrf, OVS_RO},
+        {"delete-vrf", NULL, 0, 0, test_re_nl_delete_vrf, OVS_RO},
+        {"add-route", NULL, 0, 0, test_re_nl_add_route, OVS_RO},
+        {"delete-route", NULL, 0, 0, test_re_nl_delete_route, OVS_RO},
+        {NULL, NULL, 0, 0, NULL, OVS_RO},
+    };
+    struct ovs_cmdl_context ctx;
+    ctx.argc = argc - 1;
+    ctx.argv = argv + 1;
+    ovs_cmdl_run_command(&ctx, commands);
+}
+
+OVSTEST_REGISTER("test-route-exchange-netlink",
+                 test_route_exchange_netlink_main);
diff --git a/m4/ovn.m4 b/m4/ovn.m4
index ebe4c9612..e8f30e0ac 100644
--- a/m4/ovn.m4
+++ b/m4/ovn.m4
@@ -576,3 +576,28 @@  AC_DEFUN([OVN_CHECK_UNBOUND],
    fi
    AM_CONDITIONAL([HAVE_UNBOUND], [test "$HAVE_UNBOUND" = yes])
    AC_SUBST([HAVE_UNBOUND])])
+
+dnl Checks for Netlink support.
+AC_DEFUN([OVS_CHECK_NETLINK],
+  [AC_CHECK_HEADER([linux/netlink.h],
+                   [HAVE_NETLINK=yes],
+                   [HAVE_NETLINK=no],
+                   [#include <sys/socket.h>
+   ])
+   AM_CONDITIONAL([HAVE_NETLINK], [test "$HAVE_NETLINK" = yes])
+   if test "$HAVE_NETLINK" = yes; then
+      AC_DEFINE([HAVE_NETLINK], [1],
+                [Define to 1 if Netlink protocol is available.])
+   fi])
+
+dnl OVS_CHECK_LINUX_NETLINK
+dnl
+dnl Configure Linux netlink compat.
+AC_DEFUN([OVS_CHECK_LINUX_NETLINK], [
+  AC_COMPILE_IFELSE([
+    AC_LANG_PROGRAM([#include <linux/netlink.h>], [
+        struct nla_bitfield32 x =  { 0 };
+    ])],
+    [AC_DEFINE([HAVE_NLA_BITFIELD32], [1],
+    [Define to 1 if struct nla_bitfield32 is available.])])
+])
diff --git a/tests/automake.mk b/tests/automake.mk
index 3899c9e80..0087bff69 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -55,7 +55,8 @@  SYSTEM_DPDK_TESTSUITE_AT = \
 SYSTEM_KMOD_TESTSUITE_AT = \
 	tests/system-kmod-macros.at \
 	tests/system-kmod-testsuite.at \
-	tests/system-ovn-kmod.at
+	tests/system-ovn-kmod.at \
+	tests/ovn-system-route-exchange.at
 
 SYSTEM_USERSPACE_TESTSUITE_AT = \
 	tests/system-userspace-testsuite.at \
@@ -290,6 +291,11 @@  tests_ovstest_SOURCES = \
 	lib/test-ovn-features.c \
 	northd/test-ipam.c
 
+if HAVE_NETLINK
+tests_ovstest_SOURCES += \
+	controller/test-route-exchange-netlink.c
+endif
+
 tests_ovstest_LDADD = $(OVS_LIBDIR)/daemon.lo \
     $(OVS_LIBDIR)/libopenvswitch.la lib/libovn.la \
 	controller/binding.$(OBJEXT) \
@@ -307,6 +313,11 @@  tests_ovstest_LDADD = $(OVS_LIBDIR)/daemon.lo \
 	controller/vif-plug.$(OBJEXT) \
 	northd/ipam.$(OBJEXT)
 
+if HAVE_NETLINK
+tests_ovstest_LDADD += \
+	controller/route-exchange-netlink.$(OBJEXT)
+endif
+
 # Python tests.
 CHECK_PYFILES = \
 	tests/test-l7.py \
diff --git a/tests/ovn-system-route-exchange.at b/tests/ovn-system-route-exchange.at
new file mode 100644
index 000000000..36d7e3d2a
--- /dev/null
+++ b/tests/ovn-system-route-exchange.at
@@ -0,0 +1,16 @@ 
+#
+# System level unit tests for controller/route-exchange-netlink.c module.
+#
+AT_BANNER([OVN system level unit tests])
+
+AT_SETUP([system level unit test -- route-exchange-netlink])
+AT_KEYWORDS([route-exchange])
+
+CHECK_VRF()
+
+AT_CHECK([ovstest test-route-exchange-netlink sync-routes], [0], [])
+AT_CHECK([ovstest test-route-exchange-netlink create-vrf], [0], [])
+AT_CHECK([ovstest test-route-exchange-netlink delete-vrf], [0], [])
+AT_CHECK([ovstest test-route-exchange-netlink add-route], [0], [])
+AT_CHECK([ovstest test-route-exchange-netlink delete-route], [0], [])
+AT_CLEANUP
diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
index 691c271a3..159396a40 100644
--- a/tests/system-common-macros.at
+++ b/tests/system-common-macros.at
@@ -519,3 +519,15 @@  OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
 /failed to query port patch-.*/d
 /.*terminating with signal 15.*/d"])
 ]))
+
+# CHECK_VRF()
+#
+# Perform a requirements check for running VRF tests.
+#
+m4_define([CHECK_VRF],
+[
+    rc=0
+    modprobe vrf || rc=$?
+    AT_SKIP_IF([test $rc -ne 0])
+    on_exit 'modprobe -r vrf'
+])
diff --git a/tests/system-kmod-testsuite.at b/tests/system-kmod-testsuite.at
index 5ba35babb..16b633ece 100644
--- a/tests/system-kmod-testsuite.at
+++ b/tests/system-kmod-testsuite.at
@@ -25,3 +25,4 @@  m4_include([tests/system-kmod-macros.at])
 
 m4_include([tests/system-ovn.at])
 m4_include([tests/system-ovn-kmod.at])
+m4_include([tests/ovn-system-route-exchange.at])