From patchwork Thu Mar 15 09:20:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Guoshuai Li X-Patchwork-Id: 886171 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=dtdream.com Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40233n6c99z9sSW for ; Thu, 15 Mar 2018 20:20:53 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 66E91FA4; Thu, 15 Mar 2018 09:20:32 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 54B4DEBD for ; Thu, 15 Mar 2018 09:20:29 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from smtp2203-239.mail.aliyun.com (smtp2203-239.mail.aliyun.com [121.197.203.239]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id B2CB45D2 for ; Thu, 15 Mar 2018 09:20:27 +0000 (UTC) X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07444199|-1; CH=green; FP=0|0|0|0|0|-1|-1|-1; HT=e01e01542; MF=ligs@dtdream.com; NM=1; PH=DS; RN=2; RT=2; SR=0; TI=SMTPD_---.BJR-3cL_1521105622; Received: from localhost.localdomain(mailfrom:ligs@dtdream.com fp:222.128.6.212) by smtp.aliyun-inc.com(10.147.40.7); Thu, 15 Mar 2018 17:20:23 +0800 From: Guoshuai Li To: ovs-dev@openvswitch.org Date: Thu, 15 Mar 2018 17:20:11 +0800 Message-Id: <20180315092012.5956-1-ligs@dtdream.com> X-Mailer: git-send-email 2.13.2.windows.1 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH 1/2] ovn-controller: support MAC_Binding aging X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org Add the MAC_Binding aging. The default aging time is 20 minutes. Send the ARP request at 10(1*20/2), 13(2*20/3), 15(3*20/4) minutes. If no ARP reply is received within 20 minutes, the MAC_Binding column will be deleted. Sync a MAC_Binding cache in the ovn-controller where lport redirect chassis, to records timestamps and ARP send times. Time traversal cache to send ARP requests or aging. Signed-off-by: Guoshuai Li --- ovn/controller/pinctrl.c | 363 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 363 insertions(+) diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c index b4dbd8c29..b258a7f29 100644 --- a/ovn/controller/pinctrl.c +++ b/ovn/controller/pinctrl.c @@ -70,6 +70,18 @@ static void run_put_mac_bindings(struct controller_ctx *); static void wait_put_mac_bindings(struct controller_ctx *); static void flush_put_mac_bindings(void); +static void init_aging_mac_bindings(void); +static void destroy_aging_mac_bindings(void); +static void aging_mac_bindings_wait(void); +static void aging_mac_bindings_run(const struct controller_ctx *, + const struct ovsrec_bridge *, + const struct sbrec_chassis *, + const struct hmap *); +static void init_aging_mac_binding(uint32_t, + uint32_t, + const char *, + uint32_t); + static void init_send_garps(void); static void destroy_send_garps(void); static void send_garp_wait(void); @@ -105,6 +117,7 @@ pinctrl_init(void) swconn = rconn_create(5, 0, DSCP_DEFAULT, 1 << OFP13_VERSION); conn_seq_no = 0; init_put_mac_bindings(); + init_aging_mac_bindings(); init_send_garps(); init_ipv6_ras(); } @@ -1166,6 +1179,7 @@ pinctrl_run(struct controller_ctx *ctx, send_garp_run(ctx, br_int, chassis, chassis_index, local_datapaths, active_tunnels); send_ipv6_ras(ctx, local_datapaths); + aging_mac_bindings_run(ctx, br_int, chassis, local_datapaths); } /* Table of ipv6_ra_state structures, keyed on logical port name */ @@ -1467,6 +1481,7 @@ pinctrl_wait(struct controller_ctx *ctx) rconn_recv_wait(swconn); send_garp_wait(); ipv6_ra_wait(); + aging_mac_bindings_wait(); } void @@ -1474,6 +1489,7 @@ pinctrl_destroy(void) { rconn_destroy(swconn); destroy_put_mac_bindings(); + destroy_aging_mac_bindings(); destroy_send_garps(); destroy_ipv6_ras(); } @@ -1567,6 +1583,9 @@ pinctrl_handle_put_mac_binding(const struct flow *md, } pmb->timestamp = time_msec(); pmb->mac = headers->dl_src; + + /* init aging mac_binding timestamp and arp_send_count */ + init_aging_mac_binding(dp_key, port_key, ip_s, hash); } static void @@ -1647,6 +1666,350 @@ flush_put_mac_bindings(void) } } +/* Buffered "aging_mac_binding" operation. */ +struct aging_mac_binding { + struct hmap_node hmap_node; /* In 'aging_mac_binding'. */ + + long long int timestamp; /* In milliseconds. */ + + /* Key. */ + uint32_t dp_key; + uint32_t port_key; + char ip_s[INET6_ADDRSTRLEN + 1]; + + int arp_send_count; +}; + +/* Contains "struct aging_mac_binding"s. + The cache for mac_bindings */ +static struct hmap aging_mac_bindings; + +/* Next aging mac binding time announcement in ms. */ +static long long int next_wait_time; + +static long long int base_reachable_time = 20 * 60 * 1000; + +static void +init_aging_mac_bindings(void) +{ + hmap_init(&aging_mac_bindings); + next_wait_time = LLONG_MAX; +} + +static void +flush_aging_mac_bindings(void) +{ + struct aging_mac_binding *amb; + HMAP_FOR_EACH_POP (amb, hmap_node, &aging_mac_bindings) { + free(amb); + } +} + +static struct aging_mac_binding * +find_aging_mac_binding(uint32_t dp_key, + uint32_t port_key, + const char *ip_s, + uint32_t hash) +{ + struct aging_mac_binding *amb; + HMAP_FOR_EACH_WITH_HASH (amb, hmap_node, hash, &aging_mac_bindings) { + if (amb->dp_key == dp_key && amb->port_key == port_key + && !strcmp(amb->ip_s, ip_s)) { + return amb; + } + } + return NULL; +} + +static void +insert_aging_mac_bindings(int64_t dp_key, int64_t port_key, const char *ip_s) +{ + uint32_t hash = hash_string(ip_s, hash_2words(dp_key, port_key)); + struct aging_mac_binding *amb + = find_aging_mac_binding(dp_key, port_key, ip_s, hash); + if (!amb) { + amb = xmalloc(sizeof *amb); + hmap_insert(&aging_mac_bindings, &amb->hmap_node, hash); + amb->dp_key = dp_key; + amb->port_key = port_key; + amb->timestamp = time_msec(); + amb->arp_send_count = 0; + ovs_strlcpy(amb->ip_s, ip_s, INET6_ADDRSTRLEN); + } +} + +static void +destroy_aging_mac_bindings(void) +{ + flush_aging_mac_bindings(); + hmap_destroy(&aging_mac_bindings); +} + +static void +aging_mac_bindings_wait(void) +{ + poll_timer_wait_until(next_wait_time); +} + +static void +remove_mac_bindings(const struct controller_ctx *ctx, + const char *logical_port, const char * ip_s) +{ + const struct sbrec_mac_binding *b, *n; + SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) { + if (!strcmp(b->logical_port, logical_port) + && !strcmp(b->ip, ip_s)) { + sbrec_mac_binding_delete(b); + VLOG_INFO("logical_port:%s ip:%s MAC_Binding aging.", + b->logical_port, b->ip); + return; + } + } +} + +static void +init_aging_mac_binding(uint32_t dp_key, + uint32_t port_key, + const char *ip_s, + uint32_t hash) +{ + struct aging_mac_binding *amb + = find_aging_mac_binding(dp_key, port_key, ip_s, hash); + if (amb) { + amb->timestamp = time_msec(); + amb->arp_send_count = 0; + } +} + +static void +get_localnet_vifs(const struct ovsrec_bridge *br_int, + const struct sbrec_chassis *chassis, + struct simap *localnet_ofports) +{ + for (int i = 0; i < br_int->n_ports; i++) { + const struct ovsrec_port *port_rec = br_int->ports[i]; + if (!strcmp(port_rec->name, br_int->name)) { + continue; + } + const char *chassis_id = smap_get(&port_rec->external_ids, + "ovn-chassis-id"); + if (chassis_id && !strcmp(chassis_id, chassis->name)) { + continue; + } + const char *localnet = smap_get(&port_rec->external_ids, + "ovn-localnet-port"); + for (int j = 0; j < port_rec->n_interfaces; j++) { + const struct ovsrec_interface *iface = port_rec->interfaces[j]; + if (!iface->n_ofport) { + continue; + } + /* Get localnet port with its ofport. */ + if (localnet) { + int64_t ofport = iface->ofport[0]; + if (ofport < 1 || ofport > ofp_to_u16(OFPP_MAX)) { + continue; + } + simap_put(localnet_ofports, localnet, ofport); + continue; + } + } + } +} + +static const struct sbrec_port_binding* +get_localnet_port(const struct hmap *local_datapaths, int64_t tunnel_key) +{ + struct local_datapath *ld = get_local_datapath(local_datapaths, + tunnel_key); + return ld ? ld->localnet_port : NULL; +} + +static bool +get_localnet_port_ofport_tag(const struct controller_ctx *ctx, + const struct sbrec_port_binding *pb, + const struct simap *localnet_ofports, + const struct hmap *local_datapaths, + ofp_port_t *ofport, int *tag) +{ + const char *peer_port = smap_get_def(&pb->options, "peer", ""); + const struct sbrec_port_binding *peer_pb + = lport_lookup_by_name(ctx->ovnsb_idl, peer_port); + if (peer_pb) { + const struct sbrec_port_binding *localnet_port = + get_localnet_port(local_datapaths, + peer_pb->datapath->tunnel_key); + if (localnet_port) { + *ofport = u16_to_ofp(simap_get(localnet_ofports, + localnet_port->logical_port)); + *tag = localnet_port->n_tag ? *localnet_port->tag : -1; + return true; + } + } + return false; +} + +static bool +get_lport_addresses(const struct sbrec_port_binding *pb, + struct lport_addresses *laddrs) +{ + for (int i = 0; i < pb->n_mac; i++) { + if (extract_lsp_addresses(pb->mac[i], laddrs) + && laddrs->n_ipv4_addrs) { + return true; + } + } + return false; +} + +static const struct sbrec_chassis * +get_lport_redirect_chassis(const struct controller_ctx *ctx, + const char *logical_port) +{ + char *redirect_name = xasprintf("cr-%s", logical_port); + const struct sbrec_port_binding *pb + = lport_lookup_by_name(ctx->ovnsb_idl, redirect_name); + if (pb) { + free(redirect_name); + return pb->chassis; + } + + free(redirect_name); + return NULL; +} + +static void +send_arp_request(const struct eth_addr arp_sha, ovs_be32 arp_spa, + ovs_be32 arp_tpa, int port_key, int tag) +{ + /* Compose a arp request packet. */ + uint64_t packet_stub[128 / 8]; + struct dp_packet packet; + dp_packet_use_stub(&packet, packet_stub, sizeof packet_stub); + compose_arp(&packet, ARP_OP_REQUEST, arp_sha, eth_addr_zero, true, + arp_spa, arp_tpa); + + /* Compose arp request packet's vlan if exist. */ + if (tag >= 0) { + eth_push_vlan(&packet, htons(ETH_TYPE_VLAN), htons(tag)); + } + + /* Compose actions. The arp request is output on localnet ofport. */ + uint64_t ofpacts_stub[4096 / 8]; + struct ofpbuf ofpacts = OFPBUF_STUB_INITIALIZER(ofpacts_stub); + enum ofp_version version = rconn_get_version(swconn); + ofpact_put_OUTPUT(&ofpacts)->port = port_key; + + struct ofputil_packet_out po = { + .packet = dp_packet_data(&packet), + .packet_len = dp_packet_size(&packet), + .buffer_id = UINT32_MAX, + .ofpacts = ofpacts.data, + .ofpacts_len = ofpacts.size, + }; + match_set_in_port(&po.flow_metadata, OFPP_CONTROLLER); + enum ofputil_protocol proto = ofputil_protocol_from_ofp_version(version); + queue_msg(ofputil_encode_packet_out(&po, proto)); + dp_packet_uninit(&packet); + ofpbuf_uninit(&ofpacts); +} + +/* refresh mac bindings cache from ovn sb */ +static void +refresh_aging_mac_bindings(const struct controller_ctx *ctx, + const struct sbrec_chassis *chassis) +{ + const struct sbrec_mac_binding *mb; + SBREC_MAC_BINDING_FOR_EACH (mb, ctx->ovnsb_idl) { + /* Check logical_port redirect chassis */ + if (chassis == get_lport_redirect_chassis(ctx, mb->logical_port)) { + const struct sbrec_port_binding *pb + = lport_lookup_by_name(ctx->ovnsb_idl, mb->logical_port); + if (pb) { + insert_aging_mac_bindings(pb->datapath->tunnel_key, + pb->tunnel_key, mb->ip); + } + } + } +} + +static void +aging_mac_bindings_run(const struct controller_ctx *ctx, + const struct ovsrec_bridge *br_int, + const struct sbrec_chassis *chassis, + const struct hmap *local_datapaths) +{ + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); + + refresh_aging_mac_bindings(ctx, chassis); + + struct simap localnet_ofports = SIMAP_INITIALIZER(&localnet_ofports); + get_localnet_vifs(br_int, chassis, &localnet_ofports); + + next_wait_time = LLONG_MAX; + long long int now = time_msec(); + + struct aging_mac_binding *amb, *next_amb; + HMAP_FOR_EACH_SAFE (amb, next_amb, hmap_node, &aging_mac_bindings) { + const struct sbrec_port_binding *pb + = lport_lookup_by_key(ctx->ovnsb_idl, amb->dp_key, amb->port_key); + /* Check logical_port redirect chassis */ + if (chassis != get_lport_redirect_chassis(ctx, pb->logical_port)) { + /* chassisredirect port deleted or move to other chassis */ + hmap_remove(&aging_mac_bindings, &amb->hmap_node); + free(amb); + continue; + } + + /* mac_binding aging time reachable, aging mac_binding. */ + if (now >= amb->timestamp + base_reachable_time) { + remove_mac_bindings(ctx, pb->logical_port, amb->ip_s); + hmap_remove(&aging_mac_bindings, &amb->hmap_node); + free(amb); + continue; + } + + /* send arp request in 1/2 2/3 3/4 base_reachable_time, 3 times. */ + long long int time = (amb->arp_send_count + 1) * base_reachable_time / + (amb->arp_send_count + 2); + if (2 >= amb->arp_send_count && now >= amb->timestamp + time) { + amb->arp_send_count++; + + struct lport_addresses laddrs; + if (!get_lport_addresses(pb, &laddrs)) { + VLOG_WARN_RL(&rl, "lport(%s) no ip addresses.", + pb->logical_port); + continue; + } + + ofp_port_t ofport; + int tag; + if (!get_localnet_port_ofport_tag(ctx, pb, &localnet_ofports, + local_datapaths, &ofport, + &tag)) { + VLOG_WARN_RL(&rl, "lport(%s) can not find localnet port.", + pb->logical_port); + continue; + } + send_arp_request(laddrs.ea, laddrs.ipv4_addrs[0].addr, + inet_addr(amb->ip_s), ofport, tag); + } + + long long int amb_next_wait_time; + if (2 < amb->arp_send_count) { + amb_next_wait_time = amb->timestamp + base_reachable_time; + } else { + time = (amb->arp_send_count + 1) * base_reachable_time / + (amb->arp_send_count + 2); + amb_next_wait_time = amb->timestamp + time; + } + if (next_wait_time > amb_next_wait_time) { + next_wait_time = amb_next_wait_time; + } + } + + simap_destroy(&localnet_ofports); +} + /* * Send gratuitous ARP for vif on localnet. *