From patchwork Mon Aug 29 11:30:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xavier Simonart X-Patchwork-Id: 1671398 X-Patchwork-Delegate: dceara@redhat.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=bTRK8VJf; dkim-atps=neutral Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MGStH1Vgwz1yg7 for ; Mon, 29 Aug 2022 21:30:45 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 45A7C401C6; Mon, 29 Aug 2022 11:30:43 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 45A7C401C6 Authentication-Results: smtp4.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=bTRK8VJf X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B_sfyG9v9sPd; Mon, 29 Aug 2022 11:30:41 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id 945BC40016; Mon, 29 Aug 2022 11:30:40 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 945BC40016 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 658F4C0033; Mon, 29 Aug 2022 11:30:40 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 3A562C002D for ; Mon, 29 Aug 2022 11:30:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 01D304004B for ; Mon, 29 Aug 2022 11:30:39 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 01D304004B X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ANi87Gp7ey02 for ; Mon, 29 Aug 2022 11:30:37 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 1F43840016 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp4.osuosl.org (Postfix) with ESMTPS id 1F43840016 for ; Mon, 29 Aug 2022 11:30:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661772635; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9izimUSe8a0i7T/9JyGPcOQgHIQHfZFsToZ3+PWt548=; b=bTRK8VJfEHXCJg3qRwRQVmtvqCbxeMU8w5YXbFwC7+gdL2ixeK0jVbHPbOT3lnmJj9ppDo skauNIZx6tIsAPTg/KXFKFQyK1VxK3lGI80FT+ccJ2uWABuTfPXUYIP30xVZ5FSDDY9o4k 3Mr353p01r0Cjkp7jXrstREPgm5pfKs= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-247-Y6d9EEY6Of-OafsFmNxD2w-1; Mon, 29 Aug 2022 07:30:34 -0400 X-MC-Unique: Y6d9EEY6Of-OafsFmNxD2w-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 43C953C138A6 for ; Mon, 29 Aug 2022 11:30:34 +0000 (UTC) Received: from wsfd-netdev90.ntdv.lab.eng.bos.redhat.com (wsfd-netdev90.ntdv.lab.eng.bos.redhat.com [10.19.188.196]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2335B492C3B; Mon, 29 Aug 2022 11:30:34 +0000 (UTC) From: Xavier Simonart To: xsimonar@redhat.com, dev@openvswitch.org Date: Mon, 29 Aug 2022 07:30:34 -0400 Message-Id: <20220829113034.2850198-1-xsimonar@redhat.com> In-Reply-To: <20220825101632.773113-1-xsimonar@redhat.com> References: <20220825101632.773113-1-xsimonar@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: dceara@redhat.com Subject: [ovs-dev] [PATCH ovn v3] northd: Fix multicast table full X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" active_v4_flows count was intialized when the northd node was computed. However, neither sb_multicast_group nor en_sb_igmp_group causes northd updates. Hence this count could keep increasing while processing igmp groups. This issue was sometimes 'hidden' by northd recomputes due to lflow unable to be incrementally processed (sb busy). active_v4_flows is now reinitialized right before building flows (i.e. as part of the lflow node, which is computed on igmp group changes). Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2094710 Signed-off-by: Xavier Simonart --- v2: - rebased on main - use platform independent print format v3: - updated based on Dumitru's feedback - removed other unused ipv6 configuration and unused tcpdumps from testcase Acked-by: Dumitru Ceara --- northd/northd.c | 22 +++++++-- tests/system-ovn.at | 117 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 136 insertions(+), 3 deletions(-) diff --git a/northd/northd.c b/northd/northd.c index 7e2681865..4f33ae3c3 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -1051,7 +1051,16 @@ init_mcast_info_for_switch_datapath(struct ovn_datapath *od) mcast_sw_info->query_max_response = smap_get_ullong(&od->nbs->other_config, "mcast_query_max_response", OVN_MCAST_DEFAULT_QUERY_MAX_RESPONSE_S); +} + +static void +init_mcast_flow_count(struct ovn_datapath *od) +{ + if (od->nbr) { + return; + } + struct mcast_switch_info *mcast_sw_info = &od->mcast_info.sw; mcast_sw_info->active_v4_flows = ATOMIC_VAR_INIT(0); mcast_sw_info->active_v6_flows = ATOMIC_VAR_INIT(0); } @@ -8368,6 +8377,10 @@ build_lswitch_ip_mcast_igmp_mld(struct ovn_igmp_group *igmp_group, if (atomic_compare_exchange_strong( &mcast_sw_info->active_v4_flows, &table_size, mcast_sw_info->table_size)) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + + VLOG_INFO_RL(&rl, "Too many active mcast flows: %"PRIu64, + mcast_sw_info->active_v4_flows); return; } atomic_add(&mcast_sw_info->active_v4_flows, 1, &dummy); @@ -15069,6 +15082,11 @@ build_mcast_groups(struct lflow_input *input_data, hmap_init(mcast_groups); hmap_init(igmp_groups); + struct ovn_datapath *od; + + HMAP_FOR_EACH (od, key_node, datapaths) { + init_mcast_flow_count(od); + } HMAP_FOR_EACH (op, key_node, ports) { if (op->nbrp && lrport_is_enabled(op->nbrp)) { @@ -15126,8 +15144,7 @@ build_mcast_groups(struct lflow_input *input_data, } /* If the datapath value is stale, purge the group. */ - struct ovn_datapath *od = - ovn_datapath_from_sbrec(datapaths, sb_igmp->datapath); + od = ovn_datapath_from_sbrec(datapaths, sb_igmp->datapath); if (!od || ovn_datapath_is_stale(od)) { sbrec_igmp_group_delete(sb_igmp); @@ -15172,7 +15189,6 @@ build_mcast_groups(struct lflow_input *input_data, * IGMP groups are based on the groups learnt by their multicast enabled * peers. */ - struct ovn_datapath *od; HMAP_FOR_EACH (od, key_node, datapaths) { if (ovs_list_is_empty(&od->mcast_info.groups)) { diff --git a/tests/system-ovn.at b/tests/system-ovn.at index 992813614..7f919f0ed 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -8266,3 +8266,120 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD([ +AT_SETUP([mcast flow count]) +AT_KEYWORDS([ovnigmp IP-multicast]) +AT_SKIP_IF([test $HAVE_TCPDUMP = no]) +ovn_start + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-int]) + +# Set external-ids in br-int needed for ovn-controller +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true + +# Start ovn-controller +start_daemon ovn-controller + +check ovn-nbctl ls-add ls +check ovn-nbctl lsp-add ls vm1 +check ovn-nbctl lsp-set-addresses vm1 00:00:00:00:00:01 +check ovn-nbctl lsp-add ls vm2 +check ovn-nbctl lsp-set-addresses vm2 00:00:00:00:00:02 +check ovn-nbctl lsp-add ls vm3 +check ovn-nbctl lsp-set-addresses vm3 00:00:00:00:00:03 + +check ovn-nbctl set logical_switch ls other_config:mcast_querier=false other_config:mcast_snoop=true other_config:mcast_query_interval=30 other_config:mcast_eth_src=00:00:00:00:00:05 other_config:mcast_ip4_src=42.42.42.5 other_config:mcast_ip6_src=fe80::1 other_config:mcast_idle_timeout=3000 +ovn-sbctl list ip_multicast + +wait_igmp_flows_installed() +{ + OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int table=31 | \ + grep 'priority=90' | grep "nw_dst=$1"]) +} + +ADD_NAMESPACES(vm1) +ADD_INT([vm1], [vm1], [br-int], [42.42.42.1/24]) +NS_CHECK_EXEC([vm1], [ip link set vm1 address 00:00:00:00:00:01], [0]) +NS_CHECK_EXEC([vm1], [ip route add default via 42.42.42.5], [0]) +check ovs-vsctl set Interface vm1 external_ids:iface-id=vm1 + +ADD_NAMESPACES(vm2) +ADD_INT([vm2], [vm2], [br-int], [42.42.42.2/24]) +NS_CHECK_EXEC([vm2], [ip link set vm2 address 00:00:00:00:00:02], [0]) +NS_CHECK_EXEC([vm2], [ip link set lo up], [0]) +check ovs-vsctl set Interface vm2 external_ids:iface-id=vm2 + +ADD_NAMESPACES(vm3) +NETNS_DAEMONIZE([vm3], [tcpdump -n -i any -nnleX > vm3.pcap 2>/dev/null], [tcpdump3.pid]) + +ADD_INT([vm3], [vm3], [br-int], [42.42.42.3/24]) +NS_CHECK_EXEC([vm3], [ip link set vm3 address 00:00:00:00:00:03], [0]) +NS_CHECK_EXEC([vm3], [ip link set lo up], [0]) +NS_CHECK_EXEC([vm3], [ip route add default via 42.42.42.5], [0]) +check ovs-vsctl set Interface vm3 external_ids:iface-id=vm3 + +NS_CHECK_EXEC([vm2], [sysctl -w net.ipv4.igmp_max_memberships=100], [ignore], [ignore]) +NS_CHECK_EXEC([vm3], [sysctl -w net.ipv4.igmp_max_memberships=100], [ignore], [ignore]) +wait_for_ports_up + +NS_CHECK_EXEC([vm3], [ip addr add 228.0.0.1 dev vm3 autojoin], [0]) +wait_igmp_flows_installed 228.0.0.1 + +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 228.0.0.1], [ignore], [ignore]) + +OVS_WAIT_UNTIL([ + requests=`grep "ICMP echo request" -c vm3.pcap` + test "${requests}" -ge "3" +]) + +NETNS_DAEMONIZE([vm2], [tcpdump -n -i any -nnleX > vm2.pcap 2>/dev/null], [tcpdump2.pid]) + +for i in `seq 1 40`;do + NS_CHECK_EXEC([vm2], [ip addr add 228.1.$i.1 dev vm2 autojoin &], [0]) + NS_CHECK_EXEC([vm3], [ip addr add 229.1.$i.1 dev vm3 autojoin &], [0]) + # Do not go too fast. If going fast, there is a higher chance of sb being busy, causing full recompute (engine has not run) + # In this test, we do not want too many recomputes as they might hide I+I related errors + sleep 0.2 +done + +for i in `seq 1 40`;do + wait_igmp_flows_installed 228.1.$i.1 + wait_igmp_flows_installed 229.1.$i.1 +done +ovn-sbctl list multicast_group + +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 228.1.1.1], [ignore], [ignore]) + +OVS_WAIT_UNTIL([ + requests=`grep "ICMP echo request" -c vm2.pcap` + test "${requests}" -ge "3" +]) + +# The test could succeed thanks to a lucky northd recompute...after hitting too any flows +# Double check we never hit error condition +AT_CHECK([grep -qE 'Too many active mcast flows' northd/ovn-northd.log], [1]) + +OVS_APP_EXIT_AND_WAIT([ovn-controller]) + +as ovn-sb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as ovn-nb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as northd +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE]) + +as +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d +/connection dropped.*/d +/removing policing failed: No such device/d"]) +AT_CLEANUP +])