From patchwork Thu Aug 25 10:16:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xavier Simonart X-Patchwork-Id: 1670174 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Yo4OJuA1; dkim-atps=neutral Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MCzQq3bHzz1yg7 for ; Thu, 25 Aug 2022 20:16:51 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id CDEC96105B; Thu, 25 Aug 2022 10:16:48 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org CDEC96105B Authentication-Results: smtp3.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Yo4OJuA1 X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cyZoVfbil-C7; Thu, 25 Aug 2022 10:16:43 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id 33CDB60D88; Thu, 25 Aug 2022 10:16:42 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 33CDB60D88 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 05C41C0032; Thu, 25 Aug 2022 10:16:42 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id A1D33C002D for ; Thu, 25 Aug 2022 10:16:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 6F4EB61040 for ; Thu, 25 Aug 2022 10:16:40 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 6F4EB61040 X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YFbHO_iGdLH0 for ; Thu, 25 Aug 2022 10:16:34 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 7C76060D88 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id 7C76060D88 for ; Thu, 25 Aug 2022 10:16:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661422593; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=EjhDFqGmGrqvXtxqA+c1r+HuxycyT0at0PMLJmBkXQ0=; b=Yo4OJuA1Inprd8vMomYjqRaL6/abz88tKDX0Ho0cZDF72Jf/LN8VkOwkRsI4Qr+B5m1dec 6D0RSTlRM4JLKJazIIngiJcXCzp9yE09iUsNcjRIjOzXcOYKJXLTa5JHrfqGHA8pfB4+NM /bSif/jlpcaz1hdCJBFXTw5tssies/w= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-630-bc7SfqddOFGYEFVc1zovcQ-1; Thu, 25 Aug 2022 06:16:32 -0400 X-MC-Unique: bc7SfqddOFGYEFVc1zovcQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3E5A385A585 for ; Thu, 25 Aug 2022 10:16:32 +0000 (UTC) Received: from wsfd-netdev90.ntdv.lab.eng.bos.redhat.com (wsfd-netdev90.ntdv.lab.eng.bos.redhat.com [10.19.188.196]) by smtp.corp.redhat.com (Postfix) with ESMTP id 25DA71121315; Thu, 25 Aug 2022 10:16:32 +0000 (UTC) From: Xavier Simonart To: xsimonar@redhat.com, dev@openvswitch.org Date: Thu, 25 Aug 2022 06:16:32 -0400 Message-Id: <20220825101632.773113-1-xsimonar@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH ovn v2] northd: Fix multicast table full X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" active_v4_flows count was intialized when the northd node was computed. However, neither sb_multicast_group nor en_sb_igmp_group causes northd updates. Hence this count could keep increasing while processing igmp groups. This issue was sometimes 'hidden' by northd recomputes due to lflow unable to be incrementally processed (sb busy). active_v4_flows is now reinitialized right before building flows (i.e. as part of the lflow node, which is computed on igmp group changes). Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2094710 Signed-off-by: Xavier Simonart --- v2: - rebased on main - use platform independent print format --- northd/northd.c | 18 +++++-- tests/system-ovn.at | 125 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 140 insertions(+), 3 deletions(-) diff --git a/northd/northd.c b/northd/northd.c index 7e2681865..d8a9ae769 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -1051,7 +1051,12 @@ init_mcast_info_for_switch_datapath(struct ovn_datapath *od) mcast_sw_info->query_max_response = smap_get_ullong(&od->nbs->other_config, "mcast_query_max_response", OVN_MCAST_DEFAULT_QUERY_MAX_RESPONSE_S); +} +static void +init_mcast_flow_count(struct ovn_datapath *od) +{ + struct mcast_switch_info *mcast_sw_info = &od->mcast_info.sw; mcast_sw_info->active_v4_flows = ATOMIC_VAR_INIT(0); mcast_sw_info->active_v6_flows = ATOMIC_VAR_INIT(0); } @@ -8368,6 +8373,10 @@ build_lswitch_ip_mcast_igmp_mld(struct ovn_igmp_group *igmp_group, if (atomic_compare_exchange_strong( &mcast_sw_info->active_v4_flows, &table_size, mcast_sw_info->table_size)) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); + + VLOG_INFO_RL(&rl, "Too many active mcast flows: %"PRIu64, + mcast_sw_info->active_v4_flows); return; } atomic_add(&mcast_sw_info->active_v4_flows, 1, &dummy); @@ -15069,6 +15078,11 @@ build_mcast_groups(struct lflow_input *input_data, hmap_init(mcast_groups); hmap_init(igmp_groups); + struct ovn_datapath *od; + + HMAP_FOR_EACH (od, key_node, datapaths) { + init_mcast_flow_count(od); + } HMAP_FOR_EACH (op, key_node, ports) { if (op->nbrp && lrport_is_enabled(op->nbrp)) { @@ -15126,8 +15140,7 @@ build_mcast_groups(struct lflow_input *input_data, } /* If the datapath value is stale, purge the group. */ - struct ovn_datapath *od = - ovn_datapath_from_sbrec(datapaths, sb_igmp->datapath); + od = ovn_datapath_from_sbrec(datapaths, sb_igmp->datapath); if (!od || ovn_datapath_is_stale(od)) { sbrec_igmp_group_delete(sb_igmp); @@ -15172,7 +15185,6 @@ build_mcast_groups(struct lflow_input *input_data, * IGMP groups are based on the groups learnt by their multicast enabled * peers. */ - struct ovn_datapath *od; HMAP_FOR_EACH (od, key_node, datapaths) { if (ovs_list_is_empty(&od->mcast_info.groups)) { diff --git a/tests/system-ovn.at b/tests/system-ovn.at index 992813614..87093fbcc 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -8266,3 +8266,128 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD([ +AT_SETUP([mcast flow count]) + +ovn_start + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-int]) + +# Set external-ids in br-int needed for ovn-controller +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true + +# Start ovn-controller +start_daemon ovn-controller + +check ovn-nbctl ls-add ls +check ovn-nbctl lsp-add ls vm1 +check ovn-nbctl lsp-set-addresses vm1 00:00:00:00:00:01 +check ovn-nbctl lsp-add ls vm2 +check ovn-nbctl lsp-set-addresses vm2 00:00:00:00:00:02 +check ovn-nbctl lsp-add ls vm3 +check ovn-nbctl lsp-set-addresses vm3 00:00:00:00:00:03 + +check ovn-nbctl set logical_switch ls other_config:mcast_querier=false other_config:mcast_snoop=true other_config:mcast_query_interval=30 other_config:mcast_eth_src=00:00:00:00:00:05 other_config:mcast_ip4_src=42.42.42.5 other_config:mcast_ip6_src=fe80::1 other_config:mcast_idle_timeout=3000 +ovn-sbctl list ip_multicast + +wait_igmp_flows_installed() +{ + OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int table=31 | \ + grep 'priority=90' | grep "nw_dst=$1"]) +} + +ADD_NAMESPACES(vm1) +ADD_INT([vm1], [vm1], [br-int], [42.42.42.1/24]) +NS_CHECK_EXEC([vm1], [ip link set vm1 address 00:00:00:00:00:01], [0]) +NS_CHECK_EXEC([vm1], [ip route add default via 42.42.42.5], [0]) +NS_CHECK_EXEC([vm1], [ip -6 addr add 2000::1/24 dev vm1], [0]) +NS_CHECK_EXEC([vm1], [ip -6 route add default via 2000::5], [0]) +check ovs-vsctl set Interface vm1 external_ids:iface-id=vm1 + +ADD_NAMESPACES(vm2) +ADD_INT([vm2], [vm2], [br-int], [42.42.42.2/24]) +NS_CHECK_EXEC([vm2], [ip link set vm2 address 00:00:00:00:00:02], [0]) +NS_CHECK_EXEC([vm2], [ip -6 addr add 2000::2/64 dev vm2], [0]) +NS_CHECK_EXEC([vm2], [ip link set lo up], [0]) +check ovs-vsctl set Interface vm2 external_ids:iface-id=vm2 + +ADD_NAMESPACES(vm3) +NS_CHECK_EXEC([vm3], [tcpdump -n -i any -nnle > vm3.pcap 2>/dev/null &], [ignore], [ignore]) + +ADD_INT([vm3], [vm3], [br-int], [42.42.42.3/24]) +NS_CHECK_EXEC([vm3], [ip link set vm3 address 00:00:00:00:00:03], [0]) +NS_CHECK_EXEC([vm3], [ip -6 addr add 2000::3/64 dev vm3], [0]) +NS_CHECK_EXEC([vm3], [ip link set lo up], [0]) +NS_CHECK_EXEC([vm3], [ip route add default via 42.42.42.5], [0]) +NS_CHECK_EXEC([vm3], [ip -6 route add default via 2000::5], [0]) +check ovs-vsctl set Interface vm3 external_ids:iface-id=vm3 + +NS_CHECK_EXEC([vm2], [sysctl -w net.ipv4.igmp_max_memberships=100], [ignore], [ignore]) +NS_CHECK_EXEC([vm3], [sysctl -w net.ipv4.igmp_max_memberships=100], [ignore], [ignore]) +wait_for_ports_up + +NS_CHECK_EXEC([vm3], [ip addr add 228.0.0.1 dev vm3 autojoin], [0]) +wait_igmp_flows_installed 228.0.0.1 + +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 228.0.0.1], [ignore], [ignore]) + +OVS_WAIT_UNTIL([ + requests=`grep "ICMP echo request" -c vm3.pcap` + test "${requests}" -ge "3" +]) +kill $(pidof tcpdump) + +NS_CHECK_EXEC([vm3], [tcpdump -n -i any -nnleX > vm3.pcap 2>/dev/null &], [ignore], [ignore]) +NS_CHECK_EXEC([vm2], [tcpdump -n -i any -nnleX > vm2.pcap 2>/dev/null &], [ignore], [ignore]) +NS_CHECK_EXEC([vm1], [tcpdump -n -i any -nnleX > vm1.pcap 2>/dev/null &], [ignore], [ignore]) + +for i in `seq 1 40`;do + NS_CHECK_EXEC([vm2], [ip addr add 228.1.$i.1 dev vm2 autojoin &], [0]) + NS_CHECK_EXEC([vm3], [ip addr add 229.1.$i.1 dev vm3 autojoin &], [0]) + # Do not go too fast. If going fast, there is a higher chance of sb being busy, causing full recompute (engine has not run) + # In this test, we do not want too many recomputes as they might hide I+I related errors + sleep 0.2 +done + +for i in `seq 1 40`;do + wait_igmp_flows_installed 228.1.$i.1 + wait_igmp_flows_installed 229.1.$i.1 +done +ovn-sbctl list multicast_group + +NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2 228.1.1.1], [ignore], [ignore]) + +OVS_WAIT_UNTIL([ + requests=`grep "ICMP echo request" -c vm2.pcap` + test "${requests}" -ge "3" +]) +kill $(pidof tcpdump) + +# The test could succeed thanks to a lucky northd recompute...after hitting too any flows +# Double check we never hit error condition +AT_CHECK([grep -qE 'Too many active mcast flows' northd/ovn-northd.log], [1]) + +OVS_APP_EXIT_AND_WAIT([ovn-controller]) + +as ovn-sb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as ovn-nb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as northd +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE]) + +as +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d +/connection dropped.*/d +/removing policing failed: No such device/d"]) +AT_CLEANUP +])