From patchwork Thu Mar 10 03:04:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1603658 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4KDYp75hrJz9sGD for ; Thu, 10 Mar 2022 14:05:02 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 19570832BF; Thu, 10 Mar 2022 03:05:00 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7on-lBonh1Qx; Thu, 10 Mar 2022 03:04:58 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp1.osuosl.org (Postfix) with ESMTPS id C9D04817AB; Thu, 10 Mar 2022 03:04:57 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id A4BC6C001D; Thu, 10 Mar 2022 03:04:57 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 099BCC000B for ; Thu, 10 Mar 2022 03:04:57 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id F340641740 for ; Thu, 10 Mar 2022 03:04:56 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8FyzBr37V6EA for ; Thu, 10 Mar 2022 03:04:55 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from relay12.mail.gandi.net (relay12.mail.gandi.net [217.70.178.232]) by smtp4.osuosl.org (Postfix) with ESMTPS id 4D74C415E0 for ; Thu, 10 Mar 2022 03:04:55 +0000 (UTC) Received: (Authenticated sender: numans@ovn.org) by mail.gandi.net (Postfix) with ESMTPSA id 57B6E200007; Thu, 10 Mar 2022 03:04:51 +0000 (UTC) From: numans@ovn.org To: dev@openvswitch.org Date: Wed, 9 Mar 2022 22:04:38 -0500 Message-Id: <20220310030438.4183020-1-numans@ovn.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Cc: Michael Washer Subject: [ovs-dev] [PATCH ovn] ovn-northd: Skip conntrack related stages for router ports in switch pipeline. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Numan Siddique Presently, if the inport or outport is a peer port (of router port), then we skip sending the packet to conntrack by not setting the reg0[0]/reg0[1] bits. But the packet still goes through the stages - ls_in_acl_hint, ls_in_acl and ls_in_stateful in the ingress pipeline and similarly ls_out_acl_hint, ls_out_acl and ls_out_stateful in the egress pipeline of the logical switch. In these mentioned stages, the logical flows match on the ct states (ct.new, ct.est etc) and this can be problematic if the inport or outport is peer port. It is more problematic if the packet was sent to conntrack and committed in the ingress pipeline. When that packet enters the egress pipeline with outport set to the peer port, we skip sending the packet to conntrack (which is expected) but ct state values carry over from the ingress state. If the ct.new flag was set in the ingress pipeline, then the below flows are hit and the packet gets committed in the conntrack zone of the inport logical port. table=4 (ls_out_acl ), priority=1 , match=(ip && (!ct.est || (ct.est && ct_label.blocked == 1))), action=(reg0[1] = 1; next;) table=7 (ls_out_stateful ), priority=100 , match=(reg0[1] == 1 && reg0[13] == 0), action=(ct_commit { ct_label.blocked = 0; }; next;) With this extra commit to the same conntrack zone, sometimes the packet gets dropped or doesn't reach the destination. It is not very clear how the packet gets misplaced, but avoiding this resolves the issue. And OVN ideally should not do this extra commit. To address this issue, this patch adds the logical flows to skip all the conntrack related stages if the inport or outport is peer logical port. table=5 (ls_in_pre_acl ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next(pipeline=ingress,table=18);) table=0 (ls_out_pre_lb ), priority=110 , match=(ip && outport == "sw0-lr0"), action=(next(pipeline=egress,table=8);) Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2060688 Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2062431 Reported-by: Michael Washer Signed-off-by: Numan Siddique --- northd/northd.c | 27 +++++++++++++++++---------- northd/ovn-northd.8.xml | 6 ++++-- tests/ovn-northd.at | 26 ++++++++++++++++++++++---- 3 files changed, 43 insertions(+), 16 deletions(-) diff --git a/northd/northd.c b/northd/northd.c index b264fb850..15e8b147b 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -5661,20 +5661,27 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op, * know about the connection, as the icmp request went through the logical * router on hostA, not hostB. This would only work with distributed * conntrack state across all chassis. */ - struct ds match_in = DS_EMPTY_INITIALIZER; - struct ds match_out = DS_EMPTY_INITIALIZER; + char *ingress_action = + xasprintf("next(pipeline=ingress,table=%d);", + ovn_stage_get_table(S_SWITCH_IN_ARP_ND_RSP)); + char *egress_action = + xasprintf("next(pipeline=egress,table=%d);", + ovn_stage_get_table(S_SWITCH_OUT_PORT_SEC_IP)); + + char *ingress_match = xasprintf("ip && inport == %s", op->json_key); + char *egress_match = xasprintf("ip && outport == %s", op->json_key); - ds_put_format(&match_in, "ip && inport == %s", op->json_key); - ds_put_format(&match_out, "ip && outport == %s", op->json_key); ovn_lflow_add_with_lport_and_hint(lflows, od, in_stage, priority, - ds_cstr(&match_in), "next;", op->key, - &op->nbsp->header_); + ingress_match, ingress_action, + op->key, &op->nbsp->header_); ovn_lflow_add_with_lport_and_hint(lflows, od, out_stage, priority, - ds_cstr(&match_out), "next;", op->key, - &op->nbsp->header_); + egress_match, egress_action, + op->key, &op->nbsp->header_); - ds_destroy(&match_in); - ds_destroy(&match_out); + free(ingress_action); + free(egress_action); + free(ingress_match); + free(egress_match); } static void diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 4784bff04..217f10ead 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -505,7 +505,8 @@ Pre-stateful to send IP packets to the connection tracker before eventually advancing to ingress table ACLs. If special ports such as route ports or localnet ports can't use ct(), a - priority-110 flow is added to skip over stateful ACLs. Multicast, IPv6 + priority-110 flow is added which skips all the stateful stages and + advances to the next stage after the stateful. Multicast, IPv6 Neighbor Discovery and MLD traffic also skips stateful ACLs. For "allow-stateless" ACLs, a flow is added to bypass setting the hint for connection tracker processing. @@ -589,7 +590,8 @@

This table also has a priority-110 flow with the match inport == I for all logical switch - datapaths to move traffic to the next table. Where I + datapaths which skips all the stateful stages and advances + to the next stage after the stateful. Where I is the peer of a logical router port. This flow is added to skip the connection tracking of packets which enter from logical router datapath to logical switch datapath. diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 60d91a771..f228e07cb 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -3934,7 +3934,7 @@ check_stateful_flows() { table=6 (ls_in_pre_lb ), priority=100 , match=(ip), action=(reg0[[2]] = 1; next;) table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) table=6 (ls_in_pre_lb ), priority=110 , match=(eth.mcast), action=(next;) - table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) + table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next(pipeline=ingress,table=18);) table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) ]) @@ -3967,7 +3967,7 @@ check_stateful_flows() { table=0 (ls_out_pre_lb ), priority=100 , match=(ip), action=(reg0[[2]] = 1; next;) table=0 (ls_out_pre_lb ), priority=110 , match=(eth.mcast), action=(next;) table=0 (ls_out_pre_lb ), priority=110 , match=(eth.src == $svc_monitor_mac), action=(next;) - table=0 (ls_out_pre_lb ), priority=110 , match=(ip && outport == "sw0-lr0"), action=(next;) + table=0 (ls_out_pre_lb ), priority=110 , match=(ip && outport == "sw0-lr0"), action=(next(pipeline=egress,table=8);) table=0 (ls_out_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) ]) @@ -4006,10 +4006,19 @@ AT_CHECK([grep "ls_in_pre_lb" sw0flows | sort], [0], [dnl table=6 (ls_in_pre_lb ), priority=0 , match=(1), action=(next;) table=6 (ls_in_pre_lb ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) table=6 (ls_in_pre_lb ), priority=110 , match=(eth.mcast), action=(next;) - table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next;) + table=6 (ls_in_pre_lb ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next(pipeline=ingress,table=18);) table=6 (ls_in_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) ]) +AT_CHECK([grep "ls_in_pre_acl" sw0flows | sed 's/table=./table=?/' | sort], [0], [dnl + table=? (ls_in_pre_acl ), priority=0 , match=(1), action=(next;) + table=? (ls_in_pre_acl ), priority=100 , match=(ip), action=(reg0[[0]] = 1; next;) + table=? (ls_in_pre_acl ), priority=110 , match=(eth.dst == $svc_monitor_mac), action=(next;) + table=? (ls_in_pre_acl ), priority=110 , match=(eth.mcast), action=(next;) + table=? (ls_in_pre_acl ), priority=110 , match=(ip && inport == "sw0-lr0"), action=(next(pipeline=ingress,table=18);) + table=? (ls_in_pre_acl ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2 || (udp && udp.src == 546 && udp.dst == 547)), action=(next;) +]) + AT_CHECK([grep "ls_in_pre_stateful" sw0flows | sort], [0], [dnl table=7 (ls_in_pre_stateful ), priority=0 , match=(1), action=(next;) table=7 (ls_in_pre_stateful ), priority=100 , match=(reg0[[0]] == 1), action=(ct_next;) @@ -4036,10 +4045,19 @@ AT_CHECK([grep "ls_out_pre_lb" sw0flows | sort], [0], [dnl table=0 (ls_out_pre_lb ), priority=0 , match=(1), action=(next;) table=0 (ls_out_pre_lb ), priority=110 , match=(eth.mcast), action=(next;) table=0 (ls_out_pre_lb ), priority=110 , match=(eth.src == $svc_monitor_mac), action=(next;) - table=0 (ls_out_pre_lb ), priority=110 , match=(ip && outport == "sw0-lr0"), action=(next;) + table=0 (ls_out_pre_lb ), priority=110 , match=(ip && outport == "sw0-lr0"), action=(next(pipeline=egress,table=8);) table=0 (ls_out_pre_lb ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2), action=(next;) ]) +AT_CHECK([grep "ls_out_pre_acl" sw0flows | sed 's/table=./table=?/' | sort], [0], [dnl + table=? (ls_out_pre_acl ), priority=0 , match=(1), action=(next;) + table=? (ls_out_pre_acl ), priority=100 , match=(ip), action=(reg0[[0]] = 1; next;) + table=? (ls_out_pre_acl ), priority=110 , match=(eth.mcast), action=(next;) + table=? (ls_out_pre_acl ), priority=110 , match=(eth.src == $svc_monitor_mac), action=(next;) + table=? (ls_out_pre_acl ), priority=110 , match=(ip && outport == "sw0-lr0"), action=(next(pipeline=egress,table=8);) + table=? (ls_out_pre_acl ), priority=110 , match=(nd || nd_rs || nd_ra || mldv1 || mldv2 || (udp && udp.src == 546 && udp.dst == 547)), action=(next;) +]) + AT_CHECK([grep "ls_out_pre_stateful" sw0flows | sort], [0], [dnl table=2 (ls_out_pre_stateful), priority=0 , match=(1), action=(next;) table=2 (ls_out_pre_stateful), priority=100 , match=(reg0[[0]] == 1), action=(ct_next;)