Message ID | 20200424075507.1811244-1-numans@ovn.org |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,ovn] Fix conntrack entry leaks because of TCP RST packets not sent to conntrack. | expand |
On 4/24/20 9:55 AM, numans@ovn.org wrote: > From: Numan Siddique <numans@ovn.org> > > The commit [1] - 28097d5adb95("Fix tcp_reset action handling") fixed an issue > with tcp_reset OVN action. In order to fix that issue, this commit added > logical flows to skip all the TCP RST packets from conntrack. > Ideally it should have skipped only the TCP RST packets generated by > ovn-controller from conntrack. Since all the TCP RST packets are > skipped from conntrack, the connections in conntrack remain in > ESTABLISHED state even if the client/server sends TCP RST to close the > connection. And these entries live for a long time and this is > causing performance issues as reported in the BZ. > > This patch reverts the logical flows added in [1] and modifies the inner > actions of tcp_reset in the ingress logical switch pipeline > from - "tcp_reset { outport <-> inport; output; }" > to "tcp_reset { output <-> inport; next(pipeline=egress,table=5); }". > This causes the packet to resubmit to the egress table ls_out_qos_mark > skipping the egress ACL stage. Prior to this packet, next action was > not allowing a resubmit from ingress to egress pipeline. This patch > relaxes this limitation. > > For the tcp_reset action in the egress logical switch pipeline, this patch > modifies the inner action > from - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=0); }" > to - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=19); }". > This causes the packet to enter the ingress table ls_in_l2_lkup. > > We don't see similar conntrack leaks with UDP. Although there is an issue > with the acl reject action for UDP packets. When ovn-controller generates icmp > destination unreachable packet, it doesn't get delivered. And the IP checksum is > incorrect in this packet. A follow up patch will fix these issues. > > [1] - 28097d5adb95("Fix tcp_reset action handling") > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1819785 > Co-Authored-by: Tim Rozet <trozet@redhat.com> > Signed-off-by: Tim Rozet <trozet@redhat.com> > Signed-off-by: Numan Siddique <numans@ovn.org> Hi Numan, Looks good to me. Only a question inline regarding options for avoiding hardcoding table IDs. Except for that: Acked-by: Dumitru Ceara <dceara@redhat.com> Thanks, Dumitru > --- > lib/actions.c | 6 +- > northd/ovn-northd.8.xml | 8 ++ > northd/ovn-northd.c | 14 ++-- > ovn-sb.xml | 10 ++- > tests/automake.mk | 3 +- > tests/ovn.at | 6 +- > tests/system-ovn.at | 170 +++++++++++++++++++++++++++++++++++----- > tests/test-tcp-rst.py | 37 +++++++++ > 8 files changed, 216 insertions(+), 38 deletions(-) > create mode 100644 tests/test-tcp-rst.py > > diff --git a/lib/actions.c b/lib/actions.c > index c19813de0..91619c889 100644 > --- a/lib/actions.c > +++ b/lib/actions.c > @@ -319,11 +319,7 @@ parse_NEXT(struct action_context *ctx) > } > } > > - if (pipeline == OVNACT_P_EGRESS && ctx->pp->pipeline == OVNACT_P_INGRESS) { > - lexer_error(ctx->lexer, > - "\"next\" action cannot advance from ingress to egress " > - "pipeline (use \"output\" action instead)"); > - } else if (table >= ctx->pp->n_tables) { > + if (table >= ctx->pp->n_tables) { > lexer_error(ctx->lexer, > "\"next\" action cannot advance beyond table %d.", > ctx->pp->n_tables - 1); > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index efcc4b7fc..d39e259f6 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -373,6 +373,14 @@ > for new connections and <code>reg0[1] = 1; next;</code> for existing > connections. > </li> > + <li> > + <code>reject</code> ACLs translate into logical > + flows with the > + <code>tcp_reset { output <-> inport; > + next(pipeline=egress,table=5);}</code> > + action for TCP connections and <code>icmp4/icmp6</code> action > + for UDP connections. > + </li> > <li> > Other ACLs translate to <code>drop;</code> for new or untracked > connections and <code>ct_commit(ct_label=1/1);</code> for known > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index c4675bd68..f8ae155a2 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -4721,11 +4721,11 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > * unreachable packets. */ > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > - "icmp6.type == 1 || (tcp && tcp.flags == 20) || " > + "icmp6.type == 1 || " > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > - "icmp6.type == 1 || (tcp && tcp.flags == 20) ||" > + "icmp6.type == 1 || " > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > /* Ingress and Egress Pre-ACL Table (Priority 100). > @@ -4838,11 +4838,11 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, > /* Do not send ND packets to conntrack */ > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > + "icmp6.type == 1", > "next;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > + "icmp6.type == 1", > "next;"); > > /* Do not send service monitor packets to conntrack. */ > @@ -4988,7 +4988,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > ds_put_format(&actions, "reg0 = 0; " > "eth.dst <-> eth.src; ip4.dst <-> ip4.src; " > "tcp_reset { outport <-> inport; %s };", > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > + ingress ? "next(pipeline=egress,table=5);" > + : "next(pipeline=ingress,table=19);"); Do you think there's a way we could avoid hardcoding the table numbers? I'm thinking of the case when a stage is added/deleted and the ls_out_qos_mark/"ls_in_l2_lkup" end up having different table ids. I don't think a macro definition is enough as that still needs updating every time a table is added/deleted. > ovn_lflow_add_with_hint(lflows, od, stage, > acl->priority + OVN_ACL_PRI_OFFSET + 10, > ds_cstr(&match), ds_cstr(&actions), stage_hint); > @@ -5002,7 +5003,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > ds_put_format(&actions, "reg0 = 0; " > "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " > "tcp_reset { outport <-> inport; %s };", > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > + ingress ? "next(pipeline=egress,table=5);" > + : "next(pipeline=ingress,table=19);"); Same question as above. > ovn_lflow_add_with_hint(lflows, od, stage, > acl->priority + OVN_ACL_PRI_OFFSET + 10, > ds_cstr(&match), ds_cstr(&actions), stage_hint); > diff --git a/ovn-sb.xml b/ovn-sb.xml > index 5f8da534c..3aa7cd4da 100644 > --- a/ovn-sb.xml > +++ b/ovn-sb.xml > @@ -1112,10 +1112,12 @@ > <var>pipeline</var> as a subroutine. The default <var>table</var> is > just after the current one. If <var>pipeline</var> is specified, it > may be <code>ingress</code> or <code>egress</code>; the default > - <var>pipeline</var> is the one currently executing. Actions in the > - ingress pipeline may not use <code>next</code> to jump into the > - egress pipeline (use the <code>output</code> instead), but > - transitions in the opposite direction are allowed. > + <var>pipeline</var> is the one currently executing. Actions in the > + both ingress and egress pipeline can use <code>next</code> to jump > + across the other pipeline. Actions in the ingress pipeline should > + use <code>next</code> to jump into the specific table of egress > + pipeline only if it is certain that the packets are local and not > + tunnelled and wants to skip certain stages in the packet processing. > </dd> > > <dt><code><var>field</var> = <var>constant</var>;</code></dt> > diff --git a/tests/automake.mk b/tests/automake.mk > index 215fb432b..ed530dd77 100644 > --- a/tests/automake.mk > +++ b/tests/automake.mk > @@ -205,7 +205,8 @@ tests_ovstest_LDADD = $(OVS_LIBDIR)/libopenvswitch.la lib/libovn.la > # Python tests. > CHECK_PYFILES = \ > tests/test-l7.py \ > - tests/uuidfilt.py > + tests/uuidfilt.py \ > + tests/test-tcp-rst.py > > EXTRA_DIST += $(CHECK_PYFILES) > PYCOV_CLEAN_FILES += $(CHECK_PYFILES:.py=.py,cover) .coverage > diff --git a/tests/ovn.at b/tests/ovn.at > index 2a7ee7528..b00670816 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -852,7 +852,11 @@ next(pipeline=ingress, table=11); > encodes as resubmit(,19) > > next(pipeline=egress); > - "next" action cannot advance from ingress to egress pipeline (use "output" action instead) > + formats as next(pipeline=egress, table=11); > + encodes as resubmit(,51) > + > +next(pipeline=egress, table=5); > + encodes as resubmit(,45) > > next(table=10); > formats as next(10); > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index bdb9768d2..3b11cf92b 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -3719,60 +3719,86 @@ start_daemon ovn-controller > > ovn-nbctl ls-add sw0 > > -ovn-nbctl lsp-add sw0 sw0-p1 > -ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > -ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > +ovn-nbctl lsp-add sw0 sw0-p1-rej > +ovn-nbctl lsp-set-addresses sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > +ovn-nbctl lsp-set-port-security sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > > -ovn-nbctl lsp-add sw0 sw0-p2 > -ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > -ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > +ovn-nbctl lsp-add sw0 sw0-p2-rej > +ovn-nbctl lsp-set-addresses sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > +ovn-nbctl lsp-set-port-security sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > + > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > + > +# Create port group and ACLs for sw0 ports. > +ovn-nbctl pg-add pg0_drop sw0-p1-rej sw0-p2-rej > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > + > +ovn-nbctl pg-add pg0 sw0-p1-rej sw0-p2-rej > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > +ovn-nbctl --log acl-add pg0 from-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 80" reject > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 82" allow-related > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 82" allow-related > +ovn-nbctl --log acl-add pg0 to-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 84" reject > > OVN_POPULATE_ARP > ovn-nbctl --wait=hv sync > > -ADD_NAMESPACES(sw0-p1) > -ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > +ADD_NAMESPACES(sw0-p1-rej) > +ADD_VETH(sw0-p1-rej, sw0-p1-rej, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > "10.0.0.1") > > -ADD_NAMESPACES(sw0-p2) > -ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > +ADD_NAMESPACES(sw0-p2-rej) > +ADD_VETH(sw0-p2-rej, sw0-p2-rej, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > "10.0.0.1") > > -NS_CHECK_EXEC([sw0-p1], [ip a a aef0::3/64 dev sw0-p1], [0]) > -NS_CHECK_EXEC([sw0-p2], [ip a a aef0::4/64 dev sw0-p2], [0]) > +NS_CHECK_EXEC([sw0-p1-rej], [ip a a aef0::3/64 dev sw0-p1-rej], [0]) > +NS_CHECK_EXEC([sw0-p2-rej], [ip a a aef0::4/64 dev sw0-p2-rej], [0]) > > -# Capture packets in sw0-p1. > -NS_CHECK_EXEC([sw0-p1], [tcpdump -n -c 2 -i sw0-p1 tcp port 80 > sw0-p1-ip4.pcap &], [0]) > +# Capture packets in sw0-p1-rej. > +NS_CHECK_EXEC([sw0-p1-rej], [tcpdump -n -c 2 -i sw0-p1-rej tcp port 80 > sw0-p1-rej-ip4.pcap &], [0]) > sleep 1 > > -NS_CHECK_EXEC([sw0-p1], [nc 10.0.0.4 80], [1], [], > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 80], [1], [], > [dnl > Ncat: Connection refused. > ]) > > OVS_WAIT_UNTIL([ > - total=`cat sw0-p1-ip4.pcap | wc -l` > + total=`cat sw0-p1-rej-ip4.pcap | wc -l` > echo "total = $total" > test "${total}" = "2" > ]) > > +# Now send traffic to port 84 > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 84], [1], [], > +[dnl > +Ncat: Connection refused. > +]) > + > +AT_CHECK([ > + n_pkt=$(ovs-ofctl dump-flows br-int table=44 | grep -v n_packets=0 | \ > +grep controller | grep tp_dst=84 -c) > + test $n_pkt -eq 1 > +]) > + > # Without this sleep, test case fails intermittently. > sleep 3 > > -NS_CHECK_EXEC([sw0-p2], [tcpdump -n -c 2 -i sw0-p2 tcp port 80 > sw0-p2-ip6.pcap &], [0]) > +NS_CHECK_EXEC([sw0-p2-rej], [tcpdump -n -c 2 -i sw0-p2-rej tcp port 80 > sw0-p2-rej-ip6.pcap &], [0]) > > sleep 1 > > -NS_CHECK_EXEC([sw0-p2], [nc -6 aef0::3 80], [1], [], > +NS_CHECK_EXEC([sw0-p2-rej], [nc -6 aef0::3 80], [1], [], > [dnl > Ncat: Connection refused. > ]) > > OVS_WAIT_UNTIL([ > - total=`cat sw0-p2-ip6.pcap | wc -l` > + total=`cat sw0-p2-rej-ip6.pcap | wc -l` > echo "total = $total" > test "${total}" = "2" > ]) > @@ -3936,3 +3962,105 @@ as > OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d > /.*terminating with signal 15.*/d"]) > AT_CLEANUP > + > +# Tests that when an established connection sends TCP reset, > +# the conntrack entry is not in established state. > +AT_SETUP([ovn -- conntrack TCP reset]) > +AT_KEYWORDS([conntrack]) > +ovn_start > + > +OVS_TRAFFIC_VSWITCHD_START() > +ADD_BR([br-int]) > + > +# Set external-ids in br-int needed for ovn-controller > +ovs-vsctl \ > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true > + > +# Start ovn-controller > +start_daemon ovn-controller > + > +ovn-nbctl ls-add sw0 > + > +ovn-nbctl lsp-add sw0 rst-p1 > +ovn-nbctl lsp-set-addresses rst-p1 "50:54:00:00:00:03" > +ovn-nbctl lsp-set-port-security rst-p1 "50:54:00:00:00:03" > + > +ovn-nbctl lsp-add sw0 rst-p2 > +ovn-nbctl lsp-set-addresses rst-p2 "50:54:00:00:00:04 10.0.0.4" > +ovn-nbctl lsp-set-port-security rst-p2 "50:54:00:00:00:04 10.0.0.4" > + > +# Create port group and ACLs for sw0 ports. > +ovn-nbctl pg-add pg0_drop rst-p1 rst-p2 > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > + > +ovn-nbctl pg-add pg0 rst-p1 rst-p2 > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 80" allow-related > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 80" allow-related > + > +# Create a logical router and attach to logical switch. > +ovn-nbctl lr-add lr0 > +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > +ovn-nbctl lsp-add sw0 sw0-lr0 > +ovn-nbctl lsp-set-type sw0-lr0 router > +ovn-nbctl lsp-set-addresses sw0-lr0 router > +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > + > +ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80 > +ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > +ovn-nbctl --wait=sb lr-lb-add lr0 lb1 > + > +OVN_POPULATE_ARP > +ovn-nbctl --wait=hv sync > + > +ADD_NAMESPACES(rst-p1) > +ADD_VETH(rst-p1, rst-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > + "10.0.0.1") > + > +ADD_NAMESPACES(rst-p2) > +ADD_VETH(rst-p2, rst-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > + "10.0.0.1") > + > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p1) = xup]) > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p2) = xup]) > + > +# Start webservers in 'rst-p1'. > +OVS_START_L7([rst-p1], [http]) > + > +NS_CHECK_EXEC([rst-p2], [$PYTHON $srcdir/test-tcp-rst.py --dst-port 80 --dst-ip 10.0.0.10]) > + > +# When tcp reset is sent, conntrack entry should be in the state - CLOSED or CLOSING. > +# But there is a bug where tcp reset packet was not sent to the conntrack. > +# This test case checks that the tcp reset packet is sent to conntrack > +# and the state is not in established state. > +AT_CHECK([ > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=ESTABLISHED -c) > + test $ct_est_count -eq 0 > + > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=CLOS -c) > + test $ct_est_count -eq 1 > +]) > + > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > + > +as ovn-sb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as ovn-nb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as northd > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > + > +as > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > +/connection dropped.*/d > +/Service monitor not found.*/d"]) > + > +AT_CLEANUP > diff --git a/tests/test-tcp-rst.py b/tests/test-tcp-rst.py > new file mode 100644 > index 000000000..6f96c5706 > --- /dev/null > +++ b/tests/test-tcp-rst.py > @@ -0,0 +1,37 @@ > +#!/usr/bin/env python3 > +# Copyright (c) 2020 Red Hat, Inc. > +# > +# Licensed under the Apache License, Version 2.0 (the "License"); > +# you may not use this file except in compliance with the License. > +# You may obtain a copy of the License at: > +# > +# http://www.apache.org/licenses/LICENSE-2.0 > +# > +# Unless required by applicable law or agreed to in writing, software > +# distributed under the License is distributed on an "AS IS" BASIS, > +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > +# See the License for the specific language governing permissions and > +# limitations under the License. > + > +# Simple python script which connects to tcp server and then > +# resets the connection. > +import argparse > +import socket > +import sys > +import struct > +import time > + > +parser = argparse.ArgumentParser(description='') > +parser.add_argument("--src-port", type=int, default=11337, help="source port to use") > +parser.add_argument("--dst-port", type=int, help="dst port to use") > +parser.add_argument("--dst-ip", help="server ip to use") > +args = parser.parse_args() > +sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > +server_address = (args.dst_ip, args.dst_port) > +sock.bind(('0.0.0.0', args.src_port)) > +sock.connect(server_address) > +l_onoff = 1 > +l_linger = 0 > +time.sleep(1) > +sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', l_onoff, l_linger)) > +sock.close() >
> > On 4/24/20 9:55 AM, numans@ovn.org wrote: > > From: Numan Siddique <numans@ovn.org> > > > > The commit [1] - 28097d5adb95("Fix tcp_reset action handling") fixed an issue > > with tcp_reset OVN action. In order to fix that issue, this commit added > > logical flows to skip all the TCP RST packets from conntrack. > > Ideally it should have skipped only the TCP RST packets generated by > > ovn-controller from conntrack. Since all the TCP RST packets are > > skipped from conntrack, the connections in conntrack remain in > > ESTABLISHED state even if the client/server sends TCP RST to close the > > connection. And these entries live for a long time and this is > > causing performance issues as reported in the BZ. > > > > This patch reverts the logical flows added in [1] and modifies the inner > > actions of tcp_reset in the ingress logical switch pipeline > > from - "tcp_reset { outport <-> inport; output; }" > > to "tcp_reset { output <-> inport; next(pipeline=egress,table=5); }". > > This causes the packet to resubmit to the egress table ls_out_qos_mark > > skipping the egress ACL stage. Prior to this packet, next action was > > not allowing a resubmit from ingress to egress pipeline. This patch > > relaxes this limitation. > > > > For the tcp_reset action in the egress logical switch pipeline, this patch > > modifies the inner action > > from - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=0); }" > > to - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=19); }". > > This causes the packet to enter the ingress table ls_in_l2_lkup. > > > > We don't see similar conntrack leaks with UDP. Although there is an issue > > with the acl reject action for UDP packets. When ovn-controller generates icmp > > destination unreachable packet, it doesn't get delivered. And the IP checksum is > > incorrect in this packet. A follow up patch will fix these issues. > > > > [1] - 28097d5adb95("Fix tcp_reset action handling") > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1819785 > > Co-Authored-by: Tim Rozet <trozet@redhat.com> > > Signed-off-by: Tim Rozet <trozet@redhat.com> > > Signed-off-by: Numan Siddique <numans@ovn.org> Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > > Hi Numan, > > Looks good to me. Only a question inline regarding options for avoiding > hardcoding table IDs. Except for that: > > Acked-by: Dumitru Ceara <dceara@redhat.com> > > Thanks, > Dumitru > > > --- > > lib/actions.c | 6 +- > > northd/ovn-northd.8.xml | 8 ++ > > northd/ovn-northd.c | 14 ++-- > > ovn-sb.xml | 10 ++- > > tests/automake.mk | 3 +- > > tests/ovn.at | 6 +- > > tests/system-ovn.at | 170 +++++++++++++++++++++++++++++++++++----- > > tests/test-tcp-rst.py | 37 +++++++++ > > 8 files changed, 216 insertions(+), 38 deletions(-) > > create mode 100644 tests/test-tcp-rst.py > > > > diff --git a/lib/actions.c b/lib/actions.c > > index c19813de0..91619c889 100644 > > --- a/lib/actions.c > > +++ b/lib/actions.c > > @@ -319,11 +319,7 @@ parse_NEXT(struct action_context *ctx) > > } > > } > > > > - if (pipeline == OVNACT_P_EGRESS && ctx->pp->pipeline == OVNACT_P_INGRESS) { > > - lexer_error(ctx->lexer, > > - "\"next\" action cannot advance from ingress to egress " > > - "pipeline (use \"output\" action instead)"); > > - } else if (table >= ctx->pp->n_tables) { > > + if (table >= ctx->pp->n_tables) { > > lexer_error(ctx->lexer, > > "\"next\" action cannot advance beyond table %d.", > > ctx->pp->n_tables - 1); > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > index efcc4b7fc..d39e259f6 100644 > > --- a/northd/ovn-northd.8.xml > > +++ b/northd/ovn-northd.8.xml > > @@ -373,6 +373,14 @@ > > for new connections and <code>reg0[1] = 1; next;</code> for existing > > connections. > > </li> > > + <li> > > + <code>reject</code> ACLs translate into logical > > + flows with the > > + <code>tcp_reset { output <-> inport; > > + next(pipeline=egress,table=5);}</code> > > + action for TCP connections and <code>icmp4/icmp6</code> action > > + for UDP connections. > > + </li> > > <li> > > Other ACLs translate to <code>drop;</code> for new or untracked > > connections and <code>ct_commit(ct_label=1/1);</code> for known > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > index c4675bd68..f8ae155a2 100644 > > --- a/northd/ovn-northd.c > > +++ b/northd/ovn-northd.c > > @@ -4721,11 +4721,11 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > > * unreachable packets. */ > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) || " > > + "icmp6.type == 1 || " > > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) ||" > > + "icmp6.type == 1 || " > > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > > > /* Ingress and Egress Pre-ACL Table (Priority 100). > > @@ -4838,11 +4838,11 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, > > /* Do not send ND packets to conntrack */ > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > + "icmp6.type == 1", > > "next;"); > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > + "icmp6.type == 1", > > "next;"); > > > > /* Do not send service monitor packets to conntrack. */ > > @@ -4988,7 +4988,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > > ds_put_format(&actions, "reg0 = 0; " > > "eth.dst <-> eth.src; ip4.dst <-> ip4.src; " > > "tcp_reset { outport <-> inport; %s };", > > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > > + ingress ? "next(pipeline=egress,table=5);" > > + : "next(pipeline=ingress,table=19);"); > > Do you think there's a way we could avoid hardcoding the table numbers? > I'm thinking of the case when a stage is added/deleted and the > ls_out_qos_mark/"ls_in_l2_lkup" end up having different table ids. I > don't think a macro definition is enough as that still needs updating > every time a table is added/deleted. > > > ovn_lflow_add_with_hint(lflows, od, stage, > > acl->priority + OVN_ACL_PRI_OFFSET + 10, > > ds_cstr(&match), ds_cstr(&actions), stage_hint); > > @@ -5002,7 +5003,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > > ds_put_format(&actions, "reg0 = 0; " > > "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " > > "tcp_reset { outport <-> inport; %s };", > > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > > + ingress ? "next(pipeline=egress,table=5);" > > + : "next(pipeline=ingress,table=19);"); > > Same question as above. > > > ovn_lflow_add_with_hint(lflows, od, stage, > > acl->priority + OVN_ACL_PRI_OFFSET + 10, > > ds_cstr(&match), ds_cstr(&actions), stage_hint); > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > index 5f8da534c..3aa7cd4da 100644 > > --- a/ovn-sb.xml > > +++ b/ovn-sb.xml > > @@ -1112,10 +1112,12 @@ > > <var>pipeline</var> as a subroutine. The default <var>table</var> is > > just after the current one. If <var>pipeline</var> is specified, it > > may be <code>ingress</code> or <code>egress</code>; the default > > - <var>pipeline</var> is the one currently executing. Actions in the > > - ingress pipeline may not use <code>next</code> to jump into the > > - egress pipeline (use the <code>output</code> instead), but > > - transitions in the opposite direction are allowed. > > + <var>pipeline</var> is the one currently executing. Actions in the > > + both ingress and egress pipeline can use <code>next</code> to jump > > + across the other pipeline. Actions in the ingress pipeline should > > + use <code>next</code> to jump into the specific table of egress > > + pipeline only if it is certain that the packets are local and not > > + tunnelled and wants to skip certain stages in the packet processing. > > </dd> > > > > <dt><code><var>field</var> = <var>constant</var>;</code></dt> > > diff --git a/tests/automake.mk b/tests/automake.mk > > index 215fb432b..ed530dd77 100644 > > --- a/tests/automake.mk > > +++ b/tests/automake.mk > > @@ -205,7 +205,8 @@ tests_ovstest_LDADD = $(OVS_LIBDIR)/libopenvswitch.la lib/libovn.la > > # Python tests. > > CHECK_PYFILES = \ > > tests/test-l7.py \ > > - tests/uuidfilt.py > > + tests/uuidfilt.py \ > > + tests/test-tcp-rst.py > > > > EXTRA_DIST += $(CHECK_PYFILES) > > PYCOV_CLEAN_FILES += $(CHECK_PYFILES:.py=.py,cover) .coverage > > diff --git a/tests/ovn.at b/tests/ovn.at > > index 2a7ee7528..b00670816 100644 > > --- a/tests/ovn.at > > +++ b/tests/ovn.at > > @@ -852,7 +852,11 @@ next(pipeline=ingress, table=11); > > encodes as resubmit(,19) > > > > next(pipeline=egress); > > - "next" action cannot advance from ingress to egress pipeline (use "output" action instead) > > + formats as next(pipeline=egress, table=11); > > + encodes as resubmit(,51) > > + > > +next(pipeline=egress, table=5); > > + encodes as resubmit(,45) > > > > next(table=10); > > formats as next(10); > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > index bdb9768d2..3b11cf92b 100644 > > --- a/tests/system-ovn.at > > +++ b/tests/system-ovn.at > > @@ -3719,60 +3719,86 @@ start_daemon ovn-controller > > > > ovn-nbctl ls-add sw0 > > > > -ovn-nbctl lsp-add sw0 sw0-p1 > > -ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > > -ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > > +ovn-nbctl lsp-add sw0 sw0-p1-rej > > +ovn-nbctl lsp-set-addresses sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > > +ovn-nbctl lsp-set-port-security sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > > -ovn-nbctl lsp-add sw0 sw0-p2 > > -ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > > -ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > > +ovn-nbctl lsp-add sw0 sw0-p2-rej > > +ovn-nbctl lsp-set-addresses sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > > +ovn-nbctl lsp-set-port-security sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > > + > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > > + > > +# Create port group and ACLs for sw0 ports. > > +ovn-nbctl pg-add pg0_drop sw0-p1-rej sw0-p2-rej > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > > + > > +ovn-nbctl pg-add pg0 sw0-p1-rej sw0-p2-rej > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > > +ovn-nbctl --log acl-add pg0 from-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 80" reject > > > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 82" allow-related > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 82" allow-related > > +ovn-nbctl --log acl-add pg0 to-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 84" reject > > > > OVN_POPULATE_ARP > > ovn-nbctl --wait=hv sync > > > > -ADD_NAMESPACES(sw0-p1) > > -ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > +ADD_NAMESPACES(sw0-p1-rej) > > +ADD_VETH(sw0-p1-rej, sw0-p1-rej, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > "10.0.0.1") > > > > -ADD_NAMESPACES(sw0-p2) > > -ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > +ADD_NAMESPACES(sw0-p2-rej) > > +ADD_VETH(sw0-p2-rej, sw0-p2-rej, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > "10.0.0.1") > > > > -NS_CHECK_EXEC([sw0-p1], [ip a a aef0::3/64 dev sw0-p1], [0]) > > -NS_CHECK_EXEC([sw0-p2], [ip a a aef0::4/64 dev sw0-p2], [0]) > > +NS_CHECK_EXEC([sw0-p1-rej], [ip a a aef0::3/64 dev sw0-p1-rej], [0]) > > +NS_CHECK_EXEC([sw0-p2-rej], [ip a a aef0::4/64 dev sw0-p2-rej], [0]) > > > > -# Capture packets in sw0-p1. > > -NS_CHECK_EXEC([sw0-p1], [tcpdump -n -c 2 -i sw0-p1 tcp port 80 > sw0-p1-ip4.pcap &], [0]) > > +# Capture packets in sw0-p1-rej. > > +NS_CHECK_EXEC([sw0-p1-rej], [tcpdump -n -c 2 -i sw0-p1-rej tcp port 80 > sw0-p1-rej-ip4.pcap &], [0]) > > sleep 1 > > > > -NS_CHECK_EXEC([sw0-p1], [nc 10.0.0.4 80], [1], [], > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 80], [1], [], > > [dnl > > Ncat: Connection refused. > > ]) > > > > OVS_WAIT_UNTIL([ > > - total=`cat sw0-p1-ip4.pcap | wc -l` > > + total=`cat sw0-p1-rej-ip4.pcap | wc -l` > > echo "total = $total" > > test "${total}" = "2" > > ]) > > > > +# Now send traffic to port 84 > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 84], [1], [], > > +[dnl > > +Ncat: Connection refused. > > +]) > > + > > +AT_CHECK([ > > + n_pkt=$(ovs-ofctl dump-flows br-int table=44 | grep -v n_packets=0 | \ > > +grep controller | grep tp_dst=84 -c) > > + test $n_pkt -eq 1 > > +]) > > + > > # Without this sleep, test case fails intermittently. > > sleep 3 > > > > -NS_CHECK_EXEC([sw0-p2], [tcpdump -n -c 2 -i sw0-p2 tcp port 80 > sw0-p2-ip6.pcap &], [0]) > > +NS_CHECK_EXEC([sw0-p2-rej], [tcpdump -n -c 2 -i sw0-p2-rej tcp port 80 > sw0-p2-rej-ip6.pcap &], [0]) > > > > sleep 1 > > > > -NS_CHECK_EXEC([sw0-p2], [nc -6 aef0::3 80], [1], [], > > +NS_CHECK_EXEC([sw0-p2-rej], [nc -6 aef0::3 80], [1], [], > > [dnl > > Ncat: Connection refused. > > ]) > > > > OVS_WAIT_UNTIL([ > > - total=`cat sw0-p2-ip6.pcap | wc -l` > > + total=`cat sw0-p2-rej-ip6.pcap | wc -l` > > echo "total = $total" > > test "${total}" = "2" > > ]) > > @@ -3936,3 +3962,105 @@ as > > OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d > > /.*terminating with signal 15.*/d"]) > > AT_CLEANUP > > + > > +# Tests that when an established connection sends TCP reset, > > +# the conntrack entry is not in established state. > > +AT_SETUP([ovn -- conntrack TCP reset]) > > +AT_KEYWORDS([conntrack]) > > +ovn_start > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-int]) > > + > > +# Set external-ids in br-int needed for ovn-controller > > +ovs-vsctl \ > > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > > + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > > + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true > > + > > +# Start ovn-controller > > +start_daemon ovn-controller > > + > > +ovn-nbctl ls-add sw0 > > + > > +ovn-nbctl lsp-add sw0 rst-p1 > > +ovn-nbctl lsp-set-addresses rst-p1 "50:54:00:00:00:03" > > +ovn-nbctl lsp-set-port-security rst-p1 "50:54:00:00:00:03" > > + > > +ovn-nbctl lsp-add sw0 rst-p2 > > +ovn-nbctl lsp-set-addresses rst-p2 "50:54:00:00:00:04 10.0.0.4" > > +ovn-nbctl lsp-set-port-security rst-p2 "50:54:00:00:00:04 10.0.0.4" > > + > > +# Create port group and ACLs for sw0 ports. > > +ovn-nbctl pg-add pg0_drop rst-p1 rst-p2 > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > > + > > +ovn-nbctl pg-add pg0 rst-p1 rst-p2 > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 80" allow-related > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 80" allow-related > > + > > +# Create a logical router and attach to logical switch. > > +ovn-nbctl lr-add lr0 > > +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > > +ovn-nbctl lsp-add sw0 sw0-lr0 > > +ovn-nbctl lsp-set-type sw0-lr0 router > > +ovn-nbctl lsp-set-addresses sw0-lr0 router > > +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > > + > > +ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80 > > +ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > > +ovn-nbctl --wait=sb lr-lb-add lr0 lb1 > > + > > +OVN_POPULATE_ARP > > +ovn-nbctl --wait=hv sync > > + > > +ADD_NAMESPACES(rst-p1) > > +ADD_VETH(rst-p1, rst-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > + "10.0.0.1") > > + > > +ADD_NAMESPACES(rst-p2) > > +ADD_VETH(rst-p2, rst-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > + "10.0.0.1") > > + > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p1) = xup]) > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p2) = xup]) > > + > > +# Start webservers in 'rst-p1'. > > +OVS_START_L7([rst-p1], [http]) > > + > > +NS_CHECK_EXEC([rst-p2], [$PYTHON $srcdir/test-tcp-rst.py --dst-port 80 --dst-ip 10.0.0.10]) > > + > > +# When tcp reset is sent, conntrack entry should be in the state - CLOSED or CLOSING. > > +# But there is a bug where tcp reset packet was not sent to the conntrack. > > +# This test case checks that the tcp reset packet is sent to conntrack > > +# and the state is not in established state. > > +AT_CHECK([ > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=ESTABLISHED -c) > > + test $ct_est_count -eq 0 > > + > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=CLOS -c) > > + test $ct_est_count -eq 1 > > +]) > > + > > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > + > > +as ovn-sb > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > + > > +as ovn-nb > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > + > > +as northd > > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > > + > > +as > > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > > +/connection dropped.*/d > > +/Service monitor not found.*/d"]) > > + > > +AT_CLEANUP > > diff --git a/tests/test-tcp-rst.py b/tests/test-tcp-rst.py > > new file mode 100644 > > index 000000000..6f96c5706 > > --- /dev/null > > +++ b/tests/test-tcp-rst.py > > @@ -0,0 +1,37 @@ > > +#!/usr/bin/env python3 > > +# Copyright (c) 2020 Red Hat, Inc. > > +# > > +# Licensed under the Apache License, Version 2.0 (the "License"); > > +# you may not use this file except in compliance with the License. > > +# You may obtain a copy of the License at: > > +# > > +# http://www.apache.org/licenses/LICENSE-2.0 > > +# > > +# Unless required by applicable law or agreed to in writing, software > > +# distributed under the License is distributed on an "AS IS" BASIS, > > +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > > +# See the License for the specific language governing permissions and > > +# limitations under the License. > > + > > +# Simple python script which connects to tcp server and then > > +# resets the connection. > > +import argparse > > +import socket > > +import sys > > +import struct > > +import time > > + > > +parser = argparse.ArgumentParser(description='') > > +parser.add_argument("--src-port", type=int, default=11337, help="source port to use") > > +parser.add_argument("--dst-port", type=int, help="dst port to use") > > +parser.add_argument("--dst-ip", help="server ip to use") > > +args = parser.parse_args() > > +sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > > +server_address = (args.dst_ip, args.dst_port) > > +sock.bind(('0.0.0.0', args.src_port)) > > +sock.connect(server_address) > > +l_onoff = 1 > > +l_linger = 0 > > +time.sleep(1) > > +sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', l_onoff, l_linger)) > > +sock.close() > > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Mon, Apr 27, 2020 at 8:40 PM Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote: > > > > > On 4/24/20 9:55 AM, numans@ovn.org wrote: > > > From: Numan Siddique <numans@ovn.org> > > > > > > The commit [1] - 28097d5adb95("Fix tcp_reset action handling") fixed an issue > > > with tcp_reset OVN action. In order to fix that issue, this commit added > > > logical flows to skip all the TCP RST packets from conntrack. > > > Ideally it should have skipped only the TCP RST packets generated by > > > ovn-controller from conntrack. Since all the TCP RST packets are > > > skipped from conntrack, the connections in conntrack remain in > > > ESTABLISHED state even if the client/server sends TCP RST to close the > > > connection. And these entries live for a long time and this is > > > causing performance issues as reported in the BZ. > > > > > > This patch reverts the logical flows added in [1] and modifies the inner > > > actions of tcp_reset in the ingress logical switch pipeline > > > from - "tcp_reset { outport <-> inport; output; }" > > > to "tcp_reset { output <-> inport; next(pipeline=egress,table=5); }". > > > This causes the packet to resubmit to the egress table ls_out_qos_mark > > > skipping the egress ACL stage. Prior to this packet, next action was > > > not allowing a resubmit from ingress to egress pipeline. This patch > > > relaxes this limitation. > > > > > > For the tcp_reset action in the egress logical switch pipeline, this patch > > > modifies the inner action > > > from - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=0); }" > > > to - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=19); }". > > > This causes the packet to enter the ingress table ls_in_l2_lkup. > > > > > > We don't see similar conntrack leaks with UDP. Although there is an issue > > > with the acl reject action for UDP packets. When ovn-controller generates icmp > > > destination unreachable packet, it doesn't get delivered. And the IP checksum is > > > incorrect in this packet. A follow up patch will fix these issues. > > > > > > [1] - 28097d5adb95("Fix tcp_reset action handling") > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1819785 > > > Co-Authored-by: Tim Rozet <trozet@redhat.com> > > > Signed-off-by: Tim Rozet <trozet@redhat.com> > > > Signed-off-by: Numan Siddique <numans@ovn.org> > > Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > > > > > > Hi Numan, > > > > Looks good to me. Only a question inline regarding options for avoiding > > hardcoding table IDs. Except for that: > > > > Acked-by: Dumitru Ceara <dceara@redhat.com> > > Thanks Dumitru and Lorenzo for the reviews. I applied this patch to master. > > Thanks, > > Dumitru > > > > > --- > > > lib/actions.c | 6 +- > > > northd/ovn-northd.8.xml | 8 ++ > > > northd/ovn-northd.c | 14 ++-- > > > ovn-sb.xml | 10 ++- > > > tests/automake.mk | 3 +- > > > tests/ovn.at | 6 +- > > > tests/system-ovn.at | 170 +++++++++++++++++++++++++++++++++++----- > > > tests/test-tcp-rst.py | 37 +++++++++ > > > 8 files changed, 216 insertions(+), 38 deletions(-) > > > create mode 100644 tests/test-tcp-rst.py > > > > > > diff --git a/lib/actions.c b/lib/actions.c > > > index c19813de0..91619c889 100644 > > > --- a/lib/actions.c > > > +++ b/lib/actions.c > > > @@ -319,11 +319,7 @@ parse_NEXT(struct action_context *ctx) > > > } > > > } > > > > > > - if (pipeline == OVNACT_P_EGRESS && ctx->pp->pipeline == OVNACT_P_INGRESS) { > > > - lexer_error(ctx->lexer, > > > - "\"next\" action cannot advance from ingress to egress " > > > - "pipeline (use \"output\" action instead)"); > > > - } else if (table >= ctx->pp->n_tables) { > > > + if (table >= ctx->pp->n_tables) { > > > lexer_error(ctx->lexer, > > > "\"next\" action cannot advance beyond table %d.", > > > ctx->pp->n_tables - 1); > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > > index efcc4b7fc..d39e259f6 100644 > > > --- a/northd/ovn-northd.8.xml > > > +++ b/northd/ovn-northd.8.xml > > > @@ -373,6 +373,14 @@ > > > for new connections and <code>reg0[1] = 1; next;</code> for existing > > > connections. > > > </li> > > > + <li> > > > + <code>reject</code> ACLs translate into logical > > > + flows with the > > > + <code>tcp_reset { output <-> inport; > > > + next(pipeline=egress,table=5);}</code> > > > + action for TCP connections and <code>icmp4/icmp6</code> action > > > + for UDP connections. > > > + </li> > > > <li> > > > Other ACLs translate to <code>drop;</code> for new or untracked > > > connections and <code>ct_commit(ct_label=1/1);</code> for known > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > index c4675bd68..f8ae155a2 100644 > > > --- a/northd/ovn-northd.c > > > +++ b/northd/ovn-northd.c > > > @@ -4721,11 +4721,11 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > > > * unreachable packets. */ > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > > > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) || " > > > + "icmp6.type == 1 || " > > > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > > > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) ||" > > > + "icmp6.type == 1 || " > > > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > > > > > /* Ingress and Egress Pre-ACL Table (Priority 100). > > > @@ -4838,11 +4838,11 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, > > > /* Do not send ND packets to conntrack */ > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, > > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > > + "icmp6.type == 1", > > > "next;"); > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, > > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > > + "icmp6.type == 1", > > > "next;"); > > > > > > /* Do not send service monitor packets to conntrack. */ > > > @@ -4988,7 +4988,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > > > ds_put_format(&actions, "reg0 = 0; " > > > "eth.dst <-> eth.src; ip4.dst <-> ip4.src; " > > > "tcp_reset { outport <-> inport; %s };", > > > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > > > + ingress ? "next(pipeline=egress,table=5);" > > > + : "next(pipeline=ingress,table=19);"); > > > > Do you think there's a way we could avoid hardcoding the table numbers? > > I'm thinking of the case when a stage is added/deleted and the > > ls_out_qos_mark/"ls_in_l2_lkup" end up having different table ids. I > > don't think a macro definition is enough as that still needs updating > > every time a table is added/deleted. I agree. Hardcoding table numbers are prone to regressions later in this case. As I discussed offline, I think it's better to have a macro defined for each table whose value gets generated. I'll work on a follow up patch to have this. Thanks Numan > > > > > ovn_lflow_add_with_hint(lflows, od, stage, > > > acl->priority + OVN_ACL_PRI_OFFSET + 10, > > > ds_cstr(&match), ds_cstr(&actions), stage_hint); > > > @@ -5002,7 +5003,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > > > ds_put_format(&actions, "reg0 = 0; " > > > "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " > > > "tcp_reset { outport <-> inport; %s };", > > > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > > > + ingress ? "next(pipeline=egress,table=5);" > > > + : "next(pipeline=ingress,table=19);"); > > > > Same question as above. > > > > > ovn_lflow_add_with_hint(lflows, od, stage, > > > acl->priority + OVN_ACL_PRI_OFFSET + 10, > > > ds_cstr(&match), ds_cstr(&actions), stage_hint); > > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > > index 5f8da534c..3aa7cd4da 100644 > > > --- a/ovn-sb.xml > > > +++ b/ovn-sb.xml > > > @@ -1112,10 +1112,12 @@ > > > <var>pipeline</var> as a subroutine. The default <var>table</var> is > > > just after the current one. If <var>pipeline</var> is specified, it > > > may be <code>ingress</code> or <code>egress</code>; the default > > > - <var>pipeline</var> is the one currently executing. Actions in the > > > - ingress pipeline may not use <code>next</code> to jump into the > > > - egress pipeline (use the <code>output</code> instead), but > > > - transitions in the opposite direction are allowed. > > > + <var>pipeline</var> is the one currently executing. Actions in the > > > + both ingress and egress pipeline can use <code>next</code> to jump > > > + across the other pipeline. Actions in the ingress pipeline should > > > + use <code>next</code> to jump into the specific table of egress > > > + pipeline only if it is certain that the packets are local and not > > > + tunnelled and wants to skip certain stages in the packet processing. > > > </dd> > > > > > > <dt><code><var>field</var> = <var>constant</var>;</code></dt> > > > diff --git a/tests/automake.mk b/tests/automake.mk > > > index 215fb432b..ed530dd77 100644 > > > --- a/tests/automake.mk > > > +++ b/tests/automake.mk > > > @@ -205,7 +205,8 @@ tests_ovstest_LDADD = $(OVS_LIBDIR)/libopenvswitch.la lib/libovn.la > > > # Python tests. > > > CHECK_PYFILES = \ > > > tests/test-l7.py \ > > > - tests/uuidfilt.py > > > + tests/uuidfilt.py \ > > > + tests/test-tcp-rst.py > > > > > > EXTRA_DIST += $(CHECK_PYFILES) > > > PYCOV_CLEAN_FILES += $(CHECK_PYFILES:.py=.py,cover) .coverage > > > diff --git a/tests/ovn.at b/tests/ovn.at > > > index 2a7ee7528..b00670816 100644 > > > --- a/tests/ovn.at > > > +++ b/tests/ovn.at > > > @@ -852,7 +852,11 @@ next(pipeline=ingress, table=11); > > > encodes as resubmit(,19) > > > > > > next(pipeline=egress); > > > - "next" action cannot advance from ingress to egress pipeline (use "output" action instead) > > > + formats as next(pipeline=egress, table=11); > > > + encodes as resubmit(,51) > > > + > > > +next(pipeline=egress, table=5); > > > + encodes as resubmit(,45) > > > > > > next(table=10); > > > formats as next(10); > > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > > index bdb9768d2..3b11cf92b 100644 > > > --- a/tests/system-ovn.at > > > +++ b/tests/system-ovn.at > > > @@ -3719,60 +3719,86 @@ start_daemon ovn-controller > > > > > > ovn-nbctl ls-add sw0 > > > > > > -ovn-nbctl lsp-add sw0 sw0-p1 > > > -ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > -ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > +ovn-nbctl lsp-add sw0 sw0-p1-rej > > > +ovn-nbctl lsp-set-addresses sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > +ovn-nbctl lsp-set-port-security sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > > > > -ovn-nbctl lsp-add sw0 sw0-p2 > > > -ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > -ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > +ovn-nbctl lsp-add sw0 sw0-p2-rej > > > +ovn-nbctl lsp-set-addresses sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > +ovn-nbctl lsp-set-port-security sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > + > > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > > > + > > > +# Create port group and ACLs for sw0 ports. > > > +ovn-nbctl pg-add pg0_drop sw0-p1-rej sw0-p2-rej > > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > > > + > > > +ovn-nbctl pg-add pg0 sw0-p1-rej sw0-p2-rej > > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > > > +ovn-nbctl --log acl-add pg0 from-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 80" reject > > > > > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 82" allow-related > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 82" allow-related > > > +ovn-nbctl --log acl-add pg0 to-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 84" reject > > > > > > OVN_POPULATE_ARP > > > ovn-nbctl --wait=hv sync > > > > > > -ADD_NAMESPACES(sw0-p1) > > > -ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > > +ADD_NAMESPACES(sw0-p1-rej) > > > +ADD_VETH(sw0-p1-rej, sw0-p1-rej, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > > "10.0.0.1") > > > > > > -ADD_NAMESPACES(sw0-p2) > > > -ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > > +ADD_NAMESPACES(sw0-p2-rej) > > > +ADD_VETH(sw0-p2-rej, sw0-p2-rej, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > > "10.0.0.1") > > > > > > -NS_CHECK_EXEC([sw0-p1], [ip a a aef0::3/64 dev sw0-p1], [0]) > > > -NS_CHECK_EXEC([sw0-p2], [ip a a aef0::4/64 dev sw0-p2], [0]) > > > +NS_CHECK_EXEC([sw0-p1-rej], [ip a a aef0::3/64 dev sw0-p1-rej], [0]) > > > +NS_CHECK_EXEC([sw0-p2-rej], [ip a a aef0::4/64 dev sw0-p2-rej], [0]) > > > > > > -# Capture packets in sw0-p1. > > > -NS_CHECK_EXEC([sw0-p1], [tcpdump -n -c 2 -i sw0-p1 tcp port 80 > sw0-p1-ip4.pcap &], [0]) > > > +# Capture packets in sw0-p1-rej. > > > +NS_CHECK_EXEC([sw0-p1-rej], [tcpdump -n -c 2 -i sw0-p1-rej tcp port 80 > sw0-p1-rej-ip4.pcap &], [0]) > > > sleep 1 > > > > > > -NS_CHECK_EXEC([sw0-p1], [nc 10.0.0.4 80], [1], [], > > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 80], [1], [], > > > [dnl > > > Ncat: Connection refused. > > > ]) > > > > > > OVS_WAIT_UNTIL([ > > > - total=`cat sw0-p1-ip4.pcap | wc -l` > > > + total=`cat sw0-p1-rej-ip4.pcap | wc -l` > > > echo "total = $total" > > > test "${total}" = "2" > > > ]) > > > > > > +# Now send traffic to port 84 > > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 84], [1], [], > > > +[dnl > > > +Ncat: Connection refused. > > > +]) > > > + > > > +AT_CHECK([ > > > + n_pkt=$(ovs-ofctl dump-flows br-int table=44 | grep -v n_packets=0 | \ > > > +grep controller | grep tp_dst=84 -c) > > > + test $n_pkt -eq 1 > > > +]) > > > + > > > # Without this sleep, test case fails intermittently. > > > sleep 3 > > > > > > -NS_CHECK_EXEC([sw0-p2], [tcpdump -n -c 2 -i sw0-p2 tcp port 80 > sw0-p2-ip6.pcap &], [0]) > > > +NS_CHECK_EXEC([sw0-p2-rej], [tcpdump -n -c 2 -i sw0-p2-rej tcp port 80 > sw0-p2-rej-ip6.pcap &], [0]) > > > > > > sleep 1 > > > > > > -NS_CHECK_EXEC([sw0-p2], [nc -6 aef0::3 80], [1], [], > > > +NS_CHECK_EXEC([sw0-p2-rej], [nc -6 aef0::3 80], [1], [], > > > [dnl > > > Ncat: Connection refused. > > > ]) > > > > > > OVS_WAIT_UNTIL([ > > > - total=`cat sw0-p2-ip6.pcap | wc -l` > > > + total=`cat sw0-p2-rej-ip6.pcap | wc -l` > > > echo "total = $total" > > > test "${total}" = "2" > > > ]) > > > @@ -3936,3 +3962,105 @@ as > > > OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d > > > /.*terminating with signal 15.*/d"]) > > > AT_CLEANUP > > > + > > > +# Tests that when an established connection sends TCP reset, > > > +# the conntrack entry is not in established state. > > > +AT_SETUP([ovn -- conntrack TCP reset]) > > > +AT_KEYWORDS([conntrack]) > > > +ovn_start > > > + > > > +OVS_TRAFFIC_VSWITCHD_START() > > > +ADD_BR([br-int]) > > > + > > > +# Set external-ids in br-int needed for ovn-controller > > > +ovs-vsctl \ > > > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > > > + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > > > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > > > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > > > + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true > > > + > > > +# Start ovn-controller > > > +start_daemon ovn-controller > > > + > > > +ovn-nbctl ls-add sw0 > > > + > > > +ovn-nbctl lsp-add sw0 rst-p1 > > > +ovn-nbctl lsp-set-addresses rst-p1 "50:54:00:00:00:03" > > > +ovn-nbctl lsp-set-port-security rst-p1 "50:54:00:00:00:03" > > > + > > > +ovn-nbctl lsp-add sw0 rst-p2 > > > +ovn-nbctl lsp-set-addresses rst-p2 "50:54:00:00:00:04 10.0.0.4" > > > +ovn-nbctl lsp-set-port-security rst-p2 "50:54:00:00:00:04 10.0.0.4" > > > + > > > +# Create port group and ACLs for sw0 ports. > > > +ovn-nbctl pg-add pg0_drop rst-p1 rst-p2 > > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > > > + > > > +ovn-nbctl pg-add pg0 rst-p1 rst-p2 > > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 80" allow-related > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 80" allow-related > > > + > > > +# Create a logical router and attach to logical switch. > > > +ovn-nbctl lr-add lr0 > > > +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > > > +ovn-nbctl lsp-add sw0 sw0-lr0 > > > +ovn-nbctl lsp-set-type sw0-lr0 router > > > +ovn-nbctl lsp-set-addresses sw0-lr0 router > > > +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > > > + > > > +ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80 > > > +ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > > > +ovn-nbctl --wait=sb lr-lb-add lr0 lb1 > > > + > > > +OVN_POPULATE_ARP > > > +ovn-nbctl --wait=hv sync > > > + > > > +ADD_NAMESPACES(rst-p1) > > > +ADD_VETH(rst-p1, rst-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > > + "10.0.0.1") > > > + > > > +ADD_NAMESPACES(rst-p2) > > > +ADD_VETH(rst-p2, rst-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > > + "10.0.0.1") > > > + > > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p1) = xup]) > > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p2) = xup]) > > > + > > > +# Start webservers in 'rst-p1'. > > > +OVS_START_L7([rst-p1], [http]) > > > + > > > +NS_CHECK_EXEC([rst-p2], [$PYTHON $srcdir/test-tcp-rst.py --dst-port 80 --dst-ip 10.0.0.10]) > > > + > > > +# When tcp reset is sent, conntrack entry should be in the state - CLOSED or CLOSING. > > > +# But there is a bug where tcp reset packet was not sent to the conntrack. > > > +# This test case checks that the tcp reset packet is sent to conntrack > > > +# and the state is not in established state. > > > +AT_CHECK([ > > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=ESTABLISHED -c) > > > + test $ct_est_count -eq 0 > > > + > > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=CLOS -c) > > > + test $ct_est_count -eq 1 > > > +]) > > > + > > > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > > + > > > +as ovn-sb > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > + > > > +as ovn-nb > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > + > > > +as northd > > > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > > > + > > > +as > > > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > > > +/connection dropped.*/d > > > +/Service monitor not found.*/d"]) > > > + > > > +AT_CLEANUP > > > diff --git a/tests/test-tcp-rst.py b/tests/test-tcp-rst.py > > > new file mode 100644 > > > index 000000000..6f96c5706 > > > --- /dev/null > > > +++ b/tests/test-tcp-rst.py > > > @@ -0,0 +1,37 @@ > > > +#!/usr/bin/env python3 > > > +# Copyright (c) 2020 Red Hat, Inc. > > > +# > > > +# Licensed under the Apache License, Version 2.0 (the "License"); > > > +# you may not use this file except in compliance with the License. > > > +# You may obtain a copy of the License at: > > > +# > > > +# http://www.apache.org/licenses/LICENSE-2.0 > > > +# > > > +# Unless required by applicable law or agreed to in writing, software > > > +# distributed under the License is distributed on an "AS IS" BASIS, > > > +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > > > +# See the License for the specific language governing permissions and > > > +# limitations under the License. > > > + > > > +# Simple python script which connects to tcp server and then > > > +# resets the connection. > > > +import argparse > > > +import socket > > > +import sys > > > +import struct > > > +import time > > > + > > > +parser = argparse.ArgumentParser(description='') > > > +parser.add_argument("--src-port", type=int, default=11337, help="source port to use") > > > +parser.add_argument("--dst-port", type=int, help="dst port to use") > > > +parser.add_argument("--dst-ip", help="server ip to use") > > > +args = parser.parse_args() > > > +sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > > > +server_address = (args.dst_ip, args.dst_port) > > > +sock.bind(('0.0.0.0', args.src_port)) > > > +sock.connect(server_address) > > > +l_onoff = 1 > > > +l_linger = 0 > > > +time.sleep(1) > > > +sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', l_onoff, l_linger)) > > > +sock.close() > > > > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Mon, Apr 27, 2020 at 8:41 AM Numan Siddique <numans@ovn.org> wrote: > > On Mon, Apr 27, 2020 at 8:40 PM Lorenzo Bianconi > <lorenzo.bianconi@redhat.com> wrote: > > > > > > > > On 4/24/20 9:55 AM, numans@ovn.org wrote: > > > > From: Numan Siddique <numans@ovn.org> > > > > > > > > The commit [1] - 28097d5adb95("Fix tcp_reset action handling") fixed an issue > > > > with tcp_reset OVN action. In order to fix that issue, this commit added > > > > logical flows to skip all the TCP RST packets from conntrack. > > > > Ideally it should have skipped only the TCP RST packets generated by > > > > ovn-controller from conntrack. Since all the TCP RST packets are > > > > skipped from conntrack, the connections in conntrack remain in > > > > ESTABLISHED state even if the client/server sends TCP RST to close the > > > > connection. And these entries live for a long time and this is > > > > causing performance issues as reported in the BZ. > > > > > > > > This patch reverts the logical flows added in [1] and modifies the inner > > > > actions of tcp_reset in the ingress logical switch pipeline > > > > from - "tcp_reset { outport <-> inport; output; }" > > > > to "tcp_reset { output <-> inport; next(pipeline=egress,table=5); }". > > > > This causes the packet to resubmit to the egress table ls_out_qos_mark > > > > skipping the egress ACL stage. Prior to this packet, next action was > > > > not allowing a resubmit from ingress to egress pipeline. This patch > > > > relaxes this limitation. > > > > > > > > For the tcp_reset action in the egress logical switch pipeline, this patch > > > > modifies the inner action > > > > from - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=0); }" > > > > to - "tcp_reset { outport <-> inport; next(pipeline=ingress,table=19); }". > > > > This causes the packet to enter the ingress table ls_in_l2_lkup. > > > > > > > > We don't see similar conntrack leaks with UDP. Although there is an issue > > > > with the acl reject action for UDP packets. When ovn-controller generates icmp > > > > destination unreachable packet, it doesn't get delivered. And the IP checksum is > > > > incorrect in this packet. A follow up patch will fix these issues. > > > > > > > > [1] - 28097d5adb95("Fix tcp_reset action handling") > > > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1819785 > > > > Co-Authored-by: Tim Rozet <trozet@redhat.com> > > > > Signed-off-by: Tim Rozet <trozet@redhat.com> > > > > Signed-off-by: Numan Siddique <numans@ovn.org> > > > > Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > > > > > > > > > > Hi Numan, > > > > > > Looks good to me. Only a question inline regarding options for avoiding > > > hardcoding table IDs. Except for that: > > > > > > Acked-by: Dumitru Ceara <dceara@redhat.com> > > > > > Thanks Dumitru and Lorenzo for the reviews. I applied this patch to master. Thanks Numan. Shall this be backported to 20.03? > > > > Thanks, > > > Dumitru > > > > > > > --- > > > > lib/actions.c | 6 +- > > > > northd/ovn-northd.8.xml | 8 ++ > > > > northd/ovn-northd.c | 14 ++-- > > > > ovn-sb.xml | 10 ++- > > > > tests/automake.mk | 3 +- > > > > tests/ovn.at | 6 +- > > > > tests/system-ovn.at | 170 +++++++++++++++++++++++++++++++++++----- > > > > tests/test-tcp-rst.py | 37 +++++++++ > > > > 8 files changed, 216 insertions(+), 38 deletions(-) > > > > create mode 100644 tests/test-tcp-rst.py > > > > > > > > diff --git a/lib/actions.c b/lib/actions.c > > > > index c19813de0..91619c889 100644 > > > > --- a/lib/actions.c > > > > +++ b/lib/actions.c > > > > @@ -319,11 +319,7 @@ parse_NEXT(struct action_context *ctx) > > > > } > > > > } > > > > > > > > - if (pipeline == OVNACT_P_EGRESS && ctx->pp->pipeline == OVNACT_P_INGRESS) { > > > > - lexer_error(ctx->lexer, > > > > - "\"next\" action cannot advance from ingress to egress " > > > > - "pipeline (use \"output\" action instead)"); > > > > - } else if (table >= ctx->pp->n_tables) { > > > > + if (table >= ctx->pp->n_tables) { > > > > lexer_error(ctx->lexer, > > > > "\"next\" action cannot advance beyond table %d.", > > > > ctx->pp->n_tables - 1); > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > > > index efcc4b7fc..d39e259f6 100644 > > > > --- a/northd/ovn-northd.8.xml > > > > +++ b/northd/ovn-northd.8.xml > > > > @@ -373,6 +373,14 @@ > > > > for new connections and <code>reg0[1] = 1; next;</code> for existing > > > > connections. > > > > </li> > > > > + <li> > > > > + <code>reject</code> ACLs translate into logical > > > > + flows with the > > > > + <code>tcp_reset { output <-> inport; > > > > + next(pipeline=egress,table=5);}</code> > > > > + action for TCP connections and <code>icmp4/icmp6</code> action > > > > + for UDP connections. > > > > + </li> > > > > <li> > > > > Other ACLs translate to <code>drop;</code> for new or untracked > > > > connections and <code>ct_commit(ct_label=1/1);</code> for known > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > > index c4675bd68..f8ae155a2 100644 > > > > --- a/northd/ovn-northd.c > > > > +++ b/northd/ovn-northd.c > > > > @@ -4721,11 +4721,11 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > > > > * unreachable packets. */ > > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) || " > > > > + "icmp6.type == 1 || " > > > > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 || " > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) ||" > > > > + "icmp6.type == 1 || " > > > > "(udp && udp.src == 546 && udp.dst == 547)", "next;"); > > > > > > > > /* Ingress and Egress Pre-ACL Table (Priority 100). > > > > @@ -4838,11 +4838,11 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, > > > > /* Do not send ND packets to conntrack */ > > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > > > + "icmp6.type == 1", > > > > "next;"); > > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > > > + "icmp6.type == 1", > > > > "next;"); > > > > > > > > /* Do not send service monitor packets to conntrack. */ > > > > @@ -4988,7 +4988,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > > > > ds_put_format(&actions, "reg0 = 0; " > > > > "eth.dst <-> eth.src; ip4.dst <-> ip4.src; " > > > > "tcp_reset { outport <-> inport; %s };", > > > > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > > > > + ingress ? "next(pipeline=egress,table=5);" > > > > + : "next(pipeline=ingress,table=19);"); > > > > > > Do you think there's a way we could avoid hardcoding the table numbers? > > > I'm thinking of the case when a stage is added/deleted and the > > > ls_out_qos_mark/"ls_in_l2_lkup" end up having different table ids. I > > > don't think a macro definition is enough as that still needs updating > > > every time a table is added/deleted. > > I agree. Hardcoding table numbers are prone to regressions later in this > case. As I discussed offline, I think it's better to have a macro defined > for each table whose value gets generated. > > I'll work on a follow up patch to have this. > > Thanks > Numan > > > > > > > > ovn_lflow_add_with_hint(lflows, od, stage, > > > > acl->priority + OVN_ACL_PRI_OFFSET + 10, > > > > ds_cstr(&match), ds_cstr(&actions), stage_hint); > > > > @@ -5002,7 +5003,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, > > > > ds_put_format(&actions, "reg0 = 0; " > > > > "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " > > > > "tcp_reset { outport <-> inport; %s };", > > > > - ingress ? "output;" : "next(pipeline=ingress,table=0);"); > > > > + ingress ? "next(pipeline=egress,table=5);" > > > > + : "next(pipeline=ingress,table=19);"); > > > > > > Same question as above. > > > > > > > ovn_lflow_add_with_hint(lflows, od, stage, > > > > acl->priority + OVN_ACL_PRI_OFFSET + 10, > > > > ds_cstr(&match), ds_cstr(&actions), stage_hint); > > > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > > > index 5f8da534c..3aa7cd4da 100644 > > > > --- a/ovn-sb.xml > > > > +++ b/ovn-sb.xml > > > > @@ -1112,10 +1112,12 @@ > > > > <var>pipeline</var> as a subroutine. The default <var>table</var> is > > > > just after the current one. If <var>pipeline</var> is specified, it > > > > may be <code>ingress</code> or <code>egress</code>; the default > > > > - <var>pipeline</var> is the one currently executing. Actions in the > > > > - ingress pipeline may not use <code>next</code> to jump into the > > > > - egress pipeline (use the <code>output</code> instead), but > > > > - transitions in the opposite direction are allowed. > > > > + <var>pipeline</var> is the one currently executing. Actions in the > > > > + both ingress and egress pipeline can use <code>next</code> to jump > > > > + across the other pipeline. Actions in the ingress pipeline should > > > > + use <code>next</code> to jump into the specific table of egress > > > > + pipeline only if it is certain that the packets are local and not > > > > + tunnelled and wants to skip certain stages in the packet processing. > > > > </dd> > > > > > > > > <dt><code><var>field</var> = <var>constant</var>;</code></dt> > > > > diff --git a/tests/automake.mk b/tests/automake.mk > > > > index 215fb432b..ed530dd77 100644 > > > > --- a/tests/automake.mk > > > > +++ b/tests/automake.mk > > > > @@ -205,7 +205,8 @@ tests_ovstest_LDADD = $(OVS_LIBDIR)/ libopenvswitch.la lib/libovn.la > > > > # Python tests. > > > > CHECK_PYFILES = \ > > > > tests/test-l7.py \ > > > > - tests/uuidfilt.py > > > > + tests/uuidfilt.py \ > > > > + tests/test-tcp-rst.py > > > > > > > > EXTRA_DIST += $(CHECK_PYFILES) > > > > PYCOV_CLEAN_FILES += $(CHECK_PYFILES:.py=.py,cover) .coverage > > > > diff --git a/tests/ovn.at b/tests/ovn.at > > > > index 2a7ee7528..b00670816 100644 > > > > --- a/tests/ovn.at > > > > +++ b/tests/ovn.at > > > > @@ -852,7 +852,11 @@ next(pipeline=ingress, table=11); > > > > encodes as resubmit(,19) > > > > > > > > next(pipeline=egress); > > > > - "next" action cannot advance from ingress to egress pipeline (use "output" action instead) > > > > + formats as next(pipeline=egress, table=11); > > > > + encodes as resubmit(,51) > > > > + > > > > +next(pipeline=egress, table=5); > > > > + encodes as resubmit(,45) > > > > > > > > next(table=10); > > > > formats as next(10); > > > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > > > index bdb9768d2..3b11cf92b 100644 > > > > --- a/tests/system-ovn.at > > > > +++ b/tests/system-ovn.at > > > > @@ -3719,60 +3719,86 @@ start_daemon ovn-controller > > > > > > > > ovn-nbctl ls-add sw0 > > > > > > > > -ovn-nbctl lsp-add sw0 sw0-p1 > > > > -ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > > -ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > > +ovn-nbctl lsp-add sw0 sw0-p1-rej > > > > +ovn-nbctl lsp-set-addresses sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > > +ovn-nbctl lsp-set-port-security sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" > > > > > > > > -ovn-nbctl lsp-add sw0 sw0-p2 > > > > -ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > > -ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > > +ovn-nbctl lsp-add sw0 sw0-p2-rej > > > > +ovn-nbctl lsp-set-addresses sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > > +ovn-nbctl lsp-set-port-security sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" > > > > + > > > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > > > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > > > > + > > > > +# Create port group and ACLs for sw0 ports. > > > > +ovn-nbctl pg-add pg0_drop sw0-p1-rej sw0-p2-rej > > > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > > > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > > > > + > > > > +ovn-nbctl pg-add pg0 sw0-p1-rej sw0-p2-rej > > > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > > > > +ovn-nbctl --log acl-add pg0 from-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 80" reject > > > > > > > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject > > > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 82" allow-related > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 82" allow-related > > > > +ovn-nbctl --log acl-add pg0 to-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 84" reject > > > > > > > > OVN_POPULATE_ARP > > > > ovn-nbctl --wait=hv sync > > > > > > > > -ADD_NAMESPACES(sw0-p1) > > > > -ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > > > +ADD_NAMESPACES(sw0-p1-rej) > > > > +ADD_VETH(sw0-p1-rej, sw0-p1-rej, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > > > "10.0.0.1") > > > > > > > > -ADD_NAMESPACES(sw0-p2) > > > > -ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > > > +ADD_NAMESPACES(sw0-p2-rej) > > > > +ADD_VETH(sw0-p2-rej, sw0-p2-rej, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > > > "10.0.0.1") > > > > > > > > -NS_CHECK_EXEC([sw0-p1], [ip a a aef0::3/64 dev sw0-p1], [0]) > > > > -NS_CHECK_EXEC([sw0-p2], [ip a a aef0::4/64 dev sw0-p2], [0]) > > > > +NS_CHECK_EXEC([sw0-p1-rej], [ip a a aef0::3/64 dev sw0-p1-rej], [0]) > > > > +NS_CHECK_EXEC([sw0-p2-rej], [ip a a aef0::4/64 dev sw0-p2-rej], [0]) > > > > > > > > -# Capture packets in sw0-p1. > > > > -NS_CHECK_EXEC([sw0-p1], [tcpdump -n -c 2 -i sw0-p1 tcp port 80 > sw0-p1-ip4.pcap &], [0]) > > > > +# Capture packets in sw0-p1-rej. > > > > +NS_CHECK_EXEC([sw0-p1-rej], [tcpdump -n -c 2 -i sw0-p1-rej tcp port 80 > sw0-p1-rej-ip4.pcap &], [0]) > > > > sleep 1 > > > > > > > > -NS_CHECK_EXEC([sw0-p1], [nc 10.0.0.4 80], [1], [], > > > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 80], [1], [], > > > > [dnl > > > > Ncat: Connection refused. > > > > ]) > > > > > > > > OVS_WAIT_UNTIL([ > > > > - total=`cat sw0-p1-ip4.pcap | wc -l` > > > > + total=`cat sw0-p1-rej-ip4.pcap | wc -l` > > > > echo "total = $total" > > > > test "${total}" = "2" > > > > ]) > > > > > > > > +# Now send traffic to port 84 > > > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 84], [1], [], > > > > +[dnl > > > > +Ncat: Connection refused. > > > > +]) > > > > + > > > > +AT_CHECK([ > > > > + n_pkt=$(ovs-ofctl dump-flows br-int table=44 | grep -v n_packets=0 | \ > > > > +grep controller | grep tp_dst=84 -c) > > > > + test $n_pkt -eq 1 > > > > +]) > > > > + > > > > # Without this sleep, test case fails intermittently. > > > > sleep 3 > > > > > > > > -NS_CHECK_EXEC([sw0-p2], [tcpdump -n -c 2 -i sw0-p2 tcp port 80 > sw0-p2-ip6.pcap &], [0]) > > > > +NS_CHECK_EXEC([sw0-p2-rej], [tcpdump -n -c 2 -i sw0-p2-rej tcp port 80 > sw0-p2-rej-ip6.pcap &], [0]) > > > > > > > > sleep 1 > > > > > > > > -NS_CHECK_EXEC([sw0-p2], [nc -6 aef0::3 80], [1], [], > > > > +NS_CHECK_EXEC([sw0-p2-rej], [nc -6 aef0::3 80], [1], [], > > > > [dnl > > > > Ncat: Connection refused. > > > > ]) > > > > > > > > OVS_WAIT_UNTIL([ > > > > - total=`cat sw0-p2-ip6.pcap | wc -l` > > > > + total=`cat sw0-p2-rej-ip6.pcap | wc -l` > > > > echo "total = $total" > > > > test "${total}" = "2" > > > > ]) > > > > @@ -3936,3 +3962,105 @@ as > > > > OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d > > > > /.*terminating with signal 15.*/d"]) > > > > AT_CLEANUP > > > > + > > > > +# Tests that when an established connection sends TCP reset, > > > > +# the conntrack entry is not in established state. > > > > +AT_SETUP([ovn -- conntrack TCP reset]) > > > > +AT_KEYWORDS([conntrack]) > > > > +ovn_start > > > > + > > > > +OVS_TRAFFIC_VSWITCHD_START() > > > > +ADD_BR([br-int]) > > > > + > > > > +# Set external-ids in br-int needed for ovn-controller > > > > +ovs-vsctl \ > > > > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > > > > + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > > > > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > > > > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ > > > > + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true > > > > + > > > > +# Start ovn-controller > > > > +start_daemon ovn-controller > > > > + > > > > +ovn-nbctl ls-add sw0 > > > > + > > > > +ovn-nbctl lsp-add sw0 rst-p1 > > > > +ovn-nbctl lsp-set-addresses rst-p1 "50:54:00:00:00:03" > > > > +ovn-nbctl lsp-set-port-security rst-p1 "50:54:00:00:00:03" > > > > + > > > > +ovn-nbctl lsp-add sw0 rst-p2 > > > > +ovn-nbctl lsp-set-addresses rst-p2 "50:54:00:00:00:04 10.0.0.4" > > > > +ovn-nbctl lsp-set-port-security rst-p2 "50:54:00:00:00:04 10.0.0.4" > > > > + > > > > +# Create port group and ACLs for sw0 ports. > > > > +ovn-nbctl pg-add pg0_drop rst-p1 rst-p2 > > > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop > > > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop > > > > + > > > > +ovn-nbctl pg-add pg0 rst-p1 rst-p2 > > > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 80" allow-related > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 80" allow-related > > > > + > > > > +# Create a logical router and attach to logical switch. > > > > +ovn-nbctl lr-add lr0 > > > > +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > > > > +ovn-nbctl lsp-add sw0 sw0-lr0 > > > > +ovn-nbctl lsp-set-type sw0-lr0 router > > > > +ovn-nbctl lsp-set-addresses sw0-lr0 router > > > > +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > > > > + > > > > +ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80 > > > > +ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > > > > +ovn-nbctl --wait=sb lr-lb-add lr0 lb1 > > > > + > > > > +OVN_POPULATE_ARP > > > > +ovn-nbctl --wait=hv sync > > > > + > > > > +ADD_NAMESPACES(rst-p1) > > > > +ADD_VETH(rst-p1, rst-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ > > > > + "10.0.0.1") > > > > + > > > > +ADD_NAMESPACES(rst-p2) > > > > +ADD_VETH(rst-p2, rst-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ > > > > + "10.0.0.1") > > > > + > > > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p1) = xup]) > > > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p2) = xup]) > > > > + > > > > +# Start webservers in 'rst-p1'. > > > > +OVS_START_L7([rst-p1], [http]) > > > > + > > > > +NS_CHECK_EXEC([rst-p2], [$PYTHON $srcdir/test-tcp-rst.py --dst-port 80 --dst-ip 10.0.0.10]) > > > > + > > > > +# When tcp reset is sent, conntrack entry should be in the state - CLOSED or CLOSING. > > > > +# But there is a bug where tcp reset packet was not sent to the conntrack. > > > > +# This test case checks that the tcp reset packet is sent to conntrack > > > > +# and the state is not in established state. > > > > +AT_CHECK([ > > > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=ESTABLISHED -c) > > > > + test $ct_est_count -eq 0 > > > > + > > > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=CLOS -c) > > > > + test $ct_est_count -eq 1 > > > > +]) > > > > + > > > > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > > > + > > > > +as ovn-sb > > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > > + > > > > +as ovn-nb > > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > > + > > > > +as northd > > > > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > > > > + > > > > +as > > > > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > > > > +/connection dropped.*/d > > > > +/Service monitor not found.*/d"]) > > > > + > > > > +AT_CLEANUP > > > > diff --git a/tests/test-tcp-rst.py b/tests/test-tcp-rst.py > > > > new file mode 100644 > > > > index 000000000..6f96c5706 > > > > --- /dev/null > > > > +++ b/tests/test-tcp-rst.py > > > > @@ -0,0 +1,37 @@ > > > > +#!/usr/bin/env python3 > > > > +# Copyright (c) 2020 Red Hat, Inc. > > > > +# > > > > +# Licensed under the Apache License, Version 2.0 (the "License"); > > > > +# you may not use this file except in compliance with the License. > > > > +# You may obtain a copy of the License at: > > > > +# > > > > +# http://www.apache.org/licenses/LICENSE-2.0 > > > > +# > > > > +# Unless required by applicable law or agreed to in writing, software > > > > +# distributed under the License is distributed on an "AS IS" BASIS, > > > > +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > > > > +# See the License for the specific language governing permissions and > > > > +# limitations under the License. > > > > + > > > > +# Simple python script which connects to tcp server and then > > > > +# resets the connection. > > > > +import argparse > > > > +import socket > > > > +import sys > > > > +import struct > > > > +import time > > > > + > > > > +parser = argparse.ArgumentParser(description='') > > > > +parser.add_argument("--src-port", type=int, default=11337, help="source port to use") > > > > +parser.add_argument("--dst-port", type=int, help="dst port to use") > > > > +parser.add_argument("--dst-ip", help="server ip to use") > > > > +args = parser.parse_args() > > > > +sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > > > > +server_address = (args.dst_ip, args.dst_port) > > > > +sock.bind(('0.0.0.0', args.src_port)) > > > > +sock.connect(server_address) > > > > +l_onoff = 1 > > > > +l_linger = 0 > > > > +time.sleep(1) > > > > +sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', l_onoff, l_linger)) > > > > +sock.close() > > > > > > > > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Wed, Apr 29, 2020 at 12:21 PM Han Zhou <hzhou@ovn.org> wrote: > On Mon, Apr 27, 2020 at 8:41 AM Numan Siddique <numans@ovn.org> wrote: > > > > On Mon, Apr 27, 2020 at 8:40 PM Lorenzo Bianconi > > <lorenzo.bianconi@redhat.com> wrote: > > > > > > > > > > > On 4/24/20 9:55 AM, numans@ovn.org wrote: > > > > > From: Numan Siddique <numans@ovn.org> > > > > > > > > > > The commit [1] - 28097d5adb95("Fix tcp_reset action handling") > fixed an issue > > > > > with tcp_reset OVN action. In order to fix that issue, this commit > added > > > > > logical flows to skip all the TCP RST packets from conntrack. > > > > > Ideally it should have skipped only the TCP RST packets generated > by > > > > > ovn-controller from conntrack. Since all the TCP RST packets are > > > > > skipped from conntrack, the connections in conntrack remain in > > > > > ESTABLISHED state even if the client/server sends TCP RST to close > the > > > > > connection. And these entries live for a long time and this is > > > > > causing performance issues as reported in the BZ. > > > > > > > > > > This patch reverts the logical flows added in [1] and modifies the > inner > > > > > actions of tcp_reset in the ingress logical switch pipeline > > > > > from - "tcp_reset { outport <-> inport; output; }" > > > > > to "tcp_reset { output <-> inport; next(pipeline=egress,table=5); > }". > > > > > This causes the packet to resubmit to the egress table > ls_out_qos_mark > > > > > skipping the egress ACL stage. Prior to this packet, next action > was > > > > > not allowing a resubmit from ingress to egress pipeline. This patch > > > > > relaxes this limitation. > > > > > > > > > > For the tcp_reset action in the egress logical switch pipeline, > this patch > > > > > modifies the inner action > > > > > from - "tcp_reset { outport <-> inport; > next(pipeline=ingress,table=0); }" > > > > > to - "tcp_reset { outport <-> inport; > next(pipeline=ingress,table=19); }". > > > > > This causes the packet to enter the ingress table ls_in_l2_lkup. > > > > > > > > > > We don't see similar conntrack leaks with UDP. Although there is an > issue > > > > > with the acl reject action for UDP packets. When ovn-controller > generates icmp > > > > > destination unreachable packet, it doesn't get delivered. And the > IP checksum is > > > > > incorrect in this packet. A follow up patch will fix these issues. > > > > > > > > > > [1] - 28097d5adb95("Fix tcp_reset action handling") > > > > > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1819785 > > > > > Co-Authored-by: Tim Rozet <trozet@redhat.com> > > > > > Signed-off-by: Tim Rozet <trozet@redhat.com> > > > > > Signed-off-by: Numan Siddique <numans@ovn.org> > > > > > > Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > > > > > > > > > > > > > > Hi Numan, > > > > > > > > Looks good to me. Only a question inline regarding options for > avoiding > > > > hardcoding table IDs. Except for that: > > > > > > > > Acked-by: Dumitru Ceara <dceara@redhat.com> > > > > > > > > Thanks Dumitru and Lorenzo for the reviews. I applied this patch to > master. > > Thanks Numan. Shall this be backported to 20.03? > I forgot to backport to 20.03 earlier when I applied to master. I backported it yesterday. Forgot to update here. Thanks Numan. > > > > > > Thanks, > > > > Dumitru > > > > > > > > > --- > > > > > lib/actions.c | 6 +- > > > > > northd/ovn-northd.8.xml | 8 ++ > > > > > northd/ovn-northd.c | 14 ++-- > > > > > ovn-sb.xml | 10 ++- > > > > > tests/automake.mk | 3 +- > > > > > tests/ovn.at | 6 +- > > > > > tests/system-ovn.at | 170 > +++++++++++++++++++++++++++++++++++----- > > > > > tests/test-tcp-rst.py | 37 +++++++++ > > > > > 8 files changed, 216 insertions(+), 38 deletions(-) > > > > > create mode 100644 tests/test-tcp-rst.py > > > > > > > > > > diff --git a/lib/actions.c b/lib/actions.c > > > > > index c19813de0..91619c889 100644 > > > > > --- a/lib/actions.c > > > > > +++ b/lib/actions.c > > > > > @@ -319,11 +319,7 @@ parse_NEXT(struct action_context *ctx) > > > > > } > > > > > } > > > > > > > > > > - if (pipeline == OVNACT_P_EGRESS && ctx->pp->pipeline == > OVNACT_P_INGRESS) { > > > > > - lexer_error(ctx->lexer, > > > > > - "\"next\" action cannot advance from ingress > to egress " > > > > > - "pipeline (use \"output\" action instead)"); > > > > > - } else if (table >= ctx->pp->n_tables) { > > > > > + if (table >= ctx->pp->n_tables) { > > > > > lexer_error(ctx->lexer, > > > > > "\"next\" action cannot advance beyond table > %d.", > > > > > ctx->pp->n_tables - 1); > > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > > > > > index efcc4b7fc..d39e259f6 100644 > > > > > --- a/northd/ovn-northd.8.xml > > > > > +++ b/northd/ovn-northd.8.xml > > > > > @@ -373,6 +373,14 @@ > > > > > for new connections and <code>reg0[1] = 1; next;</code> > for existing > > > > > connections. > > > > > </li> > > > > > + <li> > > > > > + <code>reject</code> ACLs translate into logical > > > > > + flows with the > > > > > + <code>tcp_reset { output <-> inport; > > > > > + next(pipeline=egress,table=5);}</code> > > > > > + action for TCP connections and <code>icmp4/icmp6</code> > action > > > > > + for UDP connections. > > > > > + </li> > > > > > <li> > > > > > Other ACLs translate to <code>drop;</code> for new or > untracked > > > > > connections and <code>ct_commit(ct_label=1/1);</code> for > known > > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > > > index c4675bd68..f8ae155a2 100644 > > > > > --- a/northd/ovn-northd.c > > > > > +++ b/northd/ovn-northd.c > > > > > @@ -4721,11 +4721,11 @@ build_pre_acls(struct ovn_datapath *od, > struct hmap *lflows) > > > > > * unreachable packets. */ > > > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, > > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 || > " > > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) > || " > > > > > + "icmp6.type == 1 || " > > > > > "(udp && udp.src == 546 && udp.dst == 547)", > "next;"); > > > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 || > " > > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20) > ||" > > > > > + "icmp6.type == 1 || " > > > > > "(udp && udp.src == 546 && udp.dst == 547)", > "next;"); > > > > > > > > > > /* Ingress and Egress Pre-ACL Table (Priority 100). > > > > > @@ -4838,11 +4838,11 @@ build_pre_lb(struct ovn_datapath *od, > struct hmap *lflows, > > > > > /* Do not send ND packets to conntrack */ > > > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, > > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > > > > + "icmp6.type == 1", > > > > > "next;"); > > > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, > > > > > "nd || nd_rs || nd_ra || icmp4.type == 3 ||" > > > > > - "icmp6.type == 1 || (tcp && tcp.flags == 20)", > > > > > + "icmp6.type == 1", > > > > > "next;"); > > > > > > > > > > /* Do not send service monitor packets to conntrack. */ > > > > > @@ -4988,7 +4988,8 @@ build_reject_acl_rules(struct ovn_datapath > *od, struct hmap *lflows, > > > > > ds_put_format(&actions, "reg0 = 0; " > > > > > "eth.dst <-> eth.src; ip4.dst <-> ip4.src; " > > > > > "tcp_reset { outport <-> inport; %s };", > > > > > - ingress ? "output;" : > "next(pipeline=ingress,table=0);"); > > > > > + ingress ? "next(pipeline=egress,table=5);" > > > > > + : "next(pipeline=ingress,table=19);"); > > > > > > > > Do you think there's a way we could avoid hardcoding the table > numbers? > > > > I'm thinking of the case when a stage is added/deleted and the > > > > ls_out_qos_mark/"ls_in_l2_lkup" end up having different table ids. I > > > > don't think a macro definition is enough as that still needs updating > > > > every time a table is added/deleted. > > > > I agree. Hardcoding table numbers are prone to regressions later in this > > case. As I discussed offline, I think it's better to have a macro defined > > for each table whose value gets generated. > > > > I'll work on a follow up patch to have this. > > > > Thanks > > Numan > > > > > > > > > > > ovn_lflow_add_with_hint(lflows, od, stage, > > > > > acl->priority + OVN_ACL_PRI_OFFSET + > 10, > > > > > ds_cstr(&match), ds_cstr(&actions), > stage_hint); > > > > > @@ -5002,7 +5003,8 @@ build_reject_acl_rules(struct ovn_datapath > *od, struct hmap *lflows, > > > > > ds_put_format(&actions, "reg0 = 0; " > > > > > "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " > > > > > "tcp_reset { outport <-> inport; %s };", > > > > > - ingress ? "output;" : > "next(pipeline=ingress,table=0);"); > > > > > + ingress ? "next(pipeline=egress,table=5);" > > > > > + : "next(pipeline=ingress,table=19);"); > > > > > > > > Same question as above. > > > > > > > > > ovn_lflow_add_with_hint(lflows, od, stage, > > > > > acl->priority + OVN_ACL_PRI_OFFSET + > 10, > > > > > ds_cstr(&match), ds_cstr(&actions), > stage_hint); > > > > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > > > > index 5f8da534c..3aa7cd4da 100644 > > > > > --- a/ovn-sb.xml > > > > > +++ b/ovn-sb.xml > > > > > @@ -1112,10 +1112,12 @@ > > > > > <var>pipeline</var> as a subroutine. The default > <var>table</var> is > > > > > just after the current one. If <var>pipeline</var> is > specified, it > > > > > may be <code>ingress</code> or <code>egress</code>; the > default > > > > > - <var>pipeline</var> is the one currently executing. > Actions in the > > > > > - ingress pipeline may not use <code>next</code> to jump > into the > > > > > - egress pipeline (use the <code>output</code> instead), > but > > > > > - transitions in the opposite direction are allowed. > > > > > + <var>pipeline</var> is the one currently executing. > Actions in the > > > > > + both ingress and egress pipeline can use > <code>next</code> to jump > > > > > + across the other pipeline. Actions in the ingress > pipeline should > > > > > + use <code>next</code> to jump into the specific table of > egress > > > > > + pipeline only if it is certain that the packets are > local and not > > > > > + tunnelled and wants to skip certain stages in the packet > processing. > > > > > </dd> > > > > > > > > > > <dt><code><var>field</var> = > <var>constant</var>;</code></dt> > > > > > diff --git a/tests/automake.mk b/tests/automake.mk > > > > > index 215fb432b..ed530dd77 100644 > > > > > --- a/tests/automake.mk > > > > > +++ b/tests/automake.mk > > > > > @@ -205,7 +205,8 @@ tests_ovstest_LDADD = $(OVS_LIBDIR)/ > libopenvswitch.la lib/libovn.la > > > > > # Python tests. > > > > > CHECK_PYFILES = \ > > > > > tests/test-l7.py \ > > > > > - tests/uuidfilt.py > > > > > + tests/uuidfilt.py \ > > > > > + tests/test-tcp-rst.py > > > > > > > > > > EXTRA_DIST += $(CHECK_PYFILES) > > > > > PYCOV_CLEAN_FILES += $(CHECK_PYFILES:.py=.py,cover) .coverage > > > > > diff --git a/tests/ovn.at b/tests/ovn.at > > > > > index 2a7ee7528..b00670816 100644 > > > > > --- a/tests/ovn.at > > > > > +++ b/tests/ovn.at > > > > > @@ -852,7 +852,11 @@ next(pipeline=ingress, table=11); > > > > > encodes as resubmit(,19) > > > > > > > > > > next(pipeline=egress); > > > > > - "next" action cannot advance from ingress to egress pipeline > (use "output" action instead) > > > > > + formats as next(pipeline=egress, table=11); > > > > > + encodes as resubmit(,51) > > > > > + > > > > > +next(pipeline=egress, table=5); > > > > > + encodes as resubmit(,45) > > > > > > > > > > next(table=10); > > > > > formats as next(10); > > > > > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > > > > > index bdb9768d2..3b11cf92b 100644 > > > > > --- a/tests/system-ovn.at > > > > > +++ b/tests/system-ovn.at > > > > > @@ -3719,60 +3719,86 @@ start_daemon ovn-controller > > > > > > > > > > ovn-nbctl ls-add sw0 > > > > > > > > > > -ovn-nbctl lsp-add sw0 sw0-p1 > > > > > -ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3 > aef0::3" > > > > > -ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 > aef0::3" > > > > > +ovn-nbctl lsp-add sw0 sw0-p1-rej > > > > > +ovn-nbctl lsp-set-addresses sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 > aef0::3" > > > > > +ovn-nbctl lsp-set-port-security sw0-p1-rej "50:54:00:00:00:03 > 10.0.0.3 aef0::3" > > > > > > > > > > -ovn-nbctl lsp-add sw0 sw0-p2 > > > > > -ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4 > aef0::4" > > > > > -ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 > aef0::4" > > > > > +ovn-nbctl lsp-add sw0 sw0-p2-rej > > > > > +ovn-nbctl lsp-set-addresses sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 > aef0::4" > > > > > +ovn-nbctl lsp-set-port-security sw0-p2-rej "50:54:00:00:00:04 > 10.0.0.4 aef0::4" > > > > > + > > > > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" > && tcp && tcp.dst == 80" reject > > > > > +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" > && ip6 && tcp && tcp.dst == 80" reject > > > > > + > > > > > +# Create port group and ACLs for sw0 ports. > > > > > +ovn-nbctl pg-add pg0_drop sw0-p1-rej sw0-p2-rej > > > > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && > ip" drop > > > > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && > ip" drop > > > > > + > > > > > +ovn-nbctl pg-add pg0 sw0-p1-rej sw0-p2-rej > > > > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" > allow-related > > > > > +ovn-nbctl --log acl-add pg0 from-lport 1004 "inport == @pg0 && ip > && tcp && tcp.dst == 80" reject > > > > > > > > > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" > && tcp && tcp.dst == 80" reject > > > > > -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" > && ip6 && tcp && tcp.dst == 80" reject > > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && > ip4.src == 0.0.0.0/0 && icmp4" allow-related > > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && > ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 82" allow-related > > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && > ip4.src == 0.0.0.0/0 && udp && udp.dst == 82" allow-related > > > > > +ovn-nbctl --log acl-add pg0 to-lport 1004 "inport == @pg0 && ip && > tcp && tcp.dst == 84" reject > > > > > > > > > > OVN_POPULATE_ARP > > > > > ovn-nbctl --wait=hv sync > > > > > > > > > > -ADD_NAMESPACES(sw0-p1) > > > > > -ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24", > "50:54:00:00:00:03", \ > > > > > +ADD_NAMESPACES(sw0-p1-rej) > > > > > +ADD_VETH(sw0-p1-rej, sw0-p1-rej, br-int, "10.0.0.3/24", > "50:54:00:00:00:03", \ > > > > > "10.0.0.1") > > > > > > > > > > -ADD_NAMESPACES(sw0-p2) > > > > > -ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24", > "50:54:00:00:00:04", \ > > > > > +ADD_NAMESPACES(sw0-p2-rej) > > > > > +ADD_VETH(sw0-p2-rej, sw0-p2-rej, br-int, "10.0.0.4/24", > "50:54:00:00:00:04", \ > > > > > "10.0.0.1") > > > > > > > > > > -NS_CHECK_EXEC([sw0-p1], [ip a a aef0::3/64 dev sw0-p1], [0]) > > > > > -NS_CHECK_EXEC([sw0-p2], [ip a a aef0::4/64 dev sw0-p2], [0]) > > > > > +NS_CHECK_EXEC([sw0-p1-rej], [ip a a aef0::3/64 dev sw0-p1-rej], > [0]) > > > > > +NS_CHECK_EXEC([sw0-p2-rej], [ip a a aef0::4/64 dev sw0-p2-rej], > [0]) > > > > > > > > > > -# Capture packets in sw0-p1. > > > > > -NS_CHECK_EXEC([sw0-p1], [tcpdump -n -c 2 -i sw0-p1 tcp port 80 > > sw0-p1-ip4.pcap &], [0]) > > > > > +# Capture packets in sw0-p1-rej. > > > > > +NS_CHECK_EXEC([sw0-p1-rej], [tcpdump -n -c 2 -i sw0-p1-rej tcp > port 80 > sw0-p1-rej-ip4.pcap &], [0]) > > > > > sleep 1 > > > > > > > > > > -NS_CHECK_EXEC([sw0-p1], [nc 10.0.0.4 80], [1], [], > > > > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 80], [1], [], > > > > > [dnl > > > > > Ncat: Connection refused. > > > > > ]) > > > > > > > > > > OVS_WAIT_UNTIL([ > > > > > - total=`cat sw0-p1-ip4.pcap | wc -l` > > > > > + total=`cat sw0-p1-rej-ip4.pcap | wc -l` > > > > > echo "total = $total" > > > > > test "${total}" = "2" > > > > > ]) > > > > > > > > > > +# Now send traffic to port 84 > > > > > +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 84], [1], [], > > > > > +[dnl > > > > > +Ncat: Connection refused. > > > > > +]) > > > > > + > > > > > +AT_CHECK([ > > > > > + n_pkt=$(ovs-ofctl dump-flows br-int table=44 | grep -v > n_packets=0 | \ > > > > > +grep controller | grep tp_dst=84 -c) > > > > > + test $n_pkt -eq 1 > > > > > +]) > > > > > + > > > > > # Without this sleep, test case fails intermittently. > > > > > sleep 3 > > > > > > > > > > -NS_CHECK_EXEC([sw0-p2], [tcpdump -n -c 2 -i sw0-p2 tcp port 80 > > sw0-p2-ip6.pcap &], [0]) > > > > > +NS_CHECK_EXEC([sw0-p2-rej], [tcpdump -n -c 2 -i sw0-p2-rej tcp > port 80 > sw0-p2-rej-ip6.pcap &], [0]) > > > > > > > > > > sleep 1 > > > > > > > > > > -NS_CHECK_EXEC([sw0-p2], [nc -6 aef0::3 80], [1], [], > > > > > +NS_CHECK_EXEC([sw0-p2-rej], [nc -6 aef0::3 80], [1], [], > > > > > [dnl > > > > > Ncat: Connection refused. > > > > > ]) > > > > > > > > > > OVS_WAIT_UNTIL([ > > > > > - total=`cat sw0-p2-ip6.pcap | wc -l` > > > > > + total=`cat sw0-p2-rej-ip6.pcap | wc -l` > > > > > echo "total = $total" > > > > > test "${total}" = "2" > > > > > ]) > > > > > @@ -3936,3 +3962,105 @@ as > > > > > OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d > > > > > /.*terminating with signal 15.*/d"]) > > > > > AT_CLEANUP > > > > > + > > > > > +# Tests that when an established connection sends TCP reset, > > > > > +# the conntrack entry is not in established state. > > > > > +AT_SETUP([ovn -- conntrack TCP reset]) > > > > > +AT_KEYWORDS([conntrack]) > > > > > +ovn_start > > > > > + > > > > > +OVS_TRAFFIC_VSWITCHD_START() > > > > > +ADD_BR([br-int]) > > > > > + > > > > > +# Set external-ids in br-int needed for ovn-controller > > > > > +ovs-vsctl \ > > > > > + -- set Open_vSwitch . external-ids:system-id=hv1 \ > > > > > + -- set Open_vSwitch . > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ > > > > > + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ > > > > > + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 > \ > > > > > + -- set bridge br-int fail-mode=secure > other-config:disable-in-band=true > > > > > + > > > > > +# Start ovn-controller > > > > > +start_daemon ovn-controller > > > > > + > > > > > +ovn-nbctl ls-add sw0 > > > > > + > > > > > +ovn-nbctl lsp-add sw0 rst-p1 > > > > > +ovn-nbctl lsp-set-addresses rst-p1 "50:54:00:00:00:03" > > > > > +ovn-nbctl lsp-set-port-security rst-p1 "50:54:00:00:00:03" > > > > > + > > > > > +ovn-nbctl lsp-add sw0 rst-p2 > > > > > +ovn-nbctl lsp-set-addresses rst-p2 "50:54:00:00:00:04 10.0.0.4" > > > > > +ovn-nbctl lsp-set-port-security rst-p2 "50:54:00:00:00:04 > 10.0.0.4" > > > > > + > > > > > +# Create port group and ACLs for sw0 ports. > > > > > +ovn-nbctl pg-add pg0_drop rst-p1 rst-p2 > > > > > +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && > ip" drop > > > > > +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && > ip" drop > > > > > + > > > > > +ovn-nbctl pg-add pg0 rst-p1 rst-p2 > > > > > +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" > allow-related > > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && > ip4.src == 0.0.0.0/0 && icmp4" allow-related > > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && > ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 80" allow-related > > > > > +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && > ip4.src == 0.0.0.0/0 && udp && udp.dst == 80" allow-related > > > > > + > > > > > +# Create a logical router and attach to logical switch. > > > > > +ovn-nbctl lr-add lr0 > > > > > +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 > > > > > +ovn-nbctl lsp-add sw0 sw0-lr0 > > > > > +ovn-nbctl lsp-set-type sw0-lr0 router > > > > > +ovn-nbctl lsp-set-addresses sw0-lr0 router > > > > > +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 > > > > > + > > > > > +ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80 > > > > > +ovn-nbctl --wait=sb ls-lb-add sw0 lb1 > > > > > +ovn-nbctl --wait=sb lr-lb-add lr0 lb1 > > > > > + > > > > > +OVN_POPULATE_ARP > > > > > +ovn-nbctl --wait=hv sync > > > > > + > > > > > +ADD_NAMESPACES(rst-p1) > > > > > +ADD_VETH(rst-p1, rst-p1, br-int, "10.0.0.3/24", > "50:54:00:00:00:03", \ > > > > > + "10.0.0.1") > > > > > + > > > > > +ADD_NAMESPACES(rst-p2) > > > > > +ADD_VETH(rst-p2, rst-p2, br-int, "10.0.0.4/24", > "50:54:00:00:00:04", \ > > > > > + "10.0.0.1") > > > > > + > > > > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p1) = xup]) > > > > > +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p2) = xup]) > > > > > + > > > > > +# Start webservers in 'rst-p1'. > > > > > +OVS_START_L7([rst-p1], [http]) > > > > > + > > > > > +NS_CHECK_EXEC([rst-p2], [$PYTHON $srcdir/test-tcp-rst.py > --dst-port 80 --dst-ip 10.0.0.10]) > > > > > + > > > > > +# When tcp reset is sent, conntrack entry should be in the state - > CLOSED or CLOSING. > > > > > +# But there is a bug where tcp reset packet was not sent to the > conntrack. > > > > > +# This test case checks that the tcp reset packet is sent to > conntrack > > > > > +# and the state is not in established state. > > > > > +AT_CHECK([ > > > > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep > 10.0.0.10 | grep state=ESTABLISHED -c) > > > > > + test $ct_est_count -eq 0 > > > > > + > > > > > + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep > 10.0.0.10 | grep state=CLOS -c) > > > > > + test $ct_est_count -eq 1 > > > > > +]) > > > > > + > > > > > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > > > > + > > > > > +as ovn-sb > > > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > > > + > > > > > +as ovn-nb > > > > > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > > > > + > > > > > +as northd > > > > > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > > > > > + > > > > > +as > > > > > +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d > > > > > +/connection dropped.*/d > > > > > +/Service monitor not found.*/d"]) > > > > > + > > > > > +AT_CLEANUP > > > > > diff --git a/tests/test-tcp-rst.py b/tests/test-tcp-rst.py > > > > > new file mode 100644 > > > > > index 000000000..6f96c5706 > > > > > --- /dev/null > > > > > +++ b/tests/test-tcp-rst.py > > > > > @@ -0,0 +1,37 @@ > > > > > +#!/usr/bin/env python3 > > > > > +# Copyright (c) 2020 Red Hat, Inc. > > > > > +# > > > > > +# Licensed under the Apache License, Version 2.0 (the "License"); > > > > > +# you may not use this file except in compliance with the License. > > > > > +# You may obtain a copy of the License at: > > > > > +# > > > > > +# http://www.apache.org/licenses/LICENSE-2.0 > > > > > +# > > > > > +# Unless required by applicable law or agreed to in writing, > software > > > > > +# distributed under the License is distributed on an "AS IS" > BASIS, > > > > > +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > > > > > +# See the License for the specific language governing permissions > and > > > > > +# limitations under the License. > > > > > + > > > > > +# Simple python script which connects to tcp server and then > > > > > +# resets the connection. > > > > > +import argparse > > > > > +import socket > > > > > +import sys > > > > > +import struct > > > > > +import time > > > > > + > > > > > +parser = argparse.ArgumentParser(description='') > > > > > +parser.add_argument("--src-port", type=int, default=11337, > help="source port to use") > > > > > +parser.add_argument("--dst-port", type=int, help="dst port to > use") > > > > > +parser.add_argument("--dst-ip", help="server ip to use") > > > > > +args = parser.parse_args() > > > > > +sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > > > > > +server_address = (args.dst_ip, args.dst_port) > > > > > +sock.bind(('0.0.0.0', args.src_port)) > > > > > +sock.connect(server_address) > > > > > +l_onoff = 1 > > > > > +l_linger = 0 > > > > > +time.sleep(1) > > > > > +sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, > struct.pack('ii', l_onoff, l_linger)) > > > > > +sock.close() > > > > > > > > > > > > > _______________________________________________ > > > > dev mailing list > > > > dev@openvswitch.org > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >
diff --git a/lib/actions.c b/lib/actions.c index c19813de0..91619c889 100644 --- a/lib/actions.c +++ b/lib/actions.c @@ -319,11 +319,7 @@ parse_NEXT(struct action_context *ctx) } } - if (pipeline == OVNACT_P_EGRESS && ctx->pp->pipeline == OVNACT_P_INGRESS) { - lexer_error(ctx->lexer, - "\"next\" action cannot advance from ingress to egress " - "pipeline (use \"output\" action instead)"); - } else if (table >= ctx->pp->n_tables) { + if (table >= ctx->pp->n_tables) { lexer_error(ctx->lexer, "\"next\" action cannot advance beyond table %d.", ctx->pp->n_tables - 1); diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index efcc4b7fc..d39e259f6 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -373,6 +373,14 @@ for new connections and <code>reg0[1] = 1; next;</code> for existing connections. </li> + <li> + <code>reject</code> ACLs translate into logical + flows with the + <code>tcp_reset { output <-> inport; + next(pipeline=egress,table=5);}</code> + action for TCP connections and <code>icmp4/icmp6</code> action + for UDP connections. + </li> <li> Other ACLs translate to <code>drop;</code> for new or untracked connections and <code>ct_commit(ct_label=1/1);</code> for known diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index c4675bd68..f8ae155a2 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -4721,11 +4721,11 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) * unreachable packets. */ ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_ACL, 110, "nd || nd_rs || nd_ra || icmp4.type == 3 || " - "icmp6.type == 1 || (tcp && tcp.flags == 20) || " + "icmp6.type == 1 || " "(udp && udp.src == 546 && udp.dst == 547)", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, "nd || nd_rs || nd_ra || icmp4.type == 3 || " - "icmp6.type == 1 || (tcp && tcp.flags == 20) ||" + "icmp6.type == 1 || " "(udp && udp.src == 546 && udp.dst == 547)", "next;"); /* Ingress and Egress Pre-ACL Table (Priority 100). @@ -4838,11 +4838,11 @@ build_pre_lb(struct ovn_datapath *od, struct hmap *lflows, /* Do not send ND packets to conntrack */ ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, "nd || nd_rs || nd_ra || icmp4.type == 3 ||" - "icmp6.type == 1 || (tcp && tcp.flags == 20)", + "icmp6.type == 1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, "nd || nd_rs || nd_ra || icmp4.type == 3 ||" - "icmp6.type == 1 || (tcp && tcp.flags == 20)", + "icmp6.type == 1", "next;"); /* Do not send service monitor packets to conntrack. */ @@ -4988,7 +4988,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, ds_put_format(&actions, "reg0 = 0; " "eth.dst <-> eth.src; ip4.dst <-> ip4.src; " "tcp_reset { outport <-> inport; %s };", - ingress ? "output;" : "next(pipeline=ingress,table=0);"); + ingress ? "next(pipeline=egress,table=5);" + : "next(pipeline=ingress,table=19);"); ovn_lflow_add_with_hint(lflows, od, stage, acl->priority + OVN_ACL_PRI_OFFSET + 10, ds_cstr(&match), ds_cstr(&actions), stage_hint); @@ -5002,7 +5003,8 @@ build_reject_acl_rules(struct ovn_datapath *od, struct hmap *lflows, ds_put_format(&actions, "reg0 = 0; " "eth.dst <-> eth.src; ip6.dst <-> ip6.src; " "tcp_reset { outport <-> inport; %s };", - ingress ? "output;" : "next(pipeline=ingress,table=0);"); + ingress ? "next(pipeline=egress,table=5);" + : "next(pipeline=ingress,table=19);"); ovn_lflow_add_with_hint(lflows, od, stage, acl->priority + OVN_ACL_PRI_OFFSET + 10, ds_cstr(&match), ds_cstr(&actions), stage_hint); diff --git a/ovn-sb.xml b/ovn-sb.xml index 5f8da534c..3aa7cd4da 100644 --- a/ovn-sb.xml +++ b/ovn-sb.xml @@ -1112,10 +1112,12 @@ <var>pipeline</var> as a subroutine. The default <var>table</var> is just after the current one. If <var>pipeline</var> is specified, it may be <code>ingress</code> or <code>egress</code>; the default - <var>pipeline</var> is the one currently executing. Actions in the - ingress pipeline may not use <code>next</code> to jump into the - egress pipeline (use the <code>output</code> instead), but - transitions in the opposite direction are allowed. + <var>pipeline</var> is the one currently executing. Actions in the + both ingress and egress pipeline can use <code>next</code> to jump + across the other pipeline. Actions in the ingress pipeline should + use <code>next</code> to jump into the specific table of egress + pipeline only if it is certain that the packets are local and not + tunnelled and wants to skip certain stages in the packet processing. </dd> <dt><code><var>field</var> = <var>constant</var>;</code></dt> diff --git a/tests/automake.mk b/tests/automake.mk index 215fb432b..ed530dd77 100644 --- a/tests/automake.mk +++ b/tests/automake.mk @@ -205,7 +205,8 @@ tests_ovstest_LDADD = $(OVS_LIBDIR)/libopenvswitch.la lib/libovn.la # Python tests. CHECK_PYFILES = \ tests/test-l7.py \ - tests/uuidfilt.py + tests/uuidfilt.py \ + tests/test-tcp-rst.py EXTRA_DIST += $(CHECK_PYFILES) PYCOV_CLEAN_FILES += $(CHECK_PYFILES:.py=.py,cover) .coverage diff --git a/tests/ovn.at b/tests/ovn.at index 2a7ee7528..b00670816 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -852,7 +852,11 @@ next(pipeline=ingress, table=11); encodes as resubmit(,19) next(pipeline=egress); - "next" action cannot advance from ingress to egress pipeline (use "output" action instead) + formats as next(pipeline=egress, table=11); + encodes as resubmit(,51) + +next(pipeline=egress, table=5); + encodes as resubmit(,45) next(table=10); formats as next(10); diff --git a/tests/system-ovn.at b/tests/system-ovn.at index bdb9768d2..3b11cf92b 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -3719,60 +3719,86 @@ start_daemon ovn-controller ovn-nbctl ls-add sw0 -ovn-nbctl lsp-add sw0 sw0-p1 -ovn-nbctl lsp-set-addresses sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" -ovn-nbctl lsp-set-port-security sw0-p1 "50:54:00:00:00:03 10.0.0.3 aef0::3" +ovn-nbctl lsp-add sw0 sw0-p1-rej +ovn-nbctl lsp-set-addresses sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" +ovn-nbctl lsp-set-port-security sw0-p1-rej "50:54:00:00:00:03 10.0.0.3 aef0::3" -ovn-nbctl lsp-add sw0 sw0-p2 -ovn-nbctl lsp-set-addresses sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" -ovn-nbctl lsp-set-port-security sw0-p2 "50:54:00:00:00:04 10.0.0.4 aef0::4" +ovn-nbctl lsp-add sw0 sw0-p2-rej +ovn-nbctl lsp-set-addresses sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" +ovn-nbctl lsp-set-port-security sw0-p2-rej "50:54:00:00:00:04 10.0.0.4 aef0::4" + +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject +#ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject + +# Create port group and ACLs for sw0 ports. +ovn-nbctl pg-add pg0_drop sw0-p1-rej sw0-p2-rej +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop + +ovn-nbctl pg-add pg0 sw0-p1-rej sw0-p2-rej +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related +ovn-nbctl --log acl-add pg0 from-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 80" reject -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p1\" && tcp && tcp.dst == 80" reject -ovn-nbctl --log acl-add sw0 from-lport 1000 "inport == \"sw0-p2\" && ip6 && tcp && tcp.dst == 80" reject +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 82" allow-related +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 82" allow-related +ovn-nbctl --log acl-add pg0 to-lport 1004 "inport == @pg0 && ip && tcp && tcp.dst == 84" reject OVN_POPULATE_ARP ovn-nbctl --wait=hv sync -ADD_NAMESPACES(sw0-p1) -ADD_VETH(sw0-p1, sw0-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ +ADD_NAMESPACES(sw0-p1-rej) +ADD_VETH(sw0-p1-rej, sw0-p1-rej, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ "10.0.0.1") -ADD_NAMESPACES(sw0-p2) -ADD_VETH(sw0-p2, sw0-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ +ADD_NAMESPACES(sw0-p2-rej) +ADD_VETH(sw0-p2-rej, sw0-p2-rej, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ "10.0.0.1") -NS_CHECK_EXEC([sw0-p1], [ip a a aef0::3/64 dev sw0-p1], [0]) -NS_CHECK_EXEC([sw0-p2], [ip a a aef0::4/64 dev sw0-p2], [0]) +NS_CHECK_EXEC([sw0-p1-rej], [ip a a aef0::3/64 dev sw0-p1-rej], [0]) +NS_CHECK_EXEC([sw0-p2-rej], [ip a a aef0::4/64 dev sw0-p2-rej], [0]) -# Capture packets in sw0-p1. -NS_CHECK_EXEC([sw0-p1], [tcpdump -n -c 2 -i sw0-p1 tcp port 80 > sw0-p1-ip4.pcap &], [0]) +# Capture packets in sw0-p1-rej. +NS_CHECK_EXEC([sw0-p1-rej], [tcpdump -n -c 2 -i sw0-p1-rej tcp port 80 > sw0-p1-rej-ip4.pcap &], [0]) sleep 1 -NS_CHECK_EXEC([sw0-p1], [nc 10.0.0.4 80], [1], [], +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 80], [1], [], [dnl Ncat: Connection refused. ]) OVS_WAIT_UNTIL([ - total=`cat sw0-p1-ip4.pcap | wc -l` + total=`cat sw0-p1-rej-ip4.pcap | wc -l` echo "total = $total" test "${total}" = "2" ]) +# Now send traffic to port 84 +NS_CHECK_EXEC([sw0-p1-rej], [nc 10.0.0.4 84], [1], [], +[dnl +Ncat: Connection refused. +]) + +AT_CHECK([ + n_pkt=$(ovs-ofctl dump-flows br-int table=44 | grep -v n_packets=0 | \ +grep controller | grep tp_dst=84 -c) + test $n_pkt -eq 1 +]) + # Without this sleep, test case fails intermittently. sleep 3 -NS_CHECK_EXEC([sw0-p2], [tcpdump -n -c 2 -i sw0-p2 tcp port 80 > sw0-p2-ip6.pcap &], [0]) +NS_CHECK_EXEC([sw0-p2-rej], [tcpdump -n -c 2 -i sw0-p2-rej tcp port 80 > sw0-p2-rej-ip6.pcap &], [0]) sleep 1 -NS_CHECK_EXEC([sw0-p2], [nc -6 aef0::3 80], [1], [], +NS_CHECK_EXEC([sw0-p2-rej], [nc -6 aef0::3 80], [1], [], [dnl Ncat: Connection refused. ]) OVS_WAIT_UNTIL([ - total=`cat sw0-p2-ip6.pcap | wc -l` + total=`cat sw0-p2-rej-ip6.pcap | wc -l` echo "total = $total" test "${total}" = "2" ]) @@ -3936,3 +3962,105 @@ as OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d /.*terminating with signal 15.*/d"]) AT_CLEANUP + +# Tests that when an established connection sends TCP reset, +# the conntrack entry is not in established state. +AT_SETUP([ovn -- conntrack TCP reset]) +AT_KEYWORDS([conntrack]) +ovn_start + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-int]) + +# Set external-ids in br-int needed for ovn-controller +ovs-vsctl \ + -- set Open_vSwitch . external-ids:system-id=hv1 \ + -- set Open_vSwitch . external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ + -- set bridge br-int fail-mode=secure other-config:disable-in-band=true + +# Start ovn-controller +start_daemon ovn-controller + +ovn-nbctl ls-add sw0 + +ovn-nbctl lsp-add sw0 rst-p1 +ovn-nbctl lsp-set-addresses rst-p1 "50:54:00:00:00:03" +ovn-nbctl lsp-set-port-security rst-p1 "50:54:00:00:00:03" + +ovn-nbctl lsp-add sw0 rst-p2 +ovn-nbctl lsp-set-addresses rst-p2 "50:54:00:00:00:04 10.0.0.4" +ovn-nbctl lsp-set-port-security rst-p2 "50:54:00:00:00:04 10.0.0.4" + +# Create port group and ACLs for sw0 ports. +ovn-nbctl pg-add pg0_drop rst-p1 rst-p2 +ovn-nbctl acl-add pg0_drop from-lport 1001 "inport == @pg0_drop && ip" drop +ovn-nbctl acl-add pg0_drop to-lport 1001 "outport == @pg0_drop && ip" drop + +ovn-nbctl pg-add pg0 rst-p1 rst-p2 +ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4" allow-related +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && icmp4" allow-related +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && tcp && tcp.dst == 80" allow-related +ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4 && ip4.src == 0.0.0.0/0 && udp && udp.dst == 80" allow-related + +# Create a logical router and attach to logical switch. +ovn-nbctl lr-add lr0 +ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.1/24 +ovn-nbctl lsp-add sw0 sw0-lr0 +ovn-nbctl lsp-set-type sw0-lr0 router +ovn-nbctl lsp-set-addresses sw0-lr0 router +ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 + +ovn-nbctl lb-add lb1 10.0.0.10:80 10.0.0.3:80 +ovn-nbctl --wait=sb ls-lb-add sw0 lb1 +ovn-nbctl --wait=sb lr-lb-add lr0 lb1 + +OVN_POPULATE_ARP +ovn-nbctl --wait=hv sync + +ADD_NAMESPACES(rst-p1) +ADD_VETH(rst-p1, rst-p1, br-int, "10.0.0.3/24", "50:54:00:00:00:03", \ + "10.0.0.1") + +ADD_NAMESPACES(rst-p2) +ADD_VETH(rst-p2, rst-p2, br-int, "10.0.0.4/24", "50:54:00:00:00:04", \ + "10.0.0.1") + +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p1) = xup]) +OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up rst-p2) = xup]) + +# Start webservers in 'rst-p1'. +OVS_START_L7([rst-p1], [http]) + +NS_CHECK_EXEC([rst-p2], [$PYTHON $srcdir/test-tcp-rst.py --dst-port 80 --dst-ip 10.0.0.10]) + +# When tcp reset is sent, conntrack entry should be in the state - CLOSED or CLOSING. +# But there is a bug where tcp reset packet was not sent to the conntrack. +# This test case checks that the tcp reset packet is sent to conntrack +# and the state is not in established state. +AT_CHECK([ + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=ESTABLISHED -c) + test $ct_est_count -eq 0 + + ct_est_count=$(ovs-appctl dpctl/dump-conntrack | grep 10.0.0.10 | grep state=CLOS -c) + test $ct_est_count -eq 1 +]) + +OVS_APP_EXIT_AND_WAIT([ovn-controller]) + +as ovn-sb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as ovn-nb +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) + +as northd +OVS_APP_EXIT_AND_WAIT([ovn-northd]) + +as +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d +/connection dropped.*/d +/Service monitor not found.*/d"]) + +AT_CLEANUP diff --git a/tests/test-tcp-rst.py b/tests/test-tcp-rst.py new file mode 100644 index 000000000..6f96c5706 --- /dev/null +++ b/tests/test-tcp-rst.py @@ -0,0 +1,37 @@ +#!/usr/bin/env python3 +# Copyright (c) 2020 Red Hat, Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at: +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Simple python script which connects to tcp server and then +# resets the connection. +import argparse +import socket +import sys +import struct +import time + +parser = argparse.ArgumentParser(description='') +parser.add_argument("--src-port", type=int, default=11337, help="source port to use") +parser.add_argument("--dst-port", type=int, help="dst port to use") +parser.add_argument("--dst-ip", help="server ip to use") +args = parser.parse_args() +sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) +server_address = (args.dst_ip, args.dst_port) +sock.bind(('0.0.0.0', args.src_port)) +sock.connect(server_address) +l_onoff = 1 +l_linger = 0 +time.sleep(1) +sock.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, struct.pack('ii', l_onoff, l_linger)) +sock.close()