Message ID | 20240604131042.825909-7-xsimonar@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | Flaky tests fixes and tests cleanup. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
Hi Xavier, This is the only patch in the series for which I have a recommendation. The rest all look good to me. On 6/4/24 09:10, Xavier Simonart wrote: > In some rare cases, and despite "recent" changes to wait for cleanup > before replying to exit, ovn-controller was still running when trying > to restart it. > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> > --- > tests/ovn-macros.at | 9 +++++++++ > tests/ovn.at | 17 ++++++----------- > 2 files changed, 15 insertions(+), 11 deletions(-) > > diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at > index 47ada5c70..71a46db8b 100644 > --- a/tests/ovn-macros.at > +++ b/tests/ovn-macros.at > @@ -1107,6 +1107,15 @@ m4_define([OVN_CHECK_SCAPY_EDNS_CLIENT_SUBNET_SUPPORT], > AT_SKIP_IF([! echo "from scapy.layers.dns import EDNS0ClientSubnet" | python 2>&1 > /dev/null]) > ]) > > +m4_define([OVN_CONTROLLER_EXIT_RESTART], I think this should just be called "OVN_CONTROLLER_RESTART". The goal is to restart ovn-controller, so I think the simpler name makes sense. It's true that we are using an "exit" command for ovn-appctl, but that is more of an implementation detail than anything else. > + [TMPPID=$(cat $1/ovn-controller.pid) > + AT_CHECK([as $1 ovn-appctl -t ovn-controller exit --restart]) > + # Make sure ovn-controller stopped so that a future restart will not fail. > + # Checking debug/status is running is not enough as there might be a small race condition. > + echo "Waiting for pid $TMPPID" > + OVS_WAIT_WHILE([kill -0 $TMPPID 2>/dev/null]) > +]) > + > m4_define([OFTABLE_PHY_TO_LOG], [0]) > m4_define([OFTABLE_LOG_INGRESS_PIPELINE], [8]) > m4_define([OFTABLE_OUTPUT_LARGE_PKT_DETECT], [37]) > diff --git a/tests/ovn.at b/tests/ovn.at > index 2dd0dfd2e..8fa26c192 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -20615,7 +20615,7 @@ echo $expected | ovstest test-ovn expr-to-packets > expected > OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > > # Stop ovn-controller on hv2 with --restart flag > -as hv2 ovs-appctl -t ovn-controller exit --restart > +OVN_CONTROLLER_EXIT_RESTART([hv2]) > > # Now send the packet again. This time, it should still arrive > OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt "$packet"]) > @@ -29316,7 +29316,7 @@ check test "$hvt2" -gt 0 > # Kill ovn-controller on chassis hv3, so that it won't update nb_cfg. > # Then wait for 9 out of 10 > sleep 1 > -check as hv3 ovn-appctl -t ovn-controller exit --restart > +OVN_CONTROLLER_EXIT_RESTART([hv3]) > wait_for_ports_up > ovn-nbctl --wait=sb sync > wait_row_count Chassis_Private 9 name!=hv3 nb_cfg=2 > @@ -36252,7 +36252,7 @@ check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1 > check_tunnel_port hv2 br-int hv1@192.168.0.1%192.168.0.2 > > # Stop ovn-controller on hv1 > -check as hv1 ovn-appctl -t ovn-controller exit --restart > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > # The tunnel should remain intact > check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1 > @@ -36281,7 +36281,7 @@ check_tunnel_port hv2 br-int1 hv1@192.168.0.1%192.168.0.2 > check grep -q "Clearing old tunnel port \"ovn-hv1-0\" (hv1@192.168.0.1%192.168.0.2) from bridge \"br-int\"" hv2/ovn-controller.log > > # Stop ovn-controller on hv1 > -check as hv1 ovn-appctl -t ovn-controller exit --restart > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > # The tunnel should remain intact > check_tunnel_port hv1 br-int1 hv2@192.168.0.2%192.168.0.1 > @@ -36371,10 +36371,7 @@ prev_id2=$(ovs-vsctl --bare --columns _uuid find port external_ids:ovn-chassis-i > # The hv2 is running we can remove the override file > rm -f ${OVN_SYSCONFDIR}/system-id-override > > -check ovn-appctl -t ovn-controller exit --restart > - > -# Make sure ovn-controller stopped before restarting it > -OVS_WAIT_UNTIL([test x$(ovn-appctl -t ovn-controller debug/status) != "xrunning"]) > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > # for some reason SSL ovsdb configuration overrides CLI, so > # delete ssl config from ovsdb to give CLI arguments priority > @@ -37086,9 +37083,7 @@ AT_CHECK([grep -c "NXT_CT_FLUSH_ZONE" hv1/ovs-vswitchd.log], [0], [dnl > ]) > > AS_BOX([Check conversion from UUID - restart]) > -ovn-appctl -t ovn-controller exit --restart > -# Make sure ovn-controller stopped before restarting it > -OVS_WAIT_UNTIL([test "$(ovn-appctl -t ovn-controller debug/status)" != "running"]) > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > replace_with_uuid lr0 > replace_with_uuid sw0
On Thu, Jun 6, 2024 at 4:59 PM Mark Michelson <mmichels@redhat.com> wrote: > > Hi Xavier, > > This is the only patch in the series for which I have a recommendation. > The rest all look good to me. > > On 6/4/24 09:10, Xavier Simonart wrote: > > In some rare cases, and despite "recent" changes to wait for cleanup > > before replying to exit, ovn-controller was still running when trying > > to restart it. > > > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> > > --- > > tests/ovn-macros.at | 9 +++++++++ > > tests/ovn.at | 17 ++++++----------- > > 2 files changed, 15 insertions(+), 11 deletions(-) > > > > diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at > > index 47ada5c70..71a46db8b 100644 > > --- a/tests/ovn-macros.at > > +++ b/tests/ovn-macros.at > > @@ -1107,6 +1107,15 @@ m4_define([OVN_CHECK_SCAPY_EDNS_CLIENT_SUBNET_SUPPORT], > > AT_SKIP_IF([! echo "from scapy.layers.dns import EDNS0ClientSubnet" | python 2>&1 > /dev/null]) > > ]) > > > > +m4_define([OVN_CONTROLLER_EXIT_RESTART], > > I think this should just be called "OVN_CONTROLLER_RESTART". The goal is > to restart ovn-controller, so I think the simpler name makes sense. It's > true that we are using an "exit" command for ovn-appctl, but that is > more of an implementation detail than anything else. Thanks for fixing the flakes. I applied all the patches in this series to main except this one. Numan > > > + [TMPPID=$(cat $1/ovn-controller.pid) > > + AT_CHECK([as $1 ovn-appctl -t ovn-controller exit --restart]) > > + # Make sure ovn-controller stopped so that a future restart will not fail. > > + # Checking debug/status is running is not enough as there might be a small race condition. > > + echo "Waiting for pid $TMPPID" > > + OVS_WAIT_WHILE([kill -0 $TMPPID 2>/dev/null]) > > +]) > > + > > m4_define([OFTABLE_PHY_TO_LOG], [0]) > > m4_define([OFTABLE_LOG_INGRESS_PIPELINE], [8]) > > m4_define([OFTABLE_OUTPUT_LARGE_PKT_DETECT], [37]) > > diff --git a/tests/ovn.at b/tests/ovn.at > > index 2dd0dfd2e..8fa26c192 100644 > > --- a/tests/ovn.at > > +++ b/tests/ovn.at > > @@ -20615,7 +20615,7 @@ echo $expected | ovstest test-ovn expr-to-packets > expected > > OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > > > > # Stop ovn-controller on hv2 with --restart flag > > -as hv2 ovs-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv2]) > > > > # Now send the packet again. This time, it should still arrive > > OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt "$packet"]) > > @@ -29316,7 +29316,7 @@ check test "$hvt2" -gt 0 > > # Kill ovn-controller on chassis hv3, so that it won't update nb_cfg. > > # Then wait for 9 out of 10 > > sleep 1 > > -check as hv3 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv3]) > > wait_for_ports_up > > ovn-nbctl --wait=sb sync > > wait_row_count Chassis_Private 9 name!=hv3 nb_cfg=2 > > @@ -36252,7 +36252,7 @@ check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1 > > check_tunnel_port hv2 br-int hv1@192.168.0.1%192.168.0.2 > > > > # Stop ovn-controller on hv1 > > -check as hv1 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # The tunnel should remain intact > > check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1 > > @@ -36281,7 +36281,7 @@ check_tunnel_port hv2 br-int1 hv1@192.168.0.1%192.168.0.2 > > check grep -q "Clearing old tunnel port \"ovn-hv1-0\" (hv1@192.168.0.1%192.168.0.2) from bridge \"br-int\"" hv2/ovn-controller.log > > > > # Stop ovn-controller on hv1 > > -check as hv1 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # The tunnel should remain intact > > check_tunnel_port hv1 br-int1 hv2@192.168.0.2%192.168.0.1 > > @@ -36371,10 +36371,7 @@ prev_id2=$(ovs-vsctl --bare --columns _uuid find port external_ids:ovn-chassis-i > > # The hv2 is running we can remove the override file > > rm -f ${OVN_SYSCONFDIR}/system-id-override > > > > -check ovn-appctl -t ovn-controller exit --restart > > - > > -# Make sure ovn-controller stopped before restarting it > > -OVS_WAIT_UNTIL([test x$(ovn-appctl -t ovn-controller debug/status) != "xrunning"]) > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # for some reason SSL ovsdb configuration overrides CLI, so > > # delete ssl config from ovsdb to give CLI arguments priority > > @@ -37086,9 +37083,7 @@ AT_CHECK([grep -c "NXT_CT_FLUSH_ZONE" hv1/ovs-vswitchd.log], [0], [dnl > > ]) > > > > AS_BOX([Check conversion from UUID - restart]) > > -ovn-appctl -t ovn-controller exit --restart > > -# Make sure ovn-controller stopped before restarting it > > -OVS_WAIT_UNTIL([test "$(ovn-appctl -t ovn-controller debug/status)" != "running"]) > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > replace_with_uuid lr0 > > replace_with_uuid sw0 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
Hi Mark Thanks for the review. Sorry, I completely lost track of this one. On Thu, Jun 6, 2024 at 10:59 PM Mark Michelson <mmichels@redhat.com> wrote: > Hi Xavier, > > This is the only patch in the series for which I have a recommendation. > The rest all look good to me. > > On 6/4/24 09:10, Xavier Simonart wrote: > > In some rare cases, and despite "recent" changes to wait for cleanup > > before replying to exit, ovn-controller was still running when trying > > to restart it. > > > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> > > --- > > tests/ovn-macros.at | 9 +++++++++ > > tests/ovn.at | 17 ++++++----------- > > 2 files changed, 15 insertions(+), 11 deletions(-) > > > > diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at > > index 47ada5c70..71a46db8b 100644 > > --- a/tests/ovn-macros.at > > +++ b/tests/ovn-macros.at > > @@ -1107,6 +1107,15 @@ > m4_define([OVN_CHECK_SCAPY_EDNS_CLIENT_SUBNET_SUPPORT], > > AT_SKIP_IF([! echo "from scapy.layers.dns import > EDNS0ClientSubnet" | python 2>&1 > /dev/null]) > > ]) > > > > +m4_define([OVN_CONTROLLER_EXIT_RESTART], > > I think this should just be called "OVN_CONTROLLER_RESTART". The goal is > to restart ovn-controller, so I think the simpler name makes sense. It's > true that we are using an "exit" command for ovn-appctl, but that is > more of an implementation detail than anything else. > > I agree that the name is confusing, but I am not sure that OVN_CONTROLLER_RESTART is better. The macro is executing "ovn-appctl -t ovn-controller exit --restart" (i.e. exit the controller w/o cleaning up databases) , and then wait for controller exit. After the macro is executed, the controller is stopped (exited). ovn-controller restart is executed later, with a different command. OVN_CONTROLLER_RESTART would let me think that the macro is restarting ovn-controller. What about OVN_CONTROLLER_EXIT ? WDYT? > + [TMPPID=$(cat $1/ovn-controller.pid) > > + AT_CHECK([as $1 ovn-appctl -t ovn-controller exit --restart]) > > + # Make sure ovn-controller stopped so that a future restart will not > fail. > > + # Checking debug/status is running is not enough as there might be a > small race condition. > > + echo "Waiting for pid $TMPPID" > > + OVS_WAIT_WHILE([kill -0 $TMPPID 2>/dev/null]) > > +]) > > + > > m4_define([OFTABLE_PHY_TO_LOG], [0]) > > m4_define([OFTABLE_LOG_INGRESS_PIPELINE], [8]) > > m4_define([OFTABLE_OUTPUT_LARGE_PKT_DETECT], [37]) > > diff --git a/tests/ovn.at b/tests/ovn.at > > index 2dd0dfd2e..8fa26c192 100644 > > --- a/tests/ovn.at > > +++ b/tests/ovn.at > > @@ -20615,7 +20615,7 @@ echo $expected | ovstest test-ovn > expr-to-packets > expected > > OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > > > > # Stop ovn-controller on hv2 with --restart flag > > -as hv2 ovs-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv2]) > > > > # Now send the packet again. This time, it should still arrive > > OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt > "$packet"]) > > @@ -29316,7 +29316,7 @@ check test "$hvt2" -gt 0 > > # Kill ovn-controller on chassis hv3, so that it won't update nb_cfg. > > # Then wait for 9 out of 10 > > sleep 1 > > -check as hv3 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv3]) > > wait_for_ports_up > > ovn-nbctl --wait=sb sync > > wait_row_count Chassis_Private 9 name!=hv3 nb_cfg=2 > > @@ -36252,7 +36252,7 @@ check_tunnel_port hv1 br-int hv2@192.168.0.2 > %192.168.0.1 > > check_tunnel_port hv2 br-int hv1@192.168.0.1%192.168.0.2 > > > > # Stop ovn-controller on hv1 > > -check as hv1 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # The tunnel should remain intact > > check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1 > > @@ -36281,7 +36281,7 @@ check_tunnel_port hv2 br-int1 hv1@192.168.0.1 > %192.168.0.2 > > check grep -q "Clearing old tunnel port \"ovn-hv1-0\" (hv1@192.168.0.1%192.168.0.2) > from bridge \"br-int\"" hv2/ovn-controller.log > > > > # Stop ovn-controller on hv1 > > -check as hv1 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # The tunnel should remain intact > > check_tunnel_port hv1 br-int1 hv2@192.168.0.2%192.168.0.1 > > @@ -36371,10 +36371,7 @@ prev_id2=$(ovs-vsctl --bare --columns _uuid > find port external_ids:ovn-chassis-i > > # The hv2 is running we can remove the override file > > rm -f ${OVN_SYSCONFDIR}/system-id-override > > > > -check ovn-appctl -t ovn-controller exit --restart > > - > > -# Make sure ovn-controller stopped before restarting it > > -OVS_WAIT_UNTIL([test x$(ovn-appctl -t ovn-controller debug/status) != > "xrunning"]) > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # for some reason SSL ovsdb configuration overrides CLI, so > > # delete ssl config from ovsdb to give CLI arguments priority > > @@ -37086,9 +37083,7 @@ AT_CHECK([grep -c "NXT_CT_FLUSH_ZONE" > hv1/ovs-vswitchd.log], [0], [dnl > > ]) > > > > AS_BOX([Check conversion from UUID - restart]) > > -ovn-appctl -t ovn-controller exit --restart > > -# Make sure ovn-controller stopped before restarting it > > -OVS_WAIT_UNTIL([test "$(ovn-appctl -t ovn-controller debug/status)" != > "running"]) > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > replace_with_uuid lr0 > > replace_with_uuid sw0 > > Thanks Xavier
On 7/26/24 05:59, Xavier Simonart wrote: > Hi Mark > > Thanks for the review. Sorry, I completely lost track of this one. > > On Thu, Jun 6, 2024 at 10:59 PM Mark Michelson <mmichels@redhat.com > <mailto:mmichels@redhat.com>> wrote: > > Hi Xavier, > > This is the only patch in the series for which I have a recommendation. > The rest all look good to me. > > On 6/4/24 09:10, Xavier Simonart wrote: > > In some rare cases, and despite "recent" changes to wait for cleanup > > before replying to exit, ovn-controller was still running when trying > > to restart it. > > > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com > <mailto:xsimonar@redhat.com>> > > --- > > tests/ovn-macros.at <http://ovn-macros.at> | 9 +++++++++ > > tests/ovn.at <http://ovn.at> | 17 ++++++----------- > > 2 files changed, 15 insertions(+), 11 deletions(-) > > > > diff --git a/tests/ovn-macros.at <http://ovn-macros.at> > b/tests/ovn-macros.at <http://ovn-macros.at> > > index 47ada5c70..71a46db8b 100644 > > --- a/tests/ovn-macros.at <http://ovn-macros.at> > > +++ b/tests/ovn-macros.at <http://ovn-macros.at> > > @@ -1107,6 +1107,15 @@ > m4_define([OVN_CHECK_SCAPY_EDNS_CLIENT_SUBNET_SUPPORT], > > AT_SKIP_IF([! echo "from scapy.layers.dns import > EDNS0ClientSubnet" | python 2>&1 > /dev/null]) > > ]) > > > > +m4_define([OVN_CONTROLLER_EXIT_RESTART], > > I think this should just be called "OVN_CONTROLLER_RESTART". The > goal is > to restart ovn-controller, so I think the simpler name makes sense. > It's > true that we are using an "exit" command for ovn-appctl, but that is > more of an implementation detail than anything else. > > I agree that the name is confusing, but I am not sure that > OVN_CONTROLLER_RESTART is better. The macro is executing > "ovn-appctl -t ovn-controller exit --restart" (i.e. exit the controller > w/o cleaning up databases) , and then wait for controller exit. > After the macro is executed, the controller is stopped (exited). > ovn-controller restart is executed later, with a different command. > OVN_CONTROLLER_RESTART would let me think that the macro is restarting > ovn-controller. > What about OVN_CONTROLLER_EXIT ? > WDYT? That's fine by me. > > > + [TMPPID=$(cat $1/ovn-controller.pid) > > + AT_CHECK([as $1 ovn-appctl -t ovn-controller exit --restart]) > > + # Make sure ovn-controller stopped so that a future restart > will not fail. > > + # Checking debug/status is running is not enough as there > might be a small race condition. > > + echo "Waiting for pid $TMPPID" > > + OVS_WAIT_WHILE([kill -0 $TMPPID 2>/dev/null]) > > +]) > > + > > m4_define([OFTABLE_PHY_TO_LOG], [0]) > > m4_define([OFTABLE_LOG_INGRESS_PIPELINE], [8]) > > m4_define([OFTABLE_OUTPUT_LARGE_PKT_DETECT], [37]) > > diff --git a/tests/ovn.at <http://ovn.at> b/tests/ovn.at > <http://ovn.at> > > index 2dd0dfd2e..8fa26c192 100644 > > --- a/tests/ovn.at <http://ovn.at> > > +++ b/tests/ovn.at <http://ovn.at> > > @@ -20615,7 +20615,7 @@ echo $expected | ovstest test-ovn > expr-to-packets > expected > > OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) > > > > # Stop ovn-controller on hv2 with --restart flag > > -as hv2 ovs-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv2]) > > > > # Now send the packet again. This time, it should still arrive > > OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt > "$packet"]) > > @@ -29316,7 +29316,7 @@ check test "$hvt2" -gt 0 > > # Kill ovn-controller on chassis hv3, so that it won't update > nb_cfg. > > # Then wait for 9 out of 10 > > sleep 1 > > -check as hv3 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv3]) > > wait_for_ports_up > > ovn-nbctl --wait=sb sync > > wait_row_count Chassis_Private 9 name!=hv3 nb_cfg=2 > > @@ -36252,7 +36252,7 @@ check_tunnel_port hv1 br-int > hv2@192.168.0.2 <mailto:hv2@192.168.0.2>%192.168.0.1 > > check_tunnel_port hv2 br-int hv1@192.168.0.1 > <mailto:hv1@192.168.0.1>%192.168.0.2 > > > > # Stop ovn-controller on hv1 > > -check as hv1 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # The tunnel should remain intact > > check_tunnel_port hv1 br-int hv2@192.168.0.2 > <mailto:hv2@192.168.0.2>%192.168.0.1 > > @@ -36281,7 +36281,7 @@ check_tunnel_port hv2 br-int1 > hv1@192.168.0.1 <mailto:hv1@192.168.0.1>%192.168.0.2 > > check grep -q "Clearing old tunnel port \"ovn-hv1-0\" > (hv1@192.168.0.1 <mailto:hv1@192.168.0.1>%192.168.0.2) from bridge > \"br-int\"" hv2/ovn-controller.log > > > > # Stop ovn-controller on hv1 > > -check as hv1 ovn-appctl -t ovn-controller exit --restart > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # The tunnel should remain intact > > check_tunnel_port hv1 br-int1 hv2@192.168.0.2 > <mailto:hv2@192.168.0.2>%192.168.0.1 > > @@ -36371,10 +36371,7 @@ prev_id2=$(ovs-vsctl --bare --columns > _uuid find port external_ids:ovn-chassis-i > > # The hv2 is running we can remove the override file > > rm -f ${OVN_SYSCONFDIR}/system-id-override > > > > -check ovn-appctl -t ovn-controller exit --restart > > - > > -# Make sure ovn-controller stopped before restarting it > > -OVS_WAIT_UNTIL([test x$(ovn-appctl -t ovn-controller > debug/status) != "xrunning"]) > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > # for some reason SSL ovsdb configuration overrides CLI, so > > # delete ssl config from ovsdb to give CLI arguments priority > > @@ -37086,9 +37083,7 @@ AT_CHECK([grep -c "NXT_CT_FLUSH_ZONE" > hv1/ovs-vswitchd.log], [0], [dnl > > ]) > > > > AS_BOX([Check conversion from UUID - restart]) > > -ovn-appctl -t ovn-controller exit --restart > > -# Make sure ovn-controller stopped before restarting it > > -OVS_WAIT_UNTIL([test "$(ovn-appctl -t ovn-controller > debug/status)" != "running"]) > > +OVN_CONTROLLER_EXIT_RESTART([hv1]) > > > > replace_with_uuid lr0 > > replace_with_uuid sw0 > > Thanks > Xavier
diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at index 47ada5c70..71a46db8b 100644 --- a/tests/ovn-macros.at +++ b/tests/ovn-macros.at @@ -1107,6 +1107,15 @@ m4_define([OVN_CHECK_SCAPY_EDNS_CLIENT_SUBNET_SUPPORT], AT_SKIP_IF([! echo "from scapy.layers.dns import EDNS0ClientSubnet" | python 2>&1 > /dev/null]) ]) +m4_define([OVN_CONTROLLER_EXIT_RESTART], + [TMPPID=$(cat $1/ovn-controller.pid) + AT_CHECK([as $1 ovn-appctl -t ovn-controller exit --restart]) + # Make sure ovn-controller stopped so that a future restart will not fail. + # Checking debug/status is running is not enough as there might be a small race condition. + echo "Waiting for pid $TMPPID" + OVS_WAIT_WHILE([kill -0 $TMPPID 2>/dev/null]) +]) + m4_define([OFTABLE_PHY_TO_LOG], [0]) m4_define([OFTABLE_LOG_INGRESS_PIPELINE], [8]) m4_define([OFTABLE_OUTPUT_LARGE_PKT_DETECT], [37]) diff --git a/tests/ovn.at b/tests/ovn.at index 2dd0dfd2e..8fa26c192 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -20615,7 +20615,7 @@ echo $expected | ovstest test-ovn expr-to-packets > expected OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected]) # Stop ovn-controller on hv2 with --restart flag -as hv2 ovs-appctl -t ovn-controller exit --restart +OVN_CONTROLLER_EXIT_RESTART([hv2]) # Now send the packet again. This time, it should still arrive OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt "$packet"]) @@ -29316,7 +29316,7 @@ check test "$hvt2" -gt 0 # Kill ovn-controller on chassis hv3, so that it won't update nb_cfg. # Then wait for 9 out of 10 sleep 1 -check as hv3 ovn-appctl -t ovn-controller exit --restart +OVN_CONTROLLER_EXIT_RESTART([hv3]) wait_for_ports_up ovn-nbctl --wait=sb sync wait_row_count Chassis_Private 9 name!=hv3 nb_cfg=2 @@ -36252,7 +36252,7 @@ check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1 check_tunnel_port hv2 br-int hv1@192.168.0.1%192.168.0.2 # Stop ovn-controller on hv1 -check as hv1 ovn-appctl -t ovn-controller exit --restart +OVN_CONTROLLER_EXIT_RESTART([hv1]) # The tunnel should remain intact check_tunnel_port hv1 br-int hv2@192.168.0.2%192.168.0.1 @@ -36281,7 +36281,7 @@ check_tunnel_port hv2 br-int1 hv1@192.168.0.1%192.168.0.2 check grep -q "Clearing old tunnel port \"ovn-hv1-0\" (hv1@192.168.0.1%192.168.0.2) from bridge \"br-int\"" hv2/ovn-controller.log # Stop ovn-controller on hv1 -check as hv1 ovn-appctl -t ovn-controller exit --restart +OVN_CONTROLLER_EXIT_RESTART([hv1]) # The tunnel should remain intact check_tunnel_port hv1 br-int1 hv2@192.168.0.2%192.168.0.1 @@ -36371,10 +36371,7 @@ prev_id2=$(ovs-vsctl --bare --columns _uuid find port external_ids:ovn-chassis-i # The hv2 is running we can remove the override file rm -f ${OVN_SYSCONFDIR}/system-id-override -check ovn-appctl -t ovn-controller exit --restart - -# Make sure ovn-controller stopped before restarting it -OVS_WAIT_UNTIL([test x$(ovn-appctl -t ovn-controller debug/status) != "xrunning"]) +OVN_CONTROLLER_EXIT_RESTART([hv1]) # for some reason SSL ovsdb configuration overrides CLI, so # delete ssl config from ovsdb to give CLI arguments priority @@ -37086,9 +37083,7 @@ AT_CHECK([grep -c "NXT_CT_FLUSH_ZONE" hv1/ovs-vswitchd.log], [0], [dnl ]) AS_BOX([Check conversion from UUID - restart]) -ovn-appctl -t ovn-controller exit --restart -# Make sure ovn-controller stopped before restarting it -OVS_WAIT_UNTIL([test "$(ovn-appctl -t ovn-controller debug/status)" != "running"]) +OVN_CONTROLLER_EXIT_RESTART([hv1]) replace_with_uuid lr0 replace_with_uuid sw0
In some rare cases, and despite "recent" changes to wait for cleanup before replying to exit, ovn-controller was still running when trying to restart it. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> --- tests/ovn-macros.at | 9 +++++++++ tests/ovn.at | 17 ++++++----------- 2 files changed, 15 insertions(+), 11 deletions(-)