Message ID | 20221124140430.1026243-1-xsimonar@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] tests: change recompute test case (reduce time) | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
On Thu, Nov 24, 2022 at 3:04 PM Xavier Simonart <xsimonar@redhat.com> wrote: > Recompute test case was checking whether there were no recomputes > when ports were ready to be claimed by ovn-controller but sb was read-only. > > However this was tested by running a test case reproducing how the issue > was > initially identified: running many ports (1000!) on a few hypervisors, > which ** usually ** produced the expected race condition. > > The new test case now simply makes sb to sleep, causing the read-only > condition > to always happen. This dramatically reduces the duration of the test. > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> > --- > tests/ovn.at | 115 +++++++++++---------------------------------------- > 1 file changed, 25 insertions(+), 90 deletions(-) > > diff --git a/tests/ovn.at b/tests/ovn.at > index fc74aa29a..ce9cf7a1c 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -32779,116 +32779,51 @@ OVN_FOR_EACH_NORTHD([ > AT_SETUP([recomputes]) > ovn_start > > -n_hv=4 > - > -# Add chassis > net_add n1 > -for i in $(seq 1 $n_hv); do > - sim_add hv$i > - as hv$i > - check ovs-vsctl add-br br-phys > - ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys > - ovn_attach n1 br-phys 192.168.0.$i 24 geneve > -done > - > -check ovn-sbctl set connection . inactivity_probe=0 > - > -add_switch_ports() { > - start_port=$1 > - end_port=$2 > - nb_hv=$3 > - bulk_size=$4 > - for ((i=start_port; i<end_port; )) do > - start_bulk=$i > - for hv in $(seq 1 $nb_hv); do > - end_bulk=$((start_bulk+bulk_size-1)) > - for port in $(seq $start_bulk $end_bulk); do > - logical_switch_port=lsp${port} > - OVN_NBCTL(lsp-add ls1 $logical_switch_port) > - OVN_NBCTL(lsp-set-addresses $logical_switch_port dynamic) > - done > - start_bulk=$((end_bulk+1)) > - done > - RUN_OVN_NBCTL() > +sim_add hv1 > +as hv1 > +check ovs-vsctl add-br br-phys > +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys > +ovn_attach n1 br-phys 192.168.0.1 24 geneve > > - start_bulk=$i > - for hv in $(seq 1 $nb_hv); do > - end_bulk=$((start_bulk+bulk_size-1)) > - for port in $(seq $start_bulk $end_bulk); do > - logical_switch_port=lsp${port} > - as hv$hv ovs-vsctl \ > - --no-wait -- add-port br-int vif${port} \ > - -- set Interface vif${port} > external_ids:iface-id=$logical_switch_port > - done > - start_bulk=$((end_bulk+1)) > - done > - i=$((end_bulk+1)) > - done > -} > check ovn-nbctl ls-add ls1 > check ovn-nbctl set Logical_Switch ls1 other_config:subnet=10.1.0.0/16 > check ovn-nbctl set Logical_Switch ls1 > other_config:exclude_ips=10.1.255.254 > > -check ovn-nbctl lr-add lr1 > -check ovn-nbctl lsp-add ls1 lsp0 -- set Logical_Switch_Port lsp0 > type=router options:router-port=lrp0 addresses=dynamic > -check ovn-nbctl lrp-add lr1 lrp0 "f0:00:00:01:00:01" 10.1.255.254/16 > -check <http://10.1.255.254/16-check> ovn-nbctl lr-nat-add lr1 snat > 10.2.0.1 10.1.0.0/16 > - > -lflow_run=0 > check ovn-nbctl --wait=hv sync > > -# Tunnel ports might not be added (yet) at this point on slow system. > -# Wait for flows related to such ports to ensure those ports have been > added > -# before we measure recomputes. Otherwise, ovs_interface handler might be > run > -# afterwards for tunnel ports, causing recomputes. > -for i in $(seq 1 $n_hv); do > - for j in $(seq 1 $n_hv); do > - if test $i != $j; then > - OVN_WAIT_REMOTE_INPUT_FLOWS(["hv$i"],["hv$j"]) > - fi > - done > -done > +as hv1 > +lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) > > -for i in $(seq 1 $n_hv); do > - as hv$i > - lflow_run1=$(ovn-appctl -t ovn-controller coverage/read-counter > lflow_run) > - lflow_run=`expr $lflow_run1 + $lflow_run` > -done > +check ovn-nbctl lsp-add ls1 lsp1 > +check ovn-nbctl lsp-add ls1 lsp2 > +check ovn-nbctl --wait=hv sync > + > +# Make sb sleep > +AT_CHECK([kill -STOP $(cat ovn-sb/ovsdb-server.pid)]) > > -add_switch_ports 1 1000 $n_hv 5 > +as hv1 > +ovs-vsctl --no-wait -- add-port br-int vif1 \ > + -- set Interface vif1 external_ids:iface-id=lsp1 > +ovs-vsctl --no-wait -- add-port br-int vif2 \ > + -- set Interface vif2 external_ids:iface-id=lsp2 > + > +# Check that ovn-controller handled the binding by checking that some > flows > +# are installed > +OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int | grep output | wc -l` > -eq 2]) > +AT_CHECK([kill -CONT $(cat ovn-sb/ovsdb-server.pid)]) > > wait_for_ports_up > check ovn-nbctl --wait=hv sync > > -for i in $(seq 1 $n_hv); do > - pid=$(cat hv${i}/ovn-controller.pid) > - u=$((u+$(cat "/proc/$pid/stat" | awk '{print $14}'))) > - s=$((s+$(cat "/proc/$pid/stat" | awk '{print $15}'))) > -done > - > -n_pid=$(cat northd/ovn-northd.pid) > -n_u=$(cat "/proc/$pid/stat" | awk '{print $14}') > -n_s=$(cat "/proc/$pid/stat" | awk '{print $15}') > - > -echo "Total Northd User Time: $n_u" > -echo "Total Northd System Time: $n_s" > -echo "Total Controller User Time: $u" > -echo "Total Controller System Time: $s" > +as hv1 > +lflow_run_end=$(ovn-appctl -t ovn-controller coverage/read-counter > lflow_run) > > -lflow_run_end=0 > -for i in $(seq 1 $n_hv); do > - as hv$i > - lflow_run1=$(ovn-appctl -t ovn-controller coverage/read-counter > lflow_run) > - lflow_run_end=`expr $lflow_run1 + $lflow_run_end` > -done > n_recomputes=`expr $lflow_run_end - $lflow_run` > echo "$n_recomputes recomputes" > > AT_CHECK([test $lflow_run_end == $lflow_run]) > > -for i in $(seq 2 $n_hv); do > - OVN_CLEANUP_SBOX([hv$i]) > -done > OVN_CLEANUP([hv1]) > AT_CLEANUP > ]) > -- > 2.31.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Looks good to me, thanks! Acked-by: Ales Musil <amusil@redhat.com>
diff --git a/tests/ovn.at b/tests/ovn.at index fc74aa29a..ce9cf7a1c 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -32779,116 +32779,51 @@ OVN_FOR_EACH_NORTHD([ AT_SETUP([recomputes]) ovn_start -n_hv=4 - -# Add chassis net_add n1 -for i in $(seq 1 $n_hv); do - sim_add hv$i - as hv$i - check ovs-vsctl add-br br-phys - ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys - ovn_attach n1 br-phys 192.168.0.$i 24 geneve -done - -check ovn-sbctl set connection . inactivity_probe=0 - -add_switch_ports() { - start_port=$1 - end_port=$2 - nb_hv=$3 - bulk_size=$4 - for ((i=start_port; i<end_port; )) do - start_bulk=$i - for hv in $(seq 1 $nb_hv); do - end_bulk=$((start_bulk+bulk_size-1)) - for port in $(seq $start_bulk $end_bulk); do - logical_switch_port=lsp${port} - OVN_NBCTL(lsp-add ls1 $logical_switch_port) - OVN_NBCTL(lsp-set-addresses $logical_switch_port dynamic) - done - start_bulk=$((end_bulk+1)) - done - RUN_OVN_NBCTL() +sim_add hv1 +as hv1 +check ovs-vsctl add-br br-phys +ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys +ovn_attach n1 br-phys 192.168.0.1 24 geneve - start_bulk=$i - for hv in $(seq 1 $nb_hv); do - end_bulk=$((start_bulk+bulk_size-1)) - for port in $(seq $start_bulk $end_bulk); do - logical_switch_port=lsp${port} - as hv$hv ovs-vsctl \ - --no-wait -- add-port br-int vif${port} \ - -- set Interface vif${port} external_ids:iface-id=$logical_switch_port - done - start_bulk=$((end_bulk+1)) - done - i=$((end_bulk+1)) - done -} check ovn-nbctl ls-add ls1 check ovn-nbctl set Logical_Switch ls1 other_config:subnet=10.1.0.0/16 check ovn-nbctl set Logical_Switch ls1 other_config:exclude_ips=10.1.255.254 -check ovn-nbctl lr-add lr1 -check ovn-nbctl lsp-add ls1 lsp0 -- set Logical_Switch_Port lsp0 type=router options:router-port=lrp0 addresses=dynamic -check ovn-nbctl lrp-add lr1 lrp0 "f0:00:00:01:00:01" 10.1.255.254/16 -check ovn-nbctl lr-nat-add lr1 snat 10.2.0.1 10.1.0.0/16 - -lflow_run=0 check ovn-nbctl --wait=hv sync -# Tunnel ports might not be added (yet) at this point on slow system. -# Wait for flows related to such ports to ensure those ports have been added -# before we measure recomputes. Otherwise, ovs_interface handler might be run -# afterwards for tunnel ports, causing recomputes. -for i in $(seq 1 $n_hv); do - for j in $(seq 1 $n_hv); do - if test $i != $j; then - OVN_WAIT_REMOTE_INPUT_FLOWS(["hv$i"],["hv$j"]) - fi - done -done +as hv1 +lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) -for i in $(seq 1 $n_hv); do - as hv$i - lflow_run1=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) - lflow_run=`expr $lflow_run1 + $lflow_run` -done +check ovn-nbctl lsp-add ls1 lsp1 +check ovn-nbctl lsp-add ls1 lsp2 +check ovn-nbctl --wait=hv sync + +# Make sb sleep +AT_CHECK([kill -STOP $(cat ovn-sb/ovsdb-server.pid)]) -add_switch_ports 1 1000 $n_hv 5 +as hv1 +ovs-vsctl --no-wait -- add-port br-int vif1 \ + -- set Interface vif1 external_ids:iface-id=lsp1 +ovs-vsctl --no-wait -- add-port br-int vif2 \ + -- set Interface vif2 external_ids:iface-id=lsp2 + +# Check that ovn-controller handled the binding by checking that some flows +# are installed +OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int | grep output | wc -l` -eq 2]) +AT_CHECK([kill -CONT $(cat ovn-sb/ovsdb-server.pid)]) wait_for_ports_up check ovn-nbctl --wait=hv sync -for i in $(seq 1 $n_hv); do - pid=$(cat hv${i}/ovn-controller.pid) - u=$((u+$(cat "/proc/$pid/stat" | awk '{print $14}'))) - s=$((s+$(cat "/proc/$pid/stat" | awk '{print $15}'))) -done - -n_pid=$(cat northd/ovn-northd.pid) -n_u=$(cat "/proc/$pid/stat" | awk '{print $14}') -n_s=$(cat "/proc/$pid/stat" | awk '{print $15}') - -echo "Total Northd User Time: $n_u" -echo "Total Northd System Time: $n_s" -echo "Total Controller User Time: $u" -echo "Total Controller System Time: $s" +as hv1 +lflow_run_end=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) -lflow_run_end=0 -for i in $(seq 1 $n_hv); do - as hv$i - lflow_run1=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) - lflow_run_end=`expr $lflow_run1 + $lflow_run_end` -done n_recomputes=`expr $lflow_run_end - $lflow_run` echo "$n_recomputes recomputes" AT_CHECK([test $lflow_run_end == $lflow_run]) -for i in $(seq 2 $n_hv); do - OVN_CLEANUP_SBOX([hv$i]) -done OVN_CLEANUP([hv1]) AT_CLEANUP ])
Recompute test case was checking whether there were no recomputes when ports were ready to be claimed by ovn-controller but sb was read-only. However this was tested by running a test case reproducing how the issue was initially identified: running many ports (1000!) on a few hypervisors, which ** usually ** produced the expected race condition. The new test case now simply makes sb to sleep, causing the read-only condition to always happen. This dramatically reduces the duration of the test. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> --- tests/ovn.at | 115 +++++++++++---------------------------------------- 1 file changed, 25 insertions(+), 90 deletions(-)