Message ID | 1483438205-31110-3-git-send-email-peterx@redhat.com |
---|---|
State | New |
Headers | show |
On Tue, Jan 03, 2017 at 06:10:05PM +0800, Peter Xu wrote: [...] > diff --git a/scripts/global.bash b/scripts/global.bash > index 77b0b29..52095bd 100644 > --- a/scripts/global.bash > +++ b/scripts/global.bash > @@ -1,2 +1,13 @@ > : ${ut_log_dir:=logs} > : ${ut_log_summary:=${ut_log_dir}/SUMMARY} > +: ${ut_run_queues:=1} > + > +function ut_in_parallel() > +{ > + [[ $ut_run_queues != 1 ]] > +} > + This function is useless now. Should remove above five lines. Sorry. -- peterx
2017-01-03 18:10+0800, Peter Xu: > run_task.sh is getting slow. This patch is trying to make it faster by > running the tests concurrently. > > We provide a new parameter "-j" for the run_tests.sh, which can be used > to specify how many run queues we want for the tests. Default queue > length is 1, which is the old behavior. > > Quick test on my laptop (4 cores, 2 threads each) shows 3x speed boost: > > |-----------------+-----------| > | command | time used | > |-----------------+-----------| > | run_test.sh | 75s | > | run_test.sh -j8 | 27s | > |-----------------+-----------| > > Suggested-by: Radim Krčmář <rkrcmar@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > run_tests.sh | 12 ++++++++++-- > scripts/functions.bash | 15 ++++++++++++++- > scripts/global.bash | 11 +++++++++++ > 3 files changed, 35 insertions(+), 3 deletions(-) I like this diffstat a lot more, thanks :) The script doesn't handle ^C well now (at least), which can be worked around with trap exit SIGINT but it would be nice to know if receiving signals in `wait` can't be fixed.
On Wed, Jan 04, 2017 at 04:09:39PM +0100, Radim Krčmář wrote: > 2017-01-03 18:10+0800, Peter Xu: > > run_task.sh is getting slow. This patch is trying to make it faster by > > running the tests concurrently. > > > > We provide a new parameter "-j" for the run_tests.sh, which can be used > > to specify how many run queues we want for the tests. Default queue > > length is 1, which is the old behavior. > > > > Quick test on my laptop (4 cores, 2 threads each) shows 3x speed boost: > > > > |-----------------+-----------| > > | command | time used | > > |-----------------+-----------| > > | run_test.sh | 75s | > > | run_test.sh -j8 | 27s | > > |-----------------+-----------| > > > > Suggested-by: Radim Krčmář <rkrcmar@redhat.com> > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > run_tests.sh | 12 ++++++++++-- > > scripts/functions.bash | 15 ++++++++++++++- > > scripts/global.bash | 11 +++++++++++ > > 3 files changed, 35 insertions(+), 3 deletions(-) > > I like this diffstat a lot more, thanks :) > > The script doesn't handle ^C well now (at least), which can be worked > around with > > trap exit SIGINT > > but it would be nice to know if receiving signals in `wait` can't be > fixed. When I send SIGINT to "run_tests.sh -j8", I see process hang dead. Not sure whether you see the same thing: #0 0x00007f7af2e1559a in waitpid () from /lib64/libc.so.6 #1 0x00005613edf8953e in waitchld.isra () #2 0x00005613edf8aae5 in wait_for () #3 0x00005613edf8b682 in wait_for_any_job () #4 0x00005613edfc7e64 in wait_builtin () #5 0x00005613edf616ea in execute_builtin.isra () #6 0x00005613edf623ee in execute_simple_command () #7 0x00005613edf79e77 in execute_command_internal () #8 0x00005613edf7b972 in execute_command () #9 0x00005613edf62aca in execute_while_or_until () #10 0x00005613edf7a156 in execute_command_internal () #11 0x00005613edf79d88 in execute_command_internal () ... If I change the "wait -n" into "wait" (this will work, but of course slower since we'll wait for all subprocesses end before we start another one), problem disappears. Not sure whether that means a "wait -n" bug. Anyway, IMHO squashing you suggestion of "trap exit SIGINT" at the entry of for_each_unittest() is an acceptable solution - it works in all cases. Thanks, -- peterx
2017-01-05 11:07+0800, Peter Xu: > On Wed, Jan 04, 2017 at 04:09:39PM +0100, Radim Krčmář wrote: >> 2017-01-03 18:10+0800, Peter Xu: >> > run_task.sh is getting slow. This patch is trying to make it faster by >> > running the tests concurrently. >> > >> > We provide a new parameter "-j" for the run_tests.sh, which can be used >> > to specify how many run queues we want for the tests. Default queue >> > length is 1, which is the old behavior. >> > >> > Quick test on my laptop (4 cores, 2 threads each) shows 3x speed boost: >> > >> > |-----------------+-----------| >> > | command | time used | >> > |-----------------+-----------| >> > | run_test.sh | 75s | >> > | run_test.sh -j8 | 27s | >> > |-----------------+-----------| >> > >> > Suggested-by: Radim Krčmář <rkrcmar@redhat.com> >> > Signed-off-by: Peter Xu <peterx@redhat.com> >> > --- >> > run_tests.sh | 12 ++++++++++-- >> > scripts/functions.bash | 15 ++++++++++++++- >> > scripts/global.bash | 11 +++++++++++ >> > 3 files changed, 35 insertions(+), 3 deletions(-) >> >> I like this diffstat a lot more, thanks :) >> >> The script doesn't handle ^C well now (at least), which can be worked >> around with >> >> trap exit SIGINT >> >> but it would be nice to know if receiving signals in `wait` can't be >> fixed. > > When I send SIGINT to "run_tests.sh -j8", I see process hang dead. Not > sure whether you see the same thing: > > #0 0x00007f7af2e1559a in waitpid () from /lib64/libc.so.6 > #1 0x00005613edf8953e in waitchld.isra () > #2 0x00005613edf8aae5 in wait_for () > #3 0x00005613edf8b682 in wait_for_any_job () > #4 0x00005613edfc7e64 in wait_builtin () > #5 0x00005613edf616ea in execute_builtin.isra () > #6 0x00005613edf623ee in execute_simple_command () > #7 0x00005613edf79e77 in execute_command_internal () > #8 0x00005613edf7b972 in execute_command () > #9 0x00005613edf62aca in execute_while_or_until () > #10 0x00005613edf7a156 in execute_command_internal () > #11 0x00005613edf79d88 in execute_command_internal () > ... I do. And sometimes, I also caught it in a signal handler: #0 0x00007f7461bcd637 in kill () at ../sysdeps/unix/syscall-template.S:84 #1 0x00000000004476b9 in wait_sigint_handler (sig=<optimized out>) at jobs.c:2504 #2 <signal handler called> #3 0x00007f7461c674ca in __GI___waitpid (pid=pid@entry=-1, stat_loc=stat_loc@entry=0x7ffdb49dfbd8, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29 #4 0x0000000000449cfb in waitchld (block=block@entry=1, wpid=-1) at jobs.c:3474 #5 0x000000000044b2eb in wait_for (pid=pid@entry=-1) at jobs.c:2718 #6 0x000000000044becd in wait_for_any_job () at jobs.c:3015 #7 0x000000000048c38b in wait_builtin (list=0x0) at ./wait.def:154 ... > > If I change the "wait -n" into "wait" (this will work, but of course > slower since we'll wait for all subprocesses end before we start > another one), problem disappears. Yeah, and just `wait` doesn't work with -j1 ... using `jobs -r` gets rid of it. > Not sure whether that means a "wait > -n" bug. Most likely a bash bug: - doesn't happen if you SIGINT before the first job has finished - doesn't happen with normal wait > Anyway, IMHO squashing you suggestion of "trap exit SIGINT" at the > entry of for_each_unittest() is an acceptable solution - it works in > all cases. Seems like the best solution at this time ... We actually want to "exit 130", as it would when using normal wait does, and maybe it could be improved a bit by adding a wait: trap 'wait; exit 130' SIGINT
On Thu, Jan 05, 2017 at 08:44:02PM +0100, Radim Krčmář wrote: [...] > > Anyway, IMHO squashing you suggestion of "trap exit SIGINT" at the > > entry of for_each_unittest() is an acceptable solution - it works in > > all cases. > > Seems like the best solution at this time ... > We actually want to "exit 130", as it would when using normal wait does, > and maybe it could be improved a bit by adding a wait: > > trap 'wait; exit 130' SIGINT Sounds better, thanks! Will repost with this one. -- peterx
diff --git a/run_tests.sh b/run_tests.sh index e1bb3a6..a4fc895 100755 --- a/run_tests.sh +++ b/run_tests.sh @@ -14,10 +14,11 @@ function usage() { cat <<EOF -Usage: $0 [-g group] [-h] [-v] +Usage: $0 [-g group] [-h] [-v] [-j N] -g: Only execute tests in the given group -h: Output this help text + -j: Execute tests in parallel -v: Enables verbose mode Set the environment variable QEMU=/path/to/qemu-system-ARCH to @@ -29,7 +30,7 @@ EOF RUNTIME_arch_run="./$TEST_DIR/run" source scripts/runtime.bash -while getopts "g:hv" opt; do +while getopts "g:hj:v" opt; do case $opt in g) only_group=$OPTARG @@ -38,6 +39,13 @@ while getopts "g:hv" opt; do usage exit ;; + j) + ut_run_queues=$OPTARG + if ! is_number "$ut_run_queues"; then + echo "Invalid -j option: $ut_run_queues" + exit 1 + fi + ;; v) verbose="yes" ;; diff --git a/scripts/functions.bash b/scripts/functions.bash index d1d2e1c..acb306b 100644 --- a/scripts/functions.bash +++ b/scripts/functions.bash @@ -1,9 +1,18 @@ +source scripts/global.bash + function run_task() { local testname="$2" + while [[ "$(jobs | wc -l)" == $ut_run_queues ]]; do + # wait for any background test to finish + wait -n + done + RUNTIME_log_file="${ut_log_dir}/${testname}.log" - "$@" + + # start the testcase in the background + "$@" & } function for_each_unittest() @@ -53,5 +62,9 @@ function for_each_unittest() fi done run_task "$cmd" "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check" "$accel" "$timeout" + + # wait all task finish + wait + exec {fd}<&- } diff --git a/scripts/global.bash b/scripts/global.bash index 77b0b29..52095bd 100644 --- a/scripts/global.bash +++ b/scripts/global.bash @@ -1,2 +1,13 @@ : ${ut_log_dir:=logs} : ${ut_log_summary:=${ut_log_dir}/SUMMARY} +: ${ut_run_queues:=1} + +function ut_in_parallel() +{ + [[ $ut_run_queues != 1 ]] +} + +function is_number() +{ + [[ "$1" =~ ^[0-9]+$ ]] +}
run_task.sh is getting slow. This patch is trying to make it faster by running the tests concurrently. We provide a new parameter "-j" for the run_tests.sh, which can be used to specify how many run queues we want for the tests. Default queue length is 1, which is the old behavior. Quick test on my laptop (4 cores, 2 threads each) shows 3x speed boost: |-----------------+-----------| | command | time used | |-----------------+-----------| | run_test.sh | 75s | | run_test.sh -j8 | 27s | |-----------------+-----------| Suggested-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- run_tests.sh | 12 ++++++++++-- scripts/functions.bash | 15 ++++++++++++++- scripts/global.bash | 11 +++++++++++ 3 files changed, 35 insertions(+), 3 deletions(-)