diff mbox series

[bpf-next] selftests/bpf: fix possible hang in sockopt_inherit

Message ID 20200715224107.3591967-1-sdf@google.com
State Accepted
Delegated to: BPF Maintainers
Headers show
Series [bpf-next] selftests/bpf: fix possible hang in sockopt_inherit | expand

Commit Message

Stanislav Fomichev July 15, 2020, 10:41 p.m. UTC
Andrii reported that sockopt_inherit occasionally hangs up on 5.5 kernel [0].
This can happen if server_thread runs faster than the main thread.
In that case, pthread_cond_wait will wait forever because
pthread_cond_signal was executed before the main thread was blocking.
Let's move pthread_mutex_lock up a bit to make sure server_thread
runs strictly after the main thread goes to sleep.

(Not sure why this is 5.5 specific, maybe scheduling is less
deterministic? But I was able to confirm that it does indeed
happen in a VM.)

[0] https://lore.kernel.org/bpf/CAEf4BzY0-bVNHmCkMFPgObs=isUAyg-dFzGDY7QWYkmm7rmTSg@mail.gmail.com/

Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Andrii Nakryiko July 16, 2020, 4:41 a.m. UTC | #1
On Wed, Jul 15, 2020 at 3:41 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> Andrii reported that sockopt_inherit occasionally hangs up on 5.5 kernel [0].
> This can happen if server_thread runs faster than the main thread.
> In that case, pthread_cond_wait will wait forever because
> pthread_cond_signal was executed before the main thread was blocking.
> Let's move pthread_mutex_lock up a bit to make sure server_thread
> runs strictly after the main thread goes to sleep.
>
> (Not sure why this is 5.5 specific, maybe scheduling is less
> deterministic? But I was able to confirm that it does indeed
> happen in a VM.)
>
> [0] https://lore.kernel.org/bpf/CAEf4BzY0-bVNHmCkMFPgObs=isUAyg-dFzGDY7QWYkmm7rmTSg@mail.gmail.com/
>
> Reported-by: Andrii Nakryiko <andriin@fb.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---

Great, thanks for figuring this out! Hopefully this is it.

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c b/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c
> index 8547ecbdc61f..ec281b0363b8 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c
> @@ -193,11 +193,10 @@ static void run_test(int cgroup_fd)
>         if (CHECK_FAIL(server_fd < 0))
>                 goto close_bpf_object;
>
> +       pthread_mutex_lock(&server_started_mtx);
>         if (CHECK_FAIL(pthread_create(&tid, NULL, server_thread,
>                                       (void *)&server_fd)))
>                 goto close_server_fd;
> -
> -       pthread_mutex_lock(&server_started_mtx);
>         pthread_cond_wait(&server_started, &server_started_mtx);
>         pthread_mutex_unlock(&server_started_mtx);
>
> --
> 2.27.0.389.gc38d7665816-goog
>
Daniel Borkmann July 16, 2020, 7:02 p.m. UTC | #2
On 7/16/20 12:41 AM, Stanislav Fomichev wrote:
> Andrii reported that sockopt_inherit occasionally hangs up on 5.5 kernel [0].
> This can happen if server_thread runs faster than the main thread.
> In that case, pthread_cond_wait will wait forever because
> pthread_cond_signal was executed before the main thread was blocking.
> Let's move pthread_mutex_lock up a bit to make sure server_thread
> runs strictly after the main thread goes to sleep.
> 
> (Not sure why this is 5.5 specific, maybe scheduling is less
> deterministic? But I was able to confirm that it does indeed
> happen in a VM.)
> 
> [0] https://lore.kernel.org/bpf/CAEf4BzY0-bVNHmCkMFPgObs=isUAyg-dFzGDY7QWYkmm7rmTSg@mail.gmail.com/
> 
> Reported-by: Andrii Nakryiko <andriin@fb.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>

Applied, thanks!
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c b/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c
index 8547ecbdc61f..ec281b0363b8 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockopt_inherit.c
@@ -193,11 +193,10 @@  static void run_test(int cgroup_fd)
 	if (CHECK_FAIL(server_fd < 0))
 		goto close_bpf_object;
 
+	pthread_mutex_lock(&server_started_mtx);
 	if (CHECK_FAIL(pthread_create(&tid, NULL, server_thread,
 				      (void *)&server_fd)))
 		goto close_server_fd;
-
-	pthread_mutex_lock(&server_started_mtx);
 	pthread_cond_wait(&server_started, &server_started_mtx);
 	pthread_mutex_unlock(&server_started_mtx);