diff mbox

Increase fork signal safety for single-threaded processes [BZ #19703]

Message ID 20160511134121.CDFB841B3C727@oldenburg.str.redhat.com
State New
Headers show

Commit Message

Florian Weimer May 11, 2016, 1:41 p.m. UTC
This provides a band-aid and addresses the scenario where fork is
called from a signal handler while the process is in the malloc
subsystem (or has acquired the libio list lock).  It does not
address the general issue of async-signal-safety of fork;
multi-threaded processes are not covered, and some glibc
subsystems have fork handlers which are not async-signal-safe.

2016-05-11  Florian Weimer  <fweimer@redhat.com>

	[BZ #19703]
	Partially async-signal-safe fork for single-threaded processes.
	* sysdeps/nptl/fork.c (__libc_fork): Introduce multiple_threads
	variable.  Do not acquire and reset/release malloc and libio locks
	in single-threaded processes.
	* malloc/tst-mallocfork2.c: New file.
	* malloc/Makefile (tests): Add it.

Comments

Tulio Magno Quites Machado Filho May 11, 2016, 7:59 p.m. UTC | #1
Florian Weimer <fweimer@redhat.com> writes:

> This provides a band-aid and addresses the scenario where fork is
> called from a signal handler while the process is in the malloc
> subsystem (or has acquired the libio list lock).  It does not
> address the general issue of async-signal-safety of fork;
> multi-threaded processes are not covered, and some glibc
> subsystems have fork handlers which are not async-signal-safe.
>
> 2016-05-11  Florian Weimer  <fweimer@redhat.com>
>
> 	[BZ #19703]
> 	Partially async-signal-safe fork for single-threaded processes.
> 	* sysdeps/nptl/fork.c (__libc_fork): Introduce multiple_threads
> 	variable.  Do not acquire and reset/release malloc and libio locks
> 	in single-threaded processes.
> 	* malloc/tst-mallocfork2.c: New file.

Do you have plans to remove malloc/tst-mallocfork?

> diff --git a/malloc/Makefile b/malloc/Makefile
> index 3283f4f..fa1730e 100644
> 
> +  sigusr1_sender_pid = fork ();
> +  if (sigusr1_sender_pid == 0)
> +    signal_sender (SIGUSR1, false);
> +  pid_t sigusr2_sender_pid = fork ();
> +  if (sigusr2_sender_pid == 0)
> +    signal_sender (SIGUSR2, true);

I suggest to fork the SIGUSR2 sender first.
Sometimes the SIGUSR1 sender floods the parent to the point it takes seconds
to fork the second sender.
Florian Weimer May 12, 2016, 7:06 a.m. UTC | #2
On 05/11/2016 09:59 PM, Tulio Magno Quites Machado Filho wrote:
> Florian Weimer <fweimer@redhat.com> writes:
>
>> This provides a band-aid and addresses the scenario where fork is
>> called from a signal handler while the process is in the malloc
>> subsystem (or has acquired the libio list lock).  It does not
>> address the general issue of async-signal-safety of fork;
>> multi-threaded processes are not covered, and some glibc
>> subsystems have fork handlers which are not async-signal-safe.
>>
>> 2016-05-11  Florian Weimer  <fweimer@redhat.com>
>>
>> 	[BZ #19703]
>> 	Partially async-signal-safe fork for single-threaded processes.
>> 	* sysdeps/nptl/fork.c (__libc_fork): Introduce multiple_threads
>> 	variable.  Do not acquire and reset/release malloc and libio locks
>> 	in single-threaded processes.
>> 	* malloc/tst-mallocfork2.c: New file.
>
> Do you have plans to remove malloc/tst-mallocfork?

No, I would just change exit to _exit there.

>> diff --git a/malloc/Makefile b/malloc/Makefile
>> index 3283f4f..fa1730e 100644
>>
>> +  sigusr1_sender_pid = fork ();
>> +  if (sigusr1_sender_pid == 0)
>> +    signal_sender (SIGUSR1, false);
>> +  pid_t sigusr2_sender_pid = fork ();
>> +  if (sigusr2_sender_pid == 0)
>> +    signal_sender (SIGUSR2, true);
>
> I suggest to fork the SIGUSR2 sender first.
> Sometimes the SIGUSR1 sender floods the parent to the point it takes seconds
> to fork the second sender.

Interesting.  I have made the change locally, as you suggested.

Any comments on the change to the fork implementation?

Thanks,
Florian
Tulio Magno Quites Machado Filho May 12, 2016, 12:42 p.m. UTC | #3
Florian Weimer <fweimer@redhat.com> writes:

> On 05/11/2016 09:59 PM, Tulio Magno Quites Machado Filho wrote:
>> Florian Weimer <fweimer@redhat.com> writes:
>>
>> Do you have plans to remove malloc/tst-mallocfork?
>
> No, I would just change exit to _exit there.

Ack.

> Any comments on the change to the fork implementation?

No.  LGTM.
Florian Weimer May 13, 2016, 2:40 p.m. UTC | #4
On 05/11/2016 09:59 PM, Tulio Magno Quites Machado Filho wrote:
> I suggest to fork the SIGUSR2 sender first.
> Sometimes the SIGUSR1 sender floods the parent to the point it takes seconds
> to fork the second sender.

I believe this is a different issue: sigusr1_sender_pid may be 0 if 
SIGUSR1 arrives very early, and we send SIGSTOP to the current process 
group, not just the subprocess.  This is consistent with no output at 
all from the test program (which is what I saw in this case).

I will commit a fix for this test issue shortly.

Florian
diff mbox

Patch

diff --git a/malloc/Makefile b/malloc/Makefile
index 3283f4f..fa1730e 100644
--- a/malloc/Makefile
+++ b/malloc/Makefile
@@ -29,7 +29,8 @@  tests := mallocbug tst-malloc tst-valloc tst-calloc tst-obstack \
 	 tst-malloc-usable tst-realloc tst-posix_memalign \
 	 tst-pvalloc tst-memalign tst-mallopt tst-scratch_buffer \
 	 tst-malloc-backtrace tst-malloc-thread-exit \
-	 tst-malloc-thread-fail tst-malloc-fork-deadlock
+	 tst-malloc-thread-fail tst-malloc-fork-deadlock \
+	 tst-mallocfork2
 test-srcs = tst-mtrace
 
 routines = malloc morecore mcheck mtrace obstack \
diff --git a/malloc/tst-mallocfork2.c b/malloc/tst-mallocfork2.c
new file mode 100644
index 0000000..237da92
--- /dev/null
+++ b/malloc/tst-mallocfork2.c
@@ -0,0 +1,212 @@ 
+/* Test case for async-signal-safe fork (with respect to malloc).
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public License as
+   published by the Free Software Foundation; either version 2.1 of the
+   License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; see the file COPYING.LIB.  If
+   not, see <http://www.gnu.org/licenses/>.  */
+
+/* This test will fail if the process is multi-threaded because we
+   only have an async-signal-safe fork in the single-threaded case
+   (where we skip acquiring the malloc heap locks).
+
+   This test only checks async-signal-safety with regards to malloc;
+   other, more rarely-used glibc subsystems could have locks which
+   still make fork unsafe, even in single-threaded processes.  */
+
+#include <errno.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/wait.h>
+#include <time.h>
+#include <unistd.h>
+
+/* How many malloc objects to keep arond.  */
+enum { malloc_objects = 1009 };
+
+/* The maximum size of an object.  */
+enum { malloc_maximum_size = 70000 };
+
+/* How many signals need to be delivered before the test exits.  */
+enum { signal_count = 1000 };
+
+
+/* Process ID of the subprocess which sends SIGUSR1 signals.  */
+static pid_t sigusr1_sender_pid;
+
+/* Set to 1 if SIGUSR1 is received.  Used to detect a signal during
+   malloc/free.  */
+static volatile sig_atomic_t sigusr1_received;
+
+/* Periodically set to 1, to indicate that the process is making
+   progress.  Checked by liveness_signal_handler.  */
+static volatile sig_atomic_t progress_indicator = 1;
+
+/* Write the message to standard output.  Usable from signal
+   handlers.  */
+static void
+write_message (const char *str)
+{
+  write (STDOUT_FILENO, str, strlen (str));
+}
+
+static void
+sigusr1_handler (int signo)
+{
+  /* Let the main program make progress, by temporarily suspending
+     signals from the subprocess.  */
+  if (sigusr1_received)
+    return;
+  if (kill (sigusr1_sender_pid, SIGSTOP) != 0)
+    {
+      write_message ("error: kill (SIGSTOP)\n");
+      abort ();
+    }
+  sigusr1_received = 1;
+
+  /* Perform a fork with a trivial subprocess.  */
+  pid_t pid = fork ();
+  if (pid == -1)
+    {
+      write_message ("error: fork\n");
+      abort ();
+    }
+  if (pid == 0)
+    _exit (0);
+  int status;
+  int ret = TEMP_FAILURE_RETRY (waitpid (pid, &status, 0));
+  if (ret < 0)
+    {
+      write_message ("error: waitpid\n");
+      abort ();
+    }
+  if (status != 0)
+    {
+      write_message ("error: unexpected exit status from subprocess\n");
+      abort ();
+    }
+}
+
+static void
+liveness_signal_handler (int signo)
+{
+  if (progress_indicator)
+    progress_indicator = 0;
+  else
+    write_message ("warning: process seems to be stuck\n");
+}
+
+static void
+__attribute__ ((noreturn))
+signal_sender (int signo, bool sleep)
+{
+  pid_t target = getppid ();
+  while (true)
+    {
+      if (kill (target, signo) != 0)
+        {
+          dprintf (STDOUT_FILENO, "error: kill: %m\n");
+          abort ();
+        }
+      if (sleep)
+        usleep (1 * 1000 * 1000);
+    }
+}
+
+static int
+do_test (void)
+{
+  struct sigaction action =
+    {
+      .sa_handler = sigusr1_handler,
+    };
+  sigemptyset (&action.sa_mask);
+
+  if (sigaction (SIGUSR1, &action, NULL) != 0)
+    {
+      printf ("error: sigaction: %m");
+      return 1;
+    }
+
+  action.sa_handler = liveness_signal_handler;
+  if (sigaction (SIGUSR2, &action, NULL) != 0)
+    {
+      printf ("error: sigaction: %m");
+      return 1;
+    }
+
+  sigusr1_sender_pid = fork ();
+  if (sigusr1_sender_pid == 0)
+    signal_sender (SIGUSR1, false);
+  pid_t sigusr2_sender_pid = fork ();
+  if (sigusr2_sender_pid == 0)
+    signal_sender (SIGUSR2, true);
+
+  void *objects[malloc_objects] = {};
+  unsigned signals = 0;
+  unsigned seed = 1;
+  time_t last_report = 0;
+  while (signals < signal_count)
+    {
+      progress_indicator = 1;
+      int slot = rand_r (&seed) % malloc_objects;
+      size_t size = rand_r (&seed) % malloc_maximum_size;
+      if (kill (sigusr1_sender_pid, SIGCONT) != 0)
+        {
+          printf ("error: kill (SIGCONT): %m\n");
+          signal (SIGUSR1, SIG_IGN);
+          kill (sigusr1_sender_pid, SIGKILL);
+          kill (sigusr2_sender_pid, SIGKILL);
+          return 1;
+        }
+      sigusr1_received = false;
+      free (objects[slot]);
+      objects[slot] = malloc (size);
+      if (sigusr1_received)
+        {
+          ++signals;
+          time_t current = time (0);
+          if (current != last_report)
+            {
+              printf ("info: SIGUSR1 signal count: %u\n", signals);
+              last_report = current;
+            }
+        }
+      if (objects[slot] == NULL)
+        {
+          printf ("error: malloc: %m\n");
+          signal (SIGUSR1, SIG_IGN);
+          kill (sigusr1_sender_pid, SIGKILL);
+          kill (sigusr2_sender_pid, SIGKILL);
+          return 1;
+        }
+    }
+
+  /* Clean up allocations.  */
+  for (int slot = 0; slot < malloc_objects; ++slot)
+    free (objects[slot]);
+
+  /* Terminate the signal-sending subprocess.  The SIGUSR1 handler
+     should no longer run because it uses sigusr1_sender_pid.  */
+  signal (SIGUSR1, SIG_IGN);
+  kill (sigusr1_sender_pid, SIGKILL);
+  kill (sigusr2_sender_pid, SIGKILL);
+
+  return 0;
+}
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
diff --git a/sysdeps/nptl/fork.c b/sysdeps/nptl/fork.c
index 1a68cbd..616d897 100644
--- a/sysdeps/nptl/fork.c
+++ b/sysdeps/nptl/fork.c
@@ -54,6 +54,12 @@  __libc_fork (void)
     struct used_handler *next;
   } *allp = NULL;
 
+  /* Determine if we are running multiple threads.  We skip some fork
+     handlers in the single-thread case, to make fork safer to use in
+     signal handlers.  POSIX requires that fork is async-signal-safe,
+     but our current fork implementation is not.  */
+  bool multiple_threads = THREAD_GETMEM (THREAD_SELF, header.multiple_threads);
+
   /* Run all the registered preparation handlers.  In reverse order.
      While doing this we build up a list of all the entries.  */
   struct fork_handler *runp;
@@ -109,12 +115,21 @@  __libc_fork (void)
       break;
     }
 
-  _IO_list_lock ();
+  /* If we are not running multiple threads, we do not have to
+     preserve lock state.  If fork runs from a signal handler, only
+     async-signal-safe functions can be used in the child.  These data
+     structures are only used by unsafe functions, so their state does
+     not matter if fork was called from a signal handler.  */
+  if (multiple_threads)
+    {
+      _IO_list_lock ();
 
-  /* Acquire malloc locks.  This needs to come last because fork
-     handlers may use malloc, and the libio list lock has an indirect
-     malloc dependency as well (via the getdelim function).  */
-  __malloc_fork_lock_parent ();
+      /* Acquire malloc locks.  This needs to come last because fork
+	 handlers may use malloc, and the libio list lock has an
+	 indirect malloc dependency as well (via the getdelim
+	 function).  */
+      __malloc_fork_lock_parent ();
+    }
 
 #ifndef NDEBUG
   pid_t ppid = THREAD_GETMEM (THREAD_SELF, tid);
@@ -173,14 +188,18 @@  __libc_fork (void)
 # endif
 #endif
 
-      /* Release malloc locks.  */
-      __malloc_fork_unlock_child ();
+      /* Reset the lock state in the multi-threaded case.  */
+      if (multiple_threads)
+	{
+	  /* Release malloc locks.  */
+	  __malloc_fork_unlock_child ();
 
-      /* Reset the file list.  These are recursive mutexes.  */
-      fresetlockfiles ();
+	  /* Reset the file list.  These are recursive mutexes.  */
+	  fresetlockfiles ();
 
-      /* Reset locks in the I/O code.  */
-      _IO_list_resetlock ();
+	  /* Reset locks in the I/O code.  */
+	  _IO_list_resetlock ();
+	}
 
       /* Reset the lock the dynamic loader uses to protect its data.  */
       __rtld_lock_initialize (GL(dl_load_lock));
@@ -217,11 +236,15 @@  __libc_fork (void)
       /* Restore the PID value.  */
       THREAD_SETMEM (THREAD_SELF, pid, parentpid);
 
-      /* Release malloc locks, parent process variant.  */
-      __malloc_fork_unlock_parent ();
+      /* Release acquired locks in the multi-threaded case.  */
+      if (multiple_threads)
+	{
+	  /* Release malloc locks, parent process variant.  */
+	  __malloc_fork_unlock_parent ();
 
-      /* We execute this even if the 'fork' call failed.  */
-      _IO_list_unlock ();
+	  /* We execute this even if the 'fork' call failed.  */
+	  _IO_list_unlock ();
+	}
 
       /* Run the handlers registered for the parent.  */
       while (allp != NULL)