diff mbox series

[ovs-dev] fatal-signal: Don't share signal fds/handles with forked process.

Message ID 20230608191659.937009-1-i.maximets@ovn.org
State Accepted
Commit 04f854f938b3310ee2ac08400d3b4bf72d070935
Headers show
Series [ovs-dev] fatal-signal: Don't share signal fds/handles with forked process. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Ilya Maximets June 8, 2023, 7:16 p.m. UTC
The signal_fds pipe and wevent are a mechanism to wake up the process
after it received a signal and stored the number for the future
processing.  They are not intended for inter-process communication.
However, in the current code, descriptors are not closed on fork().

The main scenario where we use fork() is a monitor process.  Monitor
doesn't actually use poll loops and doesn't wait on the descriptor.
But when a child process is killed, it (child) sends a byte to itself,
then it wakes up due to POLLIN on the pipe and terminates itself after
processing all the callbacks.  The byte stays unread.  And the pipe is
still open in the monitor process.  When child dies, the monitor wakes
up and forks again.  New child inherits the same pipe that still
contains unread data.  This data is never read, so the child will
constantly wake itself up for no reason.

Interestingly enough raise(SIGSEGV) doesn't immediately kill the
process.  The execution continues til the end of a signal handler,
so we're still able to write a byte to a pipe even in this case.
Presumably because we don't have SA_NODEFER.

Fix the issue by re-creating the pipe/event on fork.  This way
every new child will have its own notification channel and will
not wake up any other processes.

There was already an attempt to fix the issue, but it didn't get a
follow up (see the reported-at tag).  This is an alternative solution.

Fixes: ff8decf1a318 ("daemon: Add support for process monitoring and restart.")
Reported-at: https://patchwork.ozlabs.org/project/openvswitch/patch/20221019093147.2072-1-lifengqi@inspur.com/
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
 lib/fatal-signal.c | 49 ++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 41 insertions(+), 8 deletions(-)

Comments

Eelco Chaudron June 9, 2023, 9:18 a.m. UTC | #1
On 8 Jun 2023, at 21:16, Ilya Maximets wrote:

> The signal_fds pipe and wevent are a mechanism to wake up the process
> after it received a signal and stored the number for the future
> processing.  They are not intended for inter-process communication.
> However, in the current code, descriptors are not closed on fork().
>
> The main scenario where we use fork() is a monitor process.  Monitor
> doesn't actually use poll loops and doesn't wait on the descriptor.
> But when a child process is killed, it (child) sends a byte to itself,
> then it wakes up due to POLLIN on the pipe and terminates itself after
> processing all the callbacks.  The byte stays unread.  And the pipe is
> still open in the monitor process.  When child dies, the monitor wakes
> up and forks again.  New child inherits the same pipe that still
> contains unread data.  This data is never read, so the child will
> constantly wake itself up for no reason.
>
> Interestingly enough raise(SIGSEGV) doesn't immediately kill the
> process.  The execution continues til the end of a signal handler,
> so we're still able to write a byte to a pipe even in this case.
> Presumably because we don't have SA_NODEFER.
>
> Fix the issue by re-creating the pipe/event on fork.  This way
> every new child will have its own notification channel and will
> not wake up any other processes.
>
> There was already an attempt to fix the issue, but it didn't get a
> follow up (see the reported-at tag).  This is an alternative solution.
>
> Fixes: ff8decf1a318 ("daemon: Add support for process monitoring and restart.")
> Reported-at: https://patchwork.ozlabs.org/project/openvswitch/patch/20221019093147.2072-1-lifengqi@inspur.com/
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>


The changes look good to me.

Only tested on linux, and looking up the APIs on Windows, that part seems fine.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Ilya Maximets June 9, 2023, 2:28 p.m. UTC | #2
On 6/9/23 11:18, Eelco Chaudron wrote:
> 
> 
> On 8 Jun 2023, at 21:16, Ilya Maximets wrote:
> 
>> The signal_fds pipe and wevent are a mechanism to wake up the process
>> after it received a signal and stored the number for the future
>> processing.  They are not intended for inter-process communication.
>> However, in the current code, descriptors are not closed on fork().
>>
>> The main scenario where we use fork() is a monitor process.  Monitor
>> doesn't actually use poll loops and doesn't wait on the descriptor.
>> But when a child process is killed, it (child) sends a byte to itself,
>> then it wakes up due to POLLIN on the pipe and terminates itself after
>> processing all the callbacks.  The byte stays unread.  And the pipe is
>> still open in the monitor process.  When child dies, the monitor wakes
>> up and forks again.  New child inherits the same pipe that still
>> contains unread data.  This data is never read, so the child will
>> constantly wake itself up for no reason.
>>
>> Interestingly enough raise(SIGSEGV) doesn't immediately kill the
>> process.  The execution continues til the end of a signal handler,
>> so we're still able to write a byte to a pipe even in this case.
>> Presumably because we don't have SA_NODEFER.
>>
>> Fix the issue by re-creating the pipe/event on fork.  This way
>> every new child will have its own notification channel and will
>> not wake up any other processes.
>>
>> There was already an attempt to fix the issue, but it didn't get a
>> follow up (see the reported-at tag).  This is an alternative solution.
>>
>> Fixes: ff8decf1a318 ("daemon: Add support for process monitoring and restart.")
>> Reported-at: https://patchwork.ozlabs.org/project/openvswitch/patch/20221019093147.2072-1-lifengqi@inspur.com/
>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> 
> The changes look good to me.
> 
> Only tested on linux, and looking up the APIs on Windows, that part seems fine.
> 
> Acked-by: Eelco Chaudron <echaudro@redhat.com>

Thanks!  Applied and backported down to 2.17.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/lib/fatal-signal.c b/lib/fatal-signal.c
index f80f32182..77f0c87dd 100644
--- a/lib/fatal-signal.c
+++ b/lib/fatal-signal.c
@@ -82,6 +82,39 @@  static void call_hooks(int sig_nr);
 static BOOL WINAPI ConsoleHandlerRoutine(DWORD dwCtrlType);
 #endif
 
+/* Sets up a pipe or event handle that will be used to wake up the current
+ * process after signal is received, so it can be processed outside of the
+ * signal handler context in fatal_signal_run(). */
+static void
+fatal_signal_create_wakeup_events(void)
+{
+#ifndef _WIN32
+    xpipe_nonblocking(signal_fds);
+#else
+    wevent = CreateEvent(NULL, TRUE, FALSE, NULL);
+    if (!wevent) {
+        char *msg_buf = ovs_lasterror_to_string();
+        VLOG_FATAL("Failed to create a event (%s).", msg_buf);
+    }
+#endif
+}
+
+static void
+fatal_signal_destroy_wakeup_events(void)
+{
+#ifndef _WIN32
+    close(signal_fds[0]);
+    signal_fds[0] = -1;
+    close(signal_fds[1]);
+    signal_fds[1] = -1;
+#else
+    ResetEvent(wevent);
+    CloseHandle(wevent);
+    wevent = NULL;
+#endif
+}
+
+
 /* Initializes the fatal signal handling module.  Calling this function is
  * optional, because calling any other function in the module will also
  * initialize it.  However, in a multithreaded program, the module must be
@@ -109,15 +142,9 @@  fatal_signal_init(void)
             VLOG_DBG("Capturing of dummy backtrace has failed.");
         }
 
-#ifndef _WIN32
-        xpipe_nonblocking(signal_fds);
-#else
-        wevent = CreateEvent(NULL, TRUE, FALSE, NULL);
-        if (!wevent) {
-            char *msg_buf = ovs_lasterror_to_string();
-            VLOG_FATAL("Failed to create a event (%s).", msg_buf);
-        }
+        fatal_signal_create_wakeup_events();
 
+#ifdef _WIN32
         /* Register a function to handle Ctrl+C. */
         SetConsoleCtrlHandler(ConsoleHandlerRoutine, true);
 #endif
@@ -501,6 +528,9 @@  do_unlink_files(void)
  * hooks passed a 'cancel_cb' function to fatal_signal_add_hook(), then those
  * functions will be called, allowing them to free resources, etc.
  *
+ * Also re-creates wake-up events, so signals in one of the processes do not
+ * wake up the other one.
+ *
  * Following a fork, one of the resulting processes can call this function to
  * allow it to terminate without calling the hooks registered before calling
  * this function.  New hooks registered after calling this function will take
@@ -512,6 +542,9 @@  fatal_signal_fork(void)
 
     assert_single_threaded();
 
+    fatal_signal_destroy_wakeup_events();
+    fatal_signal_create_wakeup_events();
+
     for (i = 0; i < n_hooks; i++) {
         struct hook *h = &hooks[i];
         if (h->cancel_cb) {