From patchwork Fri Mar 17 13:03:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ales Musil X-Patchwork-Id: 1758264 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=BfUOsyCc; dkim-atps=neutral Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PdPTF1xKgz2473 for ; Sat, 18 Mar 2023 00:03:43 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 5022C8128B; Fri, 17 Mar 2023 13:03:41 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 5022C8128B Authentication-Results: smtp1.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=BfUOsyCc X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hbXlTwyaKCOH; Fri, 17 Mar 2023 13:03:39 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id AFA1D81241; Fri, 17 Mar 2023 13:03:38 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org AFA1D81241 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 87266C0035; Fri, 17 Mar 2023 13:03:38 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id C235CC0032 for ; Fri, 17 Mar 2023 13:03:36 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id A2D906178B for ; Fri, 17 Mar 2023 13:03:36 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org A2D906178B Authentication-Results: smtp3.osuosl.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=BfUOsyCc X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lHo0EPIlT2eP for ; Fri, 17 Mar 2023 13:03:34 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 49C1860B49 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id 49C1860B49 for ; Fri, 17 Mar 2023 13:03:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679058212; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=ukjrtzjujsWznZ40wORZ3CrH1XKbde3oMz4DtdU4RSc=; b=BfUOsyCcsaY2TI6H9oBmwrjDW/3nops5+esS42SYB9Bjk7yDGcwmBTL0btjGFWN5wWc8xO RVJSL61YiQq+11bKVXki5ufIKrjHdFEzhDPQ0NgtoYXOYbvv3fH4IJPKYgZZaAh7irR6Hw fs12+vUHK9v0bWwmTDOB5CroIvLAGKU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-248-1ZTKhyZlOVe8fTO0cuWn3w-1; Fri, 17 Mar 2023 09:03:30 -0400 X-MC-Unique: 1ZTKhyZlOVe8fTO0cuWn3w-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0B43A811E7E; Fri, 17 Mar 2023 13:03:30 +0000 (UTC) Received: from amusil.redhat.com (unknown [10.45.224.103]) by smtp.corp.redhat.com (Postfix) with ESMTP id 786C3492B00; Fri, 17 Mar 2023 13:03:28 +0000 (UTC) From: Ales Musil To: dev@openvswitch.org Date: Fri, 17 Mar 2023 14:03:27 +0100 Message-Id: <20230317130327.662166-1-amusil@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: dceara@redhat.com, i.maximets@ovn.org, jmeng@redhat.com Subject: [ovs-dev] [PATCH] backtrace: Replace usage of libunwind with libc backtrace X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Use the backtrace functions that is provided by libc this allows us to get backtrace that is independent of the current memory ap of the process. Which in turn can be used for debugging/tracing purpose. The backtrace is not 100% accurate due to various optimizations, most notably "-fomit-frame-pointer" and LTO. This might result that the line in source file doesn't correspond to the real line. However, it should be able to pinpoint at least the function where the backtrace was called. The backtrace is not marked as signal safe however the backtrace manual page gives more detailed explanation why it might be the case [0]. Load the "libgcc" in advance within the "fatal_signal_init" which should ensure that subsequent calls to backtrace* do not call malloc and are signal safe. The typical backtrace will look similar to the one below: /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] ovsdb-server(+0xa661) [0x562cfce2e661] ovsdb-server(+0x7e39) [0x562cfce2be39] /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] ovsdb-server(+0x8c35) [0x562cfce2cc35] backtrace.h elaborates on how to effectively get the line information associated with the addressed presented in the backtrace. [0] backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand Reported-at: https://bugzilla.redhat.com/2177760 Signed-off-by: Ales Musil --- include/openvswitch/vlog.h | 4 +- lib/backtrace.c | 71 ++++++++------------------ lib/backtrace.h | 76 +++++++++++++--------------- lib/daemon-unix.c | 1 - lib/fatal-signal.c | 100 +++++++++---------------------------- lib/ovsdb-error.c | 15 ++---- lib/vlog.c | 14 ++---- m4/openvswitch.m4 | 20 +++----- tests/atlocal.in | 1 + tests/daemon.at | 71 ++++++++++++++++++++++++++ 10 files changed, 165 insertions(+), 208 deletions(-) diff --git a/include/openvswitch/vlog.h b/include/openvswitch/vlog.h index e53ce6d81..98edc3452 100644 --- a/include/openvswitch/vlog.h +++ b/include/openvswitch/vlog.h @@ -145,8 +145,8 @@ void vlog_set_syslog_method(const char *method); /* Configure syslog target. */ void vlog_set_syslog_target(const char *target); -/* Write directly to log file. */ -void vlog_direct_write_to_log_file_unsafe(const char *s); +/* Return the current vlog file descriptor. */ +int vlog_fd(void); /* Initialization. */ void vlog_init(void); diff --git a/lib/backtrace.c b/lib/backtrace.c index 2853d5ff1..713b99e7b 100644 --- a/lib/backtrace.c +++ b/lib/backtrace.c @@ -32,13 +32,23 @@ VLOG_DEFINE_THIS_MODULE(backtrace); void backtrace_capture(struct backtrace *b) { - void *frames[BACKTRACE_MAX_FRAMES]; - int i; + b->n_frames = backtrace(b->frames, BACKTRACE_MAX_FRAMES); +} + +char * +backtrace_format(const struct backtrace *bt, struct ds *ds) +{ + if (bt->n_frames) { + char **symbols = backtrace_symbols(bt->frames, bt->n_frames); - b->n_frames = backtrace(frames, BACKTRACE_MAX_FRAMES); - for (i = 0; i < b->n_frames; i++) { - b->frames[i] = (uintptr_t) frames[i]; + for (int i = 0; i < bt->n_frames; i++) { + ds_put_format(ds, "%s\n", symbols[i]); + } + + free(symbols); } + + return ds_cstr(ds); } #else @@ -47,23 +57,15 @@ backtrace_capture(struct backtrace *backtrace) { backtrace->n_frames = 0; } -#endif -static char * -backtrace_format(const struct backtrace *b, struct ds *ds) +char * +backtrace_format(const struct backtrace *bt OVS_UNUSED, struct ds *ds) { - if (b->n_frames) { - int i; - - ds_put_cstr(ds, " (backtrace:"); - for (i = 0; i < b->n_frames; i++) { - ds_put_format(ds, " 0x%08"PRIxPTR, b->frames[i]); - } - ds_put_cstr(ds, ")"); - } + ds_put_cstr(ds, BACKTRACE_UNSUPPORTED_MSG); return ds_cstr(ds); } +#endif void log_backtrace_at(const char *msg, const char *where) @@ -77,41 +79,8 @@ log_backtrace_at(const char *msg, const char *where) } ds_put_cstr(&ds, where); + ds_put_cstr(&ds, " backtrace:\n"); VLOG_ERR("%s", backtrace_format(&b, &ds)); ds_destroy(&ds); } - -#ifdef HAVE_UNWIND -void -log_received_backtrace(int fd) { - int byte_read; - struct unw_backtrace backtrace[UNW_MAX_DEPTH]; - - VLOG_WARN("%s fd %d", __func__, fd); - fcntl(fd, F_SETFL, O_NONBLOCK); - memset(backtrace, 0, UNW_MAX_BUF); - - byte_read = read(fd, backtrace, UNW_MAX_BUF); - if (byte_read < 0) { - VLOG_ERR("Read fd %d failed: %s", fd, - ovs_strerror(errno)); - } else if (byte_read > 0) { - VLOG_WARN("SIGSEGV detected, backtrace:"); - for (int i = 0; i < UNW_MAX_DEPTH; i++) { - if (backtrace[i].func[0] == 0) { - break; - } - VLOG_WARN("0x%016"PRIxPTR" <%s+0x%"PRIxPTR">\n", - backtrace[i].ip, - backtrace[i].func, - backtrace[i].offset); - } - } -} -#else /* !HAVE_UNWIND */ -void -log_received_backtrace(int daemonize_fd OVS_UNUSED) { - VLOG_WARN("Backtrace using libunwind not supported."); -} -#endif /* HAVE_UNWIND */ diff --git a/lib/backtrace.h b/lib/backtrace.h index 5708bf9c6..55b25dca0 100644 --- a/lib/backtrace.h +++ b/lib/backtrace.h @@ -20,11 +20,6 @@ #include #include "openvswitch/dynamic-string.h" -#ifdef HAVE_UNWIND -#define UNW_LOCAL_ONLY -#include -#endif - /* log_backtrace() will save the backtrace of a running program * into the log at the DEBUG level. * @@ -36,58 +31,57 @@ * log_backtrace_msg("your message"); <-- with a message * * - * A typical log will look like the following. The hex numbers listed after - * "backtrace" are the addresses of the backtrace. + * A typical backtrace will look like the following example: + * /lib64/libopenvswitch-3.1.so.0(backtrace_capture+0x1e) [0x7fc5db298dfe] + * /lib64/libopenvswitch-3.1.so.0(log_backtrace_at+0x57) [0x7fc5db2999e7] + * /lib64/libovsdb-3.1.so.0(ovsdb_txn_complete+0x7b) [0x7fc5db56247b] + * /lib64/libovsdb-3.1.so.0(ovsdb_txn_propose_commit_block+0x8d) [0x7fc5db563a8d] + * ovsdb-server(+0xa661) [0x562cfce2e661] + * ovsdb-server(+0x7e39) [0x562cfce2be39] + * /lib64/libc.so.6(+0x27b4a) [0x7fc5db048b4a] + * /lib64/libc.so.6(__libc_start_main+0x8b) [0x7fc5db048c0b] + * ovsdb-server(+0x8c35) [0x562cfce2cc35] * - * 2014-03-13T23:18:11.979Z|00002|backtrace(revalidator_6)|ERR|lib/dpif-netdev.c:1312: (backtrace: 0x00521f57 0x00460365 0x00463ea4 0x0046470b 0x0043b32d 0x0043bac3 0x0043bae2 0x0043943b 0x004c22b3 0x2b5b3ac94e9a 0x2b5b3b4a33fd) + * GDB can be used to view the exact line of the code for particular backtrace. + * One thing to keep in mind is that the lines in source files might not + * 100% correspond with the backtrace due to various optimizations as LTO etc. + * (The effect can be seen in this example). * - * The following bash command can be used to view backtrace in - * a more readable form. - * addr2line -p -e vswitchd/ovs-vswitchd + * Assuming that debuginfo for the library or binary is installed load it to + * GDB: + * $ gdb ovsdb-server + * (gdb) list *(+0x7e39) + * 0x7e39 is in main (ovsdb/ovsdb-server.c:278). + * (gdb) list *(+0xa661) + * 0xa661 is in commit_txn (ovsdb/ovsdb-server.c:1173) * - * An typical run and output will look like: - * addr2line -p -e vswitchd/ovs-vswitchd 0x00521f57 0x00460365 0x00463ea4 - * 0x0046470b 0x0043b32d 0x0043bac3 0x0043bae2 0x0043943b 0x004c22b3 - * 0x2b5b3ac94e9a 0x2b5b3b4a33fd + * $ gdb /lib64/libovsdb-3.1.so.0 + * (gdb) list *(ovsdb_txn_propose_commit_block+0x8d) + * 0x3aa8d is in ovsdb_txn_propose_commit_block (ovsdb/transaction.c:1328) + * (gdb) list *(ovsdb_txn_complete+0x7b) + * 0x3947b is in ovsdb_txn_complete (./include/openvswitch/list.h:321) * - * openvswitch/lib/backtrace.c:33 - * openvswitch/lib/dpif-netdev.c:1312 - * openvswitch/lib/dpif.c:937 - * openvswitch/lib/dpif.c:1258 - * openvswitch/ofproto/ofproto-dpif-upcall.c:1440 - * openvswitch/ofproto/ofproto-dpif-upcall.c:1595 - * openvswitch/ofproto/ofproto-dpif-upcall.c:160 - * openvswitch/ofproto/ofproto-dpif-upcall.c:717 - * openvswitch/lib/ovs-thread.c:268 - * ??:0 - * ??:0 + * $ gdb /lib64/libopenvswitch-3.1.so.0 + * (gdb) list *(log_backtrace_at+0x57) + * 0x999e7 is in log_backtrace_at (lib/backtrace.c:77) + * (gdb) list *(backtrace_capture+0x1e) + * 0x98dfe is in backtrace_capture (lib/backtrace.c:35) */ #define log_backtrace() log_backtrace_at(NULL, OVS_SOURCE_LOCATOR); #define log_backtrace_msg(msg) log_backtrace_at(msg, OVS_SOURCE_LOCATOR); #define BACKTRACE_MAX_FRAMES 31 +#define BACKTRACE_DUMP_MSG "SIGSEGV detected, backtrace:\n" +#define BACKTRACE_UNSUPPORTED_MSG "backtrace() is not supported!\n" struct backtrace { int n_frames; - uintptr_t frames[BACKTRACE_MAX_FRAMES]; -}; - -#ifdef HAVE_UNWIND -#define UNW_MAX_DEPTH 32 -#define UNW_MAX_FUNCN 32 -#define UNW_MAX_BUF \ - (UNW_MAX_DEPTH * sizeof(struct unw_backtrace)) - -struct unw_backtrace { - char func[UNW_MAX_FUNCN]; - unw_word_t ip; - unw_word_t offset; + void *frames[BACKTRACE_MAX_FRAMES]; }; -#endif void backtrace_capture(struct backtrace *); void log_backtrace_at(const char *msg, const char *where); -void log_received_backtrace(int fd); +char *backtrace_format(const struct backtrace *bt, struct ds *ds); #endif /* backtrace.h */ diff --git a/lib/daemon-unix.c b/lib/daemon-unix.c index 1a7ba427d..68706f73a 100644 --- a/lib/daemon-unix.c +++ b/lib/daemon-unix.c @@ -395,7 +395,6 @@ monitor_daemon(pid_t daemon_pid) } } - log_received_backtrace(daemonize_fd); close(daemonize_fd); daemonize_fd = -1; diff --git a/lib/fatal-signal.c b/lib/fatal-signal.c index bbb31ef27..b822fc774 100644 --- a/lib/fatal-signal.c +++ b/lib/fatal-signal.c @@ -35,8 +35,8 @@ #include "openvswitch/type-props.h" -#ifdef HAVE_UNWIND -#include "daemon-private.h" +#ifdef HAVE_BACKTRACE +#include #endif #ifndef SIG_ATOMIC_MAX @@ -94,6 +94,13 @@ fatal_signal_init(void) inited = true; ovs_mutex_init_recursive(&mutex); + + /* The dummy backtrace is needed see comment for + * send_backtrace_to_monitor(). */ + struct backtrace dummy_bt; + backtrace_capture(&dummy_bt); + VLOG_DBG("Load \"libgcc\" by capturing dummy backtrace, n_frames=%d", + dummy_bt.n_frames); #ifndef _WIN32 xpipe_nonblocking(signal_fds); #else @@ -157,94 +164,33 @@ fatal_signal_add_hook(void (*hook_cb)(void *aux), void (*cancel_cb)(void *aux), ovs_mutex_unlock(&mutex); } -#ifdef HAVE_UNWIND -/* Convert unsigned long long to string. This is needed because - * using snprintf() is not async signal safe. */ -static inline int -llong_to_hex_str(unsigned long long value, char *str) -{ - int i = 0, res; - - if (value / 16 > 0) { - i = llong_to_hex_str(value / 16, str); - } - - res = value % 16; - str[i] = "0123456789abcdef"[res]; - - return i + 1; -} - /* Send the backtrace buffer to monitor thread. * * Note that this runs in the signal handling context, any system * library functions used here must be async-signal-safe. + * backtrace() is only signal safe if the "libgcc" was loaded + * before the signal handler. In order to keep it safe the fatal_signal_init() + * should always call backtrace_capture which will ensure that "libgcc" is + * loaded. */ static inline void send_backtrace_to_monitor(void) { - /* volatile added to prevent a "clobbered" error on ppc64le with gcc */ - volatile int dep; - struct unw_backtrace unw_bt[UNW_MAX_DEPTH]; - unw_cursor_t cursor; - unw_context_t uc; - - if (daemonize_fd == -1) { + int log_fd = vlog_fd(); + if (!log_fd) { return; } - dep = 0; - unw_getcontext(&uc); - unw_init_local(&cursor, &uc); + struct backtrace bt; + backtrace_capture(&bt); - while (dep < UNW_MAX_DEPTH && unw_step(&cursor)) { - memset(unw_bt[dep].func, 0, UNW_MAX_FUNCN); - unw_get_reg(&cursor, UNW_REG_IP, &unw_bt[dep].ip); - unw_get_proc_name(&cursor, unw_bt[dep].func, UNW_MAX_FUNCN, - &unw_bt[dep].offset); - dep++; - } - - if (monitor) { - ignore(write(daemonize_fd, unw_bt, - dep * sizeof(struct unw_backtrace))); - } else { - /* Since there is no monitor daemon running, write backtrace - * in current process. - */ - char str[] = "SIGSEGV detected, backtrace:\n"; - char ip_str[16], offset_str[6]; - char line[64], fn_name[UNW_MAX_FUNCN]; - - vlog_direct_write_to_log_file_unsafe(str); - - for (int i = 0; i < dep; i++) { - memset(line, 0, sizeof line); - memset(fn_name, 0, sizeof fn_name); - memset(offset_str, 0, sizeof offset_str); - memset(ip_str, ' ', sizeof ip_str); - ip_str[sizeof(ip_str) - 1] = 0; - - llong_to_hex_str(unw_bt[i].ip, ip_str); - llong_to_hex_str(unw_bt[i].offset, offset_str); - - strcat(line, "0x"); - strcat(line, ip_str); - strcat(line, "<"); - memcpy(fn_name, unw_bt[i].func, UNW_MAX_FUNCN - 1); - strcat(line, fn_name); - strcat(line, "+0x"); - strcat(line, offset_str); - strcat(line, ">\n"); - vlog_direct_write_to_log_file_unsafe(line); - } - } -} + ignore(write(log_fd, BACKTRACE_DUMP_MSG, strlen(BACKTRACE_DUMP_MSG))); +#ifdef HAVE_BACKTRACE + backtrace_symbols_fd(bt.frames, bt.n_frames, log_fd); #else -static inline void -send_backtrace_to_monitor(void) { - /* Nothing. */ -} + ignore(write(log_fd, BACKTRACE_UNSUPPORTED_MSG, + strlen(BACKTRACE_UNSUPPORTED_MSG))); #endif +} /* Handles fatal signal number 'sig_nr'. * diff --git a/lib/ovsdb-error.c b/lib/ovsdb-error.c index a75ad36b7..2cd6242f3 100644 --- a/lib/ovsdb-error.c +++ b/lib/ovsdb-error.c @@ -139,17 +139,6 @@ ovsdb_internal_error(struct ovsdb_error *inner_error, va_end(args); } - backtrace_capture(&backtrace); - if (backtrace.n_frames) { - int i; - - ds_put_cstr(&ds, " (backtrace:"); - for (i = 0; i < backtrace.n_frames; i++) { - ds_put_format(&ds, " 0x%08"PRIxPTR, backtrace.frames[i]); - } - ds_put_char(&ds, ')'); - } - ds_put_format(&ds, " (%s %s)", program_name, VERSION); if (inner_error) { @@ -158,6 +147,10 @@ ovsdb_internal_error(struct ovsdb_error *inner_error, free(s); } + ds_put_cstr(&ds, ", backtrace:"); + backtrace_capture(&backtrace); + backtrace_format(&backtrace, &ds); + error = ovsdb_error("internal error", "%s", ds_cstr(&ds)); ds_destroy(&ds); diff --git a/lib/vlog.c b/lib/vlog.c index 9ddea48b8..b40c6909d 100644 --- a/lib/vlog.c +++ b/lib/vlog.c @@ -649,19 +649,11 @@ vlog_set_syslog_target(const char *target) ovs_rwlock_unlock(&pattern_rwlock); } -/* - * This function writes directly to log file without using async writer or - * taking a lock. Caller must hold 'log_file_mutex' or be sure that it's - * not necessary. Could be used in exceptional cases like dumping of backtrace - * on fatal signals. - */ -void -vlog_direct_write_to_log_file_unsafe(const char *s) +int +vlog_fd(void) OVS_NO_THREAD_SAFETY_ANALYSIS { - if (log_fd >= 0) { - ignore(write(log_fd, s, strlen(s))); - } + return log_fd; } /* Returns 'false' if 'facility' is not a valid string. If 'facility' diff --git a/m4/openvswitch.m4 b/m4/openvswitch.m4 index 14d9249b8..64a097d90 100644 --- a/m4/openvswitch.m4 +++ b/m4/openvswitch.m4 @@ -359,9 +359,12 @@ AC_DEFUN([OVS_CHECK_DBDIR], dnl Defines HAVE_BACKTRACE if backtrace() is found. AC_DEFUN([OVS_CHECK_BACKTRACE], - [AC_SEARCH_LIBS([backtrace], [execinfo ubacktrace], - [AC_DEFINE([HAVE_BACKTRACE], [1], - [Define to 1 if you have backtrace(3).])])]) + [AC_SEARCH_LIBS([backtrace], [execinfo ubacktrace], [HAVE_BACKTRACE=yes], [HAVE_BACKTRACE=no]) + if test "$HAVE_BACKTRACE" = "yes"; then + AC_DEFINE([HAVE_BACKTRACE], [1], [Define to 1 if you have backtrace(3).]) + fi + AM_CONDITIONAL([HAVE_BACKTRACE], [test "$HAVE_BACKTRACE" = "yes"]) + AC_SUBST([HAVE_BACKTRACE])]) dnl Defines HAVE_PERF_EVENT if linux/perf_event.h is found. AC_DEFUN([OVS_CHECK_PERF_EVENT], @@ -738,14 +741,3 @@ AC_DEFUN([OVS_CHECK_UNBOUND], AM_CONDITIONAL([HAVE_UNBOUND], [test "$HAVE_UNBOUND" = yes]) AC_SUBST([HAVE_UNBOUND])]) -dnl Checks for libunwind. -AC_DEFUN([OVS_CHECK_UNWIND], - [AC_CHECK_LIB([unwind], [unw_backtrace], - [AC_CHECK_HEADERS([libunwind.h], [HAVE_UNWIND=yes], [HAVE_UNWIND=no])], - [HAVE_UNWIND=no]) - if test "$HAVE_UNWIND" = yes; then - AC_DEFINE([HAVE_UNWIND], [1], [Define to 1 if unwind is detected.]) - LIBS="$LIBS -lunwind" - fi - AM_CONDITIONAL([HAVE_UNWIND], [test "$HAVE_UNWIND" = yes]) - AC_SUBST([HAVE_UNWIND])]) diff --git a/tests/atlocal.in b/tests/atlocal.in index 859668586..83f9f865c 100644 --- a/tests/atlocal.in +++ b/tests/atlocal.in @@ -2,6 +2,7 @@ HAVE_OPENSSL='@HAVE_OPENSSL@' OPENSSL_SUPPORTS_SNI='@OPENSSL_SUPPORTS_SNI@' HAVE_UNBOUND='@HAVE_UNBOUND@' +HAVE_BACKTRACE='@HAVE_BACKTRACE@' EGREP='@EGREP@' PYTHON3='@PYTHON3@' CFLAGS='@CFLAGS@' diff --git a/tests/daemon.at b/tests/daemon.at index d7981f9d2..b92b45744 100644 --- a/tests/daemon.at +++ b/tests/daemon.at @@ -234,3 +234,74 @@ OVS_WAIT_UNTIL([sc query ovsdb-server | grep STATE | grep STOPPED > /dev/null 2> AT_CHECK([sc delete ovsdb-server], [0], [[[SC]] DeleteService SUCCESS ]) AT_CLEANUP + + +AT_SETUP([backtrace --detach]) +AT_SKIP_IF([test "$HAVE_BACKTRACE" = "no"]) + +# This test intentionally causes SIGSEGV, so make Address Sanitizer ignore it. +ASAN_OPTIONS=$ASAN_OPTIONS:handle_segv=0; export ASAN_OPTIONS + +# Skip it if UB Sanitizer is being used. There's no way to disable the +# SEGV check at runtime. +AT_SKIP_IF([test $TESTS_WITH_UBSAN = yes]) + +# Start the daemon and make sure that the pidfile exists immediately. +# We don't wait for the pidfile to get created because the daemon is +# supposed to do so before the parent exits. +AT_CHECK([ovsdb-server --detach --no-chdir --pidfile --no-db --log-file=`pwd`/ovsdb-server.log --verbose=DBG], [0], [ignore], [ignore]) + +AT_CHECK([test -s ovsdb-server.pid]) +child=$(cat ovsdb-server.pid) + +OVS_WAIT_WHILE([kill -SEGV $child]) +OVS_WAIT_UNTIL([grep -q "^SIGSEGV detected, backtrace:" ovsdb-server.log]) +AT_CHECK([grep -q "fatal_signal|DBG|Load \"libgcc\" by capturing dummy backtrace" ovsdb-server.log]) + +AT_CLEANUP + +AT_SETUP([backtrace --detach --monitor]) +AT_SKIP_IF([test "$HAVE_BACKTRACE" = "no"]) + +# This test intentionally causes SIGSEGV, so make Address Sanitizer ignore it. +ASAN_OPTIONS=$ASAN_OPTIONS:handle_segv=0; export ASAN_OPTIONS + +# Skip it if UB Sanitizer is being used. There's no way to disable the +# SEGV check at runtime. +AT_SKIP_IF([test $TESTS_WITH_UBSAN = yes]) + +on_exit 'kill $(cat *.pid)' + +# Start the daemon and make sure that the pidfile exists immediately. +# We don't wait for the pidfile to get created because the daemon is +# supposed to do so before the parent exits. +AT_CHECK([ovsdb-server --detach --monitor --no-chdir --pidfile --no-db --log-file=`pwd`/ovsdb-server.log --verbose=DBG], [0], [ignore], [ignore]) +AT_CHECK([test -s ovsdb-server.pid]) +child=$(cat ovsdb-server.pid) + +# Check process naming and ancestry. +monitor=$(parent_pid $child) +check_process_name $child ovsdb-server +check_ancestors $child $monitor 1 + +# Kill the daemon process, making it look like a segfault, +# and wait for a new daemon process to get spawned. +AT_CHECK([kill -SEGV $child], [0]) +OVS_WAIT_WHILE([kill -0 $child]) +OVS_WAIT_UNTIL([test -s ovsdb-server.pid && test `cat ovsdb-server.pid` != $child]) +child2=$(cat ovsdb-server.pid) + +# Check process naming and ancestry. +check_process_name $child2 ovsdb-server +check_ancestors $child2 $monitor 1 + +# Kill the daemon process with SIGTERM, and wait for the daemon +# and the monitor processes to go away and the pidfile to get deleted. +AT_CHECK([kill $child2]) +OVS_WAIT_WHILE([kill -0 $monitor || kill -0 $child2 || test -e ovsdb-server.pid]) + +AT_CHECK([grep -q "fatal_signal|DBG|Load \"libgcc\" by capturing dummy backtrace" ovsdb-server.log]) +AT_CHECK([grep -q "^SIGSEGV detected, backtrace:" ovsdb-server.log]) +AT_CHECK([grep -q "daemon_unix(monitor)|ERR|1 crashes: pid .* died, killed (Segmentation fault), core dumped, restarting" ovsdb-server.log]) + +AT_CLEANUP