Message ID | 56B9F83A.1000500@redhat.com |
---|---|
State | New |
Headers | show |
On 02/09/2016 07:31 AM, Florian Weimer wrote: > I'm not sure if this is the right fix. I traced back the lack of error > reporting to the original addition of the RTLD_NEXT functionality. I > don't know why it was omitted from it. I happened to come across the corresponding bug in Red Hat Bugzilla and wanted to chime in on the patch. POSIX requires dlerror() to return a non-null string after a dynamic linking error. After a failed call to dlopen(), dlclose(), or dlsym(), POSIX says that "More detailed diagnostic information shall be available through dlerror()." dlerror() is required to return null only when no error has occurred since the last call to the function, or on failure. (It doesn't matter that RTLD_NEXT isn't fully specified by POSIX.) So AFAICS, the patch does the right thing. I would only suggest to enhance the test to verify that dlerror() does return a null pointer when no error has occurred (e.g., the second call to dlerror() with no other intervening dlxxx() calls should return null). If there isn't already a test in the test suite that exercises the interaction of dladdr() and dlerrror() (I couldn't find one) it would be worth adding one to verify that a failed dladdr() doesn't affect dlerrror() (it doesn't seem to in my testing). But that's probably outside of the scope of this bug fix. Martin
On 02/09/2016 04:10 PM, Martin Sebor wrote: > On 02/09/2016 07:31 AM, Florian Weimer wrote: >> I'm not sure if this is the right fix. I traced back the lack of error >> reporting to the original addition of the RTLD_NEXT functionality. I >> don't know why it was omitted from it. > > I happened to come across the corresponding bug in Red Hat Bugzilla > and wanted to chime in on the patch. > > POSIX requires dlerror() to return a non-null string after a dynamic > linking error. After a failed call to dlopen(), dlclose(), or dlsym(), > POSIX says that "More detailed diagnostic information shall be > available through dlerror()." dlerror() is required to return null > only when no error has occurred since the last call to the function, > or on failure. (It doesn't matter that RTLD_NEXT isn't fully > specified by POSIX.) What does "failure" mean? The definition of the API is such that if no symbol is found (is that an error or not?) then the function returns NULL. So it might be seen as a normal form of operation as opposed to real errors like ENOMEM. A POSIX clarification should probably be made here, but I agree that from a QoI it is better to return some kind of error because that's users expect. I also expect that's what other implementations do (but haven't verified). Cheers, Carlos.
On 02/09/2016 02:37 PM, Carlos O'Donell wrote: > On 02/09/2016 04:10 PM, Martin Sebor wrote: >> On 02/09/2016 07:31 AM, Florian Weimer wrote: >>> I'm not sure if this is the right fix. I traced back the lack of error >>> reporting to the original addition of the RTLD_NEXT functionality. I >>> don't know why it was omitted from it. >> >> I happened to come across the corresponding bug in Red Hat Bugzilla >> and wanted to chime in on the patch. >> >> POSIX requires dlerror() to return a non-null string after a dynamic >> linking error. After a failed call to dlopen(), dlclose(), or dlsym(), >> POSIX says that "More detailed diagnostic information shall be >> available through dlerror()." dlerror() is required to return null >> only when no error has occurred since the last call to the function, >> or on failure. (It doesn't matter that RTLD_NEXT isn't fully >> specified by POSIX.) > > What does "failure" mean? The definition of the API is such that > if no symbol is found (is that an error or not?) then the function > returns NULL. So it might be seen as a normal form of operation > as opposed to real errors like ENOMEM. My reading of the text in dlsym() (for example) is that a successful completion implies that name refers to either a function or an object. Otherwise "If handle does not refer to a valid symbol table handle " or "if the symbol named by name cannot be found in the symbol table associated with handle" that's a failure. The text doesn't explicitly use the word failure but I think that's implied. Martin > > A POSIX clarification should probably be made here, but I agree > that from a QoI it is better to return some kind of error because > that's users expect. I also expect that's what other implementations > do (but haven't verified). > > Cheers, > Carlos. >
* Carlos O'Donell: > A POSIX clarification should probably be made here, but I agree > that from a QoI it is better to return some kind of error because > that's users expect. I also expect that's what other implementations > do (but haven't verified). It's also what the Solaris examples suggest: <http://docs.oracle.com/cd/E19683-01/816-1386/chapter3-26/index.html> I don't see any reason why RTLD_NEXT should behave differently from other handles.
* Martin Sebor: > So AFAICS, the patch does the right thing. I would only suggest to > enhance the test to verify that dlerror() does return a null pointer > when no error has occurred (e.g., the second call to dlerror() with > no other intervening dlxxx() calls should return null). Fair enough, I will put this into the test. Have you and Carlos looked at more than the test? :) I'm worried that I'm missing something in the callgraph—as far as I can tell, the change only affects RTLD_NEXT, but it seems bizarre that there wasn't any error reporting specifically for that.
On 02/09/2016 05:35 PM, Florian Weimer wrote: > * Carlos O'Donell: > >> A POSIX clarification should probably be made here, but I agree >> that from a QoI it is better to return some kind of error because >> that's users expect. I also expect that's what other implementations >> do (but haven't verified). > > It's also what the Solaris examples suggest: > > <http://docs.oracle.com/cd/E19683-01/816-1386/chapter3-26/index.html> > > I don't see any reason why RTLD_NEXT should behave differently from > other handles. Agreed, and that's the *biggest* reason to fix this, namely that the interfaces should behave the same (principle of least surprise). I am reviewing your change. I expect you are suggesting this change for 2.24 and not "right now" for the upcoming release? I fear the change might change something of consequence and I'd rather the checkin wait until Adhemerval cuts the branch on the 18th. Cheers, Carlos.
* Carlos O'Donell: > I expect you are suggesting this change for 2.24 and not "right now" > for the upcoming release? Yes, all patches are for after 2.24 branches. I'm familiar with the concept of a freeze. :) > I fear the change might change something of consequence and I'd > rather the checkin wait until Adhemerval cuts the branch on the > 18th. Absolutely. Florian
On 02/09/2016 03:31 PM, Florian Weimer wrote: > I'm not sure if this is the right fix. I traced back the lack of error > reporting to the original addition of the RTLD_NEXT functionality. I > don't know why it was omitted from it. > > I could not find any callers of _dl_lookup_symbol_x which with skip_map > != NULL and which are not related to RTLD_NEXT, so this change looks > pretty conservative to me. Ping? Thanks, Florian
On 02/09/2016 03:31 PM, Florian Weimer wrote: > 2016-02-09 Florian Weimer <fweimer@redhat.com> > > [BZ #19509] > * elf/dl-lookup.c (_dl_lookup_symbol_x): Report error even if > skip_map != NULL. > * elf/bug19509.c: New file. > * elf/Makefile (tests): Add bug19509. > (bug19509): Link against libdl. Ping? https://sourceware.org/ml/libc-alpha/2016-02/msg00172.html Thanks, Florian
On 03/30/2016 07:53 AM, Florian Weimer wrote: > On 02/09/2016 03:31 PM, Florian Weimer wrote: >> 2016-02-09 Florian Weimer <fweimer@redhat.com> >> >> [BZ #19509] >> * elf/dl-lookup.c (_dl_lookup_symbol_x): Report error even if >> skip_map != NULL. >> * elf/bug19509.c: New file. >> * elf/Makefile (tests): Add bug19509. >> (bug19509): Link against libdl. > > Ping? > > https://sourceware.org/ml/libc-alpha/2016-02/msg00172.html I agree. The only caller with non-NULL skip_map is: 146 else if (handle == RTLD_NEXT) 147 { 148 if (__glibc_unlikely (match == GL(dl_ns)[LM_ID_BASE]._ns_loaded)) 149 { 150 if (match == NULL 151 || caller < match->l_map_start 152 || caller >= match->l_map_end) 153 GLRO(dl_signal_error) (0, NULL, NULL, N_("\ 154 RTLD_NEXT used in code not dynamically loaded")); 155 } 156 157 struct link_map *l = match; 158 while (l->l_loader != NULL) 159 l = l->l_loader; 160 161 result = GLRO(dl_lookup_symbol_x) (name, match, &ref, l->l_local_scope, 162 vers, 0, 0, match); 163 } Whose purpose it is to skip the current map for lookup. Your patch and testcase look good to me.
2016-02-09 Florian Weimer <fweimer@redhat.com> [BZ #19509] * elf/dl-lookup.c (_dl_lookup_symbol_x): Report error even if skip_map != NULL. * elf/bug19509.c: New file. * elf/Makefile (tests): Add bug19509. (bug19509): Link against libdl. diff --git a/elf/Makefile b/elf/Makefile index 63a5355..18baa23 100644 --- a/elf/Makefile +++ b/elf/Makefile @@ -149,7 +149,8 @@ tests += loadtest restest1 preloadtest loadfail multiload origtest resolvfail \ tst-nodelete) \ tst-initorder tst-initorder2 tst-relsort1 tst-null-argv \ tst-ptrguard1 tst-tlsalign tst-tlsalign-extern tst-nodelete-opened \ - tst-nodelete2 tst-audit11 tst-audit12 + tst-nodelete2 tst-audit11 tst-audit12 \ + bug19509 # reldep9 ifeq ($(build-hardcoded-path-in-tests),yes) tests += tst-dlopen-aout @@ -1252,3 +1253,5 @@ $(objpfx)tst-prelink-cmp.out: tst-prelink.exp \ $(objpfx)tst-prelink-conflict.out cmp $^ > $@; \ $(evaluate-test) + +$(objpfx)bug19509: $(libdl) diff --git a/elf/bug19509.c b/elf/bug19509.c new file mode 100644 index 0000000..e7ccf88 --- /dev/null +++ b/elf/bug19509.c @@ -0,0 +1,106 @@ +/* Test error reporting for dlsym, dlvsym failures. + Copyright (C) 2016 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <dlfcn.h> +#include <gnu/lib-names.h> +#include <stdio.h> +#include <stdlib.h> + +/* Used to disambiguate symbol names. */ +static int counter; + +static void +test_one (void *handle, const char *name, void *(func) (void *, const char *), + const char *suffix) +{ + ++counter; + char symbol[32]; + snprintf (symbol, sizeof (symbol), "no_such_symbol_%d", counter); + char *expected_message; + if (asprintf (&expected_message, ": undefined symbol: %s%s", + symbol, suffix) < 0) + { + printf ("error: asprintf: %m\n"); + abort (); + } + + void *addr = func (handle, symbol); + if (addr != NULL) + { + printf ("error: %s: found symbol \"no_such_symbol\"\n", name); + abort (); + } + const char *message = dlerror (); + if (message == NULL) + { + printf ("error: %s: missing error message\n", name); + abort (); + } + const char *message_without_path = strchrnul (message, ':'); + if (strcmp (message_without_path, expected_message) != 0) + { + printf ("error: %s: unexpected error message: %s\n", name, message); + abort (); + } + free (expected_message); +} + +static void +test_handles (const char *name, void *(func) (void *, const char *), + const char *suffix) +{ + test_one (RTLD_DEFAULT, name, func, suffix); + test_one (RTLD_NEXT, name, func, suffix); + + void *handle = dlopen (LIBC_SO, RTLD_LAZY); + if (handle == NULL) + { + printf ("error: cannot dlopen %s: %s\n", LIBC_SO, dlerror ()); + abort (); + } + test_one (handle, name, func, suffix); + dlclose (handle); +} + +static void * +dlvsym_no_such_version (void *handle, const char *name) +{ + return dlvsym (handle, name, "NO_SUCH_VERSION"); +} + +static void * +dlvsym_glibc_private (void *handle, const char *name) +{ + return dlvsym (handle, name, "GLIBC_PRIVATE"); +} + +static int +do_test (void) +{ + test_handles ("dlsym", dlsym, ""); + test_handles ("dlvsym", dlvsym_no_such_version, + ", version NO_SUCH_VERSION"); + test_handles ("dlvsym", dlvsym_glibc_private, + ", version GLIBC_PRIVATE"); + + return 0; +} + + +#define TEST_FUNCTION do_test () +#include "../test-skeleton.c" diff --git a/elf/dl-lookup.c b/elf/dl-lookup.c index f577759..6d299c1 100644 --- a/elf/dl-lookup.c +++ b/elf/dl-lookup.c @@ -858,7 +858,6 @@ _dl_lookup_symbol_x (const char *undef_name, struct link_map *undef_map, if (__glibc_unlikely (current_value.s == NULL)) { if ((*ref == NULL || ELFW(ST_BIND) ((*ref)->st_info) != STB_WEAK) - && skip_map == NULL && !(GLRO(dl_debug_mask) & DL_DEBUG_UNUSED)) { /* We could find no value for a strong reference. */ -- 2.4.3