Message ID | 20150709180544.GA8522@spoyarek.pnq.redhat.com |
---|---|
State | New |
Headers | show |
The tests should have comments explaining what they are testing. If it's important that all TLS accesses in libc code be IE, then there should be a static test for that. Perhaps it could grep readelf -r output for the TLS dynamic reloc types. Or perhaps there is something we could do to make references to __tls_get_addr when linking libc.so be a hard failure. If the intent is to catch all TLS access in libc code, then why are you touching the source instead of just compiling with -ftls-model=initial-exec across the board? If we do that, then it's probably fine not to have the static test. Thanks, Roland
On Thu, Jul 09, 2015 at 11:35:45PM +0530, Siddhesh Poyarekar wrote: > The recently introduced TLS variables in the thread-local destructor > implementation (__cxa_thread_atexit_impl) used the default GD access > model, resulting in a call to __tls_get_addr. This causes a deadlock > with recent changes to the way TLS is initialized because DTV > allocations are delayed and hence despite knowing the offset to the > variable inside its TLS block, the thread has to take the global rtld > lock to safely update the TLS offset. > > This causes deadlocks when a thread is instantiated and joined inside > a destructor of a dlopen'd DSO. The correct long term fix is to > somehow not take the lock, but that will need a lot deeper change set > to alter the way in which the big rtld lock is used. > > Instead, this patch just eliminates the call to __tls_get_addr for the > thread-local variables inside libc.so. The variables changed are the > 3 in cxa_thread_atexit and the strerror thread-local variable. > > There were concerns that the static storage for TLS is limited and > hence we should not be using it. Additionally, dynamically loaded > modules may result in libc.so looking for this static storage pretty > late in static binaries. Both concerns are valid when using TLSDESC > since that is where one may attempt to allocate a TLS block from > static storage for even those variables that are not IE. They're not > very strong arguments for the traditional TLS model though, since it > assumes that the static storage would be used sparingly and definitely > not by default. Hence, for now this would only theoretically affect > ARM architectures. > > The impact is hence limited to statically linked binaries that dlopen > modules that in turn load libc.so, all that on arm hardware. It seems > like a small enough impact to justify fixing the larger problem that > currently affects everything everywhere. > If we are at dlopen what happens if user uses older glibc version? Otherwise increased storage doesn't matter much, you could bump reserved size if necessary. Only other use of nonie tls is 18 bytes in inet_ntoa to return value.
On Thu, Jul 09, 2015 at 01:40:33PM -0700, Roland McGrath wrote: > The tests should have comments explaining what they are testing. Ugh, I actually looked at the test now - I had just lifted it from Alex's patch. I'll fix it up with the comment, copyright notice, etc. > If it's important that all TLS accesses in libc code be IE, then there > should be a static test for that. Perhaps it could grep readelf -r output > for the TLS dynamic reloc types. Or perhaps there is something we could do > to make references to __tls_get_addr when linking libc.so be a hard failure. > > If the intent is to catch all TLS access in libc code, then why are you > touching the source instead of just compiling with -ftls-model=initial-exec > across the board? If we do that, then it's probably fine not to have the > static test. I'll need to ensure that test cases are not built with -ftls-model=initial-exec and also specific cases like memusage. Basically, it is more work and is probably not something I can finish in time for 2.22, given that I have other stuff to finish in the near term. I'll add the readelf test case now to ensure that all libc.so and libpthread.so code is IE. I've not reviewed the other modules and I suspect that some of them should not have it either. Do you think it would be OK if I do this in 2.23? I also have to look at the impact on ARM since it uses -ftls-model=gnu2 to get tls descriptors. I reckon it would actually be an improvement, but I'd like to make sure that it is. There's also a good case IMO to somehow compute static TLS usage within libc.so and libpthread.so and add that to the surplus. That way the surplus would be reserved specifically for user DSOs that absolutely want to use IE and libc will never encroach that. Again a good project for 2.23. Siddhesh
On Fri, Jul 10, 2015 at 01:43:14AM +0200, Ondřej Bílka wrote:
> If we are at dlopen what happens if user uses older glibc version?
It shouldn't matter since the variables are static. If we are to
compile all of libc.so with ftls-model=initial-exec though, then I'll
need to make sure that public accesses such as errno remain unchanged.
Siddhesh
> I'll need to ensure that test cases are not built with > -ftls-model=initial-exec and also specific cases like memusage. Right. > Basically, it is more work and is probably not something I can finish > in time for 2.22, given that I have other stuff to finish in the near > term. I'll add the readelf test case now to ensure that all libc.so > and libpthread.so code is IE. I've not reviewed the other modules and > I suspect that some of them should not have it either. Do you think > it would be OK if I do this in 2.23? If we have the readelf test in then I think that's fine for the time being. Cleaning up further is just an ease-of-maintenance issue. The test will break when such maintenance is required, so worst case we won't actually revisit the issue until that happens. > I also have to look at the impact on ARM since it uses -ftls-model=gnu2 > to get tls descriptors. I reckon it would actually be an improvement, > but I'd like to make sure that it is. Fair enough. > There's also a good case IMO to somehow compute static TLS usage within > libc.so and libpthread.so and add that to the surplus. That way the > surplus would be reserved specifically for user DSOs that absolutely want > to use IE and libc will never encroach that. Again a good project for > 2.23. That sounds reasonable. Thanks, Roland
On Fri, Jul 10, 2015 at 6:18 AM, Siddhesh Poyarekar <siddhesh@redhat.com> wrote: > On Thu, Jul 09, 2015 at 01:40:33PM -0700, Roland McGrath wrote: <snip> > > I also have to look at the impact on ARM since it uses > -ftls-model=gnu2 to get tls descriptors. I reckon it would actually > be an improvement, but I'd like to make sure that it is. There's also > a good case IMO to somehow compute static TLS usage within libc.so and > libpthread.so and add that to the surplus. That way the surplus would > be reserved specifically for user DSOs that absolutely want to use IE > and libc will never encroach that. Again a good project for 2.23. FYI, while -ftls-model=gnu2 isn't default on AArch32 - on AArch64 tls descriptors are the default, so you could test it there if you had access to such hardware. Ramana > > Siddhesh
On 07/10/2015 08:37 PM, Ramana Radhakrishnan wrote: > On Fri, Jul 10, 2015 at 6:18 AM, Siddhesh Poyarekar <siddhesh@redhat.com> wrote: >> On Thu, Jul 09, 2015 at 01:40:33PM -0700, Roland McGrath wrote: > > <snip> >> >> I also have to look at the impact on ARM since it uses >> -ftls-model=gnu2 to get tls descriptors. I reckon it would actually >> be an improvement, but I'd like to make sure that it is. There's also >> a good case IMO to somehow compute static TLS usage within libc.so and >> libpthread.so and add that to the surplus. That way the surplus would >> be reserved specifically for user DSOs that absolutely want to use IE >> and libc will never encroach that. Again a good project for 2.23. > > FYI, while -ftls-model=gnu2 isn't default on AArch32 - on AArch64 tls > descriptors are the default, so you could test it there if you had > access to such hardware. We absolutely have access to aarch64 hardware. So we'll test there. c.
On 07/09/2015 07:43 PM, Ondřej Bílka wrote: >> The impact is hence limited to statically linked binaries that dlopen >> modules that in turn load libc.so, all that on arm hardware. It seems >> like a small enough impact to justify fixing the larger problem that >> currently affects everything everywhere. >> > If we are at dlopen what happens if user uses older glibc version? That is an unsupported scenario. You must only dlopen a runtime that exactly matches the runtime you were statically linked with. Anything other than that is not supported. Cheers, Carlos.
On 07/09/2015 02:05 PM, Siddhesh Poyarekar wrote: > [BZ #18457] > * nptl/Makefile (tests): New test case tst-join7. > (modules-names): New test case module tst-join7mod. > * nptl/tst-join7.c: New file. > * nptl/tst-join7mod.c: New file. > * stdlib/cxa_thread_atexit_impl.c (tls_dtor_list, > dso_symbol_cache, lm_cache): Mark variables as IE. > * string/strerror_l.c (last_value): Likewise. OK if you fix the nits below. This is now a release blocker :-) > --- > nptl/Makefile | 10 ++++++++-- > nptl/tst-join7.c | 12 ++++++++++++ > nptl/tst-join7mod.c | 30 ++++++++++++++++++++++++++++++ > stdlib/cxa_thread_atexit_impl.c | 6 +++--- > string/strerror_l.c | 2 +- > 5 files changed, 54 insertions(+), 6 deletions(-) > create mode 100644 nptl/tst-join7.c > create mode 100644 nptl/tst-join7mod.c > > diff --git a/nptl/Makefile b/nptl/Makefile > index 4544aa2..f14f4d6 100644 > --- a/nptl/Makefile > +++ b/nptl/Makefile > @@ -245,7 +245,7 @@ tests = tst-typesizes \ > tst-basic7 \ > tst-kill1 tst-kill2 tst-kill3 tst-kill4 tst-kill5 tst-kill6 \ > tst-raise1 \ > - tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 \ > + tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 tst-join7 \ OK > tst-detach1 \ > tst-eintr1 tst-eintr2 tst-eintr3 tst-eintr4 tst-eintr5 \ > tst-tsd1 tst-tsd2 tst-tsd3 tst-tsd4 tst-tsd5 tst-tsd6 \ > @@ -323,7 +323,8 @@ endif > modules-names = tst-atfork2mod tst-tls3mod tst-tls4moda tst-tls4modb \ > tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \ > tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \ > - tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod > + tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod \ > + tst-join7mod OK. > extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) tst-cleanup4aux.o > test-extras += $(modules-names) tst-cleanup4aux > test-modules = $(addprefix $(objpfx),$(addsuffix .so,$(modules-names))) > @@ -528,6 +529,11 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \ > $(evaluate-test) > endif > > +$(objpfx)tst-join7: $(libdl) $(shared-thread-library) > +$(objpfx)tst-join7.out: $(objpfx)tst-join7mod.so > +$(objpfx)tst-join7mod.so: $(shared-thread-library) > +LDFLAGS-tst-join7mod.so = -Wl,-soname,tst-join7mod.so > + OK. > $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library) > > $(objpfx)tst-fini1: $(shared-thread-library) $(objpfx)tst-fini1mod.so > diff --git a/nptl/tst-join7.c b/nptl/tst-join7.c > new file mode 100644 > index 0000000..bf6fc76 > --- /dev/null > +++ b/nptl/tst-join7.c > @@ -0,0 +1,12 @@ Needs header, and one line description of test, along with bug reference. > +#include <dlfcn.h> > + As roland pointed out, this needs to describe what it's doing in a lengthy comment :-) > +int > +do_test (void) > +{ > + void *f = dlopen ("tst-join7mod.so", RTLD_NOW | RTLD_GLOBAL); > + if (f) dlclose (f); else return 1; > + return 0; > +} > + > +#define TEST_FUNCTION do_test () > +#include "../test-skeleton.c" > diff --git a/nptl/tst-join7mod.c b/nptl/tst-join7mod.c > new file mode 100644 > index 0000000..9960b76 > --- /dev/null > +++ b/nptl/tst-join7mod.c Needs header, one line bug description, and bug ID. > @@ -0,0 +1,30 @@ > +#include <stdio.h> > +#include <pthread.h> > +#include <atomic.h> > + > +static pthread_t th; > +static int running = 1; > + > +static void * > +test_run (void *p) > +{ > + while (atomic_load_relaxed (&running)) > + printf ("XXX test_run\n"); > + printf ("XXX test_run FINISHED\n"); I don't like the use of "XXX" since it indictes unfinished code or other bad comments. Why not just "Test running..." and "Test finished." ? > + return NULL; > +} > + > +static void __attribute__ ((constructor)) > +do_init (void) > +{ > + pthread_create (&th, NULL, test_run, NULL); Check error. > +} > + > +static void __attribute__ ((destructor)) > +do_end (void) > +{ > + atomic_store_relaxed (&running, 0); > + printf ("thread_join...\n"); Similar complaint: "Calling pthread_join...\n" > + pthread_join (th, NULL); Check error. > + printf ("thread_join DONE\n"); "Thread joined" > +} > diff --git a/stdlib/cxa_thread_atexit_impl.c b/stdlib/cxa_thread_atexit_impl.c > index 54e2812..9120162 100644 > --- a/stdlib/cxa_thread_atexit_impl.c > +++ b/stdlib/cxa_thread_atexit_impl.c > @@ -29,9 +29,9 @@ struct dtor_list > struct dtor_list *next; > }; > > -static __thread struct dtor_list *tls_dtor_list; > -static __thread void *dso_symbol_cache; > -static __thread struct link_map *lm_cache; > +static __thread struct dtor_list *tls_dtor_list attribute_tls_model_ie; > +static __thread void *dso_symbol_cache attribute_tls_model_ie; > +static __thread struct link_map *lm_cache attribute_tls_model_ie; OK. > > /* Register a destructor for TLS variables declared with the 'thread_local' > keyword. This function is only called from code generated by the C++ > diff --git a/string/strerror_l.c b/string/strerror_l.c > index 2ed78b5..0b8bf2a 100644 > --- a/string/strerror_l.c > +++ b/string/strerror_l.c > @@ -23,7 +23,7 @@ > #include <sys/param.h> > > > -static __thread char *last_value; > +static __thread char *last_value attribute_tls_model_ie; > OK. > > static const char * > Cheers, Carlos.
diff --git a/nptl/Makefile b/nptl/Makefile index 4544aa2..f14f4d6 100644 --- a/nptl/Makefile +++ b/nptl/Makefile @@ -245,7 +245,7 @@ tests = tst-typesizes \ tst-basic7 \ tst-kill1 tst-kill2 tst-kill3 tst-kill4 tst-kill5 tst-kill6 \ tst-raise1 \ - tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 \ + tst-join1 tst-join2 tst-join3 tst-join4 tst-join5 tst-join6 tst-join7 \ tst-detach1 \ tst-eintr1 tst-eintr2 tst-eintr3 tst-eintr4 tst-eintr5 \ tst-tsd1 tst-tsd2 tst-tsd3 tst-tsd4 tst-tsd5 tst-tsd6 \ @@ -323,7 +323,8 @@ endif modules-names = tst-atfork2mod tst-tls3mod tst-tls4moda tst-tls4modb \ tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \ tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \ - tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod + tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod \ + tst-join7mod extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) tst-cleanup4aux.o test-extras += $(modules-names) tst-cleanup4aux test-modules = $(addprefix $(objpfx),$(addsuffix .so,$(modules-names))) @@ -528,6 +529,11 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \ $(evaluate-test) endif +$(objpfx)tst-join7: $(libdl) $(shared-thread-library) +$(objpfx)tst-join7.out: $(objpfx)tst-join7mod.so +$(objpfx)tst-join7mod.so: $(shared-thread-library) +LDFLAGS-tst-join7mod.so = -Wl,-soname,tst-join7mod.so + $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library) $(objpfx)tst-fini1: $(shared-thread-library) $(objpfx)tst-fini1mod.so diff --git a/nptl/tst-join7.c b/nptl/tst-join7.c new file mode 100644 index 0000000..bf6fc76 --- /dev/null +++ b/nptl/tst-join7.c @@ -0,0 +1,12 @@ +#include <dlfcn.h> + +int +do_test (void) +{ + void *f = dlopen ("tst-join7mod.so", RTLD_NOW | RTLD_GLOBAL); + if (f) dlclose (f); else return 1; + return 0; +} + +#define TEST_FUNCTION do_test () +#include "../test-skeleton.c" diff --git a/nptl/tst-join7mod.c b/nptl/tst-join7mod.c new file mode 100644 index 0000000..9960b76 --- /dev/null +++ b/nptl/tst-join7mod.c @@ -0,0 +1,30 @@ +#include <stdio.h> +#include <pthread.h> +#include <atomic.h> + +static pthread_t th; +static int running = 1; + +static void * +test_run (void *p) +{ + while (atomic_load_relaxed (&running)) + printf ("XXX test_run\n"); + printf ("XXX test_run FINISHED\n"); + return NULL; +} + +static void __attribute__ ((constructor)) +do_init (void) +{ + pthread_create (&th, NULL, test_run, NULL); +} + +static void __attribute__ ((destructor)) +do_end (void) +{ + atomic_store_relaxed (&running, 0); + printf ("thread_join...\n"); + pthread_join (th, NULL); + printf ("thread_join DONE\n"); +} diff --git a/stdlib/cxa_thread_atexit_impl.c b/stdlib/cxa_thread_atexit_impl.c index 54e2812..9120162 100644 --- a/stdlib/cxa_thread_atexit_impl.c +++ b/stdlib/cxa_thread_atexit_impl.c @@ -29,9 +29,9 @@ struct dtor_list struct dtor_list *next; }; -static __thread struct dtor_list *tls_dtor_list; -static __thread void *dso_symbol_cache; -static __thread struct link_map *lm_cache; +static __thread struct dtor_list *tls_dtor_list attribute_tls_model_ie; +static __thread void *dso_symbol_cache attribute_tls_model_ie; +static __thread struct link_map *lm_cache attribute_tls_model_ie; /* Register a destructor for TLS variables declared with the 'thread_local' keyword. This function is only called from code generated by the C++ diff --git a/string/strerror_l.c b/string/strerror_l.c index 2ed78b5..0b8bf2a 100644 --- a/string/strerror_l.c +++ b/string/strerror_l.c @@ -23,7 +23,7 @@ #include <sys/param.h> -static __thread char *last_value; +static __thread char *last_value attribute_tls_model_ie; static const char *