Message ID | 5537B943.2060001@linaro.org |
---|---|
State | New |
Headers | show |
On 22/04/15 16:07, Adhemerval Zanella wrote: > On 22-04-2015 09:27, Szabolcs Nagy wrote: >> On the write side I used a full barrier (__sync_synchronize) although >> >> dmb ishst >> >> would be enough, but write side is not performance critical. > > My understanding is you do not need to push a seq-consistent, but rather > a store release on the 'td->arg', since the code only requires that > 'td->entry' should not be reordered with 'td->arg'. So, to adjust to > C11 semantics I would change: > > diff --git a/sysdeps/aarch64/tlsdesc.c b/sysdeps/aarch64/tlsdesc.c > index 4821f8c..f738cc6 100644 > --- a/sysdeps/aarch64/tlsdesc.c > +++ b/sysdeps/aarch64/tlsdesc.c > @@ -87,6 +87,8 @@ _dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td, > if (!sym) > { > td->arg = (void*) reloc->r_addend; > + /* Barrier so readers see the write above before the one below. */ > + __sync_synchronize (); > > To 'atomic_store_relase (td->arg, (void*) reloc->r_addend))' > > i think atomic_store_release(&td->entry, (void*) reloc->r_addend)) is the correct replacement, because moving stores before a store_release is valid >> - Lazy binding for static TLS may be unnecessary complexity: it seems >> that gcc generates at most two static TLSDESC relocation entries for a >> translation unit (zero and non-zero initialized TLS), so there has to be >> a lot of object files with static TLS linked into a shared object to >> make the load time relocation overhead significant. Unless there is some >> evidence that lazy static TLS relocation makes sense I would suggest >> removing that logic (from all archs). (The dynamic case is probably a >> similar micro-optimization, but there the lazy vs non-lazy semantics are >> different in case of undefined symbols and some applications may depend >> on that). > > Recently I am seeing that all lazy relocation yields less gains than > before and add a lot of complexity. I would like to see an usercase > where lazy TLS or even normal relocation shows a real gain. > i'm not sure if glibc considers lazy relocation as part of the abi contract, but i think static tls is always safe to do non-lazily
diff --git a/sysdeps/aarch64/tlsdesc.c b/sysdeps/aarch64/tlsdesc.c index 4821f8c..f738cc6 100644 --- a/sysdeps/aarch64/tlsdesc.c +++ b/sysdeps/aarch64/tlsdesc.c @@ -87,6 +87,8 @@ _dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td, if (!sym) { td->arg = (void*) reloc->r_addend; + /* Barrier so readers see the write above before the one below. */ + __sync_synchronize (); To 'atomic_store_relase (td->arg, (void*) reloc->r_addend))'