Message ID | AS4PR08MB7901C79CB44AE4213BD171C9837F9@AS4PR08MB7901.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
Series | Use C11 atomics instead of atomic_bit_set/bit_test_set | expand |
* Wilco Dijkstra via Libc-alpha:
> Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with atomic_fetch_or_acquire.
Long line.
Why is acquire MO required here? I don't see any synchronizing store.
Isn't this mostly a compiler barrier for use-after-free detection?
Thanks,
Florian
Hi Florian, > Why is acquire MO required here? I don't see any synchronizing store. > Isn't this mostly a compiler barrier for use-after-free detection? You're right, it looks like the only reason for atomic is to ensure memory is only freed once. A few cancelhandling accesses use acquire, and there are various relaxed loads and even non-atomic loads of it, but not a single release store, so there is no use for acquire MO here. Cheers, Wilco v2:
Hi Florian, > Why is acquire MO required here? I don't see any synchronizing store. > Isn't this mostly a compiler barrier for use-after-free detection? You're right, it looks like the only reason for atomic is to ensure memory is only freed once. A few cancelhandling accesses use acquire, and there are various relaxed loads and even non-atomic loads of it, but not a single release store, so there is no use for acquire MO here. Cheers, Wilco v2: Use relaxed atomics since there is no MO dependence Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with atomic_fetch_or_relaxed. --- diff --git a/malloc/malloc.c b/malloc/malloc.c index 29fa71b3b2a3d0a671149eaf619e4d518c56aef5..ecec901b14f602e3c93da1a847f043ffee41a1f4 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -2460,11 +2460,11 @@ sysmalloc_mmap (INTERNAL_SIZE_T nb, size_t pagesize, int extra_flags, mstate av) } /* update statistics */ - int new = atomic_exchange_and_add (&mp_.n_mmaps, 1) + 1; + int new = atomic_fetch_add_relaxed (&mp_.n_mmaps, 1) + 1; atomic_max (&mp_.max_n_mmaps, new); unsigned long sum; - sum = atomic_exchange_and_add (&mp_.mmapped_mem, size) + size; + sum = atomic_fetch_add_relaxed (&mp_.mmapped_mem, size) + size; atomic_max (&mp_.max_mmapped_mem, sum); check_chunk (av, p); @@ -3084,7 +3084,7 @@ mremap_chunk (mchunkptr p, size_t new_size) set_head (p, (new_size - offset) | IS_MMAPPED); INTERNAL_SIZE_T new; - new = atomic_exchange_and_add (&mp_.mmapped_mem, new_size - size - offset) + new = atomic_fetch_add_relaxed (&mp_.mmapped_mem, new_size - size - offset) + new_size - size - offset; atomic_max (&mp_.max_mmapped_mem, new); return p;
* Wilco Dijkstra: > Hi Florian, > >> Why is acquire MO required here? I don't see any synchronizing store. >> Isn't this mostly a compiler barrier for use-after-free detection? > > You're right, it looks like the only reason for atomic is to ensure memory > is only freed once. A few cancelhandling accesses use acquire, and there > are various relaxed loads and even non-atomic loads of it, but not a > single release store, so there is no use for acquire MO here. > > Cheers, > Wilco > > > v2: Use relaxed atomics since there is no MO dependence > > Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with > atomic_fetch_or_relaxed. > > --- > > diff --git a/malloc/malloc.c b/malloc/malloc.c > index 29fa71b3b2a3d0a671149eaf619e4d518c56aef5..ecec901b14f602e3c93da1a847f043ffee41a1f4 100644 > --- a/malloc/malloc.c > +++ b/malloc/malloc.c > @@ -2460,11 +2460,11 @@ sysmalloc_mmap (INTERNAL_SIZE_T nb, size_t pagesize, int extra_flags, mstate av) > } > > /* update statistics */ > - int new = atomic_exchange_and_add (&mp_.n_mmaps, 1) + 1; > + int new = atomic_fetch_add_relaxed (&mp_.n_mmaps, 1) + 1; > atomic_max (&mp_.max_n_mmaps, new); > > unsigned long sum; > - sum = atomic_exchange_and_add (&mp_.mmapped_mem, size) + size; > + sum = atomic_fetch_add_relaxed (&mp_.mmapped_mem, size) + size; > atomic_max (&mp_.max_mmapped_mem, sum); > > check_chunk (av, p); > @@ -3084,7 +3084,7 @@ mremap_chunk (mchunkptr p, size_t new_size) > set_head (p, (new_size - offset) | IS_MMAPPED); > > INTERNAL_SIZE_T new; > - new = atomic_exchange_and_add (&mp_.mmapped_mem, new_size - size - offset) > + new = atomic_fetch_add_relaxed (&mp_.mmapped_mem, new_size - size - offset) > + new_size - size - offset; > atomic_max (&mp_.max_mmapped_mem, new); > return p; Wrong patch, I think? Thanks, Florian
Hi Florian, This time with the right patch: v2: Use relaxed atomics since there is no MO dependence Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with atomic_fetch_or_relaxed. --- diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c index 0d31143ac849de6398e06c399b94813ae57dcff3..86a89048984da79476095dc64c4d3d2edf5eca74 100644 --- a/nptl/nptl_free_tcb.c +++ b/nptl/nptl_free_tcb.c @@ -24,7 +24,8 @@ void __nptl_free_tcb (struct pthread *pd) { /* The thread is exiting now. */ - if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0) + if ((atomic_fetch_or_relaxed (&pd->cancelhandling, TERMINATED_BITMASK) + & TERMINATED_BITMASK) == 0) { /* Free TPP data. */ if (pd->tpp != NULL) diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index 870a8fcb34eb43b58c2260fee6a4624f0fbbd469..77048bd4c148b0b2fe4ef8b4a2bf82593fef2d57 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -487,7 +487,7 @@ start_thread (void *arg) /* The thread is exiting now. Don't set this bit until after we've hit the event-reporting breakpoint, so that td_thr_get_info on us while at the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE. */ - atomic_bit_set (&pd->cancelhandling, EXITING_BIT); + atomic_fetch_or_relaxed (&pd->cancelhandling, EXITING_BITMASK); if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads))) /* This was the last thread. */ diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h index 39af275c254ef3e737736bd5c38099bada8746d6..c8521b0b232d9f9eeb09a7f62e81e8068411b9f3 100644 --- a/sysdeps/nptl/pthreadP.h +++ b/sysdeps/nptl/pthreadP.h @@ -43,12 +43,6 @@ atomic_compare_and_exchange_val_acq (&(descr)->member, new, old) #endif -#ifndef THREAD_ATOMIC_BIT_SET -# define THREAD_ATOMIC_BIT_SET(descr, member, bit) \ - atomic_bit_set (&(descr)->member, bit) -#endif - - static inline short max_adaptive_count (void) { #if HAVE_TUNABLES @@ -276,7 +270,7 @@ __do_cancel (void) struct pthread *self = THREAD_SELF; /* Make sure we get no more cancellations. */ - atomic_bit_set (&self->cancelhandling, EXITING_BIT); + atomic_fetch_or_relaxed (&self->cancelhandling, EXITING_BITMASK); __pthread_unwind ((__pthread_unwind_buf_t *) THREAD_GETMEM (self, cleanup_jmp_buf));
* Wilco Dijkstra: > Hi Florian, > > This time with the right patch: > > v2: Use relaxed atomics since there is no MO dependence > > Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with > atomic_fetch_or_relaxed. Patch looks okay. Would you mind adding a sentence to the commit message at least that this is for use-after-free detection only? Thanks. Reviewed-by: Florian Weimer <fweimer@redhat.com> Florian
diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c index 0d31143ac849de6398e06c399b94813ae57dcff3..0ba291e0d87435b7f551ab8d64f73fb5abec92fc 100644 --- a/nptl/nptl_free_tcb.c +++ b/nptl/nptl_free_tcb.c @@ -24,7 +24,8 @@ void __nptl_free_tcb (struct pthread *pd) { /* The thread is exiting now. */ - if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0) + if ((atomic_fetch_or_acquire (&pd->cancelhandling, TERMINATED_BITMASK) + & TERMINATED_BITMASK) == 0) { /* Free TPP data. */ if (pd->tpp != NULL) diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index 870a8fcb34eb43b58c2260fee6a4624f0fbbd469..d206ed7bf4c2305c0d65bc2a47baefe02969e3d2 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -487,7 +487,7 @@ start_thread (void *arg) /* The thread is exiting now. Don't set this bit until after we've hit the event-reporting breakpoint, so that td_thr_get_info on us while at the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE. */ - atomic_bit_set (&pd->cancelhandling, EXITING_BIT); + atomic_fetch_or_acquire (&pd->cancelhandling, EXITING_BITMASK); if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads))) /* This was the last thread. */ diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h index 39af275c254ef3e737736bd5c38099bada8746d6..7db1b71f1d5b6b6925a5405090a83697545b6798 100644 --- a/sysdeps/nptl/pthreadP.h +++ b/sysdeps/nptl/pthreadP.h @@ -43,12 +43,6 @@ atomic_compare_and_exchange_val_acq (&(descr)->member, new, old) #endif -#ifndef THREAD_ATOMIC_BIT_SET -# define THREAD_ATOMIC_BIT_SET(descr, member, bit) \ - atomic_bit_set (&(descr)->member, bit) -#endif - - static inline short max_adaptive_count (void) { #if HAVE_TUNABLES @@ -276,7 +270,7 @@ __do_cancel (void) struct pthread *self = THREAD_SELF; /* Make sure we get no more cancellations. */ - atomic_bit_set (&self->cancelhandling, EXITING_BIT); + atomic_fetch_or_acquire (&self->cancelhandling, EXITING_BITMASK); __pthread_unwind ((__pthread_unwind_buf_t *) THREAD_GETMEM (self, cleanup_jmp_buf));