diff mbox series

Use C11 atomics instead of atomic_bit_set/bit_test_set

Message ID AS4PR08MB7901C79CB44AE4213BD171C9837F9@AS4PR08MB7901.eurprd08.prod.outlook.com
State New
Headers show
Series Use C11 atomics instead of atomic_bit_set/bit_test_set | expand

Commit Message

Wilco Dijkstra Sept. 5, 2022, 4:35 p.m. UTC
Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with atomic_fetch_or_acquire.

Passes regress on AArch64.

---

Comments

Florian Weimer Sept. 6, 2022, 12:11 p.m. UTC | #1
* Wilco Dijkstra via Libc-alpha:

> Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with atomic_fetch_or_acquire.

Long line.

Why is acquire MO required here?  I don't see any synchronizing store.
Isn't this mostly a compiler barrier for use-after-free detection?

Thanks,
Florian
Wilco Dijkstra Sept. 6, 2022, 4:49 p.m. UTC | #2
Hi Florian,

> Why is acquire MO required here?  I don't see any synchronizing store.
> Isn't this mostly a compiler barrier for use-after-free detection?

You're right, it looks like the only reason for atomic is to ensure memory
is only freed once. A few cancelhandling accesses use acquire, and there
are various relaxed loads and even non-atomic loads of it, but not a
single release store, so there is no use for acquire MO here.

Cheers,
Wilco

v2:
Wilco Dijkstra Sept. 6, 2022, 4:52 p.m. UTC | #3
Hi Florian,

> Why is acquire MO required here?  I don't see any synchronizing store.
> Isn't this mostly a compiler barrier for use-after-free detection?

You're right, it looks like the only reason for atomic is to ensure memory
is only freed once. A few cancelhandling accesses use acquire, and there
are various relaxed loads and even non-atomic loads of it, but not a
single release store, so there is no use for acquire MO here.

Cheers,
Wilco


v2: Use relaxed atomics since there is no MO dependence

Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with
atomic_fetch_or_relaxed.

---

diff --git a/malloc/malloc.c b/malloc/malloc.c
index 29fa71b3b2a3d0a671149eaf619e4d518c56aef5..ecec901b14f602e3c93da1a847f043ffee41a1f4 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -2460,11 +2460,11 @@ sysmalloc_mmap (INTERNAL_SIZE_T nb, size_t pagesize, int extra_flags, mstate av)
     }
 
   /* update statistics */
-  int new = atomic_exchange_and_add (&mp_.n_mmaps, 1) + 1;
+  int new = atomic_fetch_add_relaxed (&mp_.n_mmaps, 1) + 1;
   atomic_max (&mp_.max_n_mmaps, new);
 
   unsigned long sum;
-  sum = atomic_exchange_and_add (&mp_.mmapped_mem, size) + size;
+  sum = atomic_fetch_add_relaxed (&mp_.mmapped_mem, size) + size;
   atomic_max (&mp_.max_mmapped_mem, sum);
 
   check_chunk (av, p);
@@ -3084,7 +3084,7 @@ mremap_chunk (mchunkptr p, size_t new_size)
   set_head (p, (new_size - offset) | IS_MMAPPED);
 
   INTERNAL_SIZE_T new;
-  new = atomic_exchange_and_add (&mp_.mmapped_mem, new_size - size - offset)
+  new = atomic_fetch_add_relaxed (&mp_.mmapped_mem, new_size - size - offset)
         + new_size - size - offset;
   atomic_max (&mp_.max_mmapped_mem, new);
   return p;
Florian Weimer Sept. 6, 2022, 5:42 p.m. UTC | #4
* Wilco Dijkstra:

> Hi Florian,
>
>> Why is acquire MO required here?  I don't see any synchronizing store.
>> Isn't this mostly a compiler barrier for use-after-free detection?
>
> You're right, it looks like the only reason for atomic is to ensure memory
> is only freed once. A few cancelhandling accesses use acquire, and there
> are various relaxed loads and even non-atomic loads of it, but not a
> single release store, so there is no use for acquire MO here.
>
> Cheers,
> Wilco
>
>
> v2: Use relaxed atomics since there is no MO dependence
>
> Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with
> atomic_fetch_or_relaxed.
>
> ---
>
> diff --git a/malloc/malloc.c b/malloc/malloc.c
> index 29fa71b3b2a3d0a671149eaf619e4d518c56aef5..ecec901b14f602e3c93da1a847f043ffee41a1f4 100644
> --- a/malloc/malloc.c
> +++ b/malloc/malloc.c
> @@ -2460,11 +2460,11 @@ sysmalloc_mmap (INTERNAL_SIZE_T nb, size_t pagesize, int extra_flags, mstate av)
>      }
>  
>    /* update statistics */
> -  int new = atomic_exchange_and_add (&mp_.n_mmaps, 1) + 1;
> +  int new = atomic_fetch_add_relaxed (&mp_.n_mmaps, 1) + 1;
>    atomic_max (&mp_.max_n_mmaps, new);
>  
>    unsigned long sum;
> -  sum = atomic_exchange_and_add (&mp_.mmapped_mem, size) + size;
> +  sum = atomic_fetch_add_relaxed (&mp_.mmapped_mem, size) + size;
>    atomic_max (&mp_.max_mmapped_mem, sum);
>  
>    check_chunk (av, p);
> @@ -3084,7 +3084,7 @@ mremap_chunk (mchunkptr p, size_t new_size)
>    set_head (p, (new_size - offset) | IS_MMAPPED);
>  
>    INTERNAL_SIZE_T new;
> -  new = atomic_exchange_and_add (&mp_.mmapped_mem, new_size - size - offset)
> +  new = atomic_fetch_add_relaxed (&mp_.mmapped_mem, new_size - size - offset)
>          + new_size - size - offset;
>    atomic_max (&mp_.max_mmapped_mem, new);
>    return p;

Wrong patch, I think?

Thanks,
Florian
Wilco Dijkstra Sept. 6, 2022, 6:41 p.m. UTC | #5
Hi Florian,

This time with the right patch:

v2: Use relaxed atomics since there is no MO dependence

Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with
atomic_fetch_or_relaxed.

---

diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c
index 0d31143ac849de6398e06c399b94813ae57dcff3..86a89048984da79476095dc64c4d3d2edf5eca74 100644
--- a/nptl/nptl_free_tcb.c
+++ b/nptl/nptl_free_tcb.c
@@ -24,7 +24,8 @@ void
 __nptl_free_tcb (struct pthread *pd)
 {
   /* The thread is exiting now.  */
-  if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0)
+  if ((atomic_fetch_or_relaxed (&pd->cancelhandling, TERMINATED_BITMASK)
+      & TERMINATED_BITMASK) == 0)
     {
       /* Free TPP data.  */
       if (pd->tpp != NULL)
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 870a8fcb34eb43b58c2260fee6a4624f0fbbd469..77048bd4c148b0b2fe4ef8b4a2bf82593fef2d57 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -487,7 +487,7 @@ start_thread (void *arg)
   /* The thread is exiting now.  Don't set this bit until after we've hit
      the event-reporting breakpoint, so that td_thr_get_info on us while at
      the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE.  */
-  atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
+  atomic_fetch_or_relaxed (&pd->cancelhandling, EXITING_BITMASK);
 
   if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
     /* This was the last thread.  */
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index 39af275c254ef3e737736bd5c38099bada8746d6..c8521b0b232d9f9eeb09a7f62e81e8068411b9f3 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -43,12 +43,6 @@
   atomic_compare_and_exchange_val_acq (&(descr)->member, new, old)
 #endif
 
-#ifndef THREAD_ATOMIC_BIT_SET
-# define THREAD_ATOMIC_BIT_SET(descr, member, bit) \
-  atomic_bit_set (&(descr)->member, bit)
-#endif
-
-
 static inline short max_adaptive_count (void)
 {
 #if HAVE_TUNABLES
@@ -276,7 +270,7 @@ __do_cancel (void)
   struct pthread *self = THREAD_SELF;
 
   /* Make sure we get no more cancellations.  */
-  atomic_bit_set (&self->cancelhandling, EXITING_BIT);
+  atomic_fetch_or_relaxed (&self->cancelhandling, EXITING_BITMASK);
 
   __pthread_unwind ((__pthread_unwind_buf_t *)
 		    THREAD_GETMEM (self, cleanup_jmp_buf));
Florian Weimer Sept. 13, 2022, 10:02 a.m. UTC | #6
* Wilco Dijkstra:

> Hi Florian,
>
> This time with the right patch:
>
> v2: Use relaxed atomics since there is no MO dependence
>
> Replace the 3 uses of atomic_bit_set and atomic_bit_test_set with
> atomic_fetch_or_relaxed.

Patch looks okay.  Would you mind adding a sentence to the commit
message at least that this is for use-after-free detection only?  Thanks.

Reviewed-by: Florian Weimer <fweimer@redhat.com>

Florian
diff mbox series

Patch

diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c
index 0d31143ac849de6398e06c399b94813ae57dcff3..0ba291e0d87435b7f551ab8d64f73fb5abec92fc 100644
--- a/nptl/nptl_free_tcb.c
+++ b/nptl/nptl_free_tcb.c
@@ -24,7 +24,8 @@  void
 __nptl_free_tcb (struct pthread *pd)
 {
   /* The thread is exiting now.  */
-  if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0)
+  if ((atomic_fetch_or_acquire (&pd->cancelhandling, TERMINATED_BITMASK)
+      & TERMINATED_BITMASK) == 0)
     {
       /* Free TPP data.  */
       if (pd->tpp != NULL)
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 870a8fcb34eb43b58c2260fee6a4624f0fbbd469..d206ed7bf4c2305c0d65bc2a47baefe02969e3d2 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -487,7 +487,7 @@  start_thread (void *arg)
   /* The thread is exiting now.  Don't set this bit until after we've hit
      the event-reporting breakpoint, so that td_thr_get_info on us while at
      the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE.  */
-  atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
+  atomic_fetch_or_acquire (&pd->cancelhandling, EXITING_BITMASK);
 
   if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
     /* This was the last thread.  */
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index 39af275c254ef3e737736bd5c38099bada8746d6..7db1b71f1d5b6b6925a5405090a83697545b6798 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -43,12 +43,6 @@ 
   atomic_compare_and_exchange_val_acq (&(descr)->member, new, old)
 #endif
 
-#ifndef THREAD_ATOMIC_BIT_SET
-# define THREAD_ATOMIC_BIT_SET(descr, member, bit) \
-  atomic_bit_set (&(descr)->member, bit)
-#endif
-
-
 static inline short max_adaptive_count (void)
 {
 #if HAVE_TUNABLES
@@ -276,7 +270,7 @@  __do_cancel (void)
   struct pthread *self = THREAD_SELF;
 
   /* Make sure we get no more cancellations.  */
-  atomic_bit_set (&self->cancelhandling, EXITING_BIT);
+  atomic_fetch_or_acquire (&self->cancelhandling, EXITING_BITMASK);
 
   __pthread_unwind ((__pthread_unwind_buf_t *)
 		    THREAD_GETMEM (self, cleanup_jmp_buf));