diff mbox series

Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.

Message ID fcc30533-41b1-a159-7019-290d2c489242@linux.ibm.com
State New
Headers show
Series Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. | expand

Commit Message

Stefan Liebler Feb. 5, 2019, 4:21 p.m. UTC
Hi,

while debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
instructions:
140:   a5 1b 00 01             oill    %r1,1
144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM 
(THREAD_SELF, robust_head.list_op_pending, NULL);
14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last 
THREAD_SETMEM of ENQUEUE_MUTEX_PI

vs (with compiler barriers):
140:   a5 1b 00 01             oill    %r1,1
144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0

Please have a look at the discussion:
"Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
(https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)

This patch is introducing the same compiler barriers and comments
for pthread_mutex_trylock as introduced for pthread_mutex_lock and
pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
"Add compiler barriers around modifications of the robust mutex list."

Okay to commit?

The original commit was first available with glibc release 2.25.
Once this patch is committed, we should at least backport it to
glibc release branches 2.25 - 2.28?

Does anybody know if and where the original commit was backported to?
I've found at least "Bug 1401665 - Fix process shared robust mutex 
defects." (https://bugzilla.redhat.com/show_bug.cgi?id=1401665#c34)

Bye
Stefan

ChangeLog:

	* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
	Add compiler barriers and comments.

Comments

Carlos O'Donell Feb. 5, 2019, 9 p.m. UTC | #1
On 2/5/19 11:21 AM, Stefan Liebler wrote:
> while debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
> instructions:
> 140:   a5 1b 00 01             oill    %r1,1
> 144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
> 
> vs (with compiler barriers):
> 140:   a5 1b 00 01             oill    %r1,1
> 144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
> 14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
> 
> Please have a look at the discussion:
> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)

What a great read! Thank you to everyone you had to spend time tracking
this down. Thank you for the patch Stefan.

> This patch is introducing the same compiler barriers and comments
> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
> "Add compiler barriers around modifications of the robust mutex list."
> 
> Okay to commit?

OK for master with 3 additional comments about ordering.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> The original commit was first available with glibc release 2.25.
> Once this patch is committed, we should at least backport it to
> glibc release branches 2.25 - 2.28?

... and 2.29.

Yes, absolutely, once you commit to master you are free to use
git cherry-pick -x to put the fix on all the branches you need.
Please post your work to libc-stable@sourceware.org.

> Does anybody know if and where the original commit was backported to?
> I've found at least "Bug 1401665 - Fix process shared robust mutex defects." (https://bugzilla.redhat.com/show_bug.cgi?id=1401665#c34)

Yes, I did that backport to RHEL 7.6. These fixes are just "further"
fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
for all arches.

>     * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>     Add compiler barriers and comments.
> 
> 20190205_pthread_mutex_trylock_barriers.patch
> 
> commit 9efa39ef04961397e39e7a9d3c11a33937755aec
> Author: Stefan Liebler <stli@linux.ibm.com>
> Date:   Tue Feb 5 12:37:42 2019 +0100
> 
>     Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.

OK.

>     While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>     Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>     instructions:
>     140:   a5 1b 00 01             oill    %r1,1
>     144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>     14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>     
>     vs (with compiler barriers):
>     140:   a5 1b 00 01             oill    %r1,1
>     144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
>     14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
>     
>     Please have a look at the discussion:
>     "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>     (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)

OK.

>     This patch is introducing the same compiler barriers and comments
>     for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>     pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>     "Add compiler barriers around modifications of the robust mutex list."
>     
>     ChangeLog:
>     
>             * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>             Add compiler barriers and comments.

OK.

> 

I reviewed:

nptl/descr.h - OK
nptl/pthread_mutex_unlock.c - OK
nptl/pthread_mutex_timedlock.c - OK
nptl/allocatestack.c - OK
nptl/pthread_create.c - OK
nptl/pthread_mutex_lock.c - OK
nptl/nptl-init.c - OK
nptl/pthread_mutex_trylock.c <--- I count 10 missing barriers, and 9 missing ordering comments.
sysdeps/nptl/fork.c - OK

> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
> index 8fe43b8f0f..ff1d7282ab 100644
> --- a/nptl/pthread_mutex_trylock.c
> +++ b/nptl/pthread_mutex_trylock.c
> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>      case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
>        THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  		     &mutex->__data.__list.__next);
> +      /* We need to set op_pending before starting the operation.  Also
> +	 see comments at ENQUEUE_MUTEX.  */
> +      __asm ("" ::: "memory");

OK. 1/10 barriers.

>  
>        oldval = mutex->__data.__lock;
>        do
> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      /* But it is inconsistent unless marked otherwise.  */
>  	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>  
> +	      /* We must not enqueue the mutex before we have acquired it.
> +		 Also see comments at ENQUEUE_MUTEX.  */
> +	      __asm ("" ::: "memory");

OK. 2/10 barriers.

>  	      ENQUEUE_MUTEX (mutex);
> +	      /* We need to clear op_pending after we enqueue the mutex.  */
> +	      __asm ("" ::: "memory");

OK. 3/10 barriers.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	      /* Note that we deliberately exist here.  If we fall
> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      int kind = PTHREAD_MUTEX_TYPE (mutex);
>  	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
>  		{
> +		  /* We do not need to ensure ordering wrt another memory
> +		     access.  Also see comments at ENQUEUE_MUTEX. */

OK. 1/9 comments.

>  		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  				 NULL);
>  		  return EDEADLK;
> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  
>  	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
>  		{
> +		  /* We do not need to ensure ordering wrt another memory
> +		     access.  */

OK. 2/9 comments.

>  		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  				 NULL);
>  

Missing comment.

159           oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,

Did not acquire the lock.

160                                                         id, 0);
161           if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
162             {
163               THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);

Clearing list_op_pending has no ordering requirement, and so could use a comment?

164 
165               return EBUSY;
166             }


> @@ -173,13 +185,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      if (oldval == id)
>  		lll_unlock (mutex->__data.__lock,
>  			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
> +	      /* FIXME This violates the mutex destruction requirements.  See
> +		 __pthread_mutex_unlock_full.  */

OK. 3/9 comments.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	      return ENOTRECOVERABLE;
>  	    }
>  	}
>        while ((oldval & FUTEX_OWNER_DIED) != 0);
>  
> +      /* We must not enqueue the mutex before we have acquired it.
> +	 Also see comments at ENQUEUE_MUTEX.  */
> +      __asm ("" ::: "memory");

OK. 4/10 barriers.

>        ENQUEUE_MUTEX (mutex);
> +      /* We need to clear op_pending after we enqueue the mutex.  */
> +      __asm ("" ::: "memory");

OK. 5/10 barriers.

>        THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>        mutex->__data.__owner = id;
> @@ -211,10 +230,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	}
>  
>  	if (robust)
> -	  /* Note: robust PI futexes are signaled by setting bit 0.  */
> -	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> -			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
> -				   | 1));
> +	  {
> +	    /* Note: robust PI futexes are signaled by setting bit 0.  */
> +	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> +			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
> +				     | 1));
> +	    /* We need to set op_pending before starting the operation.  Also
> +	       see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 6/10 barriers.

> +	  }
>  
>  	oldval = mutex->__data.__lock;
>  
> @@ -223,12 +247,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	  {
>  	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
>  	      {
> +		/* We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 4/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  		return EDEADLK;
>  	      }
>  
>  	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
>  	      {
> +		/* We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 5/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  		/* Just bump the counter.  */

Missing comment?

249         if (oldval != 0)

Filaed to get the lock.

250           {
251             if ((oldval & FUTEX_OWNER_DIED) == 0)
252               {
253                 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);

Clearing list_op_pending has no ordering requirement, and so could use a comment?

254 
255                 return EBUSY;
256               }
...
270             if (INTERNAL_SYSCALL_ERROR_P (e, __err)
271                 && INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
272               {
273                 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);

Clearing list_op_pending has no ordering requirement, and so could use a comment?

274 
275                 return EBUSY;
276               }


> @@ -287,7 +315,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	    /* But it is inconsistent unless marked otherwise.  */
>  	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>  
> +	    /* We must not enqueue the mutex before we have acquired it.
> +	       Also see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 7/8 barriers.

>  	    ENQUEUE_MUTEX (mutex);
> +	    /* We need to clear op_pending after we enqueue the mutex.  */
> +	    __asm ("" ::: "memory");

OK. 8/10 barriers.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	    /* Note that we deliberately exit here.  If we fall
> @@ -310,13 +343,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
>  			      0, 0);
>  
> +	    /* To the kernel, this will be visible after the kernel has
> +	       acquired the mutex in the syscall.  */

OK. 6/9 comments. 3 missing comments noted.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	    return ENOTRECOVERABLE;
>  	  }
>  
>  	if (robust)
>  	  {
> +	    /* We must not enqueue the mutex before we have acquired it.
> +	       Also see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 9/10 barriers.

>  	    ENQUEUE_MUTEX_PI (mutex);
> +	    /* We need to clear op_pending after we enqueue the mutex.  */
> +	    __asm ("" ::: "memory");

OK. 10/10 barriers.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	  }
>
Florian Weimer Feb. 6, 2019, 7:02 a.m. UTC | #2
* Stefan Liebler:

> 	* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
> 	Add compiler barriers and comments.

Please file a bug on sourceware.org and reference it in the commit.
Thanks.

Florian
Stefan Liebler Feb. 6, 2019, 11:25 a.m. UTC | #3
On 02/06/2019 08:02 AM, Florian Weimer wrote:
> * Stefan Liebler:
> 
>> 	* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>> 	Add compiler barriers and comments.
> 
> Please file a bug on sourceware.org and reference it in the commit.
> Thanks.
> 
> Florian
> 

Okay. I've filed the following bug:
"Bug 24180 - pthread_mutex_trylock does not use the correct order of 
instructions while maintaining the robust mutex list due to missing 
compiler barriers."
https://sourceware.org/bugzilla/show_bug.cgi?id=24180
Stefan Liebler Feb. 6, 2019, 11:25 a.m. UTC | #4
Hi Carlos,
I've updated the patch with three additional comments and I've mentioned 
the filed bug.
Please review it once again before I commit it to master and cherry pick 
it to the release branches.

On 02/05/2019 10:00 PM, Carlos O'Donell wrote:
> On 2/5/19 11:21 AM, Stefan Liebler wrote:
>> while debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>> instructions:
>> 140:   a5 1b 00 01             oill    %r1,1
>> 144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> 14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>>
>> vs (with compiler barriers):
>> 140:   a5 1b 00 01             oill    %r1,1
>> 144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
>> 14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
>>
>> Please have a look at the discussion:
>> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
> 
> What a great read! Thank you to everyone you had to spend time tracking
> this down. Thank you for the patch Stefan.
> 
>> This patch is introducing the same compiler barriers and comments
>> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>> "Add compiler barriers around modifications of the robust mutex list."
>>
>> Okay to commit?
> 
> OK for master with 3 additional comments about ordering.
> 
> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> 
>> The original commit was first available with glibc release 2.25.
>> Once this patch is committed, we should at least backport it to
>> glibc release branches 2.25 - 2.28?
> 
> ... and 2.29.
Yes, of course.

> 
> Yes, absolutely, once you commit to master you are free to use
> git cherry-pick -x to put the fix on all the branches you need.
> Please post your work to libc-stable@sourceware.org.
> 
>> Does anybody know if and where the original commit was backported to?
>> I've found at least "Bug 1401665 - Fix process shared robust mutex defects." (https://bugzilla.redhat.com/show_bug.cgi?id=1401665#c34)
> 
> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
> for all arches.
Sounds great.
That's the same fix for pthread_mutex_trylock as previously done for 
pthread_mutex_lock and pthread_mutex_timedlock.

> 
>>      * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>>      Add compiler barriers and comments.
>>
>> 20190205_pthread_mutex_trylock_barriers.patch
>>
>> commit 9efa39ef04961397e39e7a9d3c11a33937755aec
>> Author: Stefan Liebler <stli@linux.ibm.com>
>> Date:   Tue Feb 5 12:37:42 2019 +0100
>>
>>      Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
> 
> OK.
> 
>>      While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>>      Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>>      instructions:
>>      140:   a5 1b 00 01             oill    %r1,1
>>      144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>      14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>>      
>>      vs (with compiler barriers):
>>      140:   a5 1b 00 01             oill    %r1,1
>>      144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
>>      14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
>>      
>>      Please have a look at the discussion:
>>      "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>>      (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
> 
> OK.
> 
>>      This patch is introducing the same compiler barriers and comments
>>      for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>>      pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>>      "Add compiler barriers around modifications of the robust mutex list."
>>      
>>      ChangeLog:
>>      
>>              * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>>              Add compiler barriers and comments.
> 
> OK.
> 
>>
> 
> I reviewed:
> 
> nptl/descr.h - OK
> nptl/pthread_mutex_unlock.c - OK
> nptl/pthread_mutex_timedlock.c - OK
> nptl/allocatestack.c - OK
> nptl/pthread_create.c - OK
> nptl/pthread_mutex_lock.c - OK
> nptl/nptl-init.c - OK
> nptl/pthread_mutex_trylock.c <--- I count 10 missing barriers, and 9 missing ordering comments.
> sysdeps/nptl/fork.c - OK
> 
>> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
>> index 8fe43b8f0f..ff1d7282ab 100644
>> --- a/nptl/pthread_mutex_trylock.c
>> +++ b/nptl/pthread_mutex_trylock.c
>> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>       case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
>>         THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>>   		     &mutex->__data.__list.__next);
>> +      /* We need to set op_pending before starting the operation.  Also
>> +	 see comments at ENQUEUE_MUTEX.  */
>> +      __asm ("" ::: "memory");
> 
> OK. 1/10 barriers.
> 
>>   
>>         oldval = mutex->__data.__lock;
>>         do
>> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	      /* But it is inconsistent unless marked otherwise.  */
>>   	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>>   
>> +	      /* We must not enqueue the mutex before we have acquired it.
>> +		 Also see comments at ENQUEUE_MUTEX.  */
>> +	      __asm ("" ::: "memory");
> 
> OK. 2/10 barriers.
> 
>>   	      ENQUEUE_MUTEX (mutex);
>> +	      /* We need to clear op_pending after we enqueue the mutex.  */
>> +	      __asm ("" ::: "memory");
> 
> OK. 3/10 barriers.
> 
>>   	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>   	      /* Note that we deliberately exist here.  If we fall
>> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	      int kind = PTHREAD_MUTEX_TYPE (mutex);
>>   	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
>>   		{
>> +		  /* We do not need to ensure ordering wrt another memory
>> +		     access.  Also see comments at ENQUEUE_MUTEX. */
> 
> OK. 1/9 comments.
> 
>>   		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>>   				 NULL);
>>   		  return EDEADLK;
>> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   
>>   	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
>>   		{
>> +		  /* We do not need to ensure ordering wrt another memory
>> +		     access.  */
> 
> OK. 2/9 comments.
> 
>>   		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>>   				 NULL);
>>   
> 
> Missing comment.
> 
> 159           oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,
> 
> Did not acquire the lock.
> 
> 160                                                         id, 0);
> 161           if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
> 162             {
> 163               THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
> 
Added a comment. See new patch.
> 164
> 165               return EBUSY;
> 166             }
> 
> 
>> @@ -173,13 +185,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	      if (oldval == id)
>>   		lll_unlock (mutex->__data.__lock,
>>   			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
>> +	      /* FIXME This violates the mutex destruction requirements.  See
>> +		 __pthread_mutex_unlock_full.  */
> 
> OK. 3/9 comments.
> 
>>   	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   	      return ENOTRECOVERABLE;
>>   	    }
>>   	}
>>         while ((oldval & FUTEX_OWNER_DIED) != 0);
>>   
>> +      /* We must not enqueue the mutex before we have acquired it.
>> +	 Also see comments at ENQUEUE_MUTEX.  */
>> +      __asm ("" ::: "memory");
> 
> OK. 4/10 barriers.
> 
>>         ENQUEUE_MUTEX (mutex);
>> +      /* We need to clear op_pending after we enqueue the mutex.  */
>> +      __asm ("" ::: "memory");
> 
> OK. 5/10 barriers.
> 
>>         THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>         mutex->__data.__owner = id;
>> @@ -211,10 +230,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	}
>>   
>>   	if (robust)
>> -	  /* Note: robust PI futexes are signaled by setting bit 0.  */
>> -	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> -			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
>> -				   | 1));
>> +	  {
>> +	    /* Note: robust PI futexes are signaled by setting bit 0.  */
>> +	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> +			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
>> +				     | 1));
>> +	    /* We need to set op_pending before starting the operation.  Also
>> +	       see comments at ENQUEUE_MUTEX.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 6/10 barriers.
> 
>> +	  }
>>   
>>   	oldval = mutex->__data.__lock;
>>   
>> @@ -223,12 +247,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	  {
>>   	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
>>   	      {
>> +		/* We do not need to ensure ordering wrt another memory
>> +		   access.  */
> 
> OK. 4/9 comments.
> 
>>   		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   		return EDEADLK;
>>   	      }
>>   
>>   	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
>>   	      {
>> +		/* We do not need to ensure ordering wrt another memory
>> +		   access.  */
> 
> OK. 5/9 comments.
> 
>>   		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>   		/* Just bump the counter.  */
> 
> Missing comment?
> 
> 249         if (oldval != 0)
> 
> Filaed to get the lock.
> 
> 250           {
> 251             if ((oldval & FUTEX_OWNER_DIED) == 0)
> 252               {
> 253                 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
Added a comment. See new patch.
> 
> 254
> 255                 return EBUSY;
> 256               }
> ...
> 270             if (INTERNAL_SYSCALL_ERROR_P (e, __err)
> 271                 && INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
> 272               {
> 273                 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
Added a comment. See new patch.
> 
> 274
> 275                 return EBUSY;
> 276               }
> 
> 
>> @@ -287,7 +315,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	    /* But it is inconsistent unless marked otherwise.  */
>>   	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>>   
>> +	    /* We must not enqueue the mutex before we have acquired it.
>> +	       Also see comments at ENQUEUE_MUTEX.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 7/8 barriers.
> 
>>   	    ENQUEUE_MUTEX (mutex);
>> +	    /* We need to clear op_pending after we enqueue the mutex.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 8/10 barriers.
> 
>>   	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>   	    /* Note that we deliberately exit here.  If we fall
>> @@ -310,13 +343,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
>>   			      0, 0);
>>   
>> +	    /* To the kernel, this will be visible after the kernel has
>> +	       acquired the mutex in the syscall.  */
> 
> OK. 6/9 comments. 3 missing comments noted.
> 
>>   	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   	    return ENOTRECOVERABLE;
>>   	  }
>>   
>>   	if (robust)
>>   	  {
>> +	    /* We must not enqueue the mutex before we have acquired it.
>> +	       Also see comments at ENQUEUE_MUTEX.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 9/10 barriers.
> 
>>   	    ENQUEUE_MUTEX_PI (mutex);
>> +	    /* We need to clear op_pending after we enqueue the mutex.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 10/10 barriers.
> 
>>   	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   	  }
>>   
> 
>
commit b4c6ee19e804b0e90c117ec353ce67d321f0319b
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Wed Feb 6 11:27:03 2019 +0100

    Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
    
    While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
    Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
    instructions:
    140:   a5 1b 00 01             oill    %r1,1
    144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
    14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
    
    vs (with compiler barriers):
    140:   a5 1b 00 01             oill    %r1,1
    144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
    14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
    
    Please have a look at the discussion:
    "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
    (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
    
    This patch is introducing the same compiler barriers and comments
    for pthread_mutex_trylock as introduced for pthread_mutex_lock and
    pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
    "Add compiler barriers around modifications of the robust mutex list."
    
    ChangeLog:
    
            [BZ #24180]
            * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
            Add compiler barriers and comments.

diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
index 8fe43b8f0f..bf2869eca2 100644
--- a/nptl/pthread_mutex_trylock.c
+++ b/nptl/pthread_mutex_trylock.c
@@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
     case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
       THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 		     &mutex->__data.__list.__next);
+      /* We need to set op_pending before starting the operation.  Also
+	 see comments at ENQUEUE_MUTEX.  */
+      __asm ("" ::: "memory");
 
       oldval = mutex->__data.__lock;
       do
@@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      /* But it is inconsistent unless marked otherwise.  */
 	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
 
+	      /* We must not enqueue the mutex before we have acquired it.
+		 Also see comments at ENQUEUE_MUTEX.  */
+	      __asm ("" ::: "memory");
 	      ENQUEUE_MUTEX (mutex);
+	      /* We need to clear op_pending after we enqueue the mutex.  */
+	      __asm ("" ::: "memory");
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	      /* Note that we deliberately exist here.  If we fall
@@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      int kind = PTHREAD_MUTEX_TYPE (mutex);
 	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
 		{
+		  /* We do not need to ensure ordering wrt another memory
+		     access.  Also see comments at ENQUEUE_MUTEX. */
 		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 				 NULL);
 		  return EDEADLK;
@@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 
 	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
 		{
+		  /* We do not need to ensure ordering wrt another memory
+		     access.  */
 		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 				 NULL);
 
@@ -160,6 +172,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 							id, 0);
 	  if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
 	    {
+	      /* We haven't acquired the lock as it is already acquired by
+		 another owner.  We do not need to ensure ordering wrt another
+		 memory access.  */
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	      return EBUSY;
@@ -173,13 +188,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      if (oldval == id)
 		lll_unlock (mutex->__data.__lock,
 			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
+	      /* FIXME This violates the mutex destruction requirements.  See
+		 __pthread_mutex_unlock_full.  */
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	      return ENOTRECOVERABLE;
 	    }
 	}
       while ((oldval & FUTEX_OWNER_DIED) != 0);
 
+      /* We must not enqueue the mutex before we have acquired it.
+	 Also see comments at ENQUEUE_MUTEX.  */
+      __asm ("" ::: "memory");
       ENQUEUE_MUTEX (mutex);
+      /* We need to clear op_pending after we enqueue the mutex.  */
+      __asm ("" ::: "memory");
       THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
       mutex->__data.__owner = id;
@@ -211,10 +233,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	}
 
 	if (robust)
-	  /* Note: robust PI futexes are signaled by setting bit 0.  */
-	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
-			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
-				   | 1));
+	  {
+	    /* Note: robust PI futexes are signaled by setting bit 0.  */
+	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
+			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
+				     | 1));
+	    /* We need to set op_pending before starting the operation.  Also
+	       see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
+	  }
 
 	oldval = mutex->__data.__lock;
 
@@ -223,12 +250,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	  {
 	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
 	      {
+		/* We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 		return EDEADLK;
 	      }
 
 	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
 	      {
+		/* We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 		/* Just bump the counter.  */
@@ -250,6 +281,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	  {
 	    if ((oldval & FUTEX_OWNER_DIED) == 0)
 	      {
+		/* We haven't acquired the lock as it is already acquired by
+		   another owner.  We do not need to ensure ordering wrt another
+		   memory access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 		return EBUSY;
@@ -270,6 +304,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	    if (INTERNAL_SYSCALL_ERROR_P (e, __err)
 		&& INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
 	      {
+		/* The kernel has not yet finished the mutex owner death.
+		   We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 		return EBUSY;
@@ -287,7 +324,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	    /* But it is inconsistent unless marked otherwise.  */
 	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
 
+	    /* We must not enqueue the mutex before we have acquired it.
+	       Also see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
 	    ENQUEUE_MUTEX (mutex);
+	    /* We need to clear op_pending after we enqueue the mutex.  */
+	    __asm ("" ::: "memory");
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	    /* Note that we deliberately exit here.  If we fall
@@ -310,13 +352,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
 						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
 			      0, 0);
 
+	    /* To the kernel, this will be visible after the kernel has
+	       acquired the mutex in the syscall.  */
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	    return ENOTRECOVERABLE;
 	  }
 
 	if (robust)
 	  {
+	    /* We must not enqueue the mutex before we have acquired it.
+	       Also see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
 	    ENQUEUE_MUTEX_PI (mutex);
+	    /* We need to clear op_pending after we enqueue the mutex.  */
+	    __asm ("" ::: "memory");
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	  }
Carlos O'Donell Feb. 6, 2019, 3:59 p.m. UTC | #5
On 2/6/19 6:25 AM, Stefan Liebler wrote:
> Hi Carlos,
> I've updated the patch with three additional comments and I've mentioned the filed bug.
> Please review it once again before I commit it to master and cherry pick it to the release branches.

Thank you! Reviewed.

>> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
>> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
>> for all arches.
> Sounds great.
> That's the same fix for pthread_mutex_trylock as previously done for pthread_mutex_lock and pthread_mutex_timedlock.

I've filed these:
https://bugzilla.redhat.com/show_bug.cgi?id=1672771
https://bugzilla.redhat.com/show_bug.cgi?id=1672773

Feel free to comment or verify if they are going to be needed in RHEL7.7
or RHEL8. I haven't done the analysis of the disassembly yet. If you
could have a look that would help.

> 20190206_pthread_mutex_trylock_barriers.patch
> 
> commit b4c6ee19e804b0e90c117ec353ce67d321f0319b
> Author: Stefan Liebler <stli@linux.ibm.com>
> Date:   Wed Feb 6 11:27:03 2019 +0100
> 
>     Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
>     
>     While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>     Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>     instructions:
>     140:   a5 1b 00 01             oill    %r1,1
>     144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>     14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>     
>     vs (with compiler barriers):
>     140:   a5 1b 00 01             oill    %r1,1
>     144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
>     14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
>     
>     Please have a look at the discussion:
>     "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>     (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
>     
>     This patch is introducing the same compiler barriers and comments
>     for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>     pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>     "Add compiler barriers around modifications of the robust mutex list."
>     
>     ChangeLog:
>     
>             [BZ #24180]
>             * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>             Add compiler barriers and comments.

OK for master.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
> index 8fe43b8f0f..bf2869eca2 100644
> --- a/nptl/pthread_mutex_trylock.c
> +++ b/nptl/pthread_mutex_trylock.c
> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>      case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
>        THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  		     &mutex->__data.__list.__next);
> +      /* We need to set op_pending before starting the operation.  Also
> +	 see comments at ENQUEUE_MUTEX.  */
> +      __asm ("" ::: "memory");

OK. 1/10 barriers.

>  
>        oldval = mutex->__data.__lock;
>        do
> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      /* But it is inconsistent unless marked otherwise.  */
>  	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>  
> +	      /* We must not enqueue the mutex before we have acquired it.
> +		 Also see comments at ENQUEUE_MUTEX.  */
> +	      __asm ("" ::: "memory");

OK. 2/10 barriers.

>  	      ENQUEUE_MUTEX (mutex);
> +	      /* We need to clear op_pending after we enqueue the mutex.  */
> +	      __asm ("" ::: "memory");

OK. 3/10 barriers.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	      /* Note that we deliberately exist here.  If we fall
> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      int kind = PTHREAD_MUTEX_TYPE (mutex);
>  	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
>  		{
> +		  /* We do not need to ensure ordering wrt another memory
> +		     access.  Also see comments at ENQUEUE_MUTEX. */

OK. 1/9 comments.

>  		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  				 NULL);
>  		  return EDEADLK;
> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  
>  	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
>  		{
> +		  /* We do not need to ensure ordering wrt another memory
> +		     access.  */

OK. 2/9 comments.

>  		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  				 NULL);
>  
> @@ -160,6 +172,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  							id, 0);
>  	  if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
>  	    {
> +	      /* We haven't acquired the lock as it is already acquired by
> +		 another owner.  We do not need to ensure ordering wrt another
> +		 memory access.  */

OK. 3/9 comments.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	      return EBUSY;
> @@ -173,13 +188,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      if (oldval == id)
>  		lll_unlock (mutex->__data.__lock,
>  			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
> +	      /* FIXME This violates the mutex destruction requirements.  See
> +		 __pthread_mutex_unlock_full.  */

OK. 4/9 comments.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	      return ENOTRECOVERABLE;
>  	    }
>  	}
>        while ((oldval & FUTEX_OWNER_DIED) != 0);
>  
> +      /* We must not enqueue the mutex before we have acquired it.
> +	 Also see comments at ENQUEUE_MUTEX.  */
> +      __asm ("" ::: "memory");

OK. 4/10 barriers.

>        ENQUEUE_MUTEX (mutex);
> +      /* We need to clear op_pending after we enqueue the mutex.  */
> +      __asm ("" ::: "memory");

OK. 5/10 barriers.

>        THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>        mutex->__data.__owner = id;
> @@ -211,10 +233,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	}
>  
>  	if (robust)
> -	  /* Note: robust PI futexes are signaled by setting bit 0.  */
> -	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> -			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
> -				   | 1));
> +	  {
> +	    /* Note: robust PI futexes are signaled by setting bit 0.  */
> +	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> +			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
> +				     | 1));
> +	    /* We need to set op_pending before starting the operation.  Also
> +	       see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 6/10 barriers.

> +	  }
>  
>  	oldval = mutex->__data.__lock;
>  
> @@ -223,12 +250,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	  {
>  	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
>  	      {
> +		/* We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 5/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  		return EDEADLK;
>  	      }
>  
>  	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
>  	      {
> +		/* We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 6/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  		/* Just bump the counter.  */
> @@ -250,6 +281,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	  {
>  	    if ((oldval & FUTEX_OWNER_DIED) == 0)
>  	      {
> +		/* We haven't acquired the lock as it is already acquired by
> +		   another owner.  We do not need to ensure ordering wrt another
> +		   memory access.  */

OK. 7/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  		return EBUSY;
> @@ -270,6 +304,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	    if (INTERNAL_SYSCALL_ERROR_P (e, __err)
>  		&& INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
>  	      {
> +		/* The kernel has not yet finished the mutex owner death.
> +		   We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 8/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  		return EBUSY;
> @@ -287,7 +324,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	    /* But it is inconsistent unless marked otherwise.  */
>  	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>  
> +	    /* We must not enqueue the mutex before we have acquired it.
> +	       Also see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 7/10 barriers.

>  	    ENQUEUE_MUTEX (mutex);
> +	    /* We need to clear op_pending after we enqueue the mutex.  */
> +	    __asm ("" ::: "memory");

OK. 8/10 barriers.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	    /* Note that we deliberately exit here.  If we fall
> @@ -310,13 +352,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
>  			      0, 0);
>  
> +	    /* To the kernel, this will be visible after the kernel has
> +	       acquired the mutex in the syscall.  */

OK. 9/9 comments.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	    return ENOTRECOVERABLE;
>  	  }
>  
>  	if (robust)
>  	  {
> +	    /* We must not enqueue the mutex before we have acquired it.
> +	       Also see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 9/10 barriers.

>  	    ENQUEUE_MUTEX_PI (mutex);
> +	    /* We need to clear op_pending after we enqueue the mutex.  */
> +	    __asm ("" ::: "memory");


OK. 10/10 barriers.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	  }
>
Stefan Liebler Feb. 7, 2019, 3:05 p.m. UTC | #6
On 02/06/2019 04:59 PM, Carlos O'Donell wrote:
> On 2/6/19 6:25 AM, Stefan Liebler wrote:
>> Hi Carlos,
>> I've updated the patch with three additional comments and I've mentioned the filed bug.
>> Please review it once again before I commit it to master and cherry pick it to the release branches.
> 
> Thank you! Reviewed.
Committed to master (2.29.9000):
"Add compiler barriers around modifications of the robust mutex list for 
pthread_mutex_trylock. [BZ #24180]"
(https://sourceware.org/git/?p=glibc.git;a=commit;h=823624bdc47f1f80109c9c52dee7939b9386d708)

and backported it to glibc 2.25 ... 2.29 release branches.

Thanks.
Stefan

> 
>>> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
>>> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
>>> for all arches.
>> Sounds great.
>> That's the same fix for pthread_mutex_trylock as previously done for pthread_mutex_lock and pthread_mutex_timedlock.
> 
> I've filed these:
> https://bugzilla.redhat.com/show_bug.cgi?id=1672771
> https://bugzilla.redhat.com/show_bug.cgi?id=1672773
> 
> Feel free to comment or verify if they are going to be needed in RHEL7.7
> or RHEL8. I haven't done the analysis of the disassembly yet. If you
> could have a look that would help.
>
Carlos O'Donell Feb. 7, 2019, 8:32 p.m. UTC | #7
On 2/7/19 10:05 AM, Stefan Liebler wrote:
> On 02/06/2019 04:59 PM, Carlos O'Donell wrote:
>> On 2/6/19 6:25 AM, Stefan Liebler wrote:
>>> Hi Carlos,
>>> I've updated the patch with three additional comments and I've mentioned the filed bug.
>>> Please review it once again before I commit it to master and cherry pick it to the release branches.
>>
>> Thank you! Reviewed.
> Committed to master (2.29.9000):
> "Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]"
> (https://sourceware.org/git/?p=glibc.git;a=commit;h=823624bdc47f1f80109c9c52dee7939b9386d708)
> 
> and backported it to glibc 2.25 ... 2.29 release branches.

Thank you! Now if only we could fix the final robust mutex
design flaw[1] ;-)
diff mbox series

Patch

commit 9efa39ef04961397e39e7a9d3c11a33937755aec
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Tue Feb 5 12:37:42 2019 +0100

    Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
    
    While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
    Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
    instructions:
    140:   a5 1b 00 01             oill    %r1,1
    144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
    14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
    
    vs (with compiler barriers):
    140:   a5 1b 00 01             oill    %r1,1
    144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
    14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
    
    Please have a look at the discussion:
    "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
    (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
    
    This patch is introducing the same compiler barriers and comments
    for pthread_mutex_trylock as introduced for pthread_mutex_lock and
    pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
    "Add compiler barriers around modifications of the robust mutex list."
    
    ChangeLog:
    
            * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
            Add compiler barriers and comments.

diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
index 8fe43b8f0f..ff1d7282ab 100644
--- a/nptl/pthread_mutex_trylock.c
+++ b/nptl/pthread_mutex_trylock.c
@@ -94,6 +94,9 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
     case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
       THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 		     &mutex->__data.__list.__next);
+      /* We need to set op_pending before starting the operation.  Also
+	 see comments at ENQUEUE_MUTEX.  */
+      __asm ("" ::: "memory");
 
       oldval = mutex->__data.__lock;
       do
@@ -119,7 +122,12 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      /* But it is inconsistent unless marked otherwise.  */
 	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
 
+	      /* We must not enqueue the mutex before we have acquired it.
+		 Also see comments at ENQUEUE_MUTEX.  */
+	      __asm ("" ::: "memory");
 	      ENQUEUE_MUTEX (mutex);
+	      /* We need to clear op_pending after we enqueue the mutex.  */
+	      __asm ("" ::: "memory");
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	      /* Note that we deliberately exist here.  If we fall
@@ -135,6 +143,8 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      int kind = PTHREAD_MUTEX_TYPE (mutex);
 	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
 		{
+		  /* We do not need to ensure ordering wrt another memory
+		     access.  Also see comments at ENQUEUE_MUTEX. */
 		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 				 NULL);
 		  return EDEADLK;
@@ -142,6 +152,8 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 
 	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
 		{
+		  /* We do not need to ensure ordering wrt another memory
+		     access.  */
 		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 				 NULL);
 
@@ -173,13 +185,20 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      if (oldval == id)
 		lll_unlock (mutex->__data.__lock,
 			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
+	      /* FIXME This violates the mutex destruction requirements.  See
+		 __pthread_mutex_unlock_full.  */
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	      return ENOTRECOVERABLE;
 	    }
 	}
       while ((oldval & FUTEX_OWNER_DIED) != 0);
 
+      /* We must not enqueue the mutex before we have acquired it.
+	 Also see comments at ENQUEUE_MUTEX.  */
+      __asm ("" ::: "memory");
       ENQUEUE_MUTEX (mutex);
+      /* We need to clear op_pending after we enqueue the mutex.  */
+      __asm ("" ::: "memory");
       THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
       mutex->__data.__owner = id;
@@ -211,10 +230,15 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	}
 
 	if (robust)
-	  /* Note: robust PI futexes are signaled by setting bit 0.  */
-	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
-			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
-				   | 1));
+	  {
+	    /* Note: robust PI futexes are signaled by setting bit 0.  */
+	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
+			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
+				     | 1));
+	    /* We need to set op_pending before starting the operation.  Also
+	       see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
+	  }
 
 	oldval = mutex->__data.__lock;
 
@@ -223,12 +247,16 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	  {
 	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
 	      {
+		/* We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 		return EDEADLK;
 	      }
 
 	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
 	      {
+		/* We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 		/* Just bump the counter.  */
@@ -287,7 +315,12 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	    /* But it is inconsistent unless marked otherwise.  */
 	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
 
+	    /* We must not enqueue the mutex before we have acquired it.
+	       Also see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
 	    ENQUEUE_MUTEX (mutex);
+	    /* We need to clear op_pending after we enqueue the mutex.  */
+	    __asm ("" ::: "memory");
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	    /* Note that we deliberately exit here.  If we fall
@@ -310,13 +343,20 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
 			      0, 0);
 
+	    /* To the kernel, this will be visible after the kernel has
+	       acquired the mutex in the syscall.  */
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	    return ENOTRECOVERABLE;
 	  }
 
 	if (robust)
 	  {
+	    /* We must not enqueue the mutex before we have acquired it.
+	       Also see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
 	    ENQUEUE_MUTEX_PI (mutex);
+	    /* We need to clear op_pending after we enqueue the mutex.  */
+	    __asm ("" ::: "memory");
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	  }