diff mbox

[2/2] S390: Use generic spinlock code.

Message ID 6910c6f2-b763-e311-88c9-91d774f796e0@linux.vnet.ibm.com
State New
Headers show

Commit Message

Stefan Liebler March 14, 2017, 3:55 p.m. UTC
On 02/18/2017 06:05 PM, Torvald Riegel wrote:
> On Wed, 2017-02-15 at 17:26 +0100, Stefan Liebler wrote:
>> On 02/13/2017 09:39 PM, Torvald Riegel wrote:
>>> On Wed, 2017-02-08 at 15:49 +0100, Stefan Liebler wrote:
>>>> This is an updated version of the patch, which adjusts the s390 specific
>>>> atomic-macros in the same way as in include/atomic.h.
>>>> Thus passing a volatile int pointer is fine, too.
>>>
>>> The general direction of this is okay.
>>> Some of my comments for patch 1/2 apply here as well (eg, volatile vs.
>>> atomics).
>>>
>> See answer in patch 1/2.
>>
>>> What I don't like is the choice of 1000 for
>>> SPIN_LOCK_READS_BETWEEN_CMPXCHG.  Have you run benchmarks to come up
>>> with this value, or is it a guess?  Why isn't it documented how you end
>>> up with this number?
>>> We can keep these with a choice such as this, but then we need to have a
>>> FIXME comment in the code, explaining that this is just an arbitrary
>>> choice.
>>>
>>> I would guess that just spinning forever is sufficient, and that we
>>> don't need to throw in a CAS every now and then; using randomized
>>> exponential back-off might be more important.  This is something that we
>>> would be in a better position to answer if you'd provide a
>>> microbenchmark for this choice too.
>>> At the end of 2016, I've posted a draft of a microbenchmark for rwlocks.
>>> Maybe you can use this as a start and run a few experiments?
>>  >
>> I've run own benchmarks in the same manner as your mentioned
>> microbenchmark for rwlocks below.
>> You are right, I can't recognize a real difference between
>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 1000
>> and
>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG -1
>>
>> As it does not hurt, I prefer to use a CAS every 1000 plain reads.
>> A CAS is not necessary on current CPUs but from architecture
>> perspective, it is more correct if there is such a serialization
>> instruction.
>
> What do you mean by "more correct"?  I'm not aware of an architecture
> that would not ensure that stores on one CPU will eventually become
> visible to other CPUs.
>
> Thus, as I wrote in my review of patch 1/2, I think we should just
> remove the occassional CAS (ie, just do the equivalent of the -1
> setting, always).
>
>> There is a difference between
>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 0
>> and one of the others.
>
> Right.  We do want to spin if the lock is acquired by another thread.
> What we should do in a future patch is to experiment with the back-off
> between loads.  tile already has some code for this, which we at least
> need to integrate at some point.
>
>> The same applies to
>> #define SPIN_TRYLOCK_LOAD_AND_TEST_BEFORE_XCHG 1
>> It does not hurt if the lock is free, but there is a difference if the
>> lock is already acquired and trylock is called often.
>
> Yes.  I've replied to this point in the 1/2 patch thread.
>
>> I've saw your microbenchmark-post and added some notes.
>
> Thanks.  I'll get to them later.
>
>> I added a FIXME comment to re-evaluate the choice once we have the
>> appropriate microbenchmarks.
>>
>>> Also, I'm not quite sure whether this number is really
>>> spinlock-specific, and I would like to find a better place for these.
>>> IMO, they should be in some header that contains default tuning
>>> parameters for synchronization code, which is provided by each
>>> architecture that uses the generic spinlock; we'd have no #ifdef for the
>>> tuning parameters, so we'd catch typos in those headers.
>>>
>> See pthread_spin_parameters.h in updated patch 1/2.
>
> I suggest that we'll work towards consensus on patch 1/2 first.  I
> believe once that is done, patch 2/2 would likely just remove s390 code.
>
I've attached an updated patch which just removes the s390 code.


> Thanks for continuing the work on this.  I know we have some back and
> forth here in terms of overall direction, but I also think we're making
> progress and are continually improving the changes.
>

Comments

Stefan Liebler March 21, 2017, 3:43 p.m. UTC | #1
On 03/14/2017 04:55 PM, Stefan Liebler wrote:
> On 02/18/2017 06:05 PM, Torvald Riegel wrote:
>> On Wed, 2017-02-15 at 17:26 +0100, Stefan Liebler wrote:
>>> On 02/13/2017 09:39 PM, Torvald Riegel wrote:
>>>> On Wed, 2017-02-08 at 15:49 +0100, Stefan Liebler wrote:
>>>>> This is an updated version of the patch, which adjusts the s390
>>>>> specific
>>>>> atomic-macros in the same way as in include/atomic.h.
>>>>> Thus passing a volatile int pointer is fine, too.
>>>>
>>>> The general direction of this is okay.
>>>> Some of my comments for patch 1/2 apply here as well (eg, volatile vs.
>>>> atomics).
>>>>
>>> See answer in patch 1/2.
>>>
>>>> What I don't like is the choice of 1000 for
>>>> SPIN_LOCK_READS_BETWEEN_CMPXCHG.  Have you run benchmarks to come up
>>>> with this value, or is it a guess?  Why isn't it documented how you end
>>>> up with this number?
>>>> We can keep these with a choice such as this, but then we need to
>>>> have a
>>>> FIXME comment in the code, explaining that this is just an arbitrary
>>>> choice.
>>>>
>>>> I would guess that just spinning forever is sufficient, and that we
>>>> don't need to throw in a CAS every now and then; using randomized
>>>> exponential back-off might be more important.  This is something
>>>> that we
>>>> would be in a better position to answer if you'd provide a
>>>> microbenchmark for this choice too.
>>>> At the end of 2016, I've posted a draft of a microbenchmark for
>>>> rwlocks.
>>>> Maybe you can use this as a start and run a few experiments?
>>>  >
>>> I've run own benchmarks in the same manner as your mentioned
>>> microbenchmark for rwlocks below.
>>> You are right, I can't recognize a real difference between
>>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 1000
>>> and
>>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG -1
>>>
>>> As it does not hurt, I prefer to use a CAS every 1000 plain reads.
>>> A CAS is not necessary on current CPUs but from architecture
>>> perspective, it is more correct if there is such a serialization
>>> instruction.
>>
>> What do you mean by "more correct"?  I'm not aware of an architecture
>> that would not ensure that stores on one CPU will eventually become
>> visible to other CPUs.
>>
>> Thus, as I wrote in my review of patch 1/2, I think we should just
>> remove the occassional CAS (ie, just do the equivalent of the -1
>> setting, always).
>>
>>> There is a difference between
>>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 0
>>> and one of the others.
>>
>> Right.  We do want to spin if the lock is acquired by another thread.
>> What we should do in a future patch is to experiment with the back-off
>> between loads.  tile already has some code for this, which we at least
>> need to integrate at some point.
>>
>>> The same applies to
>>> #define SPIN_TRYLOCK_LOAD_AND_TEST_BEFORE_XCHG 1
>>> It does not hurt if the lock is free, but there is a difference if the
>>> lock is already acquired and trylock is called often.
>>
>> Yes.  I've replied to this point in the 1/2 patch thread.
>>
>>> I've saw your microbenchmark-post and added some notes.
>>
>> Thanks.  I'll get to them later.
>>
>>> I added a FIXME comment to re-evaluate the choice once we have the
>>> appropriate microbenchmarks.
>>>
>>>> Also, I'm not quite sure whether this number is really
>>>> spinlock-specific, and I would like to find a better place for these.
>>>> IMO, they should be in some header that contains default tuning
>>>> parameters for synchronization code, which is provided by each
>>>> architecture that uses the generic spinlock; we'd have no #ifdef for
>>>> the
>>>> tuning parameters, so we'd catch typos in those headers.
>>>>
>>> See pthread_spin_parameters.h in updated patch 1/2.
>>
>> I suggest that we'll work towards consensus on patch 1/2 first.  I
>> believe once that is done, patch 2/2 would likely just remove s390 code.
>>
> I've attached an updated patch which just removes the s390 code.
>
>

PING

>> Thanks for continuing the work on this.  I know we have some back and
>> forth here in terms of overall direction, but I also think we're making
>> progress and are continually improving the changes.
>>
Torvald Riegel April 6, 2017, 12:27 p.m. UTC | #2
On Tue, 2017-03-14 at 16:55 +0100, Stefan Liebler wrote:
> On 02/18/2017 06:05 PM, Torvald Riegel wrote:
> > On Wed, 2017-02-15 at 17:26 +0100, Stefan Liebler wrote:
> >> On 02/13/2017 09:39 PM, Torvald Riegel wrote:
> >>> On Wed, 2017-02-08 at 15:49 +0100, Stefan Liebler wrote:
> >>> Also, I'm not quite sure whether this number is really
> >>> spinlock-specific, and I would like to find a better place for these.
> >>> IMO, they should be in some header that contains default tuning
> >>> parameters for synchronization code, which is provided by each
> >>> architecture that uses the generic spinlock; we'd have no #ifdef for the
> >>> tuning parameters, so we'd catch typos in those headers.
> >>>
> >> See pthread_spin_parameters.h in updated patch 1/2.
> >
> > I suggest that we'll work towards consensus on patch 1/2 first.  I
> > believe once that is done, patch 2/2 would likely just remove s390 code.
> >
> I've attached an updated patch which just removes the s390 code.

Okay.
diff mbox

Patch

From 78888be8fab0f3e988360c77240d8aa08fcc980c Mon Sep 17 00:00:00 2001
From: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Tue, 14 Mar 2017 14:10:05 +0100
Subject: [PATCH 2/2] S390: Use generic spinlock code.

This patch removes the s390 specific implementation of spinlock code
and is now using the generic one.

ChangeLog:

	* sysdeps/s390/nptl/pthread_spin_init.c: Delete File.
	* sysdeps/s390/nptl/pthread_spin_lock.c: Likewise.
	* sysdeps/s390/nptl/pthread_spin_trylock.c: Likewise.
	* sysdeps/s390/nptl/pthread_spin_unlock.c: Likewise.
---
 sysdeps/s390/nptl/pthread_spin_init.c    | 19 -------------------
 sysdeps/s390/nptl/pthread_spin_lock.c    | 32 --------------------------------
 sysdeps/s390/nptl/pthread_spin_trylock.c | 32 --------------------------------
 sysdeps/s390/nptl/pthread_spin_unlock.c  | 32 --------------------------------
 4 files changed, 115 deletions(-)
 delete mode 100644 sysdeps/s390/nptl/pthread_spin_init.c
 delete mode 100644 sysdeps/s390/nptl/pthread_spin_lock.c
 delete mode 100644 sysdeps/s390/nptl/pthread_spin_trylock.c
 delete mode 100644 sysdeps/s390/nptl/pthread_spin_unlock.c

diff --git a/sysdeps/s390/nptl/pthread_spin_init.c b/sysdeps/s390/nptl/pthread_spin_init.c
deleted file mode 100644
index d826871..0000000
--- a/sysdeps/s390/nptl/pthread_spin_init.c
+++ /dev/null
@@ -1,19 +0,0 @@ 
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-/* Not needed.  pthread_spin_init is an alias for pthread_spin_unlock.  */
diff --git a/sysdeps/s390/nptl/pthread_spin_lock.c b/sysdeps/s390/nptl/pthread_spin_lock.c
deleted file mode 100644
index 7349940..0000000
--- a/sysdeps/s390/nptl/pthread_spin_lock.c
+++ /dev/null
@@ -1,32 +0,0 @@ 
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include "pthreadP.h"
-
-int
-pthread_spin_lock (pthread_spinlock_t *lock)
-{
-  int oldval;
-
-  __asm__ __volatile__ ("0: lhi %0,0\n"
-			"   cs  %0,%2,%1\n"
-			"   jl  0b"
-			: "=&d" (oldval), "=Q" (*lock)
-			: "d" (1), "m" (*lock) : "cc" );
-  return 0;
-}
diff --git a/sysdeps/s390/nptl/pthread_spin_trylock.c b/sysdeps/s390/nptl/pthread_spin_trylock.c
deleted file mode 100644
index 0e848da..0000000
--- a/sysdeps/s390/nptl/pthread_spin_trylock.c
+++ /dev/null
@@ -1,32 +0,0 @@ 
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <errno.h>
-#include "pthreadP.h"
-
-int
-pthread_spin_trylock (pthread_spinlock_t *lock)
-{
-  int old;
-
-  __asm__ __volatile__ ("cs %0,%3,%1"
-			: "=d" (old), "=Q" (*lock)
-			: "0" (0), "d" (1), "m" (*lock) : "cc" );
-
-  return old != 0 ? EBUSY : 0;
-}
diff --git a/sysdeps/s390/nptl/pthread_spin_unlock.c b/sysdeps/s390/nptl/pthread_spin_unlock.c
deleted file mode 100644
index 54e7378..0000000
--- a/sysdeps/s390/nptl/pthread_spin_unlock.c
+++ /dev/null
@@ -1,32 +0,0 @@ 
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-/* Ugly hack to avoid the declaration of pthread_spin_init.  */
-#define pthread_spin_init pthread_spin_init_XXX
-#include "pthreadP.h"
-#undef pthread_spin_init
-
-int
-pthread_spin_unlock (pthread_spinlock_t *lock)
-{
-  __asm__ __volatile__ ("   xc  %O0(4,%R0),%0\n"
-			"   bcr 15,0"
-			: "=Q" (*lock) : "m" (*lock) : "cc" );
-  return 0;
-}
-strong_alias (pthread_spin_unlock, pthread_spin_init)
-- 
2.7.4