From patchwork Mon Oct 26 09:50:57 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boqun Feng X-Patchwork-Id: 535805 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id A1CDC140549 for ; Mon, 26 Oct 2015 21:03:05 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=th0atk97; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 7AD6F1A0DD7 for ; Mon, 26 Oct 2015 21:03:05 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=th0atk97; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-oi0-x241.google.com (mail-oi0-x241.google.com [IPv6:2607:f8b0:4003:c06::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 99AB31A07D3 for ; Mon, 26 Oct 2015 20:52:04 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=th0atk97; dkim-atps=neutral Received: by oifu187 with SMTP id u187so9275087oif.3 for ; Mon, 26 Oct 2015 02:52:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2EjkIJSzHwTQBCzggPi+D8NKcir1DkJzgImQyt+d3Fc=; b=th0atk972ZJqWEu5s4ud8WExTUgbzK2z5wPLU7sMPkfCHoG0fuXIV4DJcUXNHQ8+x8 cdD0T3CV5kEORd28CsDUL+0DgPMavsUQENW6fXNK4FIdVzbZ66ufzJyqXH/3RSwqHQNP iCAJElL5QqLbkc+unz0YVQXFrh12O7N3O6eC0yVPm6GM7fXfkEFMssIssawaUyWttqMl 4I2Bs07hIpoqvWawfOkpRqc6Ay7PHwPWnyvdjXDjQFTLEvHfxByvC6ds9NxNRcMr/luh /4XbzTQqN82YJS/oGrjv70tNcg6mXz6UyqWTHEbVFwdXqslTpwN+hLo/Dg2VsGYs5oLu 2E4w== X-Received: by 10.202.97.196 with SMTP id v187mr22818783oib.91.1445853122790; Mon, 26 Oct 2015 02:52:02 -0700 (PDT) Received: from localhost (vm.fixme.name. [192.157.208.129]) by smtp.gmail.com with ESMTPSA id j9sm607234oem.4.2015.10.26.02.52.01 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Oct 2015 02:52:02 -0700 (PDT) From: Boqun Feng To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: [PATCH tip/locking/core v5 6/6] powerpc: atomic: Implement cmpxchg{, 64}_* and atomic{, 64}_cmpxchg_* variants Date: Mon, 26 Oct 2015 17:50:57 +0800 Message-Id: <1445853057-20735-7-git-send-email-boqun.feng@gmail.com> X-Mailer: git-send-email 2.6.2 In-Reply-To: <1445853057-20735-1-git-send-email-boqun.feng@gmail.com> References: <1445853057-20735-1-git-send-email-boqun.feng@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Waiman Long , Davidlohr Bueso , Peter Zijlstra , Boqun Feng , Will Deacon , Paul Mackerras , Thomas Gleixner , "Paul E. McKenney" , Ingo Molnar MIME-Version: 1.0 Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on which _release variants can be built. To avoid superfluous barriers in _acquire variants, we implement these operations with assembly code rather use __atomic_op_acquire() to build them automatically. For the same reason, we keep the assembly implementation of fully ordered cmpxchg operations. However, we don't do the similar for _release, because that will require putting barriers in the middle of ll/sc loops, which is probably a bad idea. Note cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed are not compiler barriers. Signed-off-by: Boqun Feng --- arch/powerpc/include/asm/atomic.h | 10 +++ arch/powerpc/include/asm/cmpxchg.h | 149 ++++++++++++++++++++++++++++++++++++- 2 files changed, 158 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h index 2c3d4f0..195dc85 100644 --- a/arch/powerpc/include/asm/atomic.h +++ b/arch/powerpc/include/asm/atomic.h @@ -176,6 +176,11 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v) #define atomic_dec_return_relaxed atomic_dec_return_relaxed #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) +#define atomic_cmpxchg_relaxed(v, o, n) \ + cmpxchg_relaxed(&((v)->counter), (o), (n)) +#define atomic_cmpxchg_acquire(v, o, n) \ + cmpxchg_acquire(&((v)->counter), (o), (n)) + #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) @@ -444,6 +449,11 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v) } #define atomic64_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) +#define atomic64_cmpxchg_relaxed(v, o, n) \ + cmpxchg_relaxed(&((v)->counter), (o), (n)) +#define atomic64_cmpxchg_acquire(v, o, n) \ + cmpxchg_acquire(&((v)->counter), (o), (n)) + #define atomic64_xchg(v, new) (xchg(&((v)->counter), new)) #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h index 17c7e14..cae4fa8 100644 --- a/arch/powerpc/include/asm/cmpxchg.h +++ b/arch/powerpc/include/asm/cmpxchg.h @@ -181,6 +181,56 @@ __cmpxchg_u32_local(volatile unsigned int *p, unsigned long old, return prev; } +static __always_inline unsigned long +__cmpxchg_u32_relaxed(u32 *p, unsigned long old, unsigned long new) +{ + unsigned long prev; + + __asm__ __volatile__ ( +"1: lwarx %0,0,%2 # __cmpxchg_u32_relaxed\n" +" cmpw 0,%0,%3\n" +" bne- 2f\n" + PPC405_ERR77(0, %2) +" stwcx. %4,0,%2\n" +" bne- 1b\n" +"2:" + : "=&r" (prev), "+m" (*p) + : "r" (p), "r" (old), "r" (new) + : "cc"); + + return prev; +} + +/* + * cmpxchg family don't have order guarantee if cmp part fails, therefore we + * can avoid superfluous barriers if we use assembly code to implement + * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for + * cmpxchg_release() because that will result in putting a barrier in the + * middle of a ll/sc loop, which is probably a bad idea. For example, this + * might cause the conditional store more likely to fail. + */ +static __always_inline unsigned long +__cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new) +{ + unsigned long prev; + + __asm__ __volatile__ ( +"1: lwarx %0,0,%2 # __cmpxchg_u32_acquire\n" +" cmpw 0,%0,%3\n" +" bne- 2f\n" + PPC405_ERR77(0, %2) +" stwcx. %4,0,%2\n" +" bne- 1b\n" + PPC_ACQUIRE_BARRIER + "\n" +"2:" + : "=&r" (prev), "+m" (*p) + : "r" (p), "r" (old), "r" (new) + : "cc", "memory"); + + return prev; +} + #ifdef CONFIG_PPC64 static __always_inline unsigned long __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new) @@ -224,6 +274,46 @@ __cmpxchg_u64_local(volatile unsigned long *p, unsigned long old, return prev; } + +static __always_inline unsigned long +__cmpxchg_u64_relaxed(u64 *p, unsigned long old, unsigned long new) +{ + unsigned long prev; + + __asm__ __volatile__ ( +"1: ldarx %0,0,%2 # __cmpxchg_u64_relaxed\n" +" cmpd 0,%0,%3\n" +" bne- 2f\n" +" stdcx. %4,0,%2\n" +" bne- 1b\n" +"2:" + : "=&r" (prev), "+m" (*p) + : "r" (p), "r" (old), "r" (new) + : "cc"); + + return prev; +} + +static __always_inline unsigned long +__cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new) +{ + unsigned long prev; + + __asm__ __volatile__ ( +"1: ldarx %0,0,%2 # __cmpxchg_u64_acquire\n" +" cmpd 0,%0,%3\n" +" bne- 2f\n" +" stdcx. %4,0,%2\n" +" bne- 1b\n" + PPC_ACQUIRE_BARRIER + "\n" +"2:" + : "=&r" (prev), "+m" (*p) + : "r" (p), "r" (old), "r" (new) + : "cc", "memory"); + + return prev; +} #endif /* This function doesn't exist, so you'll get a linker error @@ -262,6 +352,37 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new, return old; } +static __always_inline unsigned long +__cmpxchg_relaxed(void *ptr, unsigned long old, unsigned long new, + unsigned int size) +{ + switch (size) { + case 4: + return __cmpxchg_u32_relaxed(ptr, old, new); +#ifdef CONFIG_PPC64 + case 8: + return __cmpxchg_u64_relaxed(ptr, old, new); +#endif + } + __cmpxchg_called_with_bad_pointer(); + return old; +} + +static __always_inline unsigned long +__cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, + unsigned int size) +{ + switch (size) { + case 4: + return __cmpxchg_u32_acquire(ptr, old, new); +#ifdef CONFIG_PPC64 + case 8: + return __cmpxchg_u64_acquire(ptr, old, new); +#endif + } + __cmpxchg_called_with_bad_pointer(); + return old; +} #define cmpxchg(ptr, o, n) \ ({ \ __typeof__(*(ptr)) _o_ = (o); \ @@ -279,6 +400,23 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new, (unsigned long)_n_, sizeof(*(ptr))); \ }) +#define cmpxchg_relaxed(ptr, o, n) \ +({ \ + __typeof__(*(ptr)) _o_ = (o); \ + __typeof__(*(ptr)) _n_ = (n); \ + (__typeof__(*(ptr))) __cmpxchg_relaxed((ptr), \ + (unsigned long)_o_, (unsigned long)_n_, \ + sizeof(*(ptr))); \ +}) + +#define cmpxchg_acquire(ptr, o, n) \ +({ \ + __typeof__(*(ptr)) _o_ = (o); \ + __typeof__(*(ptr)) _n_ = (n); \ + (__typeof__(*(ptr))) __cmpxchg_acquire((ptr), \ + (unsigned long)_o_, (unsigned long)_n_, \ + sizeof(*(ptr))); \ +}) #ifdef CONFIG_PPC64 #define cmpxchg64(ptr, o, n) \ ({ \ @@ -290,7 +428,16 @@ __cmpxchg_local(volatile void *ptr, unsigned long old, unsigned long new, BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ cmpxchg_local((ptr), (o), (n)); \ }) -#define cmpxchg64_relaxed cmpxchg64_local +#define cmpxchg64_relaxed(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + cmpxchg_relaxed((ptr), (o), (n)); \ +}) +#define cmpxchg64_acquire(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + cmpxchg_acquire((ptr), (o), (n)); \ +}) #else #include #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))