From patchwork Mon Oct 12 14:14:04 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boqun Feng X-Patchwork-Id: 529121 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 1AFBF1402AE for ; Tue, 13 Oct 2015 01:23:03 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=ZYlIdAzV; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id EB1981A082E for ; Tue, 13 Oct 2015 01:23:02 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=ZYlIdAzV; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-ob0-x22e.google.com (mail-ob0-x22e.google.com [IPv6:2607:f8b0:4003:c01::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id E1A031A075B for ; Tue, 13 Oct 2015 01:15:46 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=ZYlIdAzV; dkim-atps=neutral Received: by obbda8 with SMTP id da8so107652700obb.1 for ; Mon, 12 Oct 2015 07:15:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=MhmsglxYXNDHzb+9imliTZgbhb6SYSEUNdl6zbG9d7U=; b=ZYlIdAzVrenbH0M5shohgZUSQvdF6NRQ6siGOiTbO3y54bGTrrvrI0/5Wlkimr/n4/ HJ2ulWhppAh74lP8JLvhWzdNax+PRYm5kXGD0GHTg0ATLtbL55slko8CukoQw6ATzKeU ZM58p/xy27n7zbrh5uLUkaheR5kp7yxO4c8jAkdaGgvYhp7/EYlzGtUfIdcOsZK3tjO7 ARvhqHEbHwZ15CSINttU6RXvtS7d+7ffKT0Il7vzzPgH5ChNfI6osI1LjL4ozJtCCUw8 d8RT/KRIDDMiSz8oKvTx2JJWhYG6nTjVeXTVWGNYLD3vmIfceVaZ1ANwQS8IRymeiCT4 sXOA== X-Received: by 10.60.65.68 with SMTP id v4mr15794240oes.84.1444659344569; Mon, 12 Oct 2015 07:15:44 -0700 (PDT) Received: from localhost (vm.fixme.name. [192.157.208.129]) by smtp.gmail.com with ESMTPSA id j1sm8530205obk.2.2015.10.12.07.15.42 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Oct 2015 07:15:43 -0700 (PDT) From: Boqun Feng To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 4/6] powerpc: atomic: Implement atomic{, 64}_*_return_* variants Date: Mon, 12 Oct 2015 22:14:04 +0800 Message-Id: <1444659246-24769-5-git-send-email-boqun.feng@gmail.com> X-Mailer: git-send-email 2.5.3 In-Reply-To: <1444659246-24769-1-git-send-email-boqun.feng@gmail.com> References: <1444659246-24769-1-git-send-email-boqun.feng@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Waiman Long , Davidlohr Bueso , Peter Zijlstra , Boqun Feng , Will Deacon , Paul Mackerras , Thomas Gleixner , "Paul E. McKenney" , Ingo Molnar MIME-Version: 1.0 Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On powerpc, acquire and release semantics can be achieved with lightweight barriers("lwsync" and "ctrl+isync"), which can be used to implement __atomic_op_{acquire,release}. For release semantics, since we only need to ensure all memory accesses that issue before must take effects before the -store- part of the atomics, "lwsync" is what we only need. On the platform without "lwsync", "sync" should be used. Therefore, smp_lwsync() is used here. For acquire semantics, "lwsync" is what we only need for the similar reason. However on the platform without "lwsync", we can use "isync" rather than "sync" as an acquire barrier. So a new kind of barrier smp_acquire_barrier__after_atomic() is introduced, which is barrier() on UP, "lwsync" if available and "isync" otherwise. __atomic_op_fence is defined as smp_lwsync() + _relaxed + smp_mb__after_atomic() to guarantee a full barrier. Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other variants with these helpers. Signed-off-by: Boqun Feng --- arch/powerpc/include/asm/atomic.h | 122 +++++++++++++++++++++++++------------- 1 file changed, 80 insertions(+), 42 deletions(-) diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h index 55f106e..3143af9 100644 --- a/arch/powerpc/include/asm/atomic.h +++ b/arch/powerpc/include/asm/atomic.h @@ -12,6 +12,39 @@ #define ATOMIC_INIT(i) { (i) } +/* + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier + * on the platform without lwsync. + */ +#ifdef CONFIG_SMP +#define smp_acquire_barrier__after_atomic() \ + __asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory") +#else +#define smp_acquire_barrier__after_atomic() barrier() +#endif +#define __atomic_op_acquire(op, args...) \ +({ \ + typeof(op##_relaxed(args)) __ret = op##_relaxed(args); \ + smp_acquire_barrier__after_atomic(); \ + __ret; \ +}) + +#define __atomic_op_release(op, args...) \ +({ \ + smp_lwsync(); \ + op##_relaxed(args); \ +}) + +#define __atomic_op_fence(op, args...) \ +({ \ + typeof(op##_relaxed(args)) __ret; \ + smp_lwsync(); \ + __ret = op##_relaxed(args); \ + smp_mb__after_atomic(); \ + __ret; \ +}) + static __inline__ int atomic_read(const atomic_t *v) { int t; @@ -42,27 +75,27 @@ static __inline__ void atomic_##op(int a, atomic_t *v) \ : "cc"); \ } \ -#define ATOMIC_OP_RETURN(op, asm_op) \ -static __inline__ int atomic_##op##_return(int a, atomic_t *v) \ +#define ATOMIC_OP_RETURN_RELAXED(op, asm_op) \ +static inline int atomic_##op##_return_relaxed(int a, atomic_t *v) \ { \ int t; \ \ __asm__ __volatile__( \ - PPC_ATOMIC_ENTRY_BARRIER \ -"1: lwarx %0,0,%2 # atomic_" #op "_return\n" \ - #asm_op " %0,%1,%0\n" \ - PPC405_ERR77(0,%2) \ -" stwcx. %0,0,%2 \n" \ +"1: lwarx %0,0,%3 # atomic_" #op "_return_relaxed\n" \ + #asm_op " %0,%2,%0\n" \ + PPC405_ERR77(0, %3) \ +" stwcx. %0,0,%3\n" \ " bne- 1b\n" \ - PPC_ATOMIC_EXIT_BARRIER \ - : "=&r" (t) \ + : "=&r" (t), "+m" (v->counter) \ : "r" (a), "r" (&v->counter) \ - : "cc", "memory"); \ + : "cc"); \ \ return t; \ } -#define ATOMIC_OPS(op, asm_op) ATOMIC_OP(op, asm_op) ATOMIC_OP_RETURN(op, asm_op) +#define ATOMIC_OPS(op, asm_op) \ + ATOMIC_OP(op, asm_op) \ + ATOMIC_OP_RETURN_RELAXED(op, asm_op) ATOMIC_OPS(add, add) ATOMIC_OPS(sub, subf) @@ -71,8 +104,11 @@ ATOMIC_OP(and, and) ATOMIC_OP(or, or) ATOMIC_OP(xor, xor) +#define atomic_add_return_relaxed atomic_add_return_relaxed +#define atomic_sub_return_relaxed atomic_sub_return_relaxed + #undef ATOMIC_OPS -#undef ATOMIC_OP_RETURN +#undef ATOMIC_OP_RETURN_RELAXED #undef ATOMIC_OP #define atomic_add_negative(a, v) (atomic_add_return((a), (v)) < 0) @@ -92,21 +128,19 @@ static __inline__ void atomic_inc(atomic_t *v) : "cc", "xer"); } -static __inline__ int atomic_inc_return(atomic_t *v) +static __inline__ int atomic_inc_return_relaxed(atomic_t *v) { int t; __asm__ __volatile__( - PPC_ATOMIC_ENTRY_BARRIER -"1: lwarx %0,0,%1 # atomic_inc_return\n\ +"1: lwarx %0,0,%1 # atomic_inc_return_relaxed\n\ addic %0,%0,1\n" PPC405_ERR77(0,%1) " stwcx. %0,0,%1 \n\ bne- 1b" - PPC_ATOMIC_EXIT_BARRIER : "=&r" (t) : "r" (&v->counter) - : "cc", "xer", "memory"); + : "cc", "xer"); return t; } @@ -136,25 +170,26 @@ static __inline__ void atomic_dec(atomic_t *v) : "cc", "xer"); } -static __inline__ int atomic_dec_return(atomic_t *v) +static __inline__ int atomic_dec_return_relaxed(atomic_t *v) { int t; __asm__ __volatile__( - PPC_ATOMIC_ENTRY_BARRIER -"1: lwarx %0,0,%1 # atomic_dec_return\n\ +"1: lwarx %0,0,%1 # atomic_dec_return_relaxed\n\ addic %0,%0,-1\n" PPC405_ERR77(0,%1) " stwcx. %0,0,%1\n\ bne- 1b" - PPC_ATOMIC_EXIT_BARRIER : "=&r" (t) : "r" (&v->counter) - : "cc", "xer", "memory"); + : "cc", "xer"); return t; } +#define atomic_inc_return_relaxed atomic_inc_return_relaxed +#define atomic_dec_return_relaxed atomic_dec_return_relaxed + #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) @@ -285,26 +320,27 @@ static __inline__ void atomic64_##op(long a, atomic64_t *v) \ : "cc"); \ } -#define ATOMIC64_OP_RETURN(op, asm_op) \ -static __inline__ long atomic64_##op##_return(long a, atomic64_t *v) \ +#define ATOMIC64_OP_RETURN_RELAXED(op, asm_op) \ +static inline long \ +atomic64_##op##_return_relaxed(long a, atomic64_t *v) \ { \ long t; \ \ __asm__ __volatile__( \ - PPC_ATOMIC_ENTRY_BARRIER \ -"1: ldarx %0,0,%2 # atomic64_" #op "_return\n" \ - #asm_op " %0,%1,%0\n" \ -" stdcx. %0,0,%2 \n" \ +"1: ldarx %0,0,%3 # atomic64_" #op "_return_relaxed\n" \ + #asm_op " %0,%2,%0\n" \ +" stdcx. %0,0,%3\n" \ " bne- 1b\n" \ - PPC_ATOMIC_EXIT_BARRIER \ - : "=&r" (t) \ + : "=&r" (t), "+m" (v->counter) \ : "r" (a), "r" (&v->counter) \ - : "cc", "memory"); \ + : "cc"); \ \ return t; \ } -#define ATOMIC64_OPS(op, asm_op) ATOMIC64_OP(op, asm_op) ATOMIC64_OP_RETURN(op, asm_op) +#define ATOMIC64_OPS(op, asm_op) \ + ATOMIC64_OP(op, asm_op) \ + ATOMIC64_OP_RETURN_RELAXED(op, asm_op) ATOMIC64_OPS(add, add) ATOMIC64_OPS(sub, subf) @@ -312,8 +348,11 @@ ATOMIC64_OP(and, and) ATOMIC64_OP(or, or) ATOMIC64_OP(xor, xor) -#undef ATOMIC64_OPS -#undef ATOMIC64_OP_RETURN +#define atomic64_add_return_relaxed atomic64_add_return_relaxed +#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed + +#undef ATOPIC64_OPS +#undef ATOMIC64_OP_RETURN_RELAXED #undef ATOMIC64_OP #define atomic64_add_negative(a, v) (atomic64_add_return((a), (v)) < 0) @@ -332,20 +371,18 @@ static __inline__ void atomic64_inc(atomic64_t *v) : "cc", "xer"); } -static __inline__ long atomic64_inc_return(atomic64_t *v) +static __inline__ long atomic64_inc_return_relaxed(atomic64_t *v) { long t; __asm__ __volatile__( - PPC_ATOMIC_ENTRY_BARRIER "1: ldarx %0,0,%1 # atomic64_inc_return\n\ addic %0,%0,1\n\ stdcx. %0,0,%1 \n\ bne- 1b" - PPC_ATOMIC_EXIT_BARRIER : "=&r" (t) : "r" (&v->counter) - : "cc", "xer", "memory"); + : "cc", "xer"); return t; } @@ -374,24 +411,25 @@ static __inline__ void atomic64_dec(atomic64_t *v) : "cc", "xer"); } -static __inline__ long atomic64_dec_return(atomic64_t *v) +static __inline__ long atomic64_dec_return_relaxed(atomic64_t *v) { long t; __asm__ __volatile__( - PPC_ATOMIC_ENTRY_BARRIER "1: ldarx %0,0,%1 # atomic64_dec_return\n\ addic %0,%0,-1\n\ stdcx. %0,0,%1\n\ bne- 1b" - PPC_ATOMIC_EXIT_BARRIER : "=&r" (t) : "r" (&v->counter) - : "cc", "xer", "memory"); + : "cc", "xer"); return t; } +#define atomic64_inc_return_relaxed atomic64_inc_return_relaxed +#define atomic64_dec_return_relaxed atomic64_dec_return_relaxed + #define atomic64_sub_and_test(a, v) (atomic64_sub_return((a), (v)) == 0) #define atomic64_dec_and_test(v) (atomic64_dec_return((v)) == 0)