From patchwork Fri Aug 28 02:48:17 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boqun Feng X-Patchwork-Id: 511776 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id C3FA914012C for ; Fri, 28 Aug 2015 16:32:49 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=DwH9S7Rl; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 9FCEC1A2B62 for ; Fri, 28 Aug 2015 16:32:49 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=DwH9S7Rl; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-io0-x236.google.com (mail-io0-x236.google.com [IPv6:2607:f8b0:4001:c06::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id AC9351A01ED for ; Fri, 28 Aug 2015 12:48:58 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=DwH9S7Rl; dkim-atps=neutral Received: by iodv127 with SMTP id v127so78993693iod.3 for ; Thu, 27 Aug 2015 19:48:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Kcy3o0HwT/+vncka8mjM7pZVGIXY0mWFB9pN7H8Ufr0=; b=DwH9S7RlEGpO99M9QB+EUMSxdQzQ5IwruSfeYtP6ArYK5m7wucHaGp8/I+XJb0JPGa loVMmomeS2H1ztkm0pbW/j3sRTju2bcuqG+/h25d+3L332doNSVgkjTjeP9Gm+wdlkyE 0c0seuqxbGxzbs2vBuDeRi6ULf7ig7jlvxscWd15J67ahSQ9YDTufTqK6SzmuGmbMiME crpiSRdS4Hwje8xL5H4E7xc5x4biQjRdi54nRc3sgOq7TPFWQc55l2AcR7IXI0x+Lxk9 cvdCDcaxx/MTzFbOuOEMJNmwCOJ2IW3PsxCDtlpy1+Ikiit4tzEmXhuoAmXJib9Xp83f Vduw== X-Received: by 10.107.132.160 with SMTP id o32mr12503651ioi.25.1440730136633; Thu, 27 Aug 2015 19:48:56 -0700 (PDT) Received: from localhost (vm.fixme.name. [192.157.208.129]) by smtp.gmail.com with ESMTPSA id r8sm3054005ioi.10.2015.08.27.19.48.54 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Aug 2015 19:48:55 -0700 (PDT) From: Boqun Feng To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: [RFC 3/5] powerpc: atomic: implement atomic{, 64}_{add, sub}_return_* variants Date: Fri, 28 Aug 2015 10:48:17 +0800 Message-Id: <1440730099-29133-4-git-send-email-boqun.feng@gmail.com> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1440730099-29133-1-git-send-email-boqun.feng@gmail.com> References: <1440730099-29133-1-git-send-email-boqun.feng@gmail.com> X-Mailman-Approved-At: Fri, 28 Aug 2015 16:26:40 +1000 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Waiman Long , Peter Zijlstra , Boqun Feng , Will Deacon , Paul Mackerras , Thomas Gleixner , "Paul E. McKenney" , Ingo Molnar MIME-Version: 1.0 Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On powerpc, we don't need a general memory barrier to achieve acquire and release semantics, so arch_atomic_op_{acquire,release} can be implemented using "lwsync" and "isync". For release semantics, since we only need to ensure all memory accesses that issue before must take effects before the -store- part of the atomics, "lwsync" is what we only need. On the platform without "lwsync", "sync" should be used. Therefore, smp_lwsync() is used here. For acquire semantics, "lwsync" is what we only need for the similar reason. However on the platform without "lwsync", we can use "isync" rather than "sync" as an acquire barrier. So a new kind of barrier smp_acquire_barrier__after_atomic() is introduced, which is barrier() on UP, "lwsync" if available and "isync" otherwise. For full ordered semantics, like the original ones, smp_lwsync() is put before relaxed variants and smp_mb__after_atomic() is put after. Signed-off-by: Boqun Feng --- arch/powerpc/include/asm/atomic.h | 88 ++++++++++++++++++++++++++++----------- 1 file changed, 64 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h index 55f106e..806ce50 100644 --- a/arch/powerpc/include/asm/atomic.h +++ b/arch/powerpc/include/asm/atomic.h @@ -12,6 +12,39 @@ #define ATOMIC_INIT(i) { (i) } +/* + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier + * on the platform without lwsync. + */ +#ifdef CONFIG_SMP +#define smp_acquire_barrier__after_atomic() \ + __asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory") +#else +#define smp_acquire_barrier__after_atomic() barrier() +#endif +#define arch_atomic_op_acquire(op, args...) \ +({ \ + typeof(op##_relaxed(args)) __ret = op##_relaxed(args); \ + smp_acquire_barrier__after_atomic(); \ + __ret; \ +}) + +#define arch_atomic_op_release(op, args...) \ +({ \ + smp_lwsync(); \ + op##_relaxed(args); \ +}) + +#define arch_atomic_op_fence(op, args...) \ +({ \ + typeof(op##_relaxed(args)) __ret; \ + smp_lwsync(); \ + __ret = op##_relaxed(args); \ + smp_mb__after_atomic(); \ + __ret; \ +}) + static __inline__ int atomic_read(const atomic_t *v) { int t; @@ -42,27 +75,27 @@ static __inline__ void atomic_##op(int a, atomic_t *v) \ : "cc"); \ } \ -#define ATOMIC_OP_RETURN(op, asm_op) \ -static __inline__ int atomic_##op##_return(int a, atomic_t *v) \ +#define ATOMIC_OP_RETURN_RELAXED(op, asm_op) \ +static inline int atomic_##op##_return_relaxed(int a, atomic_t *v) \ { \ int t; \ \ __asm__ __volatile__( \ - PPC_ATOMIC_ENTRY_BARRIER \ -"1: lwarx %0,0,%2 # atomic_" #op "_return\n" \ - #asm_op " %0,%1,%0\n" \ - PPC405_ERR77(0,%2) \ -" stwcx. %0,0,%2 \n" \ +"1: lwarx %0,0,%3 # atomic_" #op "_return_relaxed\n" \ + #asm_op " %0,%2,%0\n" \ + PPC405_ERR77(0, %3) \ +" stwcx. %0,0,%3\n" \ " bne- 1b\n" \ - PPC_ATOMIC_EXIT_BARRIER \ - : "=&r" (t) \ + : "=&r" (t), "+m" (v->counter) \ : "r" (a), "r" (&v->counter) \ - : "cc", "memory"); \ + : "cc"); \ \ return t; \ } -#define ATOMIC_OPS(op, asm_op) ATOMIC_OP(op, asm_op) ATOMIC_OP_RETURN(op, asm_op) +#define ATOMIC_OPS(op, asm_op) \ + ATOMIC_OP(op, asm_op) \ + ATOMIC_OP_RETURN_RELAXED(op, asm_op) ATOMIC_OPS(add, add) ATOMIC_OPS(sub, subf) @@ -71,8 +104,11 @@ ATOMIC_OP(and, and) ATOMIC_OP(or, or) ATOMIC_OP(xor, xor) +#define atomic_add_return_relaxed atomic_add_return_relaxed +#define atomic_sub_return_relaxed atomic_sub_return_relaxed + #undef ATOMIC_OPS -#undef ATOMIC_OP_RETURN +#undef ATOMIC_OP_RETURN_RELAXED #undef ATOMIC_OP #define atomic_add_negative(a, v) (atomic_add_return((a), (v)) < 0) @@ -285,26 +321,27 @@ static __inline__ void atomic64_##op(long a, atomic64_t *v) \ : "cc"); \ } -#define ATOMIC64_OP_RETURN(op, asm_op) \ -static __inline__ long atomic64_##op##_return(long a, atomic64_t *v) \ +#define ATOMIC64_OP_RETURN_RELAXED(op, asm_op) \ +static inline long \ +atomic64_##op##_return_relaxed(long a, atomic64_t *v) \ { \ long t; \ \ __asm__ __volatile__( \ - PPC_ATOMIC_ENTRY_BARRIER \ -"1: ldarx %0,0,%2 # atomic64_" #op "_return\n" \ - #asm_op " %0,%1,%0\n" \ -" stdcx. %0,0,%2 \n" \ +"1: ldarx %0,0,%3 # atomic64_" #op "_return_relaxed\n" \ + #asm_op " %0,%2,%0\n" \ +" stdcx. %0,0,%3\n" \ " bne- 1b\n" \ - PPC_ATOMIC_EXIT_BARRIER \ - : "=&r" (t) \ + : "=&r" (t), "+m" (v->counter) \ : "r" (a), "r" (&v->counter) \ - : "cc", "memory"); \ + : "cc"); \ \ return t; \ } -#define ATOMIC64_OPS(op, asm_op) ATOMIC64_OP(op, asm_op) ATOMIC64_OP_RETURN(op, asm_op) +#define ATOMIC64_OPS(op, asm_op) \ + ATOMIC64_OP(op, asm_op) \ + ATOMIC64_OP_RETURN_RELAXED(op, asm_op) ATOMIC64_OPS(add, add) ATOMIC64_OPS(sub, subf) @@ -312,8 +349,11 @@ ATOMIC64_OP(and, and) ATOMIC64_OP(or, or) ATOMIC64_OP(xor, xor) -#undef ATOMIC64_OPS -#undef ATOMIC64_OP_RETURN +#define atomic64_add_return_relaxed atomic64_add_return_relaxed +#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed + +#undef ATOPIC64_OPS +#undef ATOMIC64_OP_RETURN_RELAXED #undef ATOMIC64_OP #define atomic64_add_negative(a, v) (atomic64_add_return((a), (v)) < 0)