Message ID | 20230222090112.187583-1-kconsul@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | arch/powerpc/include/asm/barrier.h: redefine rmb and wmb to lwsync | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_sparse | fail | sparse (ppc64, ubuntu-22.04, ppc64) failed at step Build. |
snowpatch_ozlabs/github-powerpc_kernel_qemu | fail | 2 of 20 jobs failed. |
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 6 jobs. |
Sorry, sent the wrong patch! Please ignore this one. Sending the v2 in another email. On Wed, Feb 22, 2023 at 02:31:12PM +0530, Kautuk Consul wrote: > A link from ibm.com states: > "Ensures that all instructions preceding the call to __lwsync > complete before any subsequent store instructions can be executed > on the processor that executed the function. Also, it ensures that > all load instructions preceding the call to __lwsync complete before > any subsequent load instructions can be executed on the processor > that executed the function. This allows you to synchronize between > multiple processors with minimal performance impact, as __lwsync > does not wait for confirmation from each processor." > > Thats why smp_rmb() and smp_wmb() are defined to lwsync. > But this same understanding applies to parallel pipeline > execution on each PowerPC processor. > So, use the lwsync instruction for rmb() and wmb() on the PPC > architectures that support it. > > Also removed some useless spaces. > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > --- > arch/powerpc/include/asm/barrier.h | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h > index e80b2c0e9315..553f5a5d20bd 100644 > --- a/arch/powerpc/include/asm/barrier.h > +++ b/arch/powerpc/include/asm/barrier.h > @@ -41,11 +41,17 @@ > > /* The sub-arch has lwsync */ > #if defined(CONFIG_PPC64) || defined(CONFIG_PPC_E500MC) > -# define SMPWMB LWSYNC > +#undef rmb > +#undef wmb > +/* Redefine rmb() to lwsync. */ > +#define rmb() ({__asm__ __volatile__ ("lwsync" : : : "memory"); }) > +/* Redefine wmb() to lwsync. */ > +#define wmb() ({__asm__ __volatile__ ("lwsync" : : : "memory"); }) > +#define SMPWMB LWSYNC > #elif defined(CONFIG_BOOKE) > -# define SMPWMB mbar > +#define SMPWMB mbar > #else > -# define SMPWMB eieio > +#define SMPWMB eieio > #endif > > /* clang defines this macro for a builtin, which will not work with runtime patching */ > -- > 2.31.1 >
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h index e80b2c0e9315..553f5a5d20bd 100644 --- a/arch/powerpc/include/asm/barrier.h +++ b/arch/powerpc/include/asm/barrier.h @@ -41,11 +41,17 @@ /* The sub-arch has lwsync */ #if defined(CONFIG_PPC64) || defined(CONFIG_PPC_E500MC) -# define SMPWMB LWSYNC +#undef rmb +#undef wmb +/* Redefine rmb() to lwsync. */ +#define rmb() ({__asm__ __volatile__ ("lwsync" : : : "memory"); }) +/* Redefine wmb() to lwsync. */ +#define wmb() ({__asm__ __volatile__ ("lwsync" : : : "memory"); }) +#define SMPWMB LWSYNC #elif defined(CONFIG_BOOKE) -# define SMPWMB mbar +#define SMPWMB mbar #else -# define SMPWMB eieio +#define SMPWMB eieio #endif /* clang defines this macro for a builtin, which will not work with runtime patching */
A link from ibm.com states: "Ensures that all instructions preceding the call to __lwsync complete before any subsequent store instructions can be executed on the processor that executed the function. Also, it ensures that all load instructions preceding the call to __lwsync complete before any subsequent load instructions can be executed on the processor that executed the function. This allows you to synchronize between multiple processors with minimal performance impact, as __lwsync does not wait for confirmation from each processor." Thats why smp_rmb() and smp_wmb() are defined to lwsync. But this same understanding applies to parallel pipeline execution on each PowerPC processor. So, use the lwsync instruction for rmb() and wmb() on the PPC architectures that support it. Also removed some useless spaces. Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> --- arch/powerpc/include/asm/barrier.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)