diff mbox series

powerpc: Reduce csum_add() complexity for PPC64

Message ID a4ca63dd4c4b09e1906d08fb814af5a41d0f3fcb.1644651363.git.christophe.leroy@csgroup.eu (mailing list archive)
State Accepted
Headers show
Series powerpc: Reduce csum_add() complexity for PPC64 | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 7 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 24 jobs.

Commit Message

Christophe Leroy Feb. 12, 2022, 7:36 a.m. UTC
PPC64 does everything in C, gcc is able to skip calculation
when one of the operands in zero.

Move the constant folding in PPC32 part.

This helps GCC and reduces ppc64_defconfig by 170 bytes.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/checksum.h | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Michael Ellerman May 15, 2022, 10:28 a.m. UTC | #1
On Sat, 12 Feb 2022 08:36:17 +0100, Christophe Leroy wrote:
> PPC64 does everything in C, gcc is able to skip calculation
> when one of the operands in zero.
> 
> Move the constant folding in PPC32 part.
> 
> This helps GCC and reduces ppc64_defconfig by 170 bytes.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Reduce csum_add() complexity for PPC64
      https://git.kernel.org/powerpc/c/f206fdd9d41bf7deb96219b8ca3499a5abd79b83

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 3288a1bf5e8d..e4e25b46ac49 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -95,16 +95,15 @@  static __always_inline __wsum csum_add(__wsum csum, __wsum addend)
 {
 #ifdef __powerpc64__
 	u64 res = (__force u64)csum;
-#endif
+
+	res += (__force u64)addend;
+	return (__force __wsum)((u32)res + (res >> 32));
+#else
 	if (__builtin_constant_p(csum) && csum == 0)
 		return addend;
 	if (__builtin_constant_p(addend) && addend == 0)
 		return csum;
 
-#ifdef __powerpc64__
-	res += (__force u64)addend;
-	return (__force __wsum)((u32)res + (res >> 32));
-#else
 	asm("addc %0,%0,%1;"
 	    "addze %0,%0;"
 	    : "+r" (csum) : "r" (addend) : "xer");