diff mbox series

[PATCH-2,rs6000] Reverse V8HI on Power8 by vector rotation [PR100866]

Message ID a4b84496-105a-200f-3a88-2b0a33ce638d@linux.ibm.com
State New
Headers show
Series [PATCH-2,rs6000] Reverse V8HI on Power8 by vector rotation [PR100866] | expand

Commit Message

HAO CHEN GUI Oct. 24, 2022, 3:14 a.m. UTC
Hi,
  This patch implements V8HI byte reverse on Power8 by vector rotation.
It should be effecient than orignial vector permute. The patch comes from
Xionghu's comments in PR. I just added a test case for it.

  Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.



ChangeLog
2022-10-24  Xionghu Luo <xionghuluo@tencent.com>

gcc/
	PR target/100866
	* config/rs6000/altivec.md: (*altivec_vrl<VI_char>): Named to...
	(altivec_vrl<VI_char>): ...this.
	* config/rs6000/vsx.md (revb_<mode>): Call vspltish and vrlh when
	target is Power8 and mode is V8HI.

gcc/testsuite/
	PR target/100866
	* gcc.target/powerpc/pr100866-2.c: New.

patch.diff

Comments

Segher Boessenkool Oct. 24, 2022, 9:40 p.m. UTC | #1
Hi!

On Mon, Oct 24, 2022 at 11:14:20AM +0800, HAO CHEN GUI wrote:
>   This patch implements V8HI byte reverse on Power8 by vector rotation.

Please put *byte* reverse as the commit subject as well?

> It should be effecient than orignial vector permute. The patch comes from
> Xionghu's comments in PR. I just added a test case for it.

Yeah, on all existing CPUs such a rotate is as fast or faster than a
permute insn.  And for bigger modes, we need more insns two dependent
rotates for V4SI, and that is unlikely to be faster than a single
permutation, certainly not if code can be unrolled.

Okay for trunk.  Thanks!


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 2c4940f2e21..84660073f32 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -1875,7 +1875,7 @@  (define_insn "altivec_vpku<VI_char>um_direct"
 }
   [(set_attr "type" "vecperm")])

-(define_insn "*altivec_vrl<VI_char>"
+(define_insn "altivec_vrl<VI_char>"
   [(set (match_operand:VI2 0 "register_operand" "=v")
         (rotate:VI2 (match_operand:VI2 1 "register_operand" "v")
 		    (match_operand:VI2 2 "register_operand" "v")))]
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index e226a93bbe5..34662a7252d 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6092,12 +6092,21 @@  (define_expand "revb_<mode>"
     emit_insn (gen_p9_xxbr<VSX_XXBR>_<mode> (operands[0], operands[1]));
   else
     {
-      /* Want to have the elements in reverse order relative
-	 to the endian mode in use, i.e. in LE mode, put elements
-	 in BE order.  */
-      rtx sel = swap_endian_selector_for_mode(<MODE>mode);
-      emit_insn (gen_altivec_vperm_<mode> (operands[0], operands[1],
-					   operands[1], sel));
+      if (<MODE>mode == V8HImode)
+	{
+	  rtx splt = gen_reg_rtx (V8HImode);
+	  emit_insn (gen_altivec_vspltish (splt, GEN_INT (8)));
+	  emit_insn (gen_altivec_vrlh (operands[0], operands[1], splt));
+	}
+      else
+	{
+	  /* Want to have the elements in reverse order relative
+	     to the endian mode in use, i.e. in LE mode, put elements
+	     in BE order.  */
+	  rtx sel = swap_endian_selector_for_mode (<MODE>mode);
+	  emit_insn (gen_altivec_vperm_<mode> (operands[0], operands[1],
+					       operands[1], sel));
+	}
     }

   DONE;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100866-2.c b/gcc/testsuite/gcc.target/powerpc/pr100866-2.c
new file mode 100644
index 00000000000..4357d1beb09
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100866-2.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+/* { dg-final { scan-assembler {\mvspltish\M} } } */
+/* { dg-final { scan-assembler {\mvrlh\M} } } */
+
+#include <altivec.h>
+
+vector unsigned short revb(vector unsigned short a)
+{
+   return vec_revb(a);
+}
+