diff mbox series

rs6000: Add vector count under mask

Message ID 20200508021132.113615-1-wschmidt@linux.ibm.com
State New
Headers show
Series rs6000: Add vector count under mask | expand

Commit Message

Bill Schmidt May 8, 2020, 2:11 a.m. UTC
From: Kelvin Nilsen <kelvin@gcc.gnu.org>

Add support for new vclzdm and vctzdm vector instructions that
count leading and trailing zeros under control of a mask.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this okay for master?

Thanks,
Bill

[gcc]

2020-05-07  Kelvin Nilsen  <kelvin@gcc.gnu.org>
	    Bill Schmidt  <wschmidt@linux.ibm.com>

	* config/rs6000/altivec.h (vec_clzm): New macro.
	(vec_ctzm): Likewise.
	* config/rs6000/altivec.md (UNSPEC_VCLZDM): New constant.
	(UNSPEC_VCTZDM): Likewise.
	(vclzdm): New insn.
	(vctzdm): Likewise.
	* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_0): New macro.
	(BU_FUTURE_V_1): Likewise.
	(BU_FUTURE_V_2): Likewise.
	(BU_FUTURE_V_3): Likewise.
	(__builtin_altivec_vclzdm): New builtin definition.
	(__builtin_altivec_vctzdm): Likewise.
	* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Cause
	_ARCH_PWR_FUTURE macro to be defined if OPTION_MASK_FUTURE flag is
	set.
	* config/rs6000/rs6000-call.c (builtin_function_type): Set return
	value and parameter types to be unsigned for VCLZDM and VCTZDM.
	* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Add
	support for TARGET_FUTURE flag.
	* config/rs6000/rs6000.h (RS6000_BTM_FUTURE): New macro constant.
	* doc/extend.texi (PowerPC Altivec Built-in Functions Available
	for a Future Architecture): New subsubsection.

[gcc/testsuite]

2020-05-07  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* gcc.target/powerpc/vec-clzm-0.c: New test.
	* gcc.target/powerpc/vec-clzm-1.c: New test.
	* gcc.target/powerpc/vec-ctzm-0.c: New test.
	* gcc.target/powerpc/vec-ctzm-1.c: New test.
---
 gcc/config/rs6000/altivec.h                   |  7 +++
 gcc/config/rs6000/altivec.md                  | 21 ++++++++
 gcc/config/rs6000/rs6000-builtin.def          | 40 ++++++++++++++
 gcc/config/rs6000/rs6000-c.c                  |  2 +
 gcc/config/rs6000/rs6000-call.c               |  2 +
 gcc/config/rs6000/rs6000.c                    |  3 +-
 gcc/config/rs6000/rs6000.h                    |  2 +
 gcc/doc/extend.texi                           | 27 ++++++++++
 gcc/testsuite/gcc.target/powerpc/vec-clzm-0.c | 54 +++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/vec-clzm-1.c | 54 +++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/vec-ctzm-0.c | 54 +++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/vec-ctzm-1.c | 53 ++++++++++++++++++
 12 files changed, 318 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-clzm-0.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-clzm-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-ctzm-0.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-ctzm-1.c

Comments

Segher Boessenkool May 8, 2020, 7 p.m. UTC | #1
On Thu, May 07, 2020 at 09:11:32PM -0500, Bill Schmidt wrote:
> From: Kelvin Nilsen <kelvin@gcc.gnu.org>
> 
> Add support for new vclzdm and vctzdm vector instructions that
> count leading and trailing zeros under control of a mask.

> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions.  Is this okay for master?

(On what CPU / with what -mcpu= settings?)

> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 6b1d987913c..5ef4889ba55 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -160,6 +160,8 @@ (define_c_enum "unspec"
>     UNSPEC_BCD_OVERFLOW
>     UNSPEC_VRLMI
>     UNSPEC_VRLNM
> +   UNSPEC_VCLZDM
> +   UNSPEC_VCTZDM

Hrm, this can actually be the same unspecs as used for the GPR version,
the mode will make the difference already?  Doesn't really matter of
course.

(This needs an unspec because it isn't viable to describe in RTL what
this op does -- it is not an AND  with the mask and then a count, the
masked-out bits are actually skipped for the count).

Looks fine to me, thanks,


Segher
Li, Pan2 via Gcc-patches May 8, 2020, 7:08 p.m. UTC | #2
On 5/8/20 2:00 PM, Segher Boessenkool wrote:
> On Thu, May 07, 2020 at 09:11:32PM -0500, Bill Schmidt wrote:
>> From: Kelvin Nilsen <kelvin@gcc.gnu.org>
>>
>> Add support for new vclzdm and vctzdm vector instructions that
>> count leading and trailing zeros under control of a mask.
>> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
>> regressions.  Is this okay for master?
> (On what CPU / with what -mcpu= settings?)


Sorry for lack of clarity.  All of these patches are tested on a P9.  
The test cases have appropriate -mcpu= settings.  Those run-time tests 
requiring an architecture that supports these instructions show up as 
UNSUPPORTED in that configuration, of course.  My understanding is that 
Kelvin ran these tests on a simulator, but I do not know that for 
certain and haven't repeated those tests.  Any problems that may have 
crept in since then will get caught at such time that hardware is available.

>
>> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
>> index 6b1d987913c..5ef4889ba55 100644
>> --- a/gcc/config/rs6000/altivec.md
>> +++ b/gcc/config/rs6000/altivec.md
>> @@ -160,6 +160,8 @@ (define_c_enum "unspec"
>>      UNSPEC_BCD_OVERFLOW
>>      UNSPEC_VRLMI
>>      UNSPEC_VRLNM
>> +   UNSPEC_VCLZDM
>> +   UNSPEC_VCTZDM
> Hrm, this can actually be the same unspecs as used for the GPR version,
> the mode will make the difference already?  Doesn't really matter of
> course.
True!
>
> (This needs an unspec because it isn't viable to describe in RTL what
> this op does -- it is not an AND  with the mask and then a count, the
> masked-out bits are actually skipped for the count).
>
> Looks fine to me, thanks,

Thanks,
Bill

>
>
> Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 5f1f5924488..e1e75ad0f1e 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -686,4 +686,11 @@  __altivec_scalar_pred(vec_any_nle,
    to #define vec_step to __builtin_vec_step.  */
 #define vec_step(x) __builtin_vec_step (* (__typeof__ (x) *) 0)
 
+#ifdef _ARCH_PWR_FUTURE
+/* May modify these macro definitions if future capabilities overload
+   with support for different vector argument and result types.  */
+#define vec_clzm(a, b)	__builtin_altivec_vclzdm (a, b)
+#define vec_ctzm(a, b)	__builtin_altivec_vctzdm (a, b)
+#endif
+
 #endif /* _ALTIVEC_H */
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 6b1d987913c..5ef4889ba55 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -160,6 +160,8 @@  (define_c_enum "unspec"
    UNSPEC_BCD_OVERFLOW
    UNSPEC_VRLMI
    UNSPEC_VRLNM
+   UNSPEC_VCLZDM
+   UNSPEC_VCTZDM
 ])
 
 (define_c_enum "unspecv"
@@ -4096,6 +4098,25 @@  (define_insn "*bcd<bcd_add_sub>_test2"
   "bcd<bcd_add_sub>. %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
 
+(define_insn "vclzdm"
+  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
+	(unspec:V2DI [(match_operand:V2DI 1 "altivec_register_operand" "v")
+		      (match_operand:V2DI 2 "altivec_register_operand" "v")]
+	 UNSPEC_VCLZDM))]
+   "TARGET_FUTURE"
+   "vclzdm %0,%1,%2"
+   [(set_attr "type" "vecsimple")])
+
+(define_insn "vctzdm"
+  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
+	(unspec:V2DI [(match_operand:V2DI 1 "altivec_register_operand" "v")
+		      (match_operand:V2DI 2 "altivec_register_operand" "v")]
+	 UNSPEC_VCTZDM))]
+   "TARGET_FUTURE"
+   "vctzdm %0,%1,%2"
+   [(set_attr "type" "vecsimple")])
+
+
 (define_expand "bcd<bcd_add_sub>_<code>"
   [(parallel [(set (reg:CCFP CR6_REGNO)
 		   (compare:CCFP
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 54f750c8384..9293e7cf4fb 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -933,6 +933,42 @@ 
 		     | RS6000_BTC_BINARY),				\
 		    CODE_FOR_ ## ICODE)			/* ICODE */
 
+/* For vector builtins for instructions which may be added at some point in
+   the future that are encoded as altivec instructions, use
+   __builtin_altivec_ as the builtin name.  */
+
+#define BU_FUTURE_V_0(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_0 (FUTUREV_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_SPECIAL),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FUTURE_V_1(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_1 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FUTURE_V_2(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_2 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FUTURE_V_3(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_3 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_TERNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
 #endif
 
 
@@ -2479,6 +2515,10 @@  BU_P9_OVERLOAD_2 (CMPRB,	"byte_in_range")
 BU_P9_OVERLOAD_2 (CMPRB2,	"byte_in_either_range")
 BU_P9_OVERLOAD_2 (CMPEQB,	"byte_in_set")
 
+/* Future architecture vector built-ins.  */
+BU_FUTURE_V_2 (VCLZDM, "vclzdm", CONST, vclzdm)
+BU_FUTURE_V_2 (VCTZDM, "vctzdm", CONST, vctzdm)
+
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,		"vsbox",	  CONST, crypto_vsbox_v2di)
 BU_CRYPTO_1 (VSBOX_BE,		"vsbox_be",	  CONST, crypto_vsbox_v16qi)
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index e59eff95cf4..ee2db96f2bd 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -448,6 +448,8 @@  rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags,
     rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8");
   if ((flags & OPTION_MASK_MODULO) != 0)
     rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR9");
+  if ((flags & OPTION_MASK_FUTURE) != 0)
+    rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR_FUTURE");
   if ((flags & OPTION_MASK_SOFT_FLOAT) != 0)
     rs6000_define_or_undefine_macro (define_p, "_SOFT_FLOAT");
   if ((flags & OPTION_MASK_RECIP_PRECISION) != 0)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 68164b912f0..2a4ce5bd340 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -12926,6 +12926,8 @@  builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
     case P8V_BUILTIN_ORC_V4SI_UNS:
     case P8V_BUILTIN_ORC_V2DI_UNS:
     case P8V_BUILTIN_ORC_V1TI_UNS:
+    case FUTURE_BUILTIN_VCLZDM:
+    case FUTURE_BUILTIN_VCTZDM:
       h.uns_p[0] = 1;
       h.uns_p[1] = 1;
       h.uns_p[2] = 1;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 355aea8628d..273a7215bc5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3336,7 +3336,8 @@  rs6000_builtin_mask_calculate (void)
 	      && TARGET_HARD_FLOAT
 	      && !TARGET_IEEEQUAD)	    ? RS6000_BTM_LDBL128   : 0)
 	  | ((TARGET_FLOAT128_TYPE)	    ? RS6000_BTM_FLOAT128  : 0)
-	  | ((TARGET_FLOAT128_HW)	    ? RS6000_BTM_FLOAT128_HW : 0));
+	  | ((TARGET_FLOAT128_HW)	    ? RS6000_BTM_FLOAT128_HW : 0)
+	  | ((TARGET_FUTURE)                ? RS6000_BTM_FUTURE    : 0));
 }
 
 /* Implement TARGET_MD_ASM_ADJUST.  All asm statements are considered
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 1adc371d70f..5603af994fa 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -2304,6 +2304,8 @@  extern int frame_pointer_needed;
 #define RS6000_BTM_POWERPC64	MASK_POWERPC64	/* 64-bit registers.  */
 #define RS6000_BTM_FLOAT128	MASK_FLOAT128_KEYWORD /* IEEE 128-bit float.  */
 #define RS6000_BTM_FLOAT128_HW	MASK_FLOAT128_HW /* IEEE 128-bit float h/w.  */
+#define RS6000_BTM_FUTURE	MASK_FUTURE
+
 
 #define RS6000_BTM_COMMON	(RS6000_BTM_ALTIVEC			\
 				 | RS6000_BTM_VSX			\
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 936c22e2fe7..aa8ab3a8dc5 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -17645,6 +17645,7 @@  briefly described below.
 * PowerPC AltiVec Built-in Functions Available on ISA 2.06::
 * PowerPC AltiVec Built-in Functions Available on ISA 2.07::
 * PowerPC AltiVec Built-in Functions Available on ISA 3.0::
+* PowerPC AltiVec Built-in Functions Available for a Future Architecture::
 @end menu
 
 @node PowerPC AltiVec Built-in Functions on ISA 2.05
@@ -20687,6 +20688,32 @@  void vec_xst (vector char, int, char *);
 void vec_xst (vector unsigned char, int, vector unsigned char *);
 void vec_xst (vector unsigned char, int, unsigned char *);
 @end smallexample
+
+@node PowerPC AltiVec Built-in Functions Available for a Future Architecture
+@subsubsection PowerPC AltiVec Built-in Functions Available for a Future Architecture
+
+The following additional built-in functions are also available for the
+PowerPC family of processors, starting with a hypothetical CPU
+which may or may not be available in the future
+(@option{-mcpu=future}) or later:
+
+@smallexample
+@exdent vector unsigned long long int
+@exdent vec_clzm (vector unsigned long long int, vector unsigned long long int)
+@end smallexample
+Perform a vector count leading zeros under bit mask operation, as if
+implemented by the Future @code{vclzdm} instruction.
+@findex vec_clzm
+
+@smallexample
+@exdent vector unsigned long long int
+@exdent vec_ctzm (vector unsigned long long int, vector unsigned long long int)
+@end smallexample
+Perform a vector count trailing zeros under bit mask operation, as if
+implemented by the Future @code{vctzdm} instruction.
+@findex vec_ctzm
+
+
 @node PowerPC Hardware Transactional Memory Built-in Functions
 @subsection PowerPC Hardware Transactional Memory Built-in Functions
 GCC provides two interfaces for accessing the Hardware Transactional
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clzm-0.c b/gcc/testsuite/gcc.target/powerpc/vec-clzm-0.c
new file mode 100644
index 00000000000..099c5dc99bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clzm-0.c
@@ -0,0 +1,54 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=future" } */
+
+#include <altivec.h>
+
+extern void abort (void);
+
+vector unsigned long long int
+do_vec_clzm (vector unsigned long long int source,
+	     vector unsigned long long int mask)
+{
+  return vec_clzm (source, mask);
+}
+
+int main (int argc, char *argv [])
+{
+  vector unsigned long long int source_a = { 0xa5f07e3cull, 0x7e3ca5f0ull };
+  vector unsigned long long int source_b = { 0x3ca5f07eull, 0x5a0fe7c3ull };
+
+  vector unsigned long long int mask_a = { 0xffff0000ull, 0x0000ffffull };
+  vector unsigned long long int mask_b = { 0x0f0f0f0full, 0xf0f0f0f0ull };
+
+  /* See cntlzdm-0.c for derivation of expected results.
+
+     result_aa [0] is compute (source [0], mask [0];
+     result_aa [1] is compute (source [1], mask [1].
+
+     result_ab [0] is compute (source [0], mask [2];
+     result_ab [1] is compute (source [1], mask [3].
+
+     result_ba [0] is compute (source [2], mask [0];
+     result_ba [1] is compute (source [3], mask [1].
+
+     result_bb [0] is compute (source [2], mask [2];
+     result_bb [1] is compute (source [3], mask [3].  */
+
+  vector unsigned long long int result_aa = { 0, 0 };
+  vector unsigned long long int result_ab = { 1, 1 };
+  vector unsigned long long int result_ba = { 2, 0 };
+  vector unsigned long long int result_bb = { 0, 1 };
+
+  if (!vec_all_eq (do_vec_clzm (source_a, mask_a), result_aa))
+    abort ();
+  if (!vec_all_eq (do_vec_clzm (source_a, mask_b), result_ab))
+    abort ();
+  if (!vec_all_eq (do_vec_clzm (source_b, mask_a), result_ba))
+    abort ();
+  if (!vec_all_eq (do_vec_clzm (source_b, mask_b), result_bb))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\mvclzdm\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-clzm-1.c b/gcc/testsuite/gcc.target/powerpc/vec-clzm-1.c
new file mode 100644
index 00000000000..43b86114487
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-clzm-1.c
@@ -0,0 +1,54 @@ 
+/* { dg-do run } */
+/* { dg-require-effective-target powerpc_future_hw } */
+/* { dg-options "-mdejagnu-cpu=future" } */
+
+#include <altivec.h>
+
+extern void abort (void);
+
+vector unsigned long long int
+do_vec_clzm (vector unsigned long long int source,
+	     vector unsigned long long int mask)
+{
+  return vec_clzm (source, mask);
+}
+
+int main (int argc, char *argv [])
+{
+  vector unsigned long long int source_a = { 0xa5f07e3cull, 0x7e3ca5f0ull };
+  vector unsigned long long int source_b = { 0x3ca5f07eull, 0x5a0fe7c3ull };
+
+  vector unsigned long long int mask_a = { 0xffff0000ull, 0x0000ffffull };
+  vector unsigned long long int mask_b = { 0x0f0f0f0full, 0xf0f0f0f0ull };
+
+  /* See cntlzdm-0.c for derivation of expected results.
+
+     result_aa [0] is compute (source [0], mask [0];
+     result_aa [1] is compute (source [1], mask [1].
+
+     result_ab [0] is compute (source [0], mask [2];
+     result_ab [1] is compute (source [1], mask [3].
+
+     result_ba [0] is compute (source [2], mask [0];
+     result_ba [1] is compute (source [3], mask [1].
+
+     result_bb [0] is compute (source [2], mask [2];
+     result_bb [1] is compute (source [3], mask [3].  */
+
+  vector unsigned long long int result_aa = { 0, 0 };
+  vector unsigned long long int result_ab = { 1, 1 };
+  vector unsigned long long int result_ba = { 2, 0 };
+  vector unsigned long long int result_bb = { 0, 1 };
+
+  if (!vec_all_eq (do_vec_clzm (source_a, mask_a), result_aa))
+    abort ();
+  if (!vec_all_eq (do_vec_clzm (source_a, mask_b), result_ab))
+    abort ();
+  if (!vec_all_eq (do_vec_clzm (source_b, mask_a), result_ba))
+    abort ();
+  if (!vec_all_eq (do_vec_clzm (source_b, mask_b), result_bb))
+    abort ();
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-ctzm-0.c b/gcc/testsuite/gcc.target/powerpc/vec-ctzm-0.c
new file mode 100644
index 00000000000..315edf4d4cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-ctzm-0.c
@@ -0,0 +1,54 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=future" } */
+
+#include <altivec.h>
+
+extern void abort (void);
+
+vector unsigned long long int
+do_vec_ctzm (vector unsigned long long int source,
+	     vector unsigned long long int mask)
+{
+  return vec_ctzm (source, mask);
+}
+
+int main (int argc, char *argv [])
+{
+  vector unsigned long long int source_a = { 0xa5f07e3cull, 0x7e3ca5f0ull };
+  vector unsigned long long int source_b = { 0x3ca5f07eull, 0x5a0fe7c3ull };
+
+  vector unsigned long long int mask_a = { 0xffff0000ull, 0x0000ffffull };
+  vector unsigned long long int mask_b = { 0x0f0f0f0full, 0xf0f0f0f0ull };
+
+  /* See cnttzdm-0.c for derivation of expected results.
+
+     result_aa [0] is compute (source [0], mask [0];
+     result_aa [1] is compute (source [1], mask [1].
+
+     result_ab [0] is compute (source [0], mask [2];
+     result_ab [1] is compute (source [1], mask [3].
+
+     result_ba [0] is compute (source [2], mask [0];
+     result_ba [1] is compute (source [3], mask [1].
+
+     result_bb [0] is compute (source [2], mask [2];
+     result_bb [1] is compute (source [3], mask [3].  */
+
+  vector unsigned long long int result_aa = { 4, 4 };
+  vector unsigned long long int result_ab = { 2, 0 };
+  vector unsigned long long int result_ba = { 0, 0 };
+  vector unsigned long long int result_bb = { 1, 2 };
+
+  if (!vec_all_eq (do_vec_ctzm (source_a, mask_a), result_aa))
+    abort ();
+  if (!vec_all_eq (do_vec_ctzm (source_a, mask_b), result_ab))
+    abort ();
+  if (!vec_all_eq (do_vec_ctzm (source_b, mask_a),result_ba))
+    abort ();
+  if (!vec_all_eq (do_vec_ctzm (source_b, mask_b), result_bb))
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\mvctzdm\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-ctzm-1.c b/gcc/testsuite/gcc.target/powerpc/vec-ctzm-1.c
new file mode 100644
index 00000000000..3dc4a03ab09
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-ctzm-1.c
@@ -0,0 +1,53 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_future_hw } */
+/* { dg-options "-mdejagnu-cpu=future" } */
+
+#include <altivec.h>
+
+extern void abort (void);
+
+vector unsigned long long int
+do_vec_ctzm (vector unsigned long long int source,
+	     vector unsigned long long int mask)
+{
+  return vec_ctzm (source, mask);
+}
+
+int main (int argc, char *argv [])
+{
+  vector unsigned long long int source_a = { 0xa5f07e3cull, 0x7e3ca5f0ull };
+  vector unsigned long long int source_b = { 0x3ca5f07eull, 0x5a0fe7c3ull };
+
+  vector unsigned long long int mask_a = { 0xffff0000ull, 0x0000ffffull };
+  vector unsigned long long int mask_b = { 0x0f0f0f0full, 0xf0f0f0f0ull };
+
+  /* See cnttzdm-0.c for derivation of expected results.
+
+     result_aa [0] is compute (source [0], mask [0];
+     result_aa [1] is compute (source [1], mask [1].
+
+     result_ab [0] is compute (source [0], mask [2];
+     result_ab [1] is compute (source [1], mask [3].
+
+     result_ba [0] is compute (source [2], mask [0];
+     result_ba [1] is compute (source [3], mask [1].
+
+     result_bb [0] is compute (source [2], mask [2];
+     result_bb [1] is compute (source [3], mask [3].  */
+
+  vector unsigned long long int result_aa = { 4, 4 };
+  vector unsigned long long int result_ab = { 2, 0 };
+  vector unsigned long long int result_ba = { 0, 0 };
+  vector unsigned long long int result_bb = { 1, 2 };
+
+  if (!vec_all_eq (do_vec_ctzm (source_a, mask_a), result_aa))
+    abort ();
+  if (!vec_all_eq (do_vec_ctzm (source_a, mask_b), result_ab))
+    abort ();
+  if (!vec_all_eq (do_vec_ctzm (source_b, mask_a),result_ba))
+    abort ();
+  if (!vec_all_eq (do_vec_ctzm (source_b, mask_b), result_bb))
+    abort ();
+
+  return 0;
+}