[v2,24/36] arm: [MVE intrinsics] add vidwdup shape

Message ID	20240904132650.2720446-25-christophe.lyon@linaro.org
State	New
Headers	show Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4402F385020D From: Christophe Lyon <christophe.lyon@linaro.org> To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, ramanara@nvidia.com Cc: Christophe Lyon <christophe.lyon@linaro.org> Subject: [PATCH v2 24/36] arm: [MVE intrinsics] add vidwdup shape Date: Wed, 4 Sep 2024 13:26:38 +0000 Message-Id: <20240904132650.2720446-25-christophe.lyon@linaro.org> In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org> References: <20240711214305.3193022-1-christophe.lyon@linaro.org> <20240904132650.2720446-1-christophe.lyon@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	arm: [MVE intrinsics] Re-implement more intrinsics \| expand [v2,00/36] arm: [MVE intrinsics] Re-implement more intrinsics [v2,01/36] arm: [MVE intrinsics] improve comment for orrq shape [v2,02/36] arm: [MVE intrinsics] remove useless resolve from create shape [v2,03/36] arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h [v2,04/36] arm: [MVE intrinsics] factorize vcvtq [v2,05/36] arm: [MVE intrinsics] add vcvt shape [v2,06/36] arm: [MVE intrinsics] rework vcvtq [v2,07/36] arm: [MVE intrinsics] factorize vcvtbq vcvttq [v2,08/36] arm: [MVE intrinsics] add vcvt_f16_f32 and vcvt_f32_f16 shapes [v2,09/36] arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 vcvtbq_f32_f16 vcvttq_f32_f16 [v2,10/36] arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq [v2,11/36] arm: [MVE intrinsics] add vcvtx shape [v2,12/36] arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq [v2,13/36] arm: [MVE intrinsics] rework vbicq [v2,14/36] arm: [MVE intrinsics] factorize vorn [v2,15/36] arm: [MVE intrinsics] rework vorn [v2,16/36] arm: [MVE intrinsics] rework vctp [v2,17/36] arm: [MVE intrinsics] factorize vddup vidup [v2,18/36] arm: [MVE intrinsics] add viddup shape [v2,19/36] arm: [MVE intrinsics] rework vddup vidup [v2,20/36] arm: [MVE intrinsics] update v[id]dup tests [v2,21/36] arm: [MVE intrinsics] remove v[id]dup expanders [v2,22/36] arm: [MVE intrinsics] fix checks of immediate arguments [v2,23/36] arm: [MVE intrinsics] factorize vdwdup viwdup [v2,24/36] arm: [MVE intrinsics] add vidwdup shape [v2,25/36] arm: [MVE intrinsics] rework vdwdup viwdup [v2,26/36] arm: [MVE intrinsics] update v[id]wdup tests [v2,27/36] arm: [MVE intrinsics] remove useless v[id]wdup expanders [v2,28/36] arm: [MVE intrinsics] add vshlc shape [v2,29/36] arm: [MVE intrinsics] rework vshlcq [v2,30/36] arm: [MVE intrinsics] remove vshlcq useless expanders [v2,31/36] arm: [MVE intrinsics] add vadc_vsbc shape [v2,32/36] arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci [v2,33/36] arm: [MVE intrinsics] rework vadciq [v2,34/36] arm: [MVE intrinsics] rework vadcq [v2,35/36] arm: [MVE intrinsics] rework vsbcq vsbciq [v2,36/36] arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers

Message ID

20240904132650.2720446-25-christophe.lyon@linaro.org

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4402F385020D
From: Christophe Lyon <christophe.lyon@linaro.org>
To: gcc-patches@gcc.gnu.org,
	richard.earnshaw@arm.com,
	ramanara@nvidia.com
Cc: Christophe Lyon <christophe.lyon@linaro.org>
Subject: [PATCH v2 24/36] arm: [MVE intrinsics] add vidwdup shape
Date: Wed,  4 Sep 2024 13:26:38 +0000
Message-Id: <20240904132650.2720446-25-christophe.lyon@linaro.org>
In-Reply-To: <20240904132650.2720446-1-christophe.lyon@linaro.org>
References: <20240711214305.3193022-1-christophe.lyon@linaro.org>
 <20240904132650.2720446-1-christophe.lyon@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

Series

arm: [MVE intrinsics] Re-implement more intrinsics | expand

Commit Message

Christophe Lyon Sept. 4, 2024, 1:26 p.m. UTC

This patch adds the vidwdup shape description for vdwdup and viwdup.

It is very similar to viddup, but accounts for the additional 'wrap'
scalar parameter.

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc (vidwdup): New.
	* config/arm/arm-mve-builtins-shapes.h (vidwdup): New.
---
 gcc/config/arm/arm-mve-builtins-shapes.cc | 88 +++++++++++++++++++++++
 gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
 2 files changed, 89 insertions(+)

Comments

Richard Earnshaw (lists) Oct. 14, 2024, 5:46 p.m. UTC | #1

On 04/09/2024 14:26, Christophe Lyon wrote:
> This patch adds the vidwdup shape description for vdwdup and viwdup.
> 
> It is very similar to viddup, but accounts for the additional 'wrap'
> scalar parameter.
> 
> 2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>
> 
> 	gcc/
> 	* config/arm/arm-mve-builtins-shapes.cc (vidwdup): New.
> 	* config/arm/arm-mve-builtins-shapes.h (vidwdup): New.

OK.

R.

> ---
>  gcc/config/arm/arm-mve-builtins-shapes.cc | 88 +++++++++++++++++++++++
>  gcc/config/arm/arm-mve-builtins-shapes.h  |  1 +
>  2 files changed, 89 insertions(+)
> 
> diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc
> index a1d2e243128..510f15ae73a 100644
> --- a/gcc/config/arm/arm-mve-builtins-shapes.cc
> +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
> @@ -2291,6 +2291,94 @@ struct viddup_def : public overloaded_base<0>
>  };
>  SHAPE (viddup)
>  
> +/* <T0>_t vfoo[_n]_t0(uint32_t, uint32_t, const int)
> +   <T0>_t vfoo[_wb]_t0(uint32_t *, uint32_t, const int)
> +
> +   Shape for vector increment or decrement with wrap and duplicate operations
> +   that take an integer or pointer to integer first argument, an integer second
> +   argument and an immediate, and produce a vector.
> +
> +   Check that 'imm' is one of 1, 2, 4 or 8.
> +
> +   Example: vdwdupq.
> +   uint8x16_t [__arm_]vdwdupq[_n]_u8(uint32_t a, uint32_t b, const int imm)
> +   uint8x16_t [__arm_]vdwdupq[_wb]_u8(uint32_t *a, uint32_t b, const int imm)
> +   uint8x16_t [__arm_]vdwdupq_m[_n_u8](uint8x16_t inactive, uint32_t a, uint32_t b, const int imm, mve_pred16_t p)
> +   uint8x16_t [__arm_]vdwdupq_m[_wb_u8](uint8x16_t inactive, uint32_t *a, uint32_t b, const int imm, mve_pred16_t p)
> +   uint8x16_t [__arm_]vdwdupq_x[_n]_u8(uint32_t a, uint32_t b, const int imm, mve_pred16_t p)
> +   uint8x16_t [__arm_]vdwdupq_x[_wb]_u8(uint32_t *a, uint32_t b, const int imm, mve_pred16_t p)  */
> +struct vidwdup_def : public overloaded_base<0>
> +{
> +  bool
> +  explicit_type_suffix_p (unsigned int i, enum predication_index pred,
> +			  enum mode_suffix_index,
> +			  type_suffix_info) const override
> +  {
> +    return ((i == 0) && (pred != PRED_m));
> +  }
> +
> +  bool
> +  skip_overload_p (enum predication_index, enum mode_suffix_index mode) const override
> +  {
> +    /* For MODE_wb, share the overloaded instance with MODE_n.  */
> +    if (mode == MODE_wb)
> +      return true;
> +
> +    return false;
> +  }
> +
> +  void
> +  build (function_builder &b, const function_group_info &group,
> +	 bool preserve_user_namespace) const override
> +  {
> +    b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
> +    build_all (b, "v0,su32,su32,su64", group, MODE_n, preserve_user_namespace);
> +    build_all (b, "v0,as,su32,su64", group, MODE_wb, preserve_user_namespace);
> +  }
> +
> +  tree
> +  resolve (function_resolver &r) const override
> +  {
> +    unsigned int i, nargs;
> +    type_suffix_index type_suffix = NUM_TYPE_SUFFIXES;
> +    if (!r.check_gp_argument (3, i, nargs))
> +      return error_mark_node;
> +
> +    type_suffix = r.type_suffix_ids[0];
> +    /* With PRED_m, ther is no type suffix, so infer it from the first (inactive)
> +       argument.  */
> +    if (type_suffix == NUM_TYPE_SUFFIXES)
> +      type_suffix = r.infer_vector_type (0);
> +
> +    unsigned int last_arg = i - 2;
> +    /* Check that last_arg is either scalar or pointer.  */
> +    if (!r.scalar_argument_p (last_arg))
> +      return error_mark_node;
> +
> +    if (!r.scalar_argument_p (last_arg + 1))
> +      return error_mark_node;
> +
> +    if (!r.require_integer_immediate (last_arg + 2))
> +      return error_mark_node;
> +
> +    /* With MODE_n we expect a scalar, with MODE_wb we expect a pointer.  */
> +    mode_suffix_index mode_suffix;
> +    if (POINTER_TYPE_P (r.get_argument_type (last_arg)))
> +      mode_suffix = MODE_wb;
> +    else
> +      mode_suffix = MODE_n;
> +
> +    return r.resolve_to (mode_suffix, type_suffix);
> +  }
> +
> +  bool
> +  check (function_checker &c) const override
> +  {
> +    return c.require_immediate_one_of (2, 1, 2, 4, 8);
> +  }
> +};
> +SHAPE (vidwdup)
> +
>  /* <T0>_t vfoo[_t0](<T0>_t, <T0>_t, mve_pred16_t)
>  
>     i.e. a version of the standard ternary shape in which
> diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h
> index 186287c1620..b3d08ab3866 100644
> --- a/gcc/config/arm/arm-mve-builtins-shapes.h
> +++ b/gcc/config/arm/arm-mve-builtins-shapes.h
> @@ -83,6 +83,7 @@ namespace arm_mve
>      extern const function_shape *const vcvt_f32_f16;
>      extern const function_shape *const vcvtx;
>      extern const function_shape *const viddup;
> +    extern const function_shape *const vidwdup;
>      extern const function_shape *const vpsel;
>  
>    } /* end namespace arm_mve::shapes */

diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc
index a1d2e243128..510f15ae73a 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.cc
+++ b/gcc/config/arm/arm-mve-builtins-shapes.cc
@@ -2291,6 +2291,94 @@  struct viddup_def : public overloaded_base<0>
 };
 SHAPE (viddup)
 
+/* <T0>_t vfoo[_n]_t0(uint32_t, uint32_t, const int)
+   <T0>_t vfoo[_wb]_t0(uint32_t *, uint32_t, const int)
+
+   Shape for vector increment or decrement with wrap and duplicate operations
+   that take an integer or pointer to integer first argument, an integer second
+   argument and an immediate, and produce a vector.
+
+   Check that 'imm' is one of 1, 2, 4 or 8.
+
+   Example: vdwdupq.
+   uint8x16_t [__arm_]vdwdupq[_n]_u8(uint32_t a, uint32_t b, const int imm)
+   uint8x16_t [__arm_]vdwdupq[_wb]_u8(uint32_t *a, uint32_t b, const int imm)
+   uint8x16_t [__arm_]vdwdupq_m[_n_u8](uint8x16_t inactive, uint32_t a, uint32_t b, const int imm, mve_pred16_t p)
+   uint8x16_t [__arm_]vdwdupq_m[_wb_u8](uint8x16_t inactive, uint32_t *a, uint32_t b, const int imm, mve_pred16_t p)
+   uint8x16_t [__arm_]vdwdupq_x[_n]_u8(uint32_t a, uint32_t b, const int imm, mve_pred16_t p)
+   uint8x16_t [__arm_]vdwdupq_x[_wb]_u8(uint32_t *a, uint32_t b, const int imm, mve_pred16_t p)  */
+struct vidwdup_def : public overloaded_base<0>
+{
+  bool
+  explicit_type_suffix_p (unsigned int i, enum predication_index pred,
+			  enum mode_suffix_index,
+			  type_suffix_info) const override
+  {
+    return ((i == 0) && (pred != PRED_m));
+  }
+
+  bool
+  skip_overload_p (enum predication_index, enum mode_suffix_index mode) const override
+  {
+    /* For MODE_wb, share the overloaded instance with MODE_n.  */
+    if (mode == MODE_wb)
+      return true;
+
+    return false;
+  }
+
+  void
+  build (function_builder &b, const function_group_info &group,
+	 bool preserve_user_namespace) const override
+  {
+    b.add_overloaded_functions (group, MODE_none, preserve_user_namespace);
+    build_all (b, "v0,su32,su32,su64", group, MODE_n, preserve_user_namespace);
+    build_all (b, "v0,as,su32,su64", group, MODE_wb, preserve_user_namespace);
+  }
+
+  tree
+  resolve (function_resolver &r) const override
+  {
+    unsigned int i, nargs;
+    type_suffix_index type_suffix = NUM_TYPE_SUFFIXES;
+    if (!r.check_gp_argument (3, i, nargs))
+      return error_mark_node;
+
+    type_suffix = r.type_suffix_ids[0];
+    /* With PRED_m, ther is no type suffix, so infer it from the first (inactive)
+       argument.  */
+    if (type_suffix == NUM_TYPE_SUFFIXES)
+      type_suffix = r.infer_vector_type (0);
+
+    unsigned int last_arg = i - 2;
+    /* Check that last_arg is either scalar or pointer.  */
+    if (!r.scalar_argument_p (last_arg))
+      return error_mark_node;
+
+    if (!r.scalar_argument_p (last_arg + 1))
+      return error_mark_node;
+
+    if (!r.require_integer_immediate (last_arg + 2))
+      return error_mark_node;
+
+    /* With MODE_n we expect a scalar, with MODE_wb we expect a pointer.  */
+    mode_suffix_index mode_suffix;
+    if (POINTER_TYPE_P (r.get_argument_type (last_arg)))
+      mode_suffix = MODE_wb;
+    else
+      mode_suffix = MODE_n;
+
+    return r.resolve_to (mode_suffix, type_suffix);
+  }
+
+  bool
+  check (function_checker &c) const override
+  {
+    return c.require_immediate_one_of (2, 1, 2, 4, 8);
+  }
+};
+SHAPE (vidwdup)
+
 /* <T0>_t vfoo[_t0](<T0>_t, <T0>_t, mve_pred16_t)
 
    i.e. a version of the standard ternary shape in which
diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h
index 186287c1620..b3d08ab3866 100644
--- a/gcc/config/arm/arm-mve-builtins-shapes.h
+++ b/gcc/config/arm/arm-mve-builtins-shapes.h
@@ -83,6 +83,7 @@  namespace arm_mve
     extern const function_shape *const vcvt_f32_f16;
     extern const function_shape *const vcvtx;
     extern const function_shape *const viddup;
+    extern const function_shape *const vidwdup;
     extern const function_shape *const vpsel;
 
   } /* end namespace arm_mve::shapes */

[v2,24/36] arm: [MVE intrinsics] add vidwdup shape

Commit Message

Comments

Patch