diff mbox series

gimple-isel: Recognize vec_extract pattern.

Message ID 5104f927-d7db-c148-911c-0ca8e783a609@gmail.com
State New
Headers show
Series gimple-isel: Recognize vec_extract pattern. | expand

Commit Message

Robin Dapp July 3, 2023, 10:19 a.m. UTC
Hi,

In gimple-isel we already deduce a vec_set pattern from an
ARRAY_REF(VIEW_CONVERT_EXPR).  This patch does the same for a
vec_extract.

The code is largely similar to the vec_set one including
the addition of a can_vec_extract_var_idx_p function
in optabs.cc to check if the backend can handle a register
operand as index.  We already have can_vec_extract in
optabs-query but that one checks whether we can extract
specific modes.

With the introduction of an internal function for vec_extract
the expander must not FAIL.  For vec_set this has already been
the case so adjust the documentation accordingly.

Additionally, clarify the wording of the vector-vector case for
vec_extract.

During testing I noticed that the aarch64 simd vec_extract
expander is the only one that FAILs.  Richard is currently
testing a patch that tries to remove this.   Bootstrap and
testsuite was unchanged on x86 and power.

I was a bit torn whether to add a separate function to recognize
vec_extract or not and ended up doing it inline with several
is_extract checks.

Regards
 Robin

gcc/ChangeLog:

	* doc/md.texi: Document that vec_set and vec_extract must not
	fail.
	* gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
	(gimple_expand_vec_set_extract_expr): ...to this.
	(gimple_expand_vec_exprs): Call renamed function.
	* internal-fn.cc (vec_extract_direct): Add.
	(expand_vec_extract_optab_fn): New function to expand
	vec_extract optab.
	(direct_vec_extract_optab_supported_p): Add.
	* internal-fn.def (VEC_EXTRACT): Add.
	* optabs.cc (can_vec_extract_var_idx_p): New function.
	* optabs.h (can_vec_extract_var_idx_p): Declare.
---
 gcc/doc/md.texi     |  7 +++-
 gcc/gimple-isel.cc  | 85 +++++++++++++++++++++++++++++++++++++--------
 gcc/internal-fn.cc  | 39 +++++++++++++++++++++
 gcc/internal-fn.def |  1 +
 gcc/optabs.cc       | 24 +++++++++++++
 gcc/optabs.h        |  1 +
 6 files changed, 141 insertions(+), 16 deletions(-)

Comments

Richard Biener July 3, 2023, 10:55 a.m. UTC | #1
On Mon, 3 Jul 2023, Robin Dapp wrote:

> Hi,
> 
> In gimple-isel we already deduce a vec_set pattern from an
> ARRAY_REF(VIEW_CONVERT_EXPR).  This patch does the same for a
> vec_extract.
> 
> The code is largely similar to the vec_set one including
> the addition of a can_vec_extract_var_idx_p function
> in optabs.cc to check if the backend can handle a register
> operand as index.  We already have can_vec_extract in
> optabs-query but that one checks whether we can extract
> specific modes.
> 
> With the introduction of an internal function for vec_extract
> the expander must not FAIL.  For vec_set this has already been
> the case so adjust the documentation accordingly.
> 
> Additionally, clarify the wording of the vector-vector case for
> vec_extract.
> 
> During testing I noticed that the aarch64 simd vec_extract
> expander is the only one that FAILs.  Richard is currently
> testing a patch that tries to remove this.   Bootstrap and
> testsuite was unchanged on x86 and power.
> 
> I was a bit torn whether to add a separate function to recognize
> vec_extract or not and ended up doing it inline with several
> is_extract checks.
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
> 	* doc/md.texi: Document that vec_set and vec_extract must not
> 	fail.
> 	* gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
> 	(gimple_expand_vec_set_extract_expr): ...to this.
> 	(gimple_expand_vec_exprs): Call renamed function.
> 	* internal-fn.cc (vec_extract_direct): Add.
> 	(expand_vec_extract_optab_fn): New function to expand
> 	vec_extract optab.
> 	(direct_vec_extract_optab_supported_p): Add.
> 	* internal-fn.def (VEC_EXTRACT): Add.
> 	* optabs.cc (can_vec_extract_var_idx_p): New function.
> 	* optabs.h (can_vec_extract_var_idx_p): Declare.
> ---
>  gcc/doc/md.texi     |  7 +++-
>  gcc/gimple-isel.cc  | 85 +++++++++++++++++++++++++++++++++++++--------
>  gcc/internal-fn.cc  | 39 +++++++++++++++++++++
>  gcc/internal-fn.def |  1 +
>  gcc/optabs.cc       | 24 +++++++++++++
>  gcc/optabs.h        |  1 +
>  6 files changed, 141 insertions(+), 16 deletions(-)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 9648fdc846a..c61602fb04d 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5074,6 +5074,8 @@ of the result should be stored to memory.
>  Set given field in the vector value.  Operand 0 is the vector to modify,
>  operand 1 is new value of field and operand 2 specify the field index.
>  
> +This pattern is not allowed to @code{FAIL}.
> +
>  @cindex @code{vec_extract@var{m}@var{n}} instruction pattern
>  @item @samp{vec_extract@var{m}@var{n}}
>  Extract given field from the vector value.  Operand 1 is the vector, operand 2
> @@ -5081,7 +5083,10 @@ specify field index and operand 0 place to store value into.  The
>  @var{n} mode is the mode of the field or vector of fields that should be
>  extracted, should be either element mode of the vector mode @var{m}, or
>  a vector mode with the same element mode and smaller number of elements.
> -If @var{n} is a vector mode, the index is counted in units of that mode.
> +If @var{n} is a vector mode the index is counted in multiples of
> +mode @var{n}.
> +
> +This pattern is not allowed to @code{FAIL}.
>  
>  @cindex @code{vec_init@var{m}@var{n}} instruction pattern
>  @item @samp{vec_init@var{m}@var{n}}
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index ef688ddb57f..d6c560b63dd 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -42,15 +42,26 @@ along with GCC; see the file COPYING3.  If not see
>  
>  /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
>     internal function based on vector type of selected expansion.
> -   i.e.:
> +
> +   For vec_set:
> +
>       VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
>     =>
>       _7 = u;
>       _8 = .VEC_SET (_7, i_4(D), _1);
> -     u = _8;  */
> +     u = _8;
> +
> +   For vec_extract:
> +
> +      _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
> +   =>
> +      _4 = vD.2208;
> +      _5 = .VEC_EXTRACT (_4, idx_2(D));
> +      _3 = _5;  */
>  
>  static bool
> -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
> +gimple_expand_vec_set_extract_expr (struct function *fun,
> +				    gimple_stmt_iterator *gsi)
>  {
>    enum tree_code code;
>    gcall *new_stmt = NULL;
> @@ -62,45 +73,88 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
>    if (!stmt)
>      return false;
>  
> +  bool is_extract = false;
> +
>    tree lhs = gimple_assign_lhs (stmt);
> +  tree rhs = gimple_assign_rhs1 (stmt);
> +  tree val, op0;
>    code = TREE_CODE (lhs);
> -  if (code != ARRAY_REF)
> -    return false;
> +  if (code == ARRAY_REF)
> +    {
> +      /* Assume it is a vec_set.  */
> +      val = rhs;
> +      op0 = TREE_OPERAND (lhs, 0);
> +    }
> +  else
> +    {
> +      /* It can still be a vec_extract.  */
> +      code = TREE_CODE (rhs);
> +      if (code != ARRAY_REF)
> +	return false;
> +

can you make this

      if (TREE_CODE (lhs) == ARRAY_REF)
        ..
      else if (TREE_CODE (rhs) == ARRAY_REF)
        ..
      else
        return false;

that's way easier to read.

> +      /* Sides are swapped for vec_extract.  */
> +      is_extract = true;
> +      val = lhs;
> +      op0 = TREE_OPERAND (rhs, 0);
> +    }
>  
> -  tree val = gimple_assign_rhs1 (stmt);
> -  tree op0 = TREE_OPERAND (lhs, 0);
>    if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
>        && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
>        && TYPE_MODE (TREE_TYPE (lhs))

this 'lhs' would need to be 'rhs' for the extract?  So in addition
to 'val' add 'ref' and initialize that?

>  	   == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
>      {
> -      tree pos = TREE_OPERAND (lhs, 1);
> +      tree pos;
> +      if (!is_extract)
> +	pos = TREE_OPERAND (lhs, 1);
> +      else
> +	pos = TREE_OPERAND (rhs, 1);
> +

then makes TREE_OPERAND (ref, 1);

>        tree view_op0 = TREE_OPERAND (op0, 0);
>        machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
> +      machine_mode extract_mode;
> +      if (is_extract)
> +	extract_mode = TYPE_MODE (TREE_TYPE (lhs));
> +

and this would be 'val', though looking at 'ref' looks better here.

>        if (auto_var_in_fn_p (view_op0, fun->decl)
> -	  && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
> +	  && !TREE_ADDRESSABLE (view_op0)
> +	  && ((!is_extract && can_vec_set_var_idx_p (outermode))
> +	      || (is_extract &&
> +		  can_vec_extract_var_idx_p (outermode, extract_mode))))
>  	{
>  	  location_t loc = gimple_location (stmt);
>  	  tree var_src = make_ssa_name (TREE_TYPE (view_op0));
> -	  tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
> +	  tree var_dst;
> +	  if (!is_extract)
> +	    var_dst = make_ssa_name (TREE_TYPE (view_op0));
> +	  else
> +	    var_dst = make_ssa_name (TREE_TYPE (lhs));
>  
>  	  ass_stmt = gimple_build_assign (var_src, view_op0);
>  	  gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
>  	  gimple_set_location (ass_stmt, loc);
>  	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>  
> -	  new_stmt
> -	    = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
> +	  if (!is_extract)
> +	    new_stmt
> +	      = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
> +	  else
> +	    new_stmt
> +	      = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
> +
>  	  gimple_call_set_lhs (new_stmt, var_dst);
>  	  gimple_set_location (new_stmt, loc);
>  	  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>  
> -	  ass_stmt = gimple_build_assign (view_op0, var_dst);
> +	  if (!is_extract)
> +	    ass_stmt = gimple_build_assign (view_op0, var_dst);
> +	  else
> +	    ass_stmt = gimple_build_assign (lhs, var_dst);

as you preserve the LHS for the original testcase you should
use gsi_replace () for it.  I think that making a bigger if (is_extract)
group covering all stmt generation would make this easier to read.

Thanks,
Richard.

>  	  gimple_set_location (ass_stmt, loc);
> +	  if (!is_extract)
> +	    gimple_move_vops (ass_stmt, stmt);
>  	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>  
>  	  basic_block bb = gimple_bb (stmt);
> -	  gimple_move_vops (ass_stmt, stmt);
>  	  if (gsi_remove (gsi, true)
>  	      && gimple_purge_dead_eh_edges (bb))
>  	    cfg_changed = true;
> @@ -317,7 +371,8 @@ gimple_expand_vec_exprs (struct function *fun)
>  	      gsi_replace (&gsi, g, false);
>  	    }
>  
> -	  cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
> +	  cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
> +
>  	  if (gsi_end_p (gsi))
>  	    break;
>  	}
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 9017176dc7a..cbdb7e64d7c 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -175,6 +175,7 @@ init_internal_fns ()
>  #define len_store_direct { 3, 3, false }
>  #define len_maskstore_direct { 4, 3, false }
>  #define vec_set_direct { 3, 3, false }
> +#define vec_extract_direct { 3, 3, false }
>  #define unary_direct { 0, 0, true }
>  #define unary_convert_direct { -1, 0, true }
>  #define binary_direct { 0, 0, true }
> @@ -3126,6 +3127,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>    gcc_unreachable ();
>  }
>  
> +/* Expand VEC_EXTRACT optab internal function.  */
> +
> +static void
> +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +{
> +  tree lhs = gimple_call_lhs (stmt);
> +  tree op0 = gimple_call_arg (stmt, 0);
> +  tree op1 = gimple_call_arg (stmt, 1);
> +
> +  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +
> +  machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
> +  machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
> +
> +  rtx src = expand_normal (op0);
> +  rtx pos = expand_normal (op1);
> +
> +  class expand_operand ops[3];
> +  enum insn_code icode = convert_optab_handler (optab, outermode,
> +						extract_mode);
> +
> +  if (icode != CODE_FOR_nothing)
> +    {
> +      create_output_operand (&ops[0], target, extract_mode);
> +      create_input_operand (&ops[1], src, outermode);
> +      create_convert_operand_from (&ops[2], pos,
> +				   TYPE_MODE (TREE_TYPE (op1)), true);
> +      if (maybe_expand_insn (icode, 3, ops))
> +	{
> +	  if (!rtx_equal_p (target, ops[0].value))
> +	    emit_move_insn (target, ops[0].value);
> +	  return;
> +	}
> +    }
> +  gcc_unreachable ();
> +}
> +
>  static void
>  expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
>  {
> @@ -3976,6 +4014,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
>  #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
>  #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
>  #define direct_vec_set_optab_supported_p direct_optab_supported_p
> +#define direct_vec_extract_optab_supported_p direct_optab_supported_p
>  
>  /* Return the optab used by internal function FN.  */
>  
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index bc947c0fde7..032624b1547 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -176,6 +176,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
>  DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
>  
>  DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
> +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
>  
>  DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
>  DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index 89ab55b72ae..4440bc505b1 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode)
>  	 && insn_operand_matches (icode, 2, reg3);
>  }
>  
> +/* Return whether the backend can emit a vec_extract instruction with
> +   a non-constant index.  */
> +bool
> +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
> +{
> +  if (!VECTOR_MODE_P (vec_mode))
> +    return false;
> +
> +  rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
> +  rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
> +
> +  enum insn_code icode = convert_optab_handler (vec_extract_optab,
> +						vec_mode, extr_mode);
> +
> +  const struct insn_data_d *data = &insn_data[icode];
> +  machine_mode idx_mode = data->operand[2].mode;
> +
> +  rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
> +
> +  return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
> +	 && insn_operand_matches (icode, 1, reg2)
> +	 && insn_operand_matches (icode, 2, reg3);
> +}
> +
>  /* This function is called when we are going to emit a compare instruction that
>     compares the values found in X and Y, using the rtl operator COMPARISON.
>  
> diff --git a/gcc/optabs.h b/gcc/optabs.h
> index 29ccbe9235e..f14d012d92f 100644
> --- a/gcc/optabs.h
> +++ b/gcc/optabs.h
> @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
>  /* Return whether the backend can emit vector set instructions for inserting
>     element into vector at variable index position.  */
>  extern bool can_vec_set_var_idx_p (machine_mode);
> +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
>  
>  extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
>  			    machine_mode, int);
>
Robin Dapp July 4, 2023, 6:46 p.m. UTC | #2
Hi Richard,

changed the patch according to your comments and I agree that
it is more readable that way.  I hope using lhs as target for
the extract directly is possible the way I did it.  Richard's
patch for aarch64 is already, therefore testsuites on aarch64 and
i386 are unchanged.

Regards
 Robin

Subject: [PATCH v2] gimple-isel: Recognize vec_extract pattern.

In gimple-isel we already deduce a vec_set pattern from an
ARRAY_REF(VIEW_CONVERT_EXPR).  This patch does the same for a
vec_extract.

The code is largely similar to the vec_set one
including the addition of a can_vec_extract_var_idx_p function
in optabs.cc to check if the backend can handle a register
operand as index.  We already have can_vec_extract in
optabs-query but that one checks whether we can extract
specific modes.

With the introduction of an internal function for vec_extract
the expander must not FAIL.  For vec_set this has already been
the case so adjust the documentation accordingly.

Additionally, clarify the wording of the vector-vector case for
vec_extract.

gcc/ChangeLog:

	* doc/md.texi: Document that vec_set and vec_extract must not
	fail.
	* gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
	(gimple_expand_vec_set_extract_expr): ...to this.
	(gimple_expand_vec_exprs): Call renamed function.
	* internal-fn.cc (vec_extract_direct): Add.
	(expand_vec_extract_optab_fn): New function to expand
	vec_extract optab.
	(direct_vec_extract_optab_supported_p): Add.
	* internal-fn.def (VEC_EXTRACT): Add.
	* optabs.cc (can_vec_extract_var_idx_p): New function.
	* optabs.h (can_vec_extract_var_idx_p): Declare.
---
 gcc/doc/md.texi     |   7 +++-
 gcc/gimple-isel.cc  | 100 ++++++++++++++++++++++++++++++++------------
 gcc/internal-fn.cc  |  39 +++++++++++++++++
 gcc/internal-fn.def |   1 +
 gcc/optabs.cc       |  24 +++++++++++
 gcc/optabs.h        |   1 +
 6 files changed, 144 insertions(+), 28 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index f14dd32b2dc..b30a824488b 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5091,6 +5091,8 @@ Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored.
 Set given field in the vector value.  Operand 0 is the vector to modify,
 operand 1 is new value of field and operand 2 specify the field index.
 
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_extract@var{m}@var{n}} instruction pattern
 @item @samp{vec_extract@var{m}@var{n}}
 Extract given field from the vector value.  Operand 1 is the vector, operand 2
@@ -5098,7 +5100,10 @@ specify field index and operand 0 place to store value into.  The
 @var{n} mode is the mode of the field or vector of fields that should be
 extracted, should be either element mode of the vector mode @var{m}, or
 a vector mode with the same element mode and smaller number of elements.
-If @var{n} is a vector mode, the index is counted in units of that mode.
+If @var{n} is a vector mode the index is counted in multiples of
+mode @var{n}.
+
+This pattern is not allowed to @code{FAIL}.
 
 @cindex @code{vec_init@var{m}@var{n}} instruction pattern
 @item @samp{vec_init@var{m}@var{n}}
diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index ef688ddb57f..a18b26dec7b 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -42,17 +42,27 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
    internal function based on vector type of selected expansion.
-   i.e.:
+
+   For vec_set:
+
      VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
    =>
      _7 = u;
      _8 = .VEC_SET (_7, i_4(D), _1);
-     u = _8;  */
+     u = _8;
+
+   For vec_extract:
+
+      _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
+   =>
+      _4 = vD.2208;
+      _5 = .VEC_EXTRACT (_4, idx_2(D));
+      _3 = _5;  */
 
 static bool
-gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
+gimple_expand_vec_set_extract_expr (struct function *fun,
+				    gimple_stmt_iterator *gsi)
 {
-  enum tree_code code;
   gcall *new_stmt = NULL;
   gassign *ass_stmt = NULL;
   bool cfg_changed = false;
@@ -62,49 +72,84 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
   if (!stmt)
     return false;
 
+  bool is_extract = false;
+
   tree lhs = gimple_assign_lhs (stmt);
-  code = TREE_CODE (lhs);
-  if (code != ARRAY_REF)
+  tree rhs = gimple_assign_rhs1 (stmt);
+  tree val, ref;
+  if (TREE_CODE (lhs) == ARRAY_REF)
+    {
+      /* Assume it is a vec_set.  */
+      val = rhs;
+      ref = lhs;
+    }
+  else if (TREE_CODE (rhs) == ARRAY_REF)
+    {
+      /* vec_extract.  */
+      is_extract = true;
+      val = lhs;
+      ref = rhs;
+    }
+  else
     return false;
 
-  tree val = gimple_assign_rhs1 (stmt);
-  tree op0 = TREE_OPERAND (lhs, 0);
+  tree op0 = TREE_OPERAND (ref, 0);
   if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
-      && TYPE_MODE (TREE_TYPE (lhs))
+      && TYPE_MODE (TREE_TYPE (ref))
 	   == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
     {
-      tree pos = TREE_OPERAND (lhs, 1);
+      tree pos = TREE_OPERAND (ref, 1);
+
       tree view_op0 = TREE_OPERAND (op0, 0);
       machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
+      machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
+
       if (auto_var_in_fn_p (view_op0, fun->decl)
-	  && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
+	  && !TREE_ADDRESSABLE (view_op0)
+	  && ((!is_extract && can_vec_set_var_idx_p (outermode))
+	      || (is_extract
+		  && can_vec_extract_var_idx_p (outermode, extract_mode))))
 	{
 	  location_t loc = gimple_location (stmt);
 	  tree var_src = make_ssa_name (TREE_TYPE (view_op0));
-	  tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
 
 	  ass_stmt = gimple_build_assign (var_src, view_op0);
 	  gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
 	  gimple_set_location (ass_stmt, loc);
 	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
 
-	  new_stmt
-	    = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
-	  gimple_call_set_lhs (new_stmt, var_dst);
-	  gimple_set_location (new_stmt, loc);
-	  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+	  if (!is_extract)
+	    {
+	      tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
 
-	  ass_stmt = gimple_build_assign (view_op0, var_dst);
-	  gimple_set_location (ass_stmt, loc);
-	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
+	      new_stmt = gimple_build_call_internal (IFN_VEC_SET, 3, var_src,
+						     val, pos);
+
+	      gimple_call_set_lhs (new_stmt, var_dst);
+	      gimple_set_location (new_stmt, loc);
+	      gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
+
+	      ass_stmt = gimple_build_assign (view_op0, var_dst);
+	      gimple_set_location (ass_stmt, loc);
+	      gimple_move_vops (ass_stmt, stmt);
+	      gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
 
-	  basic_block bb = gimple_bb (stmt);
-	  gimple_move_vops (ass_stmt, stmt);
-	  if (gsi_remove (gsi, true)
-	      && gimple_purge_dead_eh_edges (bb))
-	    cfg_changed = true;
-	  *gsi = gsi_for_stmt (ass_stmt);
+	      basic_block bb = gimple_bb (stmt);
+	      if (gsi_remove (gsi, true)
+		  && gimple_purge_dead_eh_edges (bb))
+		cfg_changed = true;
+	      *gsi = gsi_for_stmt (ass_stmt);
+	    }
+	  else
+	    {
+	      new_stmt
+		= gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
+	      gimple_call_set_lhs (new_stmt, lhs);
+
+	      gsi_replace (gsi, new_stmt, true);
+	      cfg_changed = true;
+	    }
 	}
     }
 
@@ -317,7 +362,8 @@ gimple_expand_vec_exprs (struct function *fun)
 	      gsi_replace (&gsi, g, false);
 	    }
 
-	  cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
+	  cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
+
 	  if (gsi_end_p (gsi))
 	    break;
 	}
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 303df102d81..23f9309572b 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -175,6 +175,7 @@ init_internal_fns ()
 #define len_store_direct { 3, 3, false }
 #define len_maskstore_direct { 4, 5, false }
 #define vec_set_direct { 3, 3, false }
+#define vec_extract_direct { 3, 3, false }
 #define unary_direct { 0, 0, true }
 #define unary_convert_direct { -1, 0, true }
 #define binary_direct { 0, 0, true }
@@ -3107,6 +3108,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   gcc_unreachable ();
 }
 
+/* Expand VEC_EXTRACT optab internal function.  */
+
+static void
+expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree op0 = gimple_call_arg (stmt, 0);
+  tree op1 = gimple_call_arg (stmt, 1);
+
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+  machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
+  machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
+
+  rtx src = expand_normal (op0);
+  rtx pos = expand_normal (op1);
+
+  class expand_operand ops[3];
+  enum insn_code icode = convert_optab_handler (optab, outermode,
+						extract_mode);
+
+  if (icode != CODE_FOR_nothing)
+    {
+      create_output_operand (&ops[0], target, extract_mode);
+      create_input_operand (&ops[1], src, outermode);
+      create_convert_operand_from (&ops[2], pos,
+				   TYPE_MODE (TREE_TYPE (op1)), true);
+      if (maybe_expand_insn (icode, 3, ops))
+	{
+	  if (!rtx_equal_p (target, ops[0].value))
+	    emit_move_insn (target, ops[0].value);
+	  return;
+	}
+    }
+  gcc_unreachable ();
+}
+
 static void
 expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
 {
@@ -3946,6 +3984,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
 #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
 #define direct_vec_set_optab_supported_p direct_optab_supported_p
+#define direct_vec_extract_optab_supported_p direct_optab_supported_p
 
 /* Return the optab used by internal function FN.  */
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 9b73e540d55..238b7ee0bc9 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -180,6 +180,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
 DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
 
 DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
+DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
 
 DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
 DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 0a2bcf3620c..4e9f58f8060 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode)
 	 && insn_operand_matches (icode, 2, reg3);
 }
 
+/* Return whether the backend can emit a vec_extract instruction with
+   a non-constant index.  */
+bool
+can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
+{
+  if (!VECTOR_MODE_P (vec_mode))
+    return false;
+
+  rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
+  rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
+
+  enum insn_code icode = convert_optab_handler (vec_extract_optab,
+						vec_mode, extr_mode);
+
+  const struct insn_data_d *data = &insn_data[icode];
+  machine_mode idx_mode = data->operand[2].mode;
+
+  rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
+
+  return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
+	 && insn_operand_matches (icode, 1, reg2)
+	 && insn_operand_matches (icode, 2, reg3);
+}
+
 /* This function is called when we are going to emit a compare instruction that
    compares the values found in X and Y, using the rtl operator COMPARISON.
 
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 781d5548be9..c80b7f4dc1b 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
 /* Return whether the backend can emit vector set instructions for inserting
    element into vector at variable index position.  */
 extern bool can_vec_set_var_idx_p (machine_mode);
+extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
 
 extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
 			    machine_mode, int);
Richard Biener July 5, 2023, 8:10 a.m. UTC | #3
On Tue, 4 Jul 2023, Robin Dapp wrote:

> Hi Richard,
> 
> changed the patch according to your comments and I agree that
> it is more readable that way.  I hope using lhs as target for
> the extract directly is possible the way I did it.  Richard's
> patch for aarch64 is already, therefore testsuites on aarch64 and
> i386 are unchanged.
> 
> Regards
>  Robin
> 
> Subject: [PATCH v2] gimple-isel: Recognize vec_extract pattern.
> 
> In gimple-isel we already deduce a vec_set pattern from an
> ARRAY_REF(VIEW_CONVERT_EXPR).  This patch does the same for a
> vec_extract.
> 
> The code is largely similar to the vec_set one
> including the addition of a can_vec_extract_var_idx_p function
> in optabs.cc to check if the backend can handle a register
> operand as index.  We already have can_vec_extract in
> optabs-query but that one checks whether we can extract
> specific modes.
> 
> With the introduction of an internal function for vec_extract
> the expander must not FAIL.  For vec_set this has already been
> the case so adjust the documentation accordingly.
> 
> Additionally, clarify the wording of the vector-vector case for
> vec_extract.
> 
> gcc/ChangeLog:
> 
> 	* doc/md.texi: Document that vec_set and vec_extract must not
> 	fail.
> 	* gimple-isel.cc (gimple_expand_vec_set_expr): Rename this...
> 	(gimple_expand_vec_set_extract_expr): ...to this.
> 	(gimple_expand_vec_exprs): Call renamed function.
> 	* internal-fn.cc (vec_extract_direct): Add.
> 	(expand_vec_extract_optab_fn): New function to expand
> 	vec_extract optab.
> 	(direct_vec_extract_optab_supported_p): Add.
> 	* internal-fn.def (VEC_EXTRACT): Add.
> 	* optabs.cc (can_vec_extract_var_idx_p): New function.
> 	* optabs.h (can_vec_extract_var_idx_p): Declare.
> ---
>  gcc/doc/md.texi     |   7 +++-
>  gcc/gimple-isel.cc  | 100 ++++++++++++++++++++++++++++++++------------
>  gcc/internal-fn.cc  |  39 +++++++++++++++++
>  gcc/internal-fn.def |   1 +
>  gcc/optabs.cc       |  24 +++++++++++
>  gcc/optabs.h        |   1 +
>  6 files changed, 144 insertions(+), 28 deletions(-)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index f14dd32b2dc..b30a824488b 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5091,6 +5091,8 @@ Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored.
>  Set given field in the vector value.  Operand 0 is the vector to modify,
>  operand 1 is new value of field and operand 2 specify the field index.
>  
> +This pattern is not allowed to @code{FAIL}.
> +
>  @cindex @code{vec_extract@var{m}@var{n}} instruction pattern
>  @item @samp{vec_extract@var{m}@var{n}}
>  Extract given field from the vector value.  Operand 1 is the vector, operand 2
> @@ -5098,7 +5100,10 @@ specify field index and operand 0 place to store value into.  The
>  @var{n} mode is the mode of the field or vector of fields that should be
>  extracted, should be either element mode of the vector mode @var{m}, or
>  a vector mode with the same element mode and smaller number of elements.
> -If @var{n} is a vector mode, the index is counted in units of that mode.
> +If @var{n} is a vector mode the index is counted in multiples of
> +mode @var{n}.
> +
> +This pattern is not allowed to @code{FAIL}.
>  
>  @cindex @code{vec_init@var{m}@var{n}} instruction pattern
>  @item @samp{vec_init@var{m}@var{n}}
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index ef688ddb57f..a18b26dec7b 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -42,17 +42,27 @@ along with GCC; see the file COPYING3.  If not see
>  
>  /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
>     internal function based on vector type of selected expansion.
> -   i.e.:
> +
> +   For vec_set:
> +
>       VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
>     =>
>       _7 = u;
>       _8 = .VEC_SET (_7, i_4(D), _1);
> -     u = _8;  */
> +     u = _8;
> +
> +   For vec_extract:
> +
> +      _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
> +   =>
> +      _4 = vD.2208;
> +      _5 = .VEC_EXTRACT (_4, idx_2(D));
> +      _3 = _5;  */

I think you are doing

         _3 = .VEC_EXTRACT (_4, idx_2(D));

and avoiding the SSA name copy correctly.  Can you double-check?

OK with the comment adjusted.

Thanks,
Richard.

>  
>  static bool
> -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
> +gimple_expand_vec_set_extract_expr (struct function *fun,
> +				    gimple_stmt_iterator *gsi)
>  {
> -  enum tree_code code;
>    gcall *new_stmt = NULL;
>    gassign *ass_stmt = NULL;
>    bool cfg_changed = false;
> @@ -62,49 +72,84 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
>    if (!stmt)
>      return false;
>  
> +  bool is_extract = false;
> +
>    tree lhs = gimple_assign_lhs (stmt);
> -  code = TREE_CODE (lhs);
> -  if (code != ARRAY_REF)
> +  tree rhs = gimple_assign_rhs1 (stmt);
> +  tree val, ref;
> +  if (TREE_CODE (lhs) == ARRAY_REF)
> +    {
> +      /* Assume it is a vec_set.  */
> +      val = rhs;
> +      ref = lhs;
> +    }
> +  else if (TREE_CODE (rhs) == ARRAY_REF)
> +    {
> +      /* vec_extract.  */
> +      is_extract = true;
> +      val = lhs;
> +      ref = rhs;
> +    }
> +  else
>      return false;
>  
> -  tree val = gimple_assign_rhs1 (stmt);
> -  tree op0 = TREE_OPERAND (lhs, 0);
> +  tree op0 = TREE_OPERAND (ref, 0);
>    if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
>        && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
> -      && TYPE_MODE (TREE_TYPE (lhs))
> +      && TYPE_MODE (TREE_TYPE (ref))
>  	   == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
>      {
> -      tree pos = TREE_OPERAND (lhs, 1);
> +      tree pos = TREE_OPERAND (ref, 1);
> +
>        tree view_op0 = TREE_OPERAND (op0, 0);
>        machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
> +      machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));
> +
>        if (auto_var_in_fn_p (view_op0, fun->decl)
> -	  && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
> +	  && !TREE_ADDRESSABLE (view_op0)
> +	  && ((!is_extract && can_vec_set_var_idx_p (outermode))
> +	      || (is_extract
> +		  && can_vec_extract_var_idx_p (outermode, extract_mode))))
>  	{
>  	  location_t loc = gimple_location (stmt);
>  	  tree var_src = make_ssa_name (TREE_TYPE (view_op0));
> -	  tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
>  
>  	  ass_stmt = gimple_build_assign (var_src, view_op0);
>  	  gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
>  	  gimple_set_location (ass_stmt, loc);
>  	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>  
> -	  new_stmt
> -	    = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
> -	  gimple_call_set_lhs (new_stmt, var_dst);
> -	  gimple_set_location (new_stmt, loc);
> -	  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
> +	  if (!is_extract)
> +	    {
> +	      tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
>  
> -	  ass_stmt = gimple_build_assign (view_op0, var_dst);
> -	  gimple_set_location (ass_stmt, loc);
> -	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
> +	      new_stmt = gimple_build_call_internal (IFN_VEC_SET, 3, var_src,
> +						     val, pos);
> +
> +	      gimple_call_set_lhs (new_stmt, var_dst);
> +	      gimple_set_location (new_stmt, loc);
> +	      gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
> +
> +	      ass_stmt = gimple_build_assign (view_op0, var_dst);
> +	      gimple_set_location (ass_stmt, loc);
> +	      gimple_move_vops (ass_stmt, stmt);
> +	      gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
>  
> -	  basic_block bb = gimple_bb (stmt);
> -	  gimple_move_vops (ass_stmt, stmt);
> -	  if (gsi_remove (gsi, true)
> -	      && gimple_purge_dead_eh_edges (bb))
> -	    cfg_changed = true;
> -	  *gsi = gsi_for_stmt (ass_stmt);
> +	      basic_block bb = gimple_bb (stmt);
> +	      if (gsi_remove (gsi, true)
> +		  && gimple_purge_dead_eh_edges (bb))
> +		cfg_changed = true;
> +	      *gsi = gsi_for_stmt (ass_stmt);
> +	    }
> +	  else
> +	    {
> +	      new_stmt
> +		= gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
> +	      gimple_call_set_lhs (new_stmt, lhs);
> +
> +	      gsi_replace (gsi, new_stmt, true);
> +	      cfg_changed = true;
> +	    }
>  	}
>      }
>  
> @@ -317,7 +362,8 @@ gimple_expand_vec_exprs (struct function *fun)
>  	      gsi_replace (&gsi, g, false);
>  	    }
>  
> -	  cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
> +	  cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
> +
>  	  if (gsi_end_p (gsi))
>  	    break;
>  	}
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 303df102d81..23f9309572b 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -175,6 +175,7 @@ init_internal_fns ()
>  #define len_store_direct { 3, 3, false }
>  #define len_maskstore_direct { 4, 5, false }
>  #define vec_set_direct { 3, 3, false }
> +#define vec_extract_direct { 3, 3, false }
>  #define unary_direct { 0, 0, true }
>  #define unary_convert_direct { -1, 0, true }
>  #define binary_direct { 0, 0, true }
> @@ -3107,6 +3108,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
>    gcc_unreachable ();
>  }
>  
> +/* Expand VEC_EXTRACT optab internal function.  */
> +
> +static void
> +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +{
> +  tree lhs = gimple_call_lhs (stmt);
> +  tree op0 = gimple_call_arg (stmt, 0);
> +  tree op1 = gimple_call_arg (stmt, 1);
> +
> +  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +
> +  machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
> +  machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
> +
> +  rtx src = expand_normal (op0);
> +  rtx pos = expand_normal (op1);
> +
> +  class expand_operand ops[3];
> +  enum insn_code icode = convert_optab_handler (optab, outermode,
> +						extract_mode);
> +
> +  if (icode != CODE_FOR_nothing)
> +    {
> +      create_output_operand (&ops[0], target, extract_mode);
> +      create_input_operand (&ops[1], src, outermode);
> +      create_convert_operand_from (&ops[2], pos,
> +				   TYPE_MODE (TREE_TYPE (op1)), true);
> +      if (maybe_expand_insn (icode, 3, ops))
> +	{
> +	  if (!rtx_equal_p (target, ops[0].value))
> +	    emit_move_insn (target, ops[0].value);
> +	  return;
> +	}
> +    }
> +  gcc_unreachable ();
> +}
> +
>  static void
>  expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
>  {
> @@ -3946,6 +3984,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
>  #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
>  #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
>  #define direct_vec_set_optab_supported_p direct_optab_supported_p
> +#define direct_vec_extract_optab_supported_p direct_optab_supported_p
>  
>  /* Return the optab used by internal function FN.  */
>  
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 9b73e540d55..238b7ee0bc9 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -180,6 +180,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
>  DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
>  
>  DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
> +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
>  
>  DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
>  DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index 0a2bcf3620c..4e9f58f8060 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode)
>  	 && insn_operand_matches (icode, 2, reg3);
>  }
>  
> +/* Return whether the backend can emit a vec_extract instruction with
> +   a non-constant index.  */
> +bool
> +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
> +{
> +  if (!VECTOR_MODE_P (vec_mode))
> +    return false;
> +
> +  rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
> +  rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
> +
> +  enum insn_code icode = convert_optab_handler (vec_extract_optab,
> +						vec_mode, extr_mode);
> +
> +  const struct insn_data_d *data = &insn_data[icode];
> +  machine_mode idx_mode = data->operand[2].mode;
> +
> +  rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
> +
> +  return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
> +	 && insn_operand_matches (icode, 1, reg2)
> +	 && insn_operand_matches (icode, 2, reg3);
> +}
> +
>  /* This function is called when we are going to emit a compare instruction that
>     compares the values found in X and Y, using the rtl operator COMPARISON.
>  
> diff --git a/gcc/optabs.h b/gcc/optabs.h
> index 781d5548be9..c80b7f4dc1b 100644
> --- a/gcc/optabs.h
> +++ b/gcc/optabs.h
> @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
>  /* Return whether the backend can emit vector set instructions for inserting
>     element into vector at variable index position.  */
>  extern bool can_vec_set_var_idx_p (machine_mode);
> +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
>  
>  extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
>  			    machine_mode, int);
>
Robin Dapp July 5, 2023, 8:14 a.m. UTC | #4
>> +      _4 = vD.2208;
>> +      _5 = .VEC_EXTRACT (_4, idx_2(D));
>> +      _3 = _5;  */
> 
> I think you are doing
> 
>          _3 = .VEC_EXTRACT (_4, idx_2(D));
> 
> and avoiding the SSA name copy correctly.  Can you double-check?
> 
> OK with the comment adjusted.

Argh, yes, thanks.

Regards
 Robin
diff mbox series

Patch

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 9648fdc846a..c61602fb04d 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5074,6 +5074,8 @@  of the result should be stored to memory.
 Set given field in the vector value.  Operand 0 is the vector to modify,
 operand 1 is new value of field and operand 2 specify the field index.
 
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_extract@var{m}@var{n}} instruction pattern
 @item @samp{vec_extract@var{m}@var{n}}
 Extract given field from the vector value.  Operand 1 is the vector, operand 2
@@ -5081,7 +5083,10 @@  specify field index and operand 0 place to store value into.  The
 @var{n} mode is the mode of the field or vector of fields that should be
 extracted, should be either element mode of the vector mode @var{m}, or
 a vector mode with the same element mode and smaller number of elements.
-If @var{n} is a vector mode, the index is counted in units of that mode.
+If @var{n} is a vector mode the index is counted in multiples of
+mode @var{n}.
+
+This pattern is not allowed to @code{FAIL}.
 
 @cindex @code{vec_init@var{m}@var{n}} instruction pattern
 @item @samp{vec_init@var{m}@var{n}}
diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index ef688ddb57f..d6c560b63dd 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -42,15 +42,26 @@  along with GCC; see the file COPYING3.  If not see
 
 /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to
    internal function based on vector type of selected expansion.
-   i.e.:
+
+   For vec_set:
+
      VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D);
    =>
      _7 = u;
      _8 = .VEC_SET (_7, i_4(D), _1);
-     u = _8;  */
+     u = _8;
+
+   For vec_extract:
+
+      _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)];
+   =>
+      _4 = vD.2208;
+      _5 = .VEC_EXTRACT (_4, idx_2(D));
+      _3 = _5;  */
 
 static bool
-gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
+gimple_expand_vec_set_extract_expr (struct function *fun,
+				    gimple_stmt_iterator *gsi)
 {
   enum tree_code code;
   gcall *new_stmt = NULL;
@@ -62,45 +73,88 @@  gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi)
   if (!stmt)
     return false;
 
+  bool is_extract = false;
+
   tree lhs = gimple_assign_lhs (stmt);
+  tree rhs = gimple_assign_rhs1 (stmt);
+  tree val, op0;
   code = TREE_CODE (lhs);
-  if (code != ARRAY_REF)
-    return false;
+  if (code == ARRAY_REF)
+    {
+      /* Assume it is a vec_set.  */
+      val = rhs;
+      op0 = TREE_OPERAND (lhs, 0);
+    }
+  else
+    {
+      /* It can still be a vec_extract.  */
+      code = TREE_CODE (rhs);
+      if (code != ARRAY_REF)
+	return false;
+
+      /* Sides are swapped for vec_extract.  */
+      is_extract = true;
+      val = lhs;
+      op0 = TREE_OPERAND (rhs, 0);
+    }
 
-  tree val = gimple_assign_rhs1 (stmt);
-  tree op0 = TREE_OPERAND (lhs, 0);
   if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0))
       && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
       && TYPE_MODE (TREE_TYPE (lhs))
 	   == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0)))))
     {
-      tree pos = TREE_OPERAND (lhs, 1);
+      tree pos;
+      if (!is_extract)
+	pos = TREE_OPERAND (lhs, 1);
+      else
+	pos = TREE_OPERAND (rhs, 1);
+
       tree view_op0 = TREE_OPERAND (op0, 0);
       machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
+      machine_mode extract_mode;
+      if (is_extract)
+	extract_mode = TYPE_MODE (TREE_TYPE (lhs));
+
       if (auto_var_in_fn_p (view_op0, fun->decl)
-	  && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode))
+	  && !TREE_ADDRESSABLE (view_op0)
+	  && ((!is_extract && can_vec_set_var_idx_p (outermode))
+	      || (is_extract &&
+		  can_vec_extract_var_idx_p (outermode, extract_mode))))
 	{
 	  location_t loc = gimple_location (stmt);
 	  tree var_src = make_ssa_name (TREE_TYPE (view_op0));
-	  tree var_dst = make_ssa_name (TREE_TYPE (view_op0));
+	  tree var_dst;
+	  if (!is_extract)
+	    var_dst = make_ssa_name (TREE_TYPE (view_op0));
+	  else
+	    var_dst = make_ssa_name (TREE_TYPE (lhs));
 
 	  ass_stmt = gimple_build_assign (var_src, view_op0);
 	  gimple_set_vuse (ass_stmt, gimple_vuse (stmt));
 	  gimple_set_location (ass_stmt, loc);
 	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
 
-	  new_stmt
-	    = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
+	  if (!is_extract)
+	    new_stmt
+	      = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos);
+	  else
+	    new_stmt
+	      = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos);
+
 	  gimple_call_set_lhs (new_stmt, var_dst);
 	  gimple_set_location (new_stmt, loc);
 	  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
 
-	  ass_stmt = gimple_build_assign (view_op0, var_dst);
+	  if (!is_extract)
+	    ass_stmt = gimple_build_assign (view_op0, var_dst);
+	  else
+	    ass_stmt = gimple_build_assign (lhs, var_dst);
 	  gimple_set_location (ass_stmt, loc);
+	  if (!is_extract)
+	    gimple_move_vops (ass_stmt, stmt);
 	  gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT);
 
 	  basic_block bb = gimple_bb (stmt);
-	  gimple_move_vops (ass_stmt, stmt);
 	  if (gsi_remove (gsi, true)
 	      && gimple_purge_dead_eh_edges (bb))
 	    cfg_changed = true;
@@ -317,7 +371,8 @@  gimple_expand_vec_exprs (struct function *fun)
 	      gsi_replace (&gsi, g, false);
 	    }
 
-	  cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi);
+	  cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi);
+
 	  if (gsi_end_p (gsi))
 	    break;
 	}
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 9017176dc7a..cbdb7e64d7c 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -175,6 +175,7 @@  init_internal_fns ()
 #define len_store_direct { 3, 3, false }
 #define len_maskstore_direct { 4, 3, false }
 #define vec_set_direct { 3, 3, false }
+#define vec_extract_direct { 3, 3, false }
 #define unary_direct { 0, 0, true }
 #define unary_convert_direct { -1, 0, true }
 #define binary_direct { 0, 0, true }
@@ -3126,6 +3127,43 @@  expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
   gcc_unreachable ();
 }
 
+/* Expand VEC_EXTRACT optab internal function.  */
+
+static void
+expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree op0 = gimple_call_arg (stmt, 0);
+  tree op1 = gimple_call_arg (stmt, 1);
+
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+  machine_mode outermode = TYPE_MODE (TREE_TYPE (op0));
+  machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs));
+
+  rtx src = expand_normal (op0);
+  rtx pos = expand_normal (op1);
+
+  class expand_operand ops[3];
+  enum insn_code icode = convert_optab_handler (optab, outermode,
+						extract_mode);
+
+  if (icode != CODE_FOR_nothing)
+    {
+      create_output_operand (&ops[0], target, extract_mode);
+      create_input_operand (&ops[1], src, outermode);
+      create_convert_operand_from (&ops[2], pos,
+				   TYPE_MODE (TREE_TYPE (op1)), true);
+      if (maybe_expand_insn (icode, 3, ops))
+	{
+	  if (!rtx_equal_p (target, ops[0].value))
+	    emit_move_insn (target, ops[0].value);
+	  return;
+	}
+    }
+  gcc_unreachable ();
+}
+
 static void
 expand_ABNORMAL_DISPATCHER (internal_fn, gcall *)
 {
@@ -3976,6 +4014,7 @@  multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p
 #define direct_check_ptrs_optab_supported_p direct_optab_supported_p
 #define direct_vec_set_optab_supported_p direct_optab_supported_p
+#define direct_vec_extract_optab_supported_p direct_optab_supported_p
 
 /* Return the optab used by internal function FN.  */
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index bc947c0fde7..032624b1547 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -176,6 +176,7 @@  DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond)
 DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask)
 
 DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
+DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract)
 
 DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
 DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore)
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 89ab55b72ae..4440bc505b1 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -4417,6 +4417,30 @@  can_vec_set_var_idx_p (machine_mode vec_mode)
 	 && insn_operand_matches (icode, 2, reg3);
 }
 
+/* Return whether the backend can emit a vec_extract instruction with
+   a non-constant index.  */
+bool
+can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode)
+{
+  if (!VECTOR_MODE_P (vec_mode))
+    return false;
+
+  rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1);
+  rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2);
+
+  enum insn_code icode = convert_optab_handler (vec_extract_optab,
+						vec_mode, extr_mode);
+
+  const struct insn_data_d *data = &insn_data[icode];
+  machine_mode idx_mode = data->operand[2].mode;
+
+  rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3);
+
+  return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1)
+	 && insn_operand_matches (icode, 1, reg2)
+	 && insn_operand_matches (icode, 2, reg3);
+}
+
 /* This function is called when we are going to emit a compare instruction that
    compares the values found in X and Y, using the rtl operator COMPARISON.
 
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 29ccbe9235e..f14d012d92f 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -259,6 +259,7 @@  extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode);
 /* Return whether the backend can emit vector set instructions for inserting
    element into vector at variable index position.  */
 extern bool can_vec_set_var_idx_p (machine_mode);
+extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode);
 
 extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode,
 			    machine_mode, int);