Message ID | 5104f927-d7db-c148-911c-0ca8e783a609@gmail.com |
---|---|
State | New |
Headers | show |
Series | gimple-isel: Recognize vec_extract pattern. | expand |
On Mon, 3 Jul 2023, Robin Dapp wrote: > Hi, > > In gimple-isel we already deduce a vec_set pattern from an > ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a > vec_extract. > > The code is largely similar to the vec_set one including > the addition of a can_vec_extract_var_idx_p function > in optabs.cc to check if the backend can handle a register > operand as index. We already have can_vec_extract in > optabs-query but that one checks whether we can extract > specific modes. > > With the introduction of an internal function for vec_extract > the expander must not FAIL. For vec_set this has already been > the case so adjust the documentation accordingly. > > Additionally, clarify the wording of the vector-vector case for > vec_extract. > > During testing I noticed that the aarch64 simd vec_extract > expander is the only one that FAILs. Richard is currently > testing a patch that tries to remove this. Bootstrap and > testsuite was unchanged on x86 and power. > > I was a bit torn whether to add a separate function to recognize > vec_extract or not and ended up doing it inline with several > is_extract checks. > > Regards > Robin > > gcc/ChangeLog: > > * doc/md.texi: Document that vec_set and vec_extract must not > fail. > * gimple-isel.cc (gimple_expand_vec_set_expr): Rename this... > (gimple_expand_vec_set_extract_expr): ...to this. > (gimple_expand_vec_exprs): Call renamed function. > * internal-fn.cc (vec_extract_direct): Add. > (expand_vec_extract_optab_fn): New function to expand > vec_extract optab. > (direct_vec_extract_optab_supported_p): Add. > * internal-fn.def (VEC_EXTRACT): Add. > * optabs.cc (can_vec_extract_var_idx_p): New function. > * optabs.h (can_vec_extract_var_idx_p): Declare. > --- > gcc/doc/md.texi | 7 +++- > gcc/gimple-isel.cc | 85 +++++++++++++++++++++++++++++++++++++-------- > gcc/internal-fn.cc | 39 +++++++++++++++++++++ > gcc/internal-fn.def | 1 + > gcc/optabs.cc | 24 +++++++++++++ > gcc/optabs.h | 1 + > 6 files changed, 141 insertions(+), 16 deletions(-) > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index 9648fdc846a..c61602fb04d 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -5074,6 +5074,8 @@ of the result should be stored to memory. > Set given field in the vector value. Operand 0 is the vector to modify, > operand 1 is new value of field and operand 2 specify the field index. > > +This pattern is not allowed to @code{FAIL}. > + > @cindex @code{vec_extract@var{m}@var{n}} instruction pattern > @item @samp{vec_extract@var{m}@var{n}} > Extract given field from the vector value. Operand 1 is the vector, operand 2 > @@ -5081,7 +5083,10 @@ specify field index and operand 0 place to store value into. The > @var{n} mode is the mode of the field or vector of fields that should be > extracted, should be either element mode of the vector mode @var{m}, or > a vector mode with the same element mode and smaller number of elements. > -If @var{n} is a vector mode, the index is counted in units of that mode. > +If @var{n} is a vector mode the index is counted in multiples of > +mode @var{n}. > + > +This pattern is not allowed to @code{FAIL}. > > @cindex @code{vec_init@var{m}@var{n}} instruction pattern > @item @samp{vec_init@var{m}@var{n}} > diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc > index ef688ddb57f..d6c560b63dd 100644 > --- a/gcc/gimple-isel.cc > +++ b/gcc/gimple-isel.cc > @@ -42,15 +42,26 @@ along with GCC; see the file COPYING3. If not see > > /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to > internal function based on vector type of selected expansion. > - i.e.: > + > + For vec_set: > + > VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D); > => > _7 = u; > _8 = .VEC_SET (_7, i_4(D), _1); > - u = _8; */ > + u = _8; > + > + For vec_extract: > + > + _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)]; > + => > + _4 = vD.2208; > + _5 = .VEC_EXTRACT (_4, idx_2(D)); > + _3 = _5; */ > > static bool > -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) > +gimple_expand_vec_set_extract_expr (struct function *fun, > + gimple_stmt_iterator *gsi) > { > enum tree_code code; > gcall *new_stmt = NULL; > @@ -62,45 +73,88 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) > if (!stmt) > return false; > > + bool is_extract = false; > + > tree lhs = gimple_assign_lhs (stmt); > + tree rhs = gimple_assign_rhs1 (stmt); > + tree val, op0; > code = TREE_CODE (lhs); > - if (code != ARRAY_REF) > - return false; > + if (code == ARRAY_REF) > + { > + /* Assume it is a vec_set. */ > + val = rhs; > + op0 = TREE_OPERAND (lhs, 0); > + } > + else > + { > + /* It can still be a vec_extract. */ > + code = TREE_CODE (rhs); > + if (code != ARRAY_REF) > + return false; > + can you make this if (TREE_CODE (lhs) == ARRAY_REF) .. else if (TREE_CODE (rhs) == ARRAY_REF) .. else return false; that's way easier to read. > + /* Sides are swapped for vec_extract. */ > + is_extract = true; > + val = lhs; > + op0 = TREE_OPERAND (rhs, 0); > + } > > - tree val = gimple_assign_rhs1 (stmt); > - tree op0 = TREE_OPERAND (lhs, 0); > if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0)) > && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))) > && TYPE_MODE (TREE_TYPE (lhs)) this 'lhs' would need to be 'rhs' for the extract? So in addition to 'val' add 'ref' and initialize that? > == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0))))) > { > - tree pos = TREE_OPERAND (lhs, 1); > + tree pos; > + if (!is_extract) > + pos = TREE_OPERAND (lhs, 1); > + else > + pos = TREE_OPERAND (rhs, 1); > + then makes TREE_OPERAND (ref, 1); > tree view_op0 = TREE_OPERAND (op0, 0); > machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0)); > + machine_mode extract_mode; > + if (is_extract) > + extract_mode = TYPE_MODE (TREE_TYPE (lhs)); > + and this would be 'val', though looking at 'ref' looks better here. > if (auto_var_in_fn_p (view_op0, fun->decl) > - && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode)) > + && !TREE_ADDRESSABLE (view_op0) > + && ((!is_extract && can_vec_set_var_idx_p (outermode)) > + || (is_extract && > + can_vec_extract_var_idx_p (outermode, extract_mode)))) > { > location_t loc = gimple_location (stmt); > tree var_src = make_ssa_name (TREE_TYPE (view_op0)); > - tree var_dst = make_ssa_name (TREE_TYPE (view_op0)); > + tree var_dst; > + if (!is_extract) > + var_dst = make_ssa_name (TREE_TYPE (view_op0)); > + else > + var_dst = make_ssa_name (TREE_TYPE (lhs)); > > ass_stmt = gimple_build_assign (var_src, view_op0); > gimple_set_vuse (ass_stmt, gimple_vuse (stmt)); > gimple_set_location (ass_stmt, loc); > gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); > > - new_stmt > - = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); > + if (!is_extract) > + new_stmt > + = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); > + else > + new_stmt > + = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos); > + > gimple_call_set_lhs (new_stmt, var_dst); > gimple_set_location (new_stmt, loc); > gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); > > - ass_stmt = gimple_build_assign (view_op0, var_dst); > + if (!is_extract) > + ass_stmt = gimple_build_assign (view_op0, var_dst); > + else > + ass_stmt = gimple_build_assign (lhs, var_dst); as you preserve the LHS for the original testcase you should use gsi_replace () for it. I think that making a bigger if (is_extract) group covering all stmt generation would make this easier to read. Thanks, Richard. > gimple_set_location (ass_stmt, loc); > + if (!is_extract) > + gimple_move_vops (ass_stmt, stmt); > gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); > > basic_block bb = gimple_bb (stmt); > - gimple_move_vops (ass_stmt, stmt); > if (gsi_remove (gsi, true) > && gimple_purge_dead_eh_edges (bb)) > cfg_changed = true; > @@ -317,7 +371,8 @@ gimple_expand_vec_exprs (struct function *fun) > gsi_replace (&gsi, g, false); > } > > - cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi); > + cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi); > + > if (gsi_end_p (gsi)) > break; > } > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 9017176dc7a..cbdb7e64d7c 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -175,6 +175,7 @@ init_internal_fns () > #define len_store_direct { 3, 3, false } > #define len_maskstore_direct { 4, 3, false } > #define vec_set_direct { 3, 3, false } > +#define vec_extract_direct { 3, 3, false } > #define unary_direct { 0, 0, true } > #define unary_convert_direct { -1, 0, true } > #define binary_direct { 0, 0, true } > @@ -3126,6 +3127,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab) > gcc_unreachable (); > } > > +/* Expand VEC_EXTRACT optab internal function. */ > + > +static void > +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab) > +{ > + tree lhs = gimple_call_lhs (stmt); > + tree op0 = gimple_call_arg (stmt, 0); > + tree op1 = gimple_call_arg (stmt, 1); > + > + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); > + > + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0)); > + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs)); > + > + rtx src = expand_normal (op0); > + rtx pos = expand_normal (op1); > + > + class expand_operand ops[3]; > + enum insn_code icode = convert_optab_handler (optab, outermode, > + extract_mode); > + > + if (icode != CODE_FOR_nothing) > + { > + create_output_operand (&ops[0], target, extract_mode); > + create_input_operand (&ops[1], src, outermode); > + create_convert_operand_from (&ops[2], pos, > + TYPE_MODE (TREE_TYPE (op1)), true); > + if (maybe_expand_insn (icode, 3, ops)) > + { > + if (!rtx_equal_p (target, ops[0].value)) > + emit_move_insn (target, ops[0].value); > + return; > + } > + } > + gcc_unreachable (); > +} > + > static void > expand_ABNORMAL_DISPATCHER (internal_fn, gcall *) > { > @@ -3976,6 +4014,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, > #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p > #define direct_check_ptrs_optab_supported_p direct_optab_supported_p > #define direct_vec_set_optab_supported_p direct_optab_supported_p > +#define direct_vec_extract_optab_supported_p direct_optab_supported_p > > /* Return the optab used by internal function FN. */ > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index bc947c0fde7..032624b1547 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -176,6 +176,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond) > DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask) > > DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set) > +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract) > > DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store) > DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore) > diff --git a/gcc/optabs.cc b/gcc/optabs.cc > index 89ab55b72ae..4440bc505b1 100644 > --- a/gcc/optabs.cc > +++ b/gcc/optabs.cc > @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode) > && insn_operand_matches (icode, 2, reg3); > } > > +/* Return whether the backend can emit a vec_extract instruction with > + a non-constant index. */ > +bool > +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode) > +{ > + if (!VECTOR_MODE_P (vec_mode)) > + return false; > + > + rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1); > + rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2); > + > + enum insn_code icode = convert_optab_handler (vec_extract_optab, > + vec_mode, extr_mode); > + > + const struct insn_data_d *data = &insn_data[icode]; > + machine_mode idx_mode = data->operand[2].mode; > + > + rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3); > + > + return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1) > + && insn_operand_matches (icode, 1, reg2) > + && insn_operand_matches (icode, 2, reg3); > +} > + > /* This function is called when we are going to emit a compare instruction that > compares the values found in X and Y, using the rtl operator COMPARISON. > > diff --git a/gcc/optabs.h b/gcc/optabs.h > index 29ccbe9235e..f14d012d92f 100644 > --- a/gcc/optabs.h > +++ b/gcc/optabs.h > @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode); > /* Return whether the backend can emit vector set instructions for inserting > element into vector at variable index position. */ > extern bool can_vec_set_var_idx_p (machine_mode); > +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode); > > extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode, > machine_mode, int); >
Hi Richard, changed the patch according to your comments and I agree that it is more readable that way. I hope using lhs as target for the extract directly is possible the way I did it. Richard's patch for aarch64 is already, therefore testsuites on aarch64 and i386 are unchanged. Regards Robin Subject: [PATCH v2] gimple-isel: Recognize vec_extract pattern. In gimple-isel we already deduce a vec_set pattern from an ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a vec_extract. The code is largely similar to the vec_set one including the addition of a can_vec_extract_var_idx_p function in optabs.cc to check if the backend can handle a register operand as index. We already have can_vec_extract in optabs-query but that one checks whether we can extract specific modes. With the introduction of an internal function for vec_extract the expander must not FAIL. For vec_set this has already been the case so adjust the documentation accordingly. Additionally, clarify the wording of the vector-vector case for vec_extract. gcc/ChangeLog: * doc/md.texi: Document that vec_set and vec_extract must not fail. * gimple-isel.cc (gimple_expand_vec_set_expr): Rename this... (gimple_expand_vec_set_extract_expr): ...to this. (gimple_expand_vec_exprs): Call renamed function. * internal-fn.cc (vec_extract_direct): Add. (expand_vec_extract_optab_fn): New function to expand vec_extract optab. (direct_vec_extract_optab_supported_p): Add. * internal-fn.def (VEC_EXTRACT): Add. * optabs.cc (can_vec_extract_var_idx_p): New function. * optabs.h (can_vec_extract_var_idx_p): Declare. --- gcc/doc/md.texi | 7 +++- gcc/gimple-isel.cc | 100 ++++++++++++++++++++++++++++++++------------ gcc/internal-fn.cc | 39 +++++++++++++++++ gcc/internal-fn.def | 1 + gcc/optabs.cc | 24 +++++++++++ gcc/optabs.h | 1 + 6 files changed, 144 insertions(+), 28 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index f14dd32b2dc..b30a824488b 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5091,6 +5091,8 @@ Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored. Set given field in the vector value. Operand 0 is the vector to modify, operand 1 is new value of field and operand 2 specify the field index. +This pattern is not allowed to @code{FAIL}. + @cindex @code{vec_extract@var{m}@var{n}} instruction pattern @item @samp{vec_extract@var{m}@var{n}} Extract given field from the vector value. Operand 1 is the vector, operand 2 @@ -5098,7 +5100,10 @@ specify field index and operand 0 place to store value into. The @var{n} mode is the mode of the field or vector of fields that should be extracted, should be either element mode of the vector mode @var{m}, or a vector mode with the same element mode and smaller number of elements. -If @var{n} is a vector mode, the index is counted in units of that mode. +If @var{n} is a vector mode the index is counted in multiples of +mode @var{n}. + +This pattern is not allowed to @code{FAIL}. @cindex @code{vec_init@var{m}@var{n}} instruction pattern @item @samp{vec_init@var{m}@var{n}} diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc index ef688ddb57f..a18b26dec7b 100644 --- a/gcc/gimple-isel.cc +++ b/gcc/gimple-isel.cc @@ -42,17 +42,27 @@ along with GCC; see the file COPYING3. If not see /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to internal function based on vector type of selected expansion. - i.e.: + + For vec_set: + VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D); => _7 = u; _8 = .VEC_SET (_7, i_4(D), _1); - u = _8; */ + u = _8; + + For vec_extract: + + _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)]; + => + _4 = vD.2208; + _5 = .VEC_EXTRACT (_4, idx_2(D)); + _3 = _5; */ static bool -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) +gimple_expand_vec_set_extract_expr (struct function *fun, + gimple_stmt_iterator *gsi) { - enum tree_code code; gcall *new_stmt = NULL; gassign *ass_stmt = NULL; bool cfg_changed = false; @@ -62,49 +72,84 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) if (!stmt) return false; + bool is_extract = false; + tree lhs = gimple_assign_lhs (stmt); - code = TREE_CODE (lhs); - if (code != ARRAY_REF) + tree rhs = gimple_assign_rhs1 (stmt); + tree val, ref; + if (TREE_CODE (lhs) == ARRAY_REF) + { + /* Assume it is a vec_set. */ + val = rhs; + ref = lhs; + } + else if (TREE_CODE (rhs) == ARRAY_REF) + { + /* vec_extract. */ + is_extract = true; + val = lhs; + ref = rhs; + } + else return false; - tree val = gimple_assign_rhs1 (stmt); - tree op0 = TREE_OPERAND (lhs, 0); + tree op0 = TREE_OPERAND (ref, 0); if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0)) && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))) - && TYPE_MODE (TREE_TYPE (lhs)) + && TYPE_MODE (TREE_TYPE (ref)) == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0))))) { - tree pos = TREE_OPERAND (lhs, 1); + tree pos = TREE_OPERAND (ref, 1); + tree view_op0 = TREE_OPERAND (op0, 0); machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0)); + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref)); + if (auto_var_in_fn_p (view_op0, fun->decl) - && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode)) + && !TREE_ADDRESSABLE (view_op0) + && ((!is_extract && can_vec_set_var_idx_p (outermode)) + || (is_extract + && can_vec_extract_var_idx_p (outermode, extract_mode)))) { location_t loc = gimple_location (stmt); tree var_src = make_ssa_name (TREE_TYPE (view_op0)); - tree var_dst = make_ssa_name (TREE_TYPE (view_op0)); ass_stmt = gimple_build_assign (var_src, view_op0); gimple_set_vuse (ass_stmt, gimple_vuse (stmt)); gimple_set_location (ass_stmt, loc); gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); - new_stmt - = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); - gimple_call_set_lhs (new_stmt, var_dst); - gimple_set_location (new_stmt, loc); - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + if (!is_extract) + { + tree var_dst = make_ssa_name (TREE_TYPE (view_op0)); - ass_stmt = gimple_build_assign (view_op0, var_dst); - gimple_set_location (ass_stmt, loc); - gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); + new_stmt = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, + val, pos); + + gimple_call_set_lhs (new_stmt, var_dst); + gimple_set_location (new_stmt, loc); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + + ass_stmt = gimple_build_assign (view_op0, var_dst); + gimple_set_location (ass_stmt, loc); + gimple_move_vops (ass_stmt, stmt); + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); - basic_block bb = gimple_bb (stmt); - gimple_move_vops (ass_stmt, stmt); - if (gsi_remove (gsi, true) - && gimple_purge_dead_eh_edges (bb)) - cfg_changed = true; - *gsi = gsi_for_stmt (ass_stmt); + basic_block bb = gimple_bb (stmt); + if (gsi_remove (gsi, true) + && gimple_purge_dead_eh_edges (bb)) + cfg_changed = true; + *gsi = gsi_for_stmt (ass_stmt); + } + else + { + new_stmt + = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos); + gimple_call_set_lhs (new_stmt, lhs); + + gsi_replace (gsi, new_stmt, true); + cfg_changed = true; + } } } @@ -317,7 +362,8 @@ gimple_expand_vec_exprs (struct function *fun) gsi_replace (&gsi, g, false); } - cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi); + cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi); + if (gsi_end_p (gsi)) break; } diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 303df102d81..23f9309572b 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -175,6 +175,7 @@ init_internal_fns () #define len_store_direct { 3, 3, false } #define len_maskstore_direct { 4, 5, false } #define vec_set_direct { 3, 3, false } +#define vec_extract_direct { 3, 3, false } #define unary_direct { 0, 0, true } #define unary_convert_direct { -1, 0, true } #define binary_direct { 0, 0, true } @@ -3107,6 +3108,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab) gcc_unreachable (); } +/* Expand VEC_EXTRACT optab internal function. */ + +static void +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +{ + tree lhs = gimple_call_lhs (stmt); + tree op0 = gimple_call_arg (stmt, 0); + tree op1 = gimple_call_arg (stmt, 1); + + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0)); + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs)); + + rtx src = expand_normal (op0); + rtx pos = expand_normal (op1); + + class expand_operand ops[3]; + enum insn_code icode = convert_optab_handler (optab, outermode, + extract_mode); + + if (icode != CODE_FOR_nothing) + { + create_output_operand (&ops[0], target, extract_mode); + create_input_operand (&ops[1], src, outermode); + create_convert_operand_from (&ops[2], pos, + TYPE_MODE (TREE_TYPE (op1)), true); + if (maybe_expand_insn (icode, 3, ops)) + { + if (!rtx_equal_p (target, ops[0].value)) + emit_move_insn (target, ops[0].value); + return; + } + } + gcc_unreachable (); +} + static void expand_ABNORMAL_DISPATCHER (internal_fn, gcall *) { @@ -3946,6 +3984,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p #define direct_check_ptrs_optab_supported_p direct_optab_supported_p #define direct_vec_set_optab_supported_p direct_optab_supported_p +#define direct_vec_extract_optab_supported_p direct_optab_supported_p /* Return the optab used by internal function FN. */ diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 9b73e540d55..238b7ee0bc9 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -180,6 +180,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond) DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask) DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set) +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract) DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store) DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore) diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 0a2bcf3620c..4e9f58f8060 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode) && insn_operand_matches (icode, 2, reg3); } +/* Return whether the backend can emit a vec_extract instruction with + a non-constant index. */ +bool +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode) +{ + if (!VECTOR_MODE_P (vec_mode)) + return false; + + rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1); + rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2); + + enum insn_code icode = convert_optab_handler (vec_extract_optab, + vec_mode, extr_mode); + + const struct insn_data_d *data = &insn_data[icode]; + machine_mode idx_mode = data->operand[2].mode; + + rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3); + + return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1) + && insn_operand_matches (icode, 1, reg2) + && insn_operand_matches (icode, 2, reg3); +} + /* This function is called when we are going to emit a compare instruction that compares the values found in X and Y, using the rtl operator COMPARISON. diff --git a/gcc/optabs.h b/gcc/optabs.h index 781d5548be9..c80b7f4dc1b 100644 --- a/gcc/optabs.h +++ b/gcc/optabs.h @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode); /* Return whether the backend can emit vector set instructions for inserting element into vector at variable index position. */ extern bool can_vec_set_var_idx_p (machine_mode); +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode); extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode, machine_mode, int);
On Tue, 4 Jul 2023, Robin Dapp wrote: > Hi Richard, > > changed the patch according to your comments and I agree that > it is more readable that way. I hope using lhs as target for > the extract directly is possible the way I did it. Richard's > patch for aarch64 is already, therefore testsuites on aarch64 and > i386 are unchanged. > > Regards > Robin > > Subject: [PATCH v2] gimple-isel: Recognize vec_extract pattern. > > In gimple-isel we already deduce a vec_set pattern from an > ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a > vec_extract. > > The code is largely similar to the vec_set one > including the addition of a can_vec_extract_var_idx_p function > in optabs.cc to check if the backend can handle a register > operand as index. We already have can_vec_extract in > optabs-query but that one checks whether we can extract > specific modes. > > With the introduction of an internal function for vec_extract > the expander must not FAIL. For vec_set this has already been > the case so adjust the documentation accordingly. > > Additionally, clarify the wording of the vector-vector case for > vec_extract. > > gcc/ChangeLog: > > * doc/md.texi: Document that vec_set and vec_extract must not > fail. > * gimple-isel.cc (gimple_expand_vec_set_expr): Rename this... > (gimple_expand_vec_set_extract_expr): ...to this. > (gimple_expand_vec_exprs): Call renamed function. > * internal-fn.cc (vec_extract_direct): Add. > (expand_vec_extract_optab_fn): New function to expand > vec_extract optab. > (direct_vec_extract_optab_supported_p): Add. > * internal-fn.def (VEC_EXTRACT): Add. > * optabs.cc (can_vec_extract_var_idx_p): New function. > * optabs.h (can_vec_extract_var_idx_p): Declare. > --- > gcc/doc/md.texi | 7 +++- > gcc/gimple-isel.cc | 100 ++++++++++++++++++++++++++++++++------------ > gcc/internal-fn.cc | 39 +++++++++++++++++ > gcc/internal-fn.def | 1 + > gcc/optabs.cc | 24 +++++++++++ > gcc/optabs.h | 1 + > 6 files changed, 144 insertions(+), 28 deletions(-) > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index f14dd32b2dc..b30a824488b 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -5091,6 +5091,8 @@ Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored. > Set given field in the vector value. Operand 0 is the vector to modify, > operand 1 is new value of field and operand 2 specify the field index. > > +This pattern is not allowed to @code{FAIL}. > + > @cindex @code{vec_extract@var{m}@var{n}} instruction pattern > @item @samp{vec_extract@var{m}@var{n}} > Extract given field from the vector value. Operand 1 is the vector, operand 2 > @@ -5098,7 +5100,10 @@ specify field index and operand 0 place to store value into. The > @var{n} mode is the mode of the field or vector of fields that should be > extracted, should be either element mode of the vector mode @var{m}, or > a vector mode with the same element mode and smaller number of elements. > -If @var{n} is a vector mode, the index is counted in units of that mode. > +If @var{n} is a vector mode the index is counted in multiples of > +mode @var{n}. > + > +This pattern is not allowed to @code{FAIL}. > > @cindex @code{vec_init@var{m}@var{n}} instruction pattern > @item @samp{vec_init@var{m}@var{n}} > diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc > index ef688ddb57f..a18b26dec7b 100644 > --- a/gcc/gimple-isel.cc > +++ b/gcc/gimple-isel.cc > @@ -42,17 +42,27 @@ along with GCC; see the file COPYING3. If not see > > /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to > internal function based on vector type of selected expansion. > - i.e.: > + > + For vec_set: > + > VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D); > => > _7 = u; > _8 = .VEC_SET (_7, i_4(D), _1); > - u = _8; */ > + u = _8; > + > + For vec_extract: > + > + _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)]; > + => > + _4 = vD.2208; > + _5 = .VEC_EXTRACT (_4, idx_2(D)); > + _3 = _5; */ I think you are doing _3 = .VEC_EXTRACT (_4, idx_2(D)); and avoiding the SSA name copy correctly. Can you double-check? OK with the comment adjusted. Thanks, Richard. > > static bool > -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) > +gimple_expand_vec_set_extract_expr (struct function *fun, > + gimple_stmt_iterator *gsi) > { > - enum tree_code code; > gcall *new_stmt = NULL; > gassign *ass_stmt = NULL; > bool cfg_changed = false; > @@ -62,49 +72,84 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) > if (!stmt) > return false; > > + bool is_extract = false; > + > tree lhs = gimple_assign_lhs (stmt); > - code = TREE_CODE (lhs); > - if (code != ARRAY_REF) > + tree rhs = gimple_assign_rhs1 (stmt); > + tree val, ref; > + if (TREE_CODE (lhs) == ARRAY_REF) > + { > + /* Assume it is a vec_set. */ > + val = rhs; > + ref = lhs; > + } > + else if (TREE_CODE (rhs) == ARRAY_REF) > + { > + /* vec_extract. */ > + is_extract = true; > + val = lhs; > + ref = rhs; > + } > + else > return false; > > - tree val = gimple_assign_rhs1 (stmt); > - tree op0 = TREE_OPERAND (lhs, 0); > + tree op0 = TREE_OPERAND (ref, 0); > if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0)) > && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))) > - && TYPE_MODE (TREE_TYPE (lhs)) > + && TYPE_MODE (TREE_TYPE (ref)) > == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0))))) > { > - tree pos = TREE_OPERAND (lhs, 1); > + tree pos = TREE_OPERAND (ref, 1); > + > tree view_op0 = TREE_OPERAND (op0, 0); > machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0)); > + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref)); > + > if (auto_var_in_fn_p (view_op0, fun->decl) > - && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode)) > + && !TREE_ADDRESSABLE (view_op0) > + && ((!is_extract && can_vec_set_var_idx_p (outermode)) > + || (is_extract > + && can_vec_extract_var_idx_p (outermode, extract_mode)))) > { > location_t loc = gimple_location (stmt); > tree var_src = make_ssa_name (TREE_TYPE (view_op0)); > - tree var_dst = make_ssa_name (TREE_TYPE (view_op0)); > > ass_stmt = gimple_build_assign (var_src, view_op0); > gimple_set_vuse (ass_stmt, gimple_vuse (stmt)); > gimple_set_location (ass_stmt, loc); > gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); > > - new_stmt > - = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); > - gimple_call_set_lhs (new_stmt, var_dst); > - gimple_set_location (new_stmt, loc); > - gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); > + if (!is_extract) > + { > + tree var_dst = make_ssa_name (TREE_TYPE (view_op0)); > > - ass_stmt = gimple_build_assign (view_op0, var_dst); > - gimple_set_location (ass_stmt, loc); > - gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); > + new_stmt = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, > + val, pos); > + > + gimple_call_set_lhs (new_stmt, var_dst); > + gimple_set_location (new_stmt, loc); > + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); > + > + ass_stmt = gimple_build_assign (view_op0, var_dst); > + gimple_set_location (ass_stmt, loc); > + gimple_move_vops (ass_stmt, stmt); > + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); > > - basic_block bb = gimple_bb (stmt); > - gimple_move_vops (ass_stmt, stmt); > - if (gsi_remove (gsi, true) > - && gimple_purge_dead_eh_edges (bb)) > - cfg_changed = true; > - *gsi = gsi_for_stmt (ass_stmt); > + basic_block bb = gimple_bb (stmt); > + if (gsi_remove (gsi, true) > + && gimple_purge_dead_eh_edges (bb)) > + cfg_changed = true; > + *gsi = gsi_for_stmt (ass_stmt); > + } > + else > + { > + new_stmt > + = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos); > + gimple_call_set_lhs (new_stmt, lhs); > + > + gsi_replace (gsi, new_stmt, true); > + cfg_changed = true; > + } > } > } > > @@ -317,7 +362,8 @@ gimple_expand_vec_exprs (struct function *fun) > gsi_replace (&gsi, g, false); > } > > - cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi); > + cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi); > + > if (gsi_end_p (gsi)) > break; > } > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 303df102d81..23f9309572b 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -175,6 +175,7 @@ init_internal_fns () > #define len_store_direct { 3, 3, false } > #define len_maskstore_direct { 4, 5, false } > #define vec_set_direct { 3, 3, false } > +#define vec_extract_direct { 3, 3, false } > #define unary_direct { 0, 0, true } > #define unary_convert_direct { -1, 0, true } > #define binary_direct { 0, 0, true } > @@ -3107,6 +3108,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab) > gcc_unreachable (); > } > > +/* Expand VEC_EXTRACT optab internal function. */ > + > +static void > +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab) > +{ > + tree lhs = gimple_call_lhs (stmt); > + tree op0 = gimple_call_arg (stmt, 0); > + tree op1 = gimple_call_arg (stmt, 1); > + > + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); > + > + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0)); > + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs)); > + > + rtx src = expand_normal (op0); > + rtx pos = expand_normal (op1); > + > + class expand_operand ops[3]; > + enum insn_code icode = convert_optab_handler (optab, outermode, > + extract_mode); > + > + if (icode != CODE_FOR_nothing) > + { > + create_output_operand (&ops[0], target, extract_mode); > + create_input_operand (&ops[1], src, outermode); > + create_convert_operand_from (&ops[2], pos, > + TYPE_MODE (TREE_TYPE (op1)), true); > + if (maybe_expand_insn (icode, 3, ops)) > + { > + if (!rtx_equal_p (target, ops[0].value)) > + emit_move_insn (target, ops[0].value); > + return; > + } > + } > + gcc_unreachable (); > +} > + > static void > expand_ABNORMAL_DISPATCHER (internal_fn, gcall *) > { > @@ -3946,6 +3984,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, > #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p > #define direct_check_ptrs_optab_supported_p direct_optab_supported_p > #define direct_vec_set_optab_supported_p direct_optab_supported_p > +#define direct_vec_extract_optab_supported_p direct_optab_supported_p > > /* Return the optab used by internal function FN. */ > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index 9b73e540d55..238b7ee0bc9 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -180,6 +180,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond) > DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask) > > DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set) > +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract) > > DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store) > DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore) > diff --git a/gcc/optabs.cc b/gcc/optabs.cc > index 0a2bcf3620c..4e9f58f8060 100644 > --- a/gcc/optabs.cc > +++ b/gcc/optabs.cc > @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode) > && insn_operand_matches (icode, 2, reg3); > } > > +/* Return whether the backend can emit a vec_extract instruction with > + a non-constant index. */ > +bool > +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode) > +{ > + if (!VECTOR_MODE_P (vec_mode)) > + return false; > + > + rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1); > + rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2); > + > + enum insn_code icode = convert_optab_handler (vec_extract_optab, > + vec_mode, extr_mode); > + > + const struct insn_data_d *data = &insn_data[icode]; > + machine_mode idx_mode = data->operand[2].mode; > + > + rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3); > + > + return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1) > + && insn_operand_matches (icode, 1, reg2) > + && insn_operand_matches (icode, 2, reg3); > +} > + > /* This function is called when we are going to emit a compare instruction that > compares the values found in X and Y, using the rtl operator COMPARISON. > > diff --git a/gcc/optabs.h b/gcc/optabs.h > index 781d5548be9..c80b7f4dc1b 100644 > --- a/gcc/optabs.h > +++ b/gcc/optabs.h > @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode); > /* Return whether the backend can emit vector set instructions for inserting > element into vector at variable index position. */ > extern bool can_vec_set_var_idx_p (machine_mode); > +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode); > > extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode, > machine_mode, int); >
>> + _4 = vD.2208; >> + _5 = .VEC_EXTRACT (_4, idx_2(D)); >> + _3 = _5; */ > > I think you are doing > > _3 = .VEC_EXTRACT (_4, idx_2(D)); > > and avoiding the SSA name copy correctly. Can you double-check? > > OK with the comment adjusted. Argh, yes, thanks. Regards Robin
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 9648fdc846a..c61602fb04d 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5074,6 +5074,8 @@ of the result should be stored to memory. Set given field in the vector value. Operand 0 is the vector to modify, operand 1 is new value of field and operand 2 specify the field index. +This pattern is not allowed to @code{FAIL}. + @cindex @code{vec_extract@var{m}@var{n}} instruction pattern @item @samp{vec_extract@var{m}@var{n}} Extract given field from the vector value. Operand 1 is the vector, operand 2 @@ -5081,7 +5083,10 @@ specify field index and operand 0 place to store value into. The @var{n} mode is the mode of the field or vector of fields that should be extracted, should be either element mode of the vector mode @var{m}, or a vector mode with the same element mode and smaller number of elements. -If @var{n} is a vector mode, the index is counted in units of that mode. +If @var{n} is a vector mode the index is counted in multiples of +mode @var{n}. + +This pattern is not allowed to @code{FAIL}. @cindex @code{vec_init@var{m}@var{n}} instruction pattern @item @samp{vec_init@var{m}@var{n}} diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc index ef688ddb57f..d6c560b63dd 100644 --- a/gcc/gimple-isel.cc +++ b/gcc/gimple-isel.cc @@ -42,15 +42,26 @@ along with GCC; see the file COPYING3. If not see /* Expand all ARRAY_REF(VIEW_CONVERT_EXPR) gimple assignments into calls to internal function based on vector type of selected expansion. - i.e.: + + For vec_set: + VIEW_CONVERT_EXPR<int[4]>(u)[_1] = i_4(D); => _7 = u; _8 = .VEC_SET (_7, i_4(D), _1); - u = _8; */ + u = _8; + + For vec_extract: + + _3 = VIEW_CONVERT_EXPR<intD.1[4]>(vD.2208)[idx_2(D)]; + => + _4 = vD.2208; + _5 = .VEC_EXTRACT (_4, idx_2(D)); + _3 = _5; */ static bool -gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) +gimple_expand_vec_set_extract_expr (struct function *fun, + gimple_stmt_iterator *gsi) { enum tree_code code; gcall *new_stmt = NULL; @@ -62,45 +73,88 @@ gimple_expand_vec_set_expr (struct function *fun, gimple_stmt_iterator *gsi) if (!stmt) return false; + bool is_extract = false; + tree lhs = gimple_assign_lhs (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + tree val, op0; code = TREE_CODE (lhs); - if (code != ARRAY_REF) - return false; + if (code == ARRAY_REF) + { + /* Assume it is a vec_set. */ + val = rhs; + op0 = TREE_OPERAND (lhs, 0); + } + else + { + /* It can still be a vec_extract. */ + code = TREE_CODE (rhs); + if (code != ARRAY_REF) + return false; + + /* Sides are swapped for vec_extract. */ + is_extract = true; + val = lhs; + op0 = TREE_OPERAND (rhs, 0); + } - tree val = gimple_assign_rhs1 (stmt); - tree op0 = TREE_OPERAND (lhs, 0); if (TREE_CODE (op0) == VIEW_CONVERT_EXPR && DECL_P (TREE_OPERAND (op0, 0)) && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0))) && TYPE_MODE (TREE_TYPE (lhs)) == TYPE_MODE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 0))))) { - tree pos = TREE_OPERAND (lhs, 1); + tree pos; + if (!is_extract) + pos = TREE_OPERAND (lhs, 1); + else + pos = TREE_OPERAND (rhs, 1); + tree view_op0 = TREE_OPERAND (op0, 0); machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0)); + machine_mode extract_mode; + if (is_extract) + extract_mode = TYPE_MODE (TREE_TYPE (lhs)); + if (auto_var_in_fn_p (view_op0, fun->decl) - && !TREE_ADDRESSABLE (view_op0) && can_vec_set_var_idx_p (outermode)) + && !TREE_ADDRESSABLE (view_op0) + && ((!is_extract && can_vec_set_var_idx_p (outermode)) + || (is_extract && + can_vec_extract_var_idx_p (outermode, extract_mode)))) { location_t loc = gimple_location (stmt); tree var_src = make_ssa_name (TREE_TYPE (view_op0)); - tree var_dst = make_ssa_name (TREE_TYPE (view_op0)); + tree var_dst; + if (!is_extract) + var_dst = make_ssa_name (TREE_TYPE (view_op0)); + else + var_dst = make_ssa_name (TREE_TYPE (lhs)); ass_stmt = gimple_build_assign (var_src, view_op0); gimple_set_vuse (ass_stmt, gimple_vuse (stmt)); gimple_set_location (ass_stmt, loc); gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); - new_stmt - = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); + if (!is_extract) + new_stmt + = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); + else + new_stmt + = gimple_build_call_internal (IFN_VEC_EXTRACT, 2, var_src, pos); + gimple_call_set_lhs (new_stmt, var_dst); gimple_set_location (new_stmt, loc); gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); - ass_stmt = gimple_build_assign (view_op0, var_dst); + if (!is_extract) + ass_stmt = gimple_build_assign (view_op0, var_dst); + else + ass_stmt = gimple_build_assign (lhs, var_dst); gimple_set_location (ass_stmt, loc); + if (!is_extract) + gimple_move_vops (ass_stmt, stmt); gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); basic_block bb = gimple_bb (stmt); - gimple_move_vops (ass_stmt, stmt); if (gsi_remove (gsi, true) && gimple_purge_dead_eh_edges (bb)) cfg_changed = true; @@ -317,7 +371,8 @@ gimple_expand_vec_exprs (struct function *fun) gsi_replace (&gsi, g, false); } - cfg_changed |= gimple_expand_vec_set_expr (fun, &gsi); + cfg_changed |= gimple_expand_vec_set_extract_expr (fun, &gsi); + if (gsi_end_p (gsi)) break; } diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 9017176dc7a..cbdb7e64d7c 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -175,6 +175,7 @@ init_internal_fns () #define len_store_direct { 3, 3, false } #define len_maskstore_direct { 4, 3, false } #define vec_set_direct { 3, 3, false } +#define vec_extract_direct { 3, 3, false } #define unary_direct { 0, 0, true } #define unary_convert_direct { -1, 0, true } #define binary_direct { 0, 0, true } @@ -3126,6 +3127,43 @@ expand_vec_set_optab_fn (internal_fn, gcall *stmt, convert_optab optab) gcc_unreachable (); } +/* Expand VEC_EXTRACT optab internal function. */ + +static void +expand_vec_extract_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +{ + tree lhs = gimple_call_lhs (stmt); + tree op0 = gimple_call_arg (stmt, 0); + tree op1 = gimple_call_arg (stmt, 1); + + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + + machine_mode outermode = TYPE_MODE (TREE_TYPE (op0)); + machine_mode extract_mode = TYPE_MODE (TREE_TYPE (lhs)); + + rtx src = expand_normal (op0); + rtx pos = expand_normal (op1); + + class expand_operand ops[3]; + enum insn_code icode = convert_optab_handler (optab, outermode, + extract_mode); + + if (icode != CODE_FOR_nothing) + { + create_output_operand (&ops[0], target, extract_mode); + create_input_operand (&ops[1], src, outermode); + create_convert_operand_from (&ops[2], pos, + TYPE_MODE (TREE_TYPE (op1)), true); + if (maybe_expand_insn (icode, 3, ops)) + { + if (!rtx_equal_p (target, ops[0].value)) + emit_move_insn (target, ops[0].value); + return; + } + } + gcc_unreachable (); +} + static void expand_ABNORMAL_DISPATCHER (internal_fn, gcall *) { @@ -3976,6 +4014,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_mask_fold_left_optab_supported_p direct_optab_supported_p #define direct_check_ptrs_optab_supported_p direct_optab_supported_p #define direct_vec_set_optab_supported_p direct_optab_supported_p +#define direct_vec_extract_optab_supported_p direct_optab_supported_p /* Return the optab used by internal function FN. */ diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index bc947c0fde7..032624b1547 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -176,6 +176,7 @@ DEF_INTERNAL_OPTAB_FN (VCONDEQ, 0, vcondeq, vec_cond) DEF_INTERNAL_OPTAB_FN (VCOND_MASK, 0, vcond_mask, vec_cond_mask) DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set) +DEF_INTERNAL_OPTAB_FN (VEC_EXTRACT, 0, vec_extract, vec_extract) DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store) DEF_INTERNAL_OPTAB_FN (LEN_MASK_STORE, 0, len_maskstore, len_maskstore) diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 89ab55b72ae..4440bc505b1 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -4417,6 +4417,30 @@ can_vec_set_var_idx_p (machine_mode vec_mode) && insn_operand_matches (icode, 2, reg3); } +/* Return whether the backend can emit a vec_extract instruction with + a non-constant index. */ +bool +can_vec_extract_var_idx_p (machine_mode vec_mode, machine_mode extr_mode) +{ + if (!VECTOR_MODE_P (vec_mode)) + return false; + + rtx reg1 = alloca_raw_REG (extr_mode, LAST_VIRTUAL_REGISTER + 1); + rtx reg2 = alloca_raw_REG (vec_mode, LAST_VIRTUAL_REGISTER + 2); + + enum insn_code icode = convert_optab_handler (vec_extract_optab, + vec_mode, extr_mode); + + const struct insn_data_d *data = &insn_data[icode]; + machine_mode idx_mode = data->operand[2].mode; + + rtx reg3 = alloca_raw_REG (idx_mode, LAST_VIRTUAL_REGISTER + 3); + + return icode != CODE_FOR_nothing && insn_operand_matches (icode, 0, reg1) + && insn_operand_matches (icode, 1, reg2) + && insn_operand_matches (icode, 2, reg3); +} + /* This function is called when we are going to emit a compare instruction that compares the values found in X and Y, using the rtl operator COMPARISON. diff --git a/gcc/optabs.h b/gcc/optabs.h index 29ccbe9235e..f14d012d92f 100644 --- a/gcc/optabs.h +++ b/gcc/optabs.h @@ -259,6 +259,7 @@ extern bool can_vcond_compare_p (enum rtx_code, machine_mode, machine_mode); /* Return whether the backend can emit vector set instructions for inserting element into vector at variable index position. */ extern bool can_vec_set_var_idx_p (machine_mode); +extern bool can_vec_extract_var_idx_p (machine_mode, machine_mode); extern rtx prepare_operand (enum insn_code, rtx, int, machine_mode, machine_mode, int);