diff mbox series

vect: Fix inconsistency in fully-masked lane-reducing op generation [PR116985]

Message ID LV2PR01MB7839DC176FDEAAACDCA31924F77A2@LV2PR01MB7839.prod.exchangelabs.com
State New
Headers show
Series vect: Fix inconsistency in fully-masked lane-reducing op generation [PR116985] | expand

Commit Message

Feng Xue OS Oct. 12, 2024, 7:11 a.m. UTC
To align vectorized def/use when lane-reducing op is present in loop reduction,
we may need to insert extra trivial pass-through copies, which would cause
mismatch between lane-reducing vector copy and loop mask index. This could be
fixed by computing the right index around a new counter on effective lane-
reducing vector copies.

Thanks,
Feng
---
gcc/
        PR tree-optimization/116985
        * tree-vect-loop.cc (vect_transform_reduction): Compute loop mask
        index based on effective vector copies for reduction op.
---
 gcc/tree-vect-loop.cc | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Richard Biener Oct. 12, 2024, 12:12 p.m. UTC | #1
On Sat, Oct 12, 2024 at 9:12 AM Feng Xue OS <fxue@os.amperecomputing.com> wrote:
>
> To align vectorized def/use when lane-reducing op is present in loop reduction,
> we may need to insert extra trivial pass-through copies, which would cause
> mismatch between lane-reducing vector copy and loop mask index. This could be
> fixed by computing the right index around a new counter on effective lane-
> reducing vector copies.

OK, but can you add the reduced testcase from the PR in a way that it ICEs
before and not after the patch?

Thanks,
Richard.

> Thanks,
> Feng
> ---
> gcc/
>         PR tree-optimization/116985
>         * tree-vect-loop.cc (vect_transform_reduction): Compute loop mask
>         index based on effective vector copies for reduction op.
> ---
>  gcc/tree-vect-loop.cc | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index ade72a5124f..025442aabc3 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -8916,6 +8916,7 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
>
>    bool emulated_mixed_dot_prod = vect_is_emulated_mixed_dot_prod (stmt_info);
>    unsigned num = vec_oprnds[reduc_index == 0 ? 1 : 0].length ();
> +  unsigned mask_index = 0;
>
>    for (unsigned i = 0; i < num; ++i)
>      {
> @@ -8954,7 +8955,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
>               std::swap (vop[0], vop[1]);
>             }
>           tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks,
> -                                         vec_num * ncopies, vectype_in, i);
> +                                         vec_num * ncopies, vectype_in,
> +                                         mask_index++);
>           gcall *call = gimple_build_call_internal (cond_fn, 4, mask,
>                                                     vop[0], vop[1], vop[0]);
>           new_temp = make_ssa_name (vec_dest, call);
> @@ -8971,7 +8973,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
>           if (masked_loop_p && mask_by_cond_expr)
>             {
>               tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks,
> -                                             vec_num * ncopies, vectype_in, i);
> +                                             vec_num * ncopies, vectype_in,
> +                                             mask_index++);
>               build_vect_cond_expr (code, vop, mask, gsi);
>             }
>
> --
> 2.17.1
>
>
>
Feng Xue OS Oct. 12, 2024, 3:07 p.m. UTC | #2
Added.

Thanks,
Feng
diff mbox series

Patch

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index ade72a5124f..025442aabc3 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -8916,6 +8916,7 @@  vect_transform_reduction (loop_vec_info loop_vinfo,

   bool emulated_mixed_dot_prod = vect_is_emulated_mixed_dot_prod (stmt_info);
   unsigned num = vec_oprnds[reduc_index == 0 ? 1 : 0].length ();
+  unsigned mask_index = 0;

   for (unsigned i = 0; i < num; ++i)
     {
@@ -8954,7 +8955,8 @@  vect_transform_reduction (loop_vec_info loop_vinfo,
              std::swap (vop[0], vop[1]);
            }
          tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks,
-                                         vec_num * ncopies, vectype_in, i);
+                                         vec_num * ncopies, vectype_in,
+                                         mask_index++);
          gcall *call = gimple_build_call_internal (cond_fn, 4, mask,
                                                    vop[0], vop[1], vop[0]);
          new_temp = make_ssa_name (vec_dest, call);
@@ -8971,7 +8973,8 @@  vect_transform_reduction (loop_vec_info loop_vinfo,
          if (masked_loop_p && mask_by_cond_expr)
            {
              tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks,
-                                             vec_num * ncopies, vectype_in, i);
+                                             vec_num * ncopies, vectype_in,
+                                             mask_index++);
              build_vect_cond_expr (code, vop, mask, gsi);
            }