Message ID | LV2PR01MB7839DC176FDEAAACDCA31924F77A2@LV2PR01MB7839.prod.exchangelabs.com |
---|---|
State | New |
Headers | show |
Series | vect: Fix inconsistency in fully-masked lane-reducing op generation [PR116985] | expand |
On Sat, Oct 12, 2024 at 9:12 AM Feng Xue OS <fxue@os.amperecomputing.com> wrote: > > To align vectorized def/use when lane-reducing op is present in loop reduction, > we may need to insert extra trivial pass-through copies, which would cause > mismatch between lane-reducing vector copy and loop mask index. This could be > fixed by computing the right index around a new counter on effective lane- > reducing vector copies. OK, but can you add the reduced testcase from the PR in a way that it ICEs before and not after the patch? Thanks, Richard. > Thanks, > Feng > --- > gcc/ > PR tree-optimization/116985 > * tree-vect-loop.cc (vect_transform_reduction): Compute loop mask > index based on effective vector copies for reduction op. > --- > gcc/tree-vect-loop.cc | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index ade72a5124f..025442aabc3 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -8916,6 +8916,7 @@ vect_transform_reduction (loop_vec_info loop_vinfo, > > bool emulated_mixed_dot_prod = vect_is_emulated_mixed_dot_prod (stmt_info); > unsigned num = vec_oprnds[reduc_index == 0 ? 1 : 0].length (); > + unsigned mask_index = 0; > > for (unsigned i = 0; i < num; ++i) > { > @@ -8954,7 +8955,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo, > std::swap (vop[0], vop[1]); > } > tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, > - vec_num * ncopies, vectype_in, i); > + vec_num * ncopies, vectype_in, > + mask_index++); > gcall *call = gimple_build_call_internal (cond_fn, 4, mask, > vop[0], vop[1], vop[0]); > new_temp = make_ssa_name (vec_dest, call); > @@ -8971,7 +8973,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo, > if (masked_loop_p && mask_by_cond_expr) > { > tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, > - vec_num * ncopies, vectype_in, i); > + vec_num * ncopies, vectype_in, > + mask_index++); > build_vect_cond_expr (code, vop, mask, gsi); > } > > -- > 2.17.1 > > >
Added. Thanks, Feng
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index ade72a5124f..025442aabc3 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -8916,6 +8916,7 @@ vect_transform_reduction (loop_vec_info loop_vinfo, bool emulated_mixed_dot_prod = vect_is_emulated_mixed_dot_prod (stmt_info); unsigned num = vec_oprnds[reduc_index == 0 ? 1 : 0].length (); + unsigned mask_index = 0; for (unsigned i = 0; i < num; ++i) { @@ -8954,7 +8955,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo, std::swap (vop[0], vop[1]); } tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, - vec_num * ncopies, vectype_in, i); + vec_num * ncopies, vectype_in, + mask_index++); gcall *call = gimple_build_call_internal (cond_fn, 4, mask, vop[0], vop[1], vop[0]); new_temp = make_ssa_name (vec_dest, call); @@ -8971,7 +8973,8 @@ vect_transform_reduction (loop_vec_info loop_vinfo, if (masked_loop_p && mask_by_cond_expr) { tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, - vec_num * ncopies, vectype_in, i); + vec_num * ncopies, vectype_in, + mask_index++); build_vect_cond_expr (code, vop, mask, gsi); }