Message ID | mptjzzzlalx.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | vect: Fix voluntarily-masked negative conditionals [PR108430] | expand |
On Thu, Mar 2, 2023 at 11:19 AM Richard Sandiford via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > vectorizable_condition checks whether a COND_EXPR condition is used > elsewhere with a loop mask. If so, it applies the loop mask to the > COND_EXPR too, to reduce the number of live masks and to increase the > chance of combining the AND with the comparison. > > There is also code to do this for inverted conditions. E.g. if > we have a < b ? c : d and something else is conditional on !(a < b) > (such as a load in d), we use !(a < b) ? d : c and apply the loop > mask to !(a < b). > > This inversion relied on the function's bitop1/bitop2 mechanism. > However, that mechanism is skipped if the condition is split out of > the COND_EXPR as a separate statement. This meant that we could end > up using the inverse of the intended condition. > > There is a separate way of negating the condition when a mask > is being applied (which is also used for EXTRACT_LAST reductions). > This patch uses that instead. > > As well as the testcase, this fixes aarch64/sve/vcond_{4,17}_run.c. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? OK. > Richard > > > gcc/ > PR tree-optimization/108430 > * tree-vect-stmts.cc (vectorizable_condition): Fix handling > of inverted condition. > > gcc/testsuite/ > PR tree-optimization/108430 > * gcc.target/aarch64/sve/pr108430.c: New test. > --- > .../gcc.target/aarch64/sve/pr108430.c | 21 +++++++++++++++++++ > gcc/tree-vect-stmts.cc | 3 +-- > 2 files changed, 22 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr108430.c > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c b/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c > new file mode 100644 > index 00000000000..e7ce0f6d793 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c > @@ -0,0 +1,21 @@ > +/* { dg-do run { target aarch64_sve512_hw } } */ > +/* { dg-options "-O3 -msve-vector-bits=512" } */ > + > +long d = 1; > +static int i = 37; > +static unsigned long a[22]; > +static unsigned short c[22]; > +static unsigned g[80]; > +static unsigned short *h = c; > +static unsigned long *j = a; > +int main() { > + for (long m = 0; m < 8; ++m) > + d = 1; > + for (unsigned char p = 0; p < 17; p += 2) > + { > + long t = h[p] ? i : j[p]; > + g[p] = t; > + } > + if (g[0]) > + __builtin_abort (); > +} > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index 9e5ffbe252e..77ad8b78506 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -10756,11 +10756,10 @@ vectorizable_condition (vec_info *vinfo, > cond.code = orig_code; > if (loop_vinfo->scalar_cond_masked_set.contains (cond)) > { > - bitop1 = orig_code; > - bitop2 = BIT_NOT_EXPR; > masks = &LOOP_VINFO_MASKS (loop_vinfo); > cond_code = cond.code; > swap_cond_operands = true; > + must_invert_cmp_result = true; > } > } > } > -- > 2.25.1 >
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c b/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c new file mode 100644 index 00000000000..e7ce0f6d793 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr108430.c @@ -0,0 +1,21 @@ +/* { dg-do run { target aarch64_sve512_hw } } */ +/* { dg-options "-O3 -msve-vector-bits=512" } */ + +long d = 1; +static int i = 37; +static unsigned long a[22]; +static unsigned short c[22]; +static unsigned g[80]; +static unsigned short *h = c; +static unsigned long *j = a; +int main() { + for (long m = 0; m < 8; ++m) + d = 1; + for (unsigned char p = 0; p < 17; p += 2) + { + long t = h[p] ? i : j[p]; + g[p] = t; + } + if (g[0]) + __builtin_abort (); +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 9e5ffbe252e..77ad8b78506 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -10756,11 +10756,10 @@ vectorizable_condition (vec_info *vinfo, cond.code = orig_code; if (loop_vinfo->scalar_cond_masked_set.contains (cond)) { - bitop1 = orig_code; - bitop2 = BIT_NOT_EXPR; masks = &LOOP_VINFO_MASKS (loop_vinfo); cond_code = cond.code; swap_cond_operands = true; + must_invert_cmp_result = true; } } }