Message ID | 226fc0cc-1068-fea9-b45c-b79dc022b5d6@arm.com |
---|---|
State | New |
Headers | show |
Series | PR98974: Fix vectorizable_condition after STMT_VINFO_VEC_STMTS | expand |
"Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes: > Hi, > > As mentioned in the PR, this patch fixes up the nvectors parameter passed to vect_get_loop_mask in vectorizable_condition. > Before the STMT_VINFO_VEC_STMTS rework we used to handle each ncopy separately, now we gather them all at the same time and don't need to multiply vec_num with ncopies. > > The reduced testcase I used to illustrate the issue in the PR gives a warning, if someone knows how to get rid of that (it's Fortran) I'd include it as a testcase for this. Looks like Richi's since posted one. > Bootstrapped and regression tested on aarch64-none-linux-gnu. I don't believe that code triggers for other targets, so not sure it makes sense to test on others? > > Is this OK for trunk? Would you rather wait for the testcase? > > gcc/ChangeLog: > 2021-02-05 Andre Vieira <andre.simoesdiasvieira@arm.com> > > PR middle-end/98974 > * tree-vect-stmts.c (vectorizable_condition): Fix nvectors parameter > for vect_get_loop_mask call. > > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 0bc1cb1c5b4f6c1f0447241b4d31434bf7cca1a4..d07602f6d38f9c51936ac09942599fc5a14f46ab 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -10237,8 +10237,7 @@ vectorizable_condition (vec_info *vinfo, > { > unsigned vec_num = vec_oprnds0.length (); > tree loop_mask > - = vect_get_loop_mask (gsi, masks, vec_num * ncopies, > - vectype, i); > + = vect_get_loop_mask (gsi, masks, vec_num, vectype, i); > tree tmp2 = make_ssa_name (vec_cmp_type); > gassign *g > = gimple_build_assign (tmp2, BIT_AND_EXPR, vec_compare, Does removing the shadowed vec_num work? I think that would be less confusing, and means that the calculation stays in sync with the vect_record_loop_mask call. Thanks, Richard
On 05/02/2021 12:47, Richard Sandiford wrote: > "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes: >> Hi, >> >> As mentioned in the PR, this patch fixes up the nvectors parameter passed to vect_get_loop_mask in vectorizable_condition. >> Before the STMT_VINFO_VEC_STMTS rework we used to handle each ncopy separately, now we gather them all at the same time and don't need to multiply vec_num with ncopies. >> >> The reduced testcase I used to illustrate the issue in the PR gives a warning, if someone knows how to get rid of that (it's Fortran) I'd include it as a testcase for this. > Looks like Richi's since posted one. Included it. >> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c >> index 0bc1cb1c5b4f6c1f0447241b4d31434bf7cca1a4..d07602f6d38f9c51936ac09942599fc5a14f46ab 100644 >> --- a/gcc/tree-vect-stmts.c >> +++ b/gcc/tree-vect-stmts.c >> @@ -10237,8 +10237,7 @@ vectorizable_condition (vec_info *vinfo, >> { >> unsigned vec_num = vec_oprnds0.length (); >> tree loop_mask >> - = vect_get_loop_mask (gsi, masks, vec_num * ncopies, >> - vectype, i); >> + = vect_get_loop_mask (gsi, masks, vec_num, vectype, i); >> tree tmp2 = make_ssa_name (vec_cmp_type); >> gassign *g >> = gimple_build_assign (tmp2, BIT_AND_EXPR, vec_compare, > Does removing the shadowed vec_num work? I think that would be less > confusing, and means that the calculation stays in sync with the Yeah that works too. Here's a reworked patch. gcc/ChangeLog: 2021-02-05 Andre Vieira <andre.simoesdiasvieira@arm.com> PR middle-end/98974 * tree-vect-stmts.c (vectorizable_condition): Fix nvectors parameter for vect_get_loop_mask call. gcc/testsuite/ChangeLog: 2021-02-05 Andre Vieira <andre.simoesdiasvieira@arm.com> * gfortran.dg/pr98974.F90: New test. diff --git a/gcc/testsuite/gfortran.dg/pr98974.F90 b/gcc/testsuite/gfortran.dg/pr98974.F90 new file mode 100644 index 0000000000000000000000000000000000000000..b3db6a6654a0b36bc567405c70429a5dbe168d1e --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr98974.F90 @@ -0,0 +1,20 @@ +! PR middle-end/98974 +! { dg-do compile { target { aarch64*-*-* } } } +! { dg-options "-Ofast -mcpu=neoverse-v1" } + +module module_foobar + integer,parameter :: fp_kind = selected_real_kind(15) +contains + subroutine foobar( foo, ix ,jx ,kx,iy,ky) + real, dimension( ix, kx, jx ) :: foo + real(fp_kind), dimension( iy, ky, 3 ) :: bar, baz + do k=1,ky + do i=1,iy + if ( baz(i,k,1) > 0. ) then + bar(i,k,1) = 0 + endif + foo(i,nk,j) = baz0 * bar(i,k,1) + enddo + enddo + end +end diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 0bc1cb1c5b4f6c1f0447241b4d31434bf7cca1a4..064e5d138ce9a151287662a0caefd9925b0a2920 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -10235,7 +10235,6 @@ vectorizable_condition (vec_info *vinfo, if (masks) { - unsigned vec_num = vec_oprnds0.length (); tree loop_mask = vect_get_loop_mask (gsi, masks, vec_num * ncopies, vectype, i);
"Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes: > On 05/02/2021 12:47, Richard Sandiford wrote: >> "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com> writes: >>> Hi, >>> >>> As mentioned in the PR, this patch fixes up the nvectors parameter passed to vect_get_loop_mask in vectorizable_condition. >>> Before the STMT_VINFO_VEC_STMTS rework we used to handle each ncopy separately, now we gather them all at the same time and don't need to multiply vec_num with ncopies. >>> >>> The reduced testcase I used to illustrate the issue in the PR gives a warning, if someone knows how to get rid of that (it's Fortran) I'd include it as a testcase for this. >> Looks like Richi's since posted one. > Included it. >>> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c >>> index 0bc1cb1c5b4f6c1f0447241b4d31434bf7cca1a4..d07602f6d38f9c51936ac09942599fc5a14f46ab 100644 >>> --- a/gcc/tree-vect-stmts.c >>> +++ b/gcc/tree-vect-stmts.c >>> @@ -10237,8 +10237,7 @@ vectorizable_condition (vec_info *vinfo, >>> { >>> unsigned vec_num = vec_oprnds0.length (); >>> tree loop_mask >>> - = vect_get_loop_mask (gsi, masks, vec_num * ncopies, >>> - vectype, i); >>> + = vect_get_loop_mask (gsi, masks, vec_num, vectype, i); >>> tree tmp2 = make_ssa_name (vec_cmp_type); >>> gassign *g >>> = gimple_build_assign (tmp2, BIT_AND_EXPR, vec_compare, >> Does removing the shadowed vec_num work? I think that would be less >> confusing, and means that the calculation stays in sync with the > Yeah that works too. > > Here's a reworked patch. > > > gcc/ChangeLog: > 2021-02-05 Andre Vieira <andre.simoesdiasvieira@arm.com> > > PR middle-end/98974 > * tree-vect-stmts.c (vectorizable_condition): Fix nvectors > parameter > for vect_get_loop_mask call. > > gcc/testsuite/ChangeLog: > 2021-02-05 Andre Vieira <andre.simoesdiasvieira@arm.com> > > * gfortran.dg/pr98974.F90: New test. > > diff --git a/gcc/testsuite/gfortran.dg/pr98974.F90 b/gcc/testsuite/gfortran.dg/pr98974.F90 > new file mode 100644 > index 0000000000000000000000000000000000000000..b3db6a6654a0b36bc567405c70429a5dbe168d1e > --- /dev/null > +++ b/gcc/testsuite/gfortran.dg/pr98974.F90 > @@ -0,0 +1,20 @@ > +! PR middle-end/98974 > +! { dg-do compile { target { aarch64*-*-* } } } > +! { dg-options "-Ofast -mcpu=neoverse-v1" } Only -mcpu=neoverse-v1 is specific to aarch64*, so I think this should be: ! { dg-do compile } ! { dg-options "-Ofast" } ! { dg-additional-options "-mcpu=neoverse-v1" { target aarch64*-*-* } } OK with that change, thanks. Richard > + > +module module_foobar > + integer,parameter :: fp_kind = selected_real_kind(15) > +contains > + subroutine foobar( foo, ix ,jx ,kx,iy,ky) > + real, dimension( ix, kx, jx ) :: foo > + real(fp_kind), dimension( iy, ky, 3 ) :: bar, baz > + do k=1,ky > + do i=1,iy > + if ( baz(i,k,1) > 0. ) then > + bar(i,k,1) = 0 > + endif > + foo(i,nk,j) = baz0 * bar(i,k,1) > + enddo > + enddo > + end > +end > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 0bc1cb1c5b4f6c1f0447241b4d31434bf7cca1a4..064e5d138ce9a151287662a0caefd9925b0a2920 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -10235,7 +10235,6 @@ vectorizable_condition (vec_info *vinfo, > > if (masks) > { > - unsigned vec_num = vec_oprnds0.length (); > tree loop_mask > = vect_get_loop_mask (gsi, masks, vec_num * ncopies, > vectype, i);
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 0bc1cb1c5b4f6c1f0447241b4d31434bf7cca1a4..d07602f6d38f9c51936ac09942599fc5a14f46ab 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -10237,8 +10237,7 @@ vectorizable_condition (vec_info *vinfo, { unsigned vec_num = vec_oprnds0.length (); tree loop_mask - = vect_get_loop_mask (gsi, masks, vec_num * ncopies, - vectype, i); + = vect_get_loop_mask (gsi, masks, vec_num, vectype, i); tree tmp2 = make_ssa_name (vec_cmp_type); gassign *g = gimple_build_assign (tmp2, BIT_AND_EXPR, vec_compare,