Message ID | 508683d0-ab11-c1dd-7a27-1f734328e0c4@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | testsuite: Adjust possibly fragile slp-perm-9.c [PR104015] | expand |
on 2022/1/18 上午11:06, Kewen.Lin via Gcc-patches wrote: > Hi, > > As discussed in PR104015, the test case slp-perm-9.c can be > fragile when vectorizer tries to use different vectorisation > strategies. > > As Richard suggested, this patch tries to make the check not > sensitive on the re-trying times by removing the times checking. > To still retain the test coverage on unnecessary re-trying, for > example this exposed PR104015 on Power9, I added two test cases > to powerpc test bucket. > > Tested on x86_64-redhat-linux, aarch64-linux-gnu and > powerpc64-linux-gnu Power8 and powerpc64le-linux-gnu > Power9/Power10. > > Is it ok for trunk? > > BR, > Kewen > ----- > gcc/testsuite/ChangeLog: > > PR tree-optimization/104015 > * gcc.dg/vect/slp-perm-9.c: Adjust. > * gcc.target/powerpc/pr104015-1.c: New test. > * gcc.target/powerpc/pr104015-2.c: New test. One updated version is attached to modify pr104015-2.c slightly by using more clear required effective target lp64. Tested as before. BR, Kewen gcc/testsuite/gcc.dg/vect/slp-perm-9.c | 4 +-- gcc/testsuite/gcc.target/powerpc/pr104015-1.c | 28 ++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/pr104015-2.c | 29 +++++++++++++++++++ 3 files changed, 58 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-1.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-2.c diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c index 873eddf223e..154c00af598 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c @@ -61,9 +61,7 @@ int main (int argc, const char* argv[]) /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { { vect_perm_short || vect32 } || vect_load_lanes } } } } */ /* We don't try permutes with a group size of 3 for variable-length vectors. */ -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && { ! vect_partial_vectors_usage_1 } } } xfail vect_variable_length } } } */ -/* Try to vectorize the epilogue using partial vectors. */ -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 2 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && vect_partial_vectors_usage_1 } } xfail vect_variable_length } } } */ +/* { dg-final { scan-tree-dump "permutation requires at least three vectors" "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */ /* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! { vect_perm3_short || vect32 } } || vect_load_lanes } } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { { vect_perm3_short || vect32 } && { ! vect_load_lanes } } } } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-1.c b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c new file mode 100644 index 00000000000..895c243aaf8 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c @@ -0,0 +1,28 @@ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */ + +/* As PR104015, we don't expect vectorizer will re-try some vector modes + for epilogues on Power9, since Power9 doesn't support partial vector + by defaut. */ + +#include <stdarg.h> +#define N 200 + +void __attribute__((noinline)) +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput) +{ + unsigned short i, a, b, c; + + for (i = 0; i < N / 3; i++) + { + a = *pInput++; + b = *pInput++; + c = *pInput++; + + *pOutput++ = a + b + c + 3; + *pOutput++ = a + b + c + 12; + *pOutput++ = a + b + c + 1; + } +} + +/* { dg-final { scan-tree-dump-not "Re-trying epilogue analysis with vector mode" "vect" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-2.c b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c new file mode 100644 index 00000000000..ab482b11629 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c @@ -0,0 +1,29 @@ +/* { dg-require-effective-target power10_ok } */ +/* Vector with length instructions lxvl/stxvl are only enabled for 64 bit. */ +/* { dg-require-effective-target lp64 } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */ + +/* Power10 support partial vector for epilogue by default, it's expected + vectorizer would re-try for it once. */ + +#include <stdarg.h> +#define N 200 + +void __attribute__((noinline)) +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput) +{ + unsigned short i, a, b, c; + + for (i = 0; i < N / 3; i++) + { + a = *pInput++; + b = *pInput++; + c = *pInput++; + + *pOutput++ = a + b + c + 3; + *pOutput++ = a + b + c + 12; + *pOutput++ = a + b + c + 1; + } +} + +/* { dg-final { scan-tree-dump-times "Re-trying epilogue analysis with vector mode" 1 "vect" } } */
"Kewen.Lin" <linkw@linux.ibm.com> writes: > on 2022/1/18 锟斤拷锟斤拷11:06, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> As discussed in PR104015, the test case slp-perm-9.c can be >> fragile when vectorizer tries to use different vectorisation >> strategies. >> >> As Richard suggested, this patch tries to make the check not >> sensitive on the re-trying times by removing the times checking. >> To still retain the test coverage on unnecessary re-trying, for >> example this exposed PR104015 on Power9, I added two test cases >> to powerpc test bucket. >> >> Tested on x86_64-redhat-linux, aarch64-linux-gnu and >> powerpc64-linux-gnu Power8 and powerpc64le-linux-gnu >> Power9/Power10. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> ----- >> gcc/testsuite/ChangeLog: >> >> PR tree-optimization/104015 >> * gcc.dg/vect/slp-perm-9.c: Adjust. >> * gcc.target/powerpc/pr104015-1.c: New test. >> * gcc.target/powerpc/pr104015-2.c: New test. > > One updated version is attached to modify pr104015-2.c slightly by > using more clear required effective target lp64. > > Tested as before. > > BR, > Kewen OK for the target-independent part, thanks. IMO it's OK independently of the rs6000 tests. Richard > gcc/testsuite/gcc.dg/vect/slp-perm-9.c | 4 +-- > gcc/testsuite/gcc.target/powerpc/pr104015-1.c | 28 ++++++++++++++++++ > gcc/testsuite/gcc.target/powerpc/pr104015-2.c | 29 +++++++++++++++++++ > 3 files changed, 58 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-2.c > > diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c > index 873eddf223e..154c00af598 100644 > --- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c > +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c > @@ -61,9 +61,7 @@ int main (int argc, const char* argv[]) > /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { { vect_perm_short || vect32 } || vect_load_lanes } } } } */ > /* We don't try permutes with a group size of 3 for variable-length > vectors. */ > -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && { ! vect_partial_vectors_usage_1 } } } xfail vect_variable_length } } } */ > -/* Try to vectorize the epilogue using partial vectors. */ > -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 2 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && vect_partial_vectors_usage_1 } } xfail vect_variable_length } } } */ > +/* { dg-final { scan-tree-dump "permutation requires at least three vectors" "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */ > /* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */ > /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! { vect_perm3_short || vect32 } } || vect_load_lanes } } } } */ > /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { { vect_perm3_short || vect32 } && { ! vect_load_lanes } } } } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-1.c b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c > new file mode 100644 > index 00000000000..895c243aaf8 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c > @@ -0,0 +1,28 @@ > +/* { dg-require-effective-target powerpc_p9vector_ok } */ > +/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */ > + > +/* As PR104015, we don't expect vectorizer will re-try some vector modes > + for epilogues on Power9, since Power9 doesn't support partial vector > + by defaut. */ > + > +#include <stdarg.h> > +#define N 200 > + > +void __attribute__((noinline)) > +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput) > +{ > + unsigned short i, a, b, c; > + > + for (i = 0; i < N / 3; i++) > + { > + a = *pInput++; > + b = *pInput++; > + c = *pInput++; > + > + *pOutput++ = a + b + c + 3; > + *pOutput++ = a + b + c + 12; > + *pOutput++ = a + b + c + 1; > + } > +} > + > +/* { dg-final { scan-tree-dump-not "Re-trying epilogue analysis with vector mode" "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-2.c b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c > new file mode 100644 > index 00000000000..ab482b11629 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c > @@ -0,0 +1,29 @@ > +/* { dg-require-effective-target power10_ok } */ > +/* Vector with length instructions lxvl/stxvl are only enabled for 64 bit. */ > +/* { dg-require-effective-target lp64 } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */ > + > +/* Power10 support partial vector for epilogue by default, it's expected > + vectorizer would re-try for it once. */ > + > +#include <stdarg.h> > +#define N 200 > + > +void __attribute__((noinline)) > +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput) > +{ > + unsigned short i, a, b, c; > + > + for (i = 0; i < N / 3; i++) > + { > + a = *pInput++; > + b = *pInput++; > + c = *pInput++; > + > + *pOutput++ = a + b + c + 3; > + *pOutput++ = a + b + c + 12; > + *pOutput++ = a + b + c + 1; > + } > +} > + > +/* { dg-final { scan-tree-dump-times "Re-trying epilogue analysis with vector mode" 1 "vect" } } */
On Tue, Jan 18, 2022 at 11:57:32AM +0000, Richard Sandiford wrote: > "Kewen.Lin" <linkw@linux.ibm.com> writes: > >> PR tree-optimization/104015 > >> * gcc.dg/vect/slp-perm-9.c: Adjust. > >> * gcc.target/powerpc/pr104015-1.c: New test. > >> * gcc.target/powerpc/pr104015-2.c: New test. > OK for the target-independent part, thanks. IMO it's OK independently > of the rs6000 tests. The rs6000 parts are fine as well. Thanks! I see you got rid of the ilp32 tests, I was going to holler about that, there is no reason this should only work (or only be tested) on 64-bit systems :-) Segher
on 2022/1/19 上午5:34, Segher Boessenkool wrote: > On Tue, Jan 18, 2022 at 11:57:32AM +0000, Richard Sandiford wrote: >> "Kewen.Lin" <linkw@linux.ibm.com> writes: >>>> PR tree-optimization/104015 >>>> * gcc.dg/vect/slp-perm-9.c: Adjust. >>>> * gcc.target/powerpc/pr104015-1.c: New test. >>>> * gcc.target/powerpc/pr104015-2.c: New test. > >> OK for the target-independent part, thanks. IMO it's OK independently >> of the rs6000 tests. > > The rs6000 parts are fine as well. Thanks! > > I see you got rid of the ilp32 tests, I was going to holler about that, > there is no reason this should only work (or only be tested) on 64-bit > systems :-) > Thanks Richard and Segher, committed as r12-6717. BR, Kewen
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c index 873eddf223e..154c00af598 100644 --- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c @@ -61,9 +61,7 @@ int main (int argc, const char* argv[]) /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { { vect_perm_short || vect32 } || vect_load_lanes } } } } */ /* We don't try permutes with a group size of 3 for variable-length vectors. */ -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && { ! vect_partial_vectors_usage_1 } } } xfail vect_variable_length } } } */ -/* Try to vectorize the epilogue using partial vectors. */ -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 2 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && vect_partial_vectors_usage_1 } } xfail vect_variable_length } } } */ +/* { dg-final { scan-tree-dump "permutation requires at least three vectors" "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */ /* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! { vect_perm3_short || vect32 } } || vect_load_lanes } } } } */ /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { { vect_perm3_short || vect32 } && { ! vect_load_lanes } } } } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-1.c b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c new file mode 100644 index 00000000000..895c243aaf8 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c @@ -0,0 +1,28 @@ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */ + +/* As PR104015, we don't expect vectorizer will re-try some vector modes + for epilogues on Power9, since Power9 doesn't support partial vector + by defaut. */ + +#include <stdarg.h> +#define N 200 + +void __attribute__((noinline)) +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput) +{ + unsigned short i, a, b, c; + + for (i = 0; i < N / 3; i++) + { + a = *pInput++; + b = *pInput++; + c = *pInput++; + + *pOutput++ = a + b + c + 3; + *pOutput++ = a + b + c + 12; + *pOutput++ = a + b + c + 1; + } +} + +/* { dg-final { scan-tree-dump-not "Re-trying epilogue analysis with vector mode" "vect" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-2.c b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c new file mode 100644 index 00000000000..1b66a64f47c --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c @@ -0,0 +1,28 @@ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */ + +/* Power10 support partial vector for epilogue by default, it's expected + vectorizer would re-try for it once. */ + +#include <stdarg.h> +#define N 200 + +void __attribute__((noinline)) +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput) +{ + unsigned short i, a, b, c; + + for (i = 0; i < N / 3; i++) + { + a = *pInput++; + b = *pInput++; + c = *pInput++; + + *pOutput++ = a + b + c + 3; + *pOutput++ = a + b + c + 12; + *pOutput++ = a + b + c + 1; + } +} + +/* Vector with length instructions lxvl/stxvl are only enabled for 64 bit. */ +/* { dg-final { scan-tree-dump-times "Re-trying epilogue analysis with vector mode" 1 "vect" {target { ! ilp32 } } } } */