Message ID | 2ea418e8-7e7d-3477-9026-f850a046d463@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | testsuite: Adjust pr96789.c to exclude vect_load_lanes | expand |
"Kewen.Lin" <linkw@linux.ibm.com> writes: > Hi, > > As Lyon pointed out, the newly introduced test case > gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf. > Loop vectorizer is able to vectorize the two loops which > operate on array tmp with load_lanes feature support. It > makes dse3 get unexpected inputs and do nothing. > > This patch is to teach the case to respect vect_load_lanes, > meanwhile to guard the check only under vect_int. I'm not sure this is the right check. The test passes on aarch64, which also has load lanes, but apparently doesn't use them for this test. I think the way the loop vectoriser handles the loops will depend a lot on target costs, which can vary in unpredictable ways. Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize? Or does that defeat the purpose of the test? Thanks, Richard > > Is it ok for trunk? > > BR, > Kewen > ----- > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/pr96789.c: Adjusted by excluding vect_load_lanes. > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c > index d6139a014d8..1b89f8b7a6a 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c > @@ -55,4 +55,7 @@ bar (int16_t res[16], uint8_t *val1, uint8_t *val2) > } > } > > -/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" } } */ > +/* Exclude targets which support load_lanes since loop vectorizer > + can vectorize those two loops that operate tmp array so that > + subsequent dse3 will not eliminate tmp stores. */ > +/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" { target { vect_int && { ! vect_load_lanes } } } } } */
Hi Richard, Thanks for the review! on 2020/11/10 下午7:31, Richard Sandiford wrote: > "Kewen.Lin" <linkw@linux.ibm.com> writes: >> Hi, >> >> As Lyon pointed out, the newly introduced test case >> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf. >> Loop vectorizer is able to vectorize the two loops which >> operate on array tmp with load_lanes feature support. It >> makes dse3 get unexpected inputs and do nothing. >> >> This patch is to teach the case to respect vect_load_lanes, >> meanwhile to guard the check only under vect_int. > > I'm not sure this is the right check. The test passes on aarch64, > which also has load lanes, but apparently doesn't use them for this > test. I think the way the loop vectoriser handles the loops will > depend a lot on target costs, which can vary in unpredictable ways. > You are right, although aarch64 doesn't have this failure, it can fail with explicit -march=armv8-a+sve. It can vary as target features/costs change. The check is still fragile. Your suggestion with -ftree-slp-vectorize below is better! > Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize? > Or does that defeat the purpose of the test? It works, nice, thanks for the suggestion! I appended one explicit -fno-tree-loop-vectorize to avoid it to fail in case someone kicks off the testing with explicit -ftree-loop-vectorize. The updated version is pasted below, is it ok for trunk? BR, Kewen ----- gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr96789.c: Adjusted by disabling loop vectorization. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c index d6139a014d8..5704952309b 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c @@ -1,5 +1,8 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -funroll-loops -ftree-vectorize -fdump-tree-dse-details" } */ +/* Disable loop vectorization to avoid that loop vectorizer + optimizes those two loops that operate tmp array so that + subsequent dse3 won't eliminate expected tmp stores. */ +/* { dg-options "-O2 -funroll-loops -ftree-slp-vectorize -fno-tree-loop-vectorize -fdump-tree-dse-details" } */ /* Test if scalar cleanup pass takes effects, mainly check its secondary pass DSE can remove dead stores on array
"Kewen.Lin" <linkw@linux.ibm.com> writes: > Hi Richard, > > Thanks for the review! > > on 2020/11/10 锟斤拷锟斤拷7:31, Richard Sandiford wrote: >> "Kewen.Lin" <linkw@linux.ibm.com> writes: >>> Hi, >>> >>> As Lyon pointed out, the newly introduced test case >>> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf. >>> Loop vectorizer is able to vectorize the two loops which >>> operate on array tmp with load_lanes feature support. It >>> makes dse3 get unexpected inputs and do nothing. >>> >>> This patch is to teach the case to respect vect_load_lanes, >>> meanwhile to guard the check only under vect_int. >> >> I'm not sure this is the right check. The test passes on aarch64, >> which also has load lanes, but apparently doesn't use them for this >> test. I think the way the loop vectoriser handles the loops will >> depend a lot on target costs, which can vary in unpredictable ways. >> > > You are right, although aarch64 doesn't have this failure, it can fail > with explicit -march=armv8-a+sve. It can vary as target features/costs > change. The check is still fragile. > > Your suggestion with -ftree-slp-vectorize below is better! > >> Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize? >> Or does that defeat the purpose of the test? > > It works, nice, thanks for the suggestion! > > I appended one explicit -fno-tree-loop-vectorize to avoid it to fail > in case someone kicks off the testing with explicit -ftree-loop-vectorize. > > The updated version is pasted below, is it ok for trunk? OK, thanks. Richard > > BR, > Kewen > ----- > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/pr96789.c: Adjusted by disabling loop vectorization. > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c > index d6139a014d8..5704952309b 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c > @@ -1,5 +1,8 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -funroll-loops -ftree-vectorize -fdump-tree-dse-details" } */ > +/* Disable loop vectorization to avoid that loop vectorizer > + optimizes those two loops that operate tmp array so that > + subsequent dse3 won't eliminate expected tmp stores. */ > +/* { dg-options "-O2 -funroll-loops -ftree-slp-vectorize -fno-tree-loop-vectorize -fdump-tree-dse-details" } */ > > /* Test if scalar cleanup pass takes effects, mainly check > its secondary pass DSE can remove dead stores on array
On 11/10/20 7:42 PM, Kewen.Lin via Gcc-patches wrote: > Hi Richard, > > Thanks for the review! > > on 2020/11/10 涓嬪崍7:31, Richard Sandiford wrote: >> "Kewen.Lin" <linkw@linux.ibm.com> writes: >>> Hi, >>> >>> As Lyon pointed out, the newly introduced test case >>> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf. >>> Loop vectorizer is able to vectorize the two loops which >>> operate on array tmp with load_lanes feature support. It >>> makes dse3 get unexpected inputs and do nothing. >>> >>> This patch is to teach the case to respect vect_load_lanes, >>> meanwhile to guard the check only under vect_int. >> I'm not sure this is the right check. The test passes on aarch64, >> which also has load lanes, but apparently doesn't use them for this >> test. I think the way the loop vectoriser handles the loops will >> depend a lot on target costs, which can vary in unpredictable ways. >> > You are right, although aarch64 doesn't have this failure, it can fail > with explicit -march=armv8-a+sve. It can vary as target features/costs > change. The check is still fragile. > > Your suggestion with -ftree-slp-vectorize below is better! > >> Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize? >> Or does that defeat the purpose of the test? > It works, nice, thanks for the suggestion! > > I appended one explicit -fno-tree-loop-vectorize to avoid it to fail > in case someone kicks off the testing with explicit -ftree-loop-vectorize. > > The updated version is pasted below, is it ok for trunk? > > BR, > Kewen > ----- > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/pr96789.c: Adjusted by disabling loop vectorization. OK jeff
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c index d6139a014d8..1b89f8b7a6a 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c @@ -55,4 +55,7 @@ bar (int16_t res[16], uint8_t *val1, uint8_t *val2) } } -/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" } } */ +/* Exclude targets which support load_lanes since loop vectorizer + can vectorize those two loops that operate tmp array so that + subsequent dse3 will not eliminate tmp stores. */ +/* { dg-final { scan-tree-dump {Deleted dead store:.*tmp} "dse3" { target { vect_int && { ! vect_load_lanes } } } } } */