Message ID | ZMNiIpP4cwLGlRSN@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
Series | loop-split improvements, part 1 | expand |
On Fri, Jul 28, 2023 at 8:38 AM Jan Hubicka via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > while looking on profile misupdate on hmmer I noticed that loop splitting pass is not > able to handle the loop it has as an example it should apply on: > > One transformation of loops like: > > for (i = 0; i < 100; i++) > { > if (i < 50) > A; > else > B; > } > > into: > > for (i = 0; i < 50; i++) > { > A; > } > for (; i < 100; i++) > { > B; > } > > The problem is that ivcanon turns the test into i != 100 and the pass > explicitly gives up on any loops ending with != test. It needs to know > the directoin of the induction variable in order to derive right conditions, > but that can be done also from step. > > It turns out that there are no testcases for basic loop splitting. I will add > some with the profile update fix. > > There are other issues, like VRP will turn i < 99 into i == 99 based on > value range which also makes the pass to give up. > > Bootstrapped/regtested x86_64-linux, OK? OK. > gcc/ChangeLog: > > * tree-ssa-loop-split.cc (split_loop): Also support NE driven > loops when IV test is not overflowing. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ifc-12.c: Disable loop splitting. > * gcc.target/i386/avx2-gather-6.c: Likewise. > * gcc.target/i386/avx2-vect-aggressive.c: Likewise. > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c > index 9468c070489..bedf29c7dbc 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details" } */ > +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details -fno-split-loops" } */ > /* { dg-require-visibility "" } */ > > struct st > diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c > index b9119581ae2..47a95dbe989 100644 > --- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c > +++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake" } */ > +/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake -fno-split-loops" } */ > > #include "avx2-gather-5.c" > > diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c > index 57192791857..fa336e70e84 100644 > --- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c > +++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c > @@ -1,6 +1,6 @@ > /* { dg-do run } */ > /* { dg-require-effective-target avx2 } */ > -/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */ > +/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1 -fno-split-loops" } */ > > #include "avx2-check.h" > #define N 64 > diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc > index b41b5e614c2..27780370d85 100644 > --- a/gcc/tree-ssa-loop-split.cc > +++ b/gcc/tree-ssa-loop-split.cc > @@ -540,10 +545,17 @@ split_loop (class loop *loop1) > || !empty_block_p (loop1->latch) > || !easy_exit_values (loop1) > || !number_of_iterations_exit (loop1, exit1, &niter, false, true) > - || niter.cmp == ERROR_MARK > - /* We can't yet handle loops controlled by a != predicate. */ > - || niter.cmp == NE_EXPR) > + || niter.cmp == ERROR_MARK) > return false; > + if (niter.cmp == NE_EXPR) > + { > + if (!niter.control.no_overflow) > + return false; > + if (tree_int_cst_sign_bit (niter.control.step) > 0) > + niter.cmp = GT_EXPR; > + else > + niter.cmp = LT_EXPR; > + } > > bbs = get_loop_body (loop1); >
On Thu, Jul 27, 2023 at 11:38 PM Jan Hubicka via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > while looking on profile misupdate on hmmer I noticed that loop splitting pass is not > able to handle the loop it has as an example it should apply on: > > One transformation of loops like: > > for (i = 0; i < 100; i++) > { > if (i < 50) > A; > else > B; > } > > into: > > for (i = 0; i < 50; i++) > { > A; > } > for (; i < 100; i++) > { > B; > } > > The problem is that ivcanon turns the test into i != 100 and the pass > explicitly gives up on any loops ending with != test. It needs to know > the directoin of the induction variable in order to derive right conditions, > but that can be done also from step. > > It turns out that there are no testcases for basic loop splitting. I will add > some with the profile update fix. Thanks for doing this. Here are bug reports you should look at after all of your loop splitting patches are done: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37239 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77689 Thanks, Andrew > > There are other issues, like VRP will turn i < 99 into i == 99 based on > value range which also makes the pass to give up. > > Bootstrapped/regtested x86_64-linux, OK? > > gcc/ChangeLog: > > * tree-ssa-loop-split.cc (split_loop): Also support NE driven > loops when IV test is not overflowing. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ifc-12.c: Disable loop splitting. > * gcc.target/i386/avx2-gather-6.c: Likewise. > * gcc.target/i386/avx2-vect-aggressive.c: Likewise. > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c > index 9468c070489..bedf29c7dbc 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details" } */ > +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details -fno-split-loops" } */ > /* { dg-require-visibility "" } */ > > struct st > diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c > index b9119581ae2..47a95dbe989 100644 > --- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c > +++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake" } */ > +/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake -fno-split-loops" } */ > > #include "avx2-gather-5.c" > > diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c > index 57192791857..fa336e70e84 100644 > --- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c > +++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c > @@ -1,6 +1,6 @@ > /* { dg-do run } */ > /* { dg-require-effective-target avx2 } */ > -/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */ > +/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1 -fno-split-loops" } */ > > #include "avx2-check.h" > #define N 64 > diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc > index b41b5e614c2..27780370d85 100644 > --- a/gcc/tree-ssa-loop-split.cc > +++ b/gcc/tree-ssa-loop-split.cc > @@ -540,10 +545,17 @@ split_loop (class loop *loop1) > || !empty_block_p (loop1->latch) > || !easy_exit_values (loop1) > || !number_of_iterations_exit (loop1, exit1, &niter, false, true) > - || niter.cmp == ERROR_MARK > - /* We can't yet handle loops controlled by a != predicate. */ > - || niter.cmp == NE_EXPR) > + || niter.cmp == ERROR_MARK) > return false; > + if (niter.cmp == NE_EXPR) > + { > + if (!niter.control.no_overflow) > + return false; > + if (tree_int_cst_sign_bit (niter.control.step) > 0) > + niter.cmp = GT_EXPR; > + else > + niter.cmp = LT_EXPR; > + } > > bbs = get_loop_body (loop1); >
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c index 9468c070489..bedf29c7dbc 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details" } */ +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details -fno-split-loops" } */ /* { dg-require-visibility "" } */ struct st diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c index b9119581ae2..47a95dbe989 100644 --- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c +++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake" } */ +/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake -fno-split-loops" } */ #include "avx2-gather-5.c" diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c index 57192791857..fa336e70e84 100644 --- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c +++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c @@ -1,6 +1,6 @@ /* { dg-do run } */ /* { dg-require-effective-target avx2 } */ -/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */ +/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1 -fno-split-loops" } */ #include "avx2-check.h" #define N 64 diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc index b41b5e614c2..27780370d85 100644 --- a/gcc/tree-ssa-loop-split.cc +++ b/gcc/tree-ssa-loop-split.cc @@ -540,10 +545,17 @@ split_loop (class loop *loop1) || !empty_block_p (loop1->latch) || !easy_exit_values (loop1) || !number_of_iterations_exit (loop1, exit1, &niter, false, true) - || niter.cmp == ERROR_MARK - /* We can't yet handle loops controlled by a != predicate. */ - || niter.cmp == NE_EXPR) + || niter.cmp == ERROR_MARK) return false; + if (niter.cmp == NE_EXPR) + { + if (!niter.control.no_overflow) + return false; + if (tree_int_cst_sign_bit (niter.control.step) > 0) + niter.cmp = GT_EXPR; + else + niter.cmp = LT_EXPR; + } bbs = get_loop_body (loop1);