Message ID | 31e99da8-0b2b-d19d-aeb2-253e7f57ca80@ispras.ru |
---|---|
State | New |
Headers | show |
Series | [modulo-sched] Fix PR92591 | expand |
Hello. > As pointed out in the PR > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92591#c1, the test can be > fixed by DFA-checking more adjacent row sequences in the partial > schedule. > I've found that on powerpc64 gcc.c-torture/execute/pr61682.c test > catches same issue with -Os -fmodulo-sched-allow-regmoves with some > non-zero sms-dfa-history parameter values, so I added that test using > #include as second test into the patch. > > Minor separate patch about modulo-sched parameters is also attached. > If no objection, I'll commit this two patches into trunk tomorrow > together with my PR90001 fix. > > Trunk and 8/9 branches succesfully regstrapped on x64, and > cross-compiler check-gcc tested on ppc, ppc64, arm, aarch64, ia64 and > s390. Certainly a lot of testing were also done with changing default > sms-dfa-history value to some other than zero. I think this should be backported into 9 and 8 branches, because second example gives an ICE there. But I'm not sure about backporting sms-dfa-history upper bound limitation (<=16) into params.def in branches. Compile-time may grow dramatically for huge values like 1000, so we have to limit it. Is it ok to limit the parameter, or maybe it's better to implement some "history=MIN(history, 16)" logic in modulo-sched.c ? I see that sometimes parameter limitation is backported, examples are: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80663 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79576 While at it, maybe you have some thoughts about selected value of 16. Maximum reasonable value for sms-dfa-history param seems to be max latency between two insns on target platform (calculated by dep_cost function in haifa-sched.c). I'm posting full backport patch here, it suits 8/9 branches. Jakub and Richard, is it OK ? Roman Backport from mainline gcc/ChangeLog: 2019-12-17 Roman Zhuykov <zhroma@ispras.ru> * modulo-sched.c (ps_add_node_check_conflicts): Improve checking for history > 0 case. * params.def (sms-dfa-history): Limit to 16. gcc/testsuite/ChangeLog: 2019-12-17 Roman Zhuykov <zhroma@ispras.ru> * gcc.dg/pr92951-1.c: New test. * gcc.dg/pr92951-2.c: New test. diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c --- a/gcc/modulo-sched.c +++ b/gcc/modulo-sched.c @@ -3209,7 +3209,7 @@ ps_add_node_check_conflicts (partial_schedule_ptr ps, int n, int c, sbitmap must_precede, sbitmap must_follow) { - int has_conflicts = 0; + int i, first, amount, has_conflicts = 0; ps_insn_ptr ps_i; /* First add the node to the PS, if this succeeds check for @@ -3217,23 +3217,32 @@ ps_add_node_check_conflicts (partial_schedule_ptr ps, int n, if (! (ps_i = add_node_to_ps (ps, n, c, must_precede, must_follow))) return NULL; /* Failed to insert the node at the given cycle. */ - has_conflicts = ps_has_conflicts (ps, c, c) - || (ps->history > 0 - && ps_has_conflicts (ps, - c - ps->history, - c + ps->history)); - - /* Try different issue slots to find one that the given node can be - scheduled in without conflicts. */ - while (has_conflicts) + while (1) { + has_conflicts = ps_has_conflicts (ps, c, c); + if (ps->history > 0 && !has_conflicts) + { + /* Check all 2h+1 intervals, starting from c-2h..c up to c..2h, + but not more than ii intervals. */ + first = c - ps->history; + amount = 2 * ps->history + 1; + if (amount > ps->ii) + amount = ps->ii; + for (i = first; i < first + amount; i++) + { + has_conflicts = ps_has_conflicts (ps, + i - ps->history, + i + ps->history); + if (has_conflicts) + break; + } + } + if (!has_conflicts) + break; + /* Try different issue slots to find one that the given node can be + scheduled in without conflicts. */ if (! ps_insn_advance_column (ps, ps_i, must_follow)) break; - has_conflicts = ps_has_conflicts (ps, c, c) - || (ps->history > 0 - && ps_has_conflicts (ps, - c - ps->history, - c + ps->history)); } if (has_conflicts) diff --git a/gcc/testsuite/gcc.dg/pr92951-1.c b/gcc/testsuite/gcc.dg/pr92951-1.c --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr92951-1.c @@ -0,0 +1,11 @@ +/* PR rtl-optimization/92591 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fmodulo-sched -fweb -fno-dce -fno-ivopts -fno-sched-pressure -fno-tree-loop-distribute-patterns --param sms-dfa-history=1" } */ +/* { dg-additional-options "-mcpu=e500mc" { target { powerpc-*-* } } } */ + +void +wf (char *mr, int tc) +{ + while (tc-- > 0) + *mr++ = 0; +} diff --git a/gcc/testsuite/gcc.dg/pr92951-2.c b/gcc/testsuite/gcc.dg/pr92951-2.c --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr92951-2.c @@ -0,0 +1,5 @@ +/* PR rtl-optimization/92591 */ +/* { dg-do compile } */ +/* { dg-options "-Os -fmodulo-sched -fmodulo-sched-allow-regmoves --param sms-dfa-history=8" } */ + +#include "../gcc.c-torture/execute/pr61682.c" diff --git a/gcc/params.def b/gcc/params.def index e3ad05f..74215f2 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -387,7 +387,7 @@ DEFPARAM(PARAM_SMS_MIN_SC, DEFPARAM(PARAM_SMS_DFA_HISTORY, "sms-dfa-history", "The number of cycles the swing modulo scheduler considers when checking conflicts using DFA.", - 0, 0, 0) + 0, 0, 16) DEFPARAM(PARAM_SMS_LOOP_AVERAGE_COUNT_THRESHOLD, "sms-loop-average-count-threshold", "A threshold on the average loop count considered by the swing modulo scheduler.",
On December 17, 2019 8:40:59 PM GMT+01:00, Roman Zhuykov <zhroma@ispras.ru> wrote: >Hello. > >> As pointed out in the PR >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92591#c1, the test can >be >> fixed by DFA-checking more adjacent row sequences in the partial >> schedule. >> I've found that on powerpc64 gcc.c-torture/execute/pr61682.c test >> catches same issue with -Os -fmodulo-sched-allow-regmoves with some >> non-zero sms-dfa-history parameter values, so I added that test using >> #include as second test into the patch. >> >> Minor separate patch about modulo-sched parameters is also attached. >> If no objection, I'll commit this two patches into trunk tomorrow >> together with my PR90001 fix. >> >> Trunk and 8/9 branches succesfully regstrapped on x64, and >> cross-compiler check-gcc tested on ppc, ppc64, arm, aarch64, ia64 and >> s390. Certainly a lot of testing were also done with changing default >> sms-dfa-history value to some other than zero. > >I think this should be backported into 9 and 8 branches, because second > >example gives an ICE there. But I'm not sure about backporting >sms-dfa-history upper bound limitation (<=16) into params.def in >branches. Compile-time may grow dramatically for huge values like >1000, >so we have to limit it. Is it ok to limit the parameter, or maybe it's > >better to implement some "history=MIN(history, 16)" logic in >modulo-sched.c ? > >I see that sometimes parameter limitation is backported, examples are: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80663 >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79576 > >While at it, maybe you have some thoughts about selected value of 16. >Maximum reasonable value for sms-dfa-history param seems to be max >latency between two insns on target platform (calculated by dep_cost >function in haifa-sched.c). > >I'm posting full backport patch here, it suits 8/9 branches. Jakub and >Richard, is it OK ? Ok with me. Richard. >Roman > >Backport from mainline >gcc/ChangeLog: > >2019-12-17 Roman Zhuykov <zhroma@ispras.ru> > > * modulo-sched.c (ps_add_node_check_conflicts): Improve checking > for history > 0 case. > * params.def (sms-dfa-history): Limit to 16. > >gcc/testsuite/ChangeLog: > >2019-12-17 Roman Zhuykov <zhroma@ispras.ru> > > * gcc.dg/pr92951-1.c: New test. > * gcc.dg/pr92951-2.c: New test. > > >diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c >--- a/gcc/modulo-sched.c >+++ b/gcc/modulo-sched.c >@@ -3209,7 +3209,7 @@ ps_add_node_check_conflicts (partial_schedule_ptr > >ps, int n, > int c, sbitmap must_precede, > sbitmap must_follow) > { >- int has_conflicts = 0; >+ int i, first, amount, has_conflicts = 0; > ps_insn_ptr ps_i; > > /* First add the node to the PS, if this succeeds check for >@@ -3217,23 +3217,32 @@ ps_add_node_check_conflicts >(partial_schedule_ptr ps, int n, > if (! (ps_i = add_node_to_ps (ps, n, c, must_precede, must_follow))) > return NULL; /* Failed to insert the node at the given cycle. */ > >- has_conflicts = ps_has_conflicts (ps, c, c) >- || (ps->history > 0 >- && ps_has_conflicts (ps, >- c - ps->history, >- c + ps->history)); >- >- /* Try different issue slots to find one that the given node can be >- scheduled in without conflicts. */ >- while (has_conflicts) >+ while (1) > { >+ has_conflicts = ps_has_conflicts (ps, c, c); >+ if (ps->history > 0 && !has_conflicts) >+ { >+ /* Check all 2h+1 intervals, starting from c-2h..c up to c..2h, >+ but not more than ii intervals. */ >+ first = c - ps->history; >+ amount = 2 * ps->history + 1; >+ if (amount > ps->ii) >+ amount = ps->ii; >+ for (i = first; i < first + amount; i++) >+ { >+ has_conflicts = ps_has_conflicts (ps, >+ i - ps->history, >+ i + ps->history); >+ if (has_conflicts) >+ break; >+ } >+ } >+ if (!has_conflicts) >+ break; >+ /* Try different issue slots to find one that the given node can > >be >+ scheduled in without conflicts. */ > if (! ps_insn_advance_column (ps, ps_i, must_follow)) > break; >- has_conflicts = ps_has_conflicts (ps, c, c) >- || (ps->history > 0 >- && ps_has_conflicts (ps, >- c - ps->history, >- c + ps->history)); > } > > if (has_conflicts) >diff --git a/gcc/testsuite/gcc.dg/pr92951-1.c >b/gcc/testsuite/gcc.dg/pr92951-1.c >--- /dev/null >+++ b/gcc/testsuite/gcc.dg/pr92951-1.c >@@ -0,0 +1,11 @@ >+/* PR rtl-optimization/92591 */ >+/* { dg-do compile } */ >+/* { dg-options "-O2 -fmodulo-sched -fweb -fno-dce -fno-ivopts >-fno-sched-pressure -fno-tree-loop-distribute-patterns --param >sms-dfa-history=1" } */ >+/* { dg-additional-options "-mcpu=e500mc" { target { powerpc-*-* } } } > >*/ >+ >+void >+wf (char *mr, int tc) >+{ >+ while (tc-- > 0) >+ *mr++ = 0; >+} >diff --git a/gcc/testsuite/gcc.dg/pr92951-2.c >b/gcc/testsuite/gcc.dg/pr92951-2.c >--- /dev/null >+++ b/gcc/testsuite/gcc.dg/pr92951-2.c >@@ -0,0 +1,5 @@ >+/* PR rtl-optimization/92591 */ >+/* { dg-do compile } */ >+/* { dg-options "-Os -fmodulo-sched -fmodulo-sched-allow-regmoves >--param sms-dfa-history=8" } */ >+ >+#include "../gcc.c-torture/execute/pr61682.c" >diff --git a/gcc/params.def b/gcc/params.def >index e3ad05f..74215f2 100644 >--- a/gcc/params.def >+++ b/gcc/params.def >@@ -387,7 +387,7 @@ DEFPARAM(PARAM_SMS_MIN_SC, > DEFPARAM(PARAM_SMS_DFA_HISTORY, > "sms-dfa-history", > "The number of cycles the swing modulo scheduler considers when >checking conflicts using DFA.", >- 0, 0, 0) >+ 0, 0, 16) > DEFPARAM(PARAM_SMS_LOOP_AVERAGE_COUNT_THRESHOLD, > "sms-loop-average-count-threshold", > "A threshold on the average loop count considered by the swing modulo > >scheduler.",
diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c --- a/gcc/modulo-sched.c +++ b/gcc/modulo-sched.c @@ -3200,7 +3200,7 @@ ps_add_node_check_conflicts (partial_schedule_ptr ps, int n, int c, sbitmap must_precede, sbitmap must_follow) { - int has_conflicts = 0; + int i, first, amount, has_conflicts = 0; ps_insn_ptr ps_i; /* First add the node to the PS, if this succeeds check for @@ -3208,23 +3208,32 @@ ps_add_node_check_conflicts (partial_schedule_ptr ps, int n, if (! (ps_i = add_node_to_ps (ps, n, c, must_precede, must_follow))) return NULL; /* Failed to insert the node at the given cycle. */ - has_conflicts = ps_has_conflicts (ps, c, c) - || (ps->history > 0 - && ps_has_conflicts (ps, - c - ps->history, - c + ps->history)); - - /* Try different issue slots to find one that the given node can be - scheduled in without conflicts. */ - while (has_conflicts) + while (1) { + has_conflicts = ps_has_conflicts (ps, c, c); + if (ps->history > 0 && !has_conflicts) + { + /* Check all 2h+1 intervals, starting from c-2h..c up to c..2h, + but not more than ii intervals. */ + first = c - ps->history; + amount = 2 * ps->history + 1; + if (amount > ps->ii) + amount = ps->ii; + for (i = first; i < first + amount; i++) + { + has_conflicts = ps_has_conflicts (ps, + i - ps->history, + i + ps->history); + if (has_conflicts) + break; + } + } + if (!has_conflicts) + break; + /* Try different issue slots to find one that the given node can be + scheduled in without conflicts. */ if (! ps_insn_advance_column (ps, ps_i, must_follow)) break; - has_conflicts = ps_has_conflicts (ps, c, c) - || (ps->history > 0 - && ps_has_conflicts (ps, - c - ps->history, - c + ps->history)); } if (has_conflicts) diff --git a/gcc/testsuite/gcc.dg/pr92951-1.c b/gcc/testsuite/gcc.dg/pr92951-1.c --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr92951-1.c @@ -0,0 +1,11 @@ +/* PR rtl-optimization/92591 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fmodulo-sched -fweb -fno-dce -fno-ivopts -fno-sched-pressure -fno-tree-loop-distribute-patterns --param sms-dfa-history=1" } */ +/* { dg-additional-options "-mcpu=e500mc" { target { powerpc-*-* } } } */ + +void +wf (char *mr, int tc) +{ + while (tc-- > 0) + *mr++ = 0; +} diff --git a/gcc/testsuite/gcc.dg/pr92951-2.c b/gcc/testsuite/gcc.dg/pr92951-2.c --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr92951-2.c @@ -0,0 +1,5 @@ +/* PR rtl-optimization/92591 */ +/* { dg-do compile } */ +/* { dg-options "-Os -fmodulo-sched -fmodulo-sched-allow-regmoves --param sms-dfa-history=8" } */ + +#include "../gcc.c-torture/execute/pr61682.c"