Message ID | 4C18F491.2000406@codesourcery.com |
---|---|
State | New |
Headers | show |
On Wed, Jun 16, 2010 at 5:58 PM, Maxim Kuvyrkov <maxim@codesourcery.com> wrote: > Currently, code hoisting only checks immediately-dominated blocks for > expressions to hoist. I wonder if limiting the search for expressions is > intentional. > > This patch makes code hoisting search through all dominated blocks for > expressions to hoist. And makes the algorithm quadratic in the size of the CFG. You should limit the depth not only to avoid excessive live range lengths but also for corner cases of strangely-formed CFGs. Ciao! Steven
On 06/16/10 10:27, Steven Bosscher wrote: > On Wed, Jun 16, 2010 at 5:58 PM, Maxim Kuvyrkov<maxim@codesourcery.com> wrote: > >> Currently, code hoisting only checks immediately-dominated blocks for >> expressions to hoist. I wonder if limiting the search for expressions is >> intentional. >> >> This patch makes code hoisting search through all dominated blocks for >> expressions to hoist. >> > And makes the algorithm quadratic in the size of the CFG. You should > limit the depth not only to avoid excessive live range lengths but > also for corner cases of strangely-formed CFGs. > Technically true, but we only care about the dominance tree here, not the entire CFG. The change which made code hoisting only look at the immediate dominators was a mistake. It's unfortunate that get_dominated_by only returns immediate dominators -- based on the name, one could reasonably expect to get the full set of dominators which I suspect happened back in 2002 when that change was made. Maxim -- can you test f90-intrinsic-bit.f with and without this change and report back on how the compilation time changes? Thanks, Jeff
On Mon, Jun 21, 2010 at 8:46 PM, Jeff Law <law@redhat.com> wrote: > On 06/16/10 10:27, Steven Bosscher wrote: >> >> On Wed, Jun 16, 2010 at 5:58 PM, Maxim Kuvyrkov<maxim@codesourcery.com> >> wrote: >> >>> >>> Currently, code hoisting only checks immediately-dominated blocks for >>> expressions to hoist. I wonder if limiting the search for expressions is >>> intentional. >>> >>> This patch makes code hoisting search through all dominated blocks for >>> expressions to hoist. >>> >> >> And makes the algorithm quadratic in the size of the CFG. You should >> limit the depth not only to avoid excessive live range lengths but >> also for corner cases of strangely-formed CFGs. >> > > Technically true, but we only care about the dominance tree here, not the > entire CFG. The change which made code hoisting only look at the immediate > dominators was a mistake. It's unfortunate that get_dominated_by only > returns immediate dominators -- based on the name, one could reasonably > expect to get the full set of dominators which I suspect happened back in > 2002 when that change was made. I experimented with a patch similar to Maxim's already 2.5 years ago (and offered to work on it for CS, but there was no interest in this work at the time :-/) See these three Bugzilla comments: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c8 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c13 Note especially the pessimization in comment #13 of PR33828. Therefore I maintain my objection to this patch. I hope Maxim will address this and the the other issues of PR33828. But of course there is much more benefit to be obtained from finishing the GIMPLE hoisting pass (which I also sent to CS, FWIW). It showed ~1% code size improvement (!) on CSiBE arm-elf when I last toyed with it, even though it is not quite perfect and hoists things it shouldn't... Ciao! Steven
On 06/21/10 13:28, Steven Bosscher wrote: > > I experimented with a patch similar to Maxim's already 2.5 years ago > (and offered to work on it for CS, but there was no interest in this > work at the time :-/) See these three Bugzilla comments: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c2 > Right. This is precisely the problem with using immediate dominators. This doesn't argue that Maxim's approach is wrong or bad for compile time performance or anything like that. It merely raises the same issue. > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c8 > Clearly something should be done about this. If you have a testcase for Maxim that would be a help. One could argue that adding the REG_EQUAL note created by hoisting to the hash table is a waste of time, fixing that would eliminate this problem. > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c13 > > Note especially the pessimization in comment #13 of PR33828. > Therefore I maintain my objection to this patch. > Clearly you don't want to hoist any higher than the lowest common dominator. Otherwise you unreasonably lengthen lifetimes. Maxim will need to address this problem as well. Jeff
On Mon, Jun 21, 2010 at 10:18 PM, Jeff Law <law@redhat.com> wrote: > Clearly you don't want to hoist any higher than the lowest common dominator. > Otherwise you unreasonably lengthen lifetimes. Maxim will need to address > this problem as well. Yes, and I need to fix this for my GIMPLE hoisting pass also. Do you know of an algorithm for this? Ciao! Steven
On 06/21/10 14:38, Steven Bosscher wrote: > On Mon, Jun 21, 2010 at 10:18 PM, Jeff Law<law@redhat.com> wrote: > >> Clearly you don't want to hoist any higher than the lowest common dominator. >> Otherwise you unreasonably lengthen lifetimes. Maxim will need to address >> this problem as well. >> > Yes, and I need to fix this for my GIMPLE hoisting pass also. Do you > know of an algorithm for this? > I recall seeing a lowest common ancestor algorithm in literature somewhere, so you could run that on the dominator tree I think. Jeff
On Mon, Jun 21, 2010 at 11:35 PM, Jeff Law <law@redhat.com> wrote: > I recall seeing a lowest common ancestor algorithm in literature somewhere, > so you could run that on the dominator tree I think. Yay. Even wikipedia knows about lowest common ancestor algorithms. So there you go: even if one knows a language fairly well, you can still look for something seemingly trivial for two years and not succeed, simply because you just don't know the right terminology. I'm going to play with some of those algorithms right away, so that hopefully GIMPLE hoisting will be ready before the end of stage 1 after all (this was the major blocker for me to continue working on it...). Thanks! Ciao! Steven
On 6/22/10 12:18 AM, Jeff Law wrote: > On 06/21/10 13:28, Steven Bosscher wrote: >> >> I experimented with a patch similar to Maxim's already 2.5 years ago >> (and offered to work on it for CS, but there was no interest in this >> work at the time :-/) See these three Bugzilla comments: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c2 > Right. This is precisely the problem with using immediate dominators. > This doesn't argue that Maxim's approach is wrong or bad for compile > time performance or anything like that. It merely raises the same issue. I agree with Steven that the search is better be constrained, possibly, with a large enough constant. I've added a new parameter and a dominance.c function to return dominated blocks up to depth N in the dominator tree (with N==1 being immediate dominators and N==0 being all dominators). Does this sound OK? ... >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33828#c13 >> >> Note especially the pessimization in comment #13 of PR33828. >> Therefore I maintain my objection to this patch. > Clearly you don't want to hoist any higher than the lowest common > dominator. Otherwise you unreasonably lengthen lifetimes. Maxim will > need to address this problem as well. This can be addressed with a walk over the dominator tree after we compute VBEout. Start with the root and descend in the tree keeping a bitset of expressions that should be alive up the tree. If current node 1. has a single successor, 2. has i'th expression set in VBEout, 3. the successor has i'th expression set in VBEout, 4. current node doesn't generate i'th expression, 5. i'th expression is not marked in the bitset as required up the tree, than we can hoist i'th expression in the successor with the same result as in the current node and not unnecessarily extend live ranges. There maybe a couple more details to the above, but the problem should be easily fixable. I will post second version of the patch in a couple of days. Thank you,
On 06/21/2010 11:50 PM, Steven Bosscher wrote: > On Mon, Jun 21, 2010 at 11:35 PM, Jeff Law<law@redhat.com> wrote: >> I recall seeing a lowest common ancestor algorithm in literature somewhere, >> so you could run that on the dominator tree I think. > > Yay. Even wikipedia knows about lowest common ancestor algorithms. Even dominance.c knows, nearest_common_dominator. :) Paolo
On 6/21/10 10:46 PM, Jeff Law wrote: ... > Technically true, but we only care about the dominance tree here, not > the entire CFG. The change which made code hoisting only look at the > immediate dominators was a mistake. It's unfortunate that > get_dominated_by only returns immediate dominators -- based on the name, > one could reasonably expect to get the full set of dominators which I > suspect happened back in 2002 when that change was made. > > Maxim -- can you test f90-intrinsic-bit.f with and without this change > and report back on how the compilation time changes? Compile time of f90-intrinsic-bit.f doesn't change in any meaningful way. Regards,
diff --git a/gcc/gcse.c b/gcc/gcse.c index 45cab70..a7c7237 100644 --- a/gcc/gcse.c +++ b/gcc/gcse.c @@ -4327,7 +4327,8 @@ hoist_code (void) int found = 0; int insn_inserted_p; - domby = get_dominated_by (CDI_DOMINATORS, bb); + domby = get_all_dominated_blocks (CDI_DOMINATORS, bb); + /* Examine each expression that is very busy at the exit of this block. These are the potentially hoistable expressions. */ for (i = 0; i < hoist_vbeout[bb->index]->n_bits; i++) @@ -4418,7 +4419,11 @@ hoist_code (void) it would be safe to compute it at the start of the dominated block. Now we have to determine if the expression would reach the dominated block if it was - placed at the end of BB. */ + placed at the end of BB. + Note: the fact that hoist_exprs has i-th bit set means + that /some/, not necesserilly all, occurences from + the dominated blocks can be hoisted to BB. Here we check + if a specific occurence can be hoisted to BB. */ if (hoist_expr_reaches_here_p (bb, i, dominated, NULL)) { struct expr *expr = index_map[i]; @@ -4431,6 +4436,12 @@ hoist_code (void) occr = occr->next; gcc_assert (occr); + + /* An occurence might've been already deleted + while processing a dominator of BB. */ + if (occr->deleted_p) + continue; + insn = occr->insn; set = single_set (insn); gcc_assert (set);