Message ID | 20150319195402.GA21421@f1.c.bardezibar.internal |
---|---|
State | New |
Headers | show |
On Thu, Mar 19, 2015 at 8:54 PM, Sebastian Pop <sebpop@gmail.com> wrote: > Richard Biener wrote: >> please instead fixup after copy_bbs in duplicate_seme_region. >> > > Thanks for the review. > Attached patch that does not modify copy_bbs. > Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp > > Full bootstrap and regtest in progress on x86_64-linux. Ok for trunk? Looks good to me. Please also add a testcase. Thanks, Richard.
On 03/19/2015 01:54 PM, Sebastian Pop wrote: > Richard Biener wrote: >> please instead fixup after copy_bbs in duplicate_seme_region. >> > > Thanks for the review. > Attached patch that does not modify copy_bbs. > Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp > > Full bootstrap and regtest in progress on x86_64-linux. Ok for trunk? I'm going to have to find time tomorrow to wrap this up. I've got a morning full of meetings, then I on a plane for the east coast for the rest of the day. I'm in an exit row seat for the 2nd leg, so there ought to be enough room to open my laptop and poke at this. jeff
On 03/19/15 13:54, Sebastian Pop wrote: > Richard Biener wrote: >> >please instead fixup after copy_bbs in duplicate_seme_region. >> > > Thanks for the review. > Attached patch that does not modify copy_bbs. > Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp > > Full bootstrap and regtest in progress on x86_64-linux. Ok for trunk? > > > 0001-diamonds-are-not-valid-execution-threads-for-jump-th.patch > > > From 8f1516235bce3e1c4f359149dcc546d813ed7817 Mon Sep 17 00:00:00 2001 > From: Sebastian Pop<sebpop@gmail.com> > Date: Tue, 17 Mar 2015 20:28:19 +0100 > Subject: [PATCH] diamonds are not valid execution threads for jump threading > > PR tree-optimization/65177 > * tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread. > (bb_in_bbs): New. > (duplicate_seme_region): Renamed duplicate_thread_path. Redirect all > edges not adjacent on the path to the original code. OK for the trunk. Though I think there's some stage1 refactoring that we're going to want to do. Specifically, it seems to me that copy_bbs should be refactored into copy_bbs and copy_bbs_for_threading or somesuch. Where those routines call into refactored common subroutines, but obviously handle wiring up the outgoing edges from the copied blocks differently. The goal would be to eliminate the overly complex block copy/CFG update scheme in tree-ssa-threadupdate.c as part of a larger project to convert to a backward threader that can run independently of DOM. Jeff
Jeff Law wrote: > > PR tree-optimization/65177 > > * tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread. > > (bb_in_bbs): New. > > (duplicate_seme_region): Renamed duplicate_thread_path. Redirect all > > edges not adjacent on the path to the original code. > OK for the trunk. Committed r221675. > Though I think there's some stage1 refactoring that we're going to want to do. Agreed. > Specifically, it seems to me that copy_bbs should be refactored into > copy_bbs and copy_bbs_for_threading or somesuch. Where those > routines call into refactored common subroutines, but obviously > handle wiring up the outgoing edges from the copied blocks > differently. > That would be a good cleanup: I don't like to arbitrarily redirect edges in copy_bbs just to redirect them back to their initial place in the caller. > The goal would be to eliminate the overly complex block copy/CFG > update scheme in tree-ssa-threadupdate.c as part of a larger project > to convert to a backward threader that can run independently of DOM. I have a start of a patch for that cleanup, it currently runs wild as it would replace the existing threadupdate code generator with a call to the new duplicate_thread_path. I think we should take smaller more manageable steps to ease the review and to not destabilize the jump-threader. In particular I think we should have both code generators for a while and turn one on/off with an option. Sebastian
On 03/25/2015 05:09 PM, Sebastian Pop wrote: >> Specifically, it seems to me that copy_bbs should be refactored into >> copy_bbs and copy_bbs_for_threading or somesuch. Where those >> routines call into refactored common subroutines, but obviously >> handle wiring up the outgoing edges from the copied blocks >> differently. >> > > That would be a good cleanup: I don't like to arbitrarily redirect edges in > copy_bbs just to redirect them back to their initial place in the caller. Exactly. > >> The goal would be to eliminate the overly complex block copy/CFG >> update scheme in tree-ssa-threadupdate.c as part of a larger project >> to convert to a backward threader that can run independently of DOM. > > I have a start of a patch for that cleanup, it currently runs wild as it would > replace the existing threadupdate code generator with a call to the new > duplicate_thread_path. I think we should take smaller more manageable steps to > ease the review and to not destabilize the jump-threader. In particular I think > we should have both code generators for a while and turn one on/off with an option. I hadn't planned on supporting both with an option, I'd rather make the switch and not look back :-) An option just adds maintenance burdens (supporting both approaches) and doesn't actually help the end users (though it would help us as developers). Regardless, probably the first step is a common API for the two path duplication approaches. Jeff
On Thu, Mar 26, 2015 at 4:50 PM, Jeff Law <law@redhat.com> wrote: > On 03/25/2015 05:09 PM, Sebastian Pop wrote: >>> >>> Specifically, it seems to me that copy_bbs should be refactored into >>> copy_bbs and copy_bbs_for_threading or somesuch. Where those >>> routines call into refactored common subroutines, but obviously >>> handle wiring up the outgoing edges from the copied blocks >>> differently. >>> >> >> That would be a good cleanup: I don't like to arbitrarily redirect edges >> in >> copy_bbs just to redirect them back to their initial place in the caller. > > Exactly. > >> >>> The goal would be to eliminate the overly complex block copy/CFG >>> update scheme in tree-ssa-threadupdate.c as part of a larger project >>> to convert to a backward threader that can run independently of DOM. >> >> >> I have a start of a patch for that cleanup, it currently runs wild as it >> would >> replace the existing threadupdate code generator with a call to the new >> duplicate_thread_path. I think we should take smaller more manageable >> steps to >> ease the review and to not destabilize the jump-threader. In particular I >> think >> we should have both code generators for a while and turn one on/off with >> an option. > > I hadn't planned on supporting both with an option, I'd rather make the > switch and not look back :-) An option just adds maintenance burdens > (supporting both approaches) and doesn't actually help the end users (though > it would help us as developers). > > Regardless, probably the first step is a common API for the two path > duplication approaches. Yeah, and refactoring copy_bbs so that the actual edge duplication happens in another function (thus we can have two of them). Note we have way too many copy-XYZ APIs/workers on GIMPLE already. I was also playing with the idea to support value-numbering the stmts on-the-fly as we copy them and use this for cost estimation of copies (ok, well - basically do the copy and then either throw it away again or accept it). I thought about applying this to loop unrolling, but obviously this also applies to any block duplication we do during jump threading. Richard. > Jeff
On 03/27/2015 02:53 AM, Richard Biener wrote: > > Yeah, and refactoring copy_bbs so that the actual edge duplication happens > in another function (thus we can have two of them). Exactly. > > I was also playing with the idea to support value-numbering the stmts > on-the-fly as we copy them and use this for cost estimation of copies > (ok, well - basically do the copy and then either throw it away again > or accept it). > I thought about applying this to loop unrolling, but obviously this also applies > to any block duplication we do during jump threading. It'd definitely be useful in threading. It's often the case that PHIs in the duplicates are degenerates and we can/should cprop them away and perform resultant simplifications that propagation enables. If we could do that integrated with the actual duplication, then we'd be able to drop the phi-only cprop pass which exists solely to clean up that stuff. jeff
From 8f1516235bce3e1c4f359149dcc546d813ed7817 Mon Sep 17 00:00:00 2001 From: Sebastian Pop <sebpop@gmail.com> Date: Tue, 17 Mar 2015 20:28:19 +0100 Subject: [PATCH] diamonds are not valid execution threads for jump threading PR tree-optimization/65177 * tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread. (bb_in_bbs): New. (duplicate_seme_region): Renamed duplicate_thread_path. Redirect all edges not adjacent on the path to the original code. --- gcc/tree-ssa-threadupdate.c | 93 +++++++++++++++++++++++++++++++------------ 1 files changed, 67 insertions(+), 26 deletions(-) diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index 7a159bb..610e807 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -2328,36 +2328,32 @@ bb_ends_with_multiway_branch (basic_block bb ATTRIBUTE_UNUSED) return false; } -/* Verify that the REGION is a Single Entry Multiple Exits region: make sure no - edge other than ENTRY is entering the REGION. */ +/* Verify that the REGION is a valid jump thread. A jump thread is a special + case of SEME Single Entry Multiple Exits region in which all nodes in the + REGION have exactly one incoming edge. The only exception is the first block + that may not have been connected to the rest of the cfg yet. */ DEBUG_FUNCTION void -verify_seme (edge entry, basic_block *region, unsigned n_region) +verify_jump_thread (basic_block *region, unsigned n_region) { - bitmap bbs = BITMAP_ALLOC (NULL); - for (unsigned i = 0; i < n_region; i++) - bitmap_set_bit (bbs, region[i]->index); + gcc_assert (EDGE_COUNT (region[i]->preds) <= 1); +} - for (unsigned i = 0; i < n_region; i++) - { - edge e; - edge_iterator ei; - basic_block bb = region[i]; +/* Return true when BB is one of the first N items in BBS. */ - /* All predecessors other than ENTRY->src should be in the region. */ - for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ei_next (&ei)) - if (e != entry) - gcc_assert (bitmap_bit_p (bbs, e->src->index)); - } +static inline bool +bb_in_bbs (basic_block bb, basic_block *bbs, int n) +{ + for (int i = 0; i < n; i++) + if (bb == bbs[i]) + return true; - BITMAP_FREE (bbs); + return false; } -/* Duplicates a Single Entry Multiple Exit REGION (set of N_REGION basic - blocks). The ENTRY edge is redirected to the duplicate of the region. If - REGION is not a Single Entry region, ignore any incoming edges other than - ENTRY: this makes the copied region a Single Entry region. +/* Duplicates a jump-thread path of N_REGION basic blocks. + The ENTRY edge is redirected to the duplicate of the region. Remove the last conditional statement in the last basic block in the REGION, and create a single fallthru edge pointing to the same destination as the @@ -2369,7 +2365,7 @@ verify_seme (edge entry, basic_block *region, unsigned n_region) Returns false if it is unable to copy the region, true otherwise. */ static bool -duplicate_seme_region (edge entry, edge exit, +duplicate_thread_path (edge entry, edge exit, basic_block *region, unsigned n_region, basic_block *region_copy) { @@ -2428,7 +2424,53 @@ duplicate_seme_region (edge entry, edge exit, } copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop, - split_edge_bb_loc (entry), 0); + split_edge_bb_loc (entry), false); + + /* Fix up: copy_bbs redirects all edges pointing to copied blocks. The + following code ensures that all the edges exiting the jump-thread path are + redirected back to the original code: these edges are exceptions + invalidating the property that is propagated by executing all the blocks of + the jump-thread path in order. */ + + for (i = 0; i < n_region; i++) + { + edge e; + edge_iterator ei; + basic_block bb = region_copy[i]; + + if (single_succ_p (bb)) + { + /* Make sure the successor is the next node in the path. */ + gcc_assert (i + 1 == n_region + || region_copy[i + 1] == single_succ_edge (bb)->dest); + continue; + } + + /* Special case the last block on the path: make sure that it does not + jump back on the copied path. */ + if (i + 1 == n_region) + { + FOR_EACH_EDGE (e, ei, bb->succs) + if (bb_in_bbs (e->dest, region_copy, n_region - 1)) + { + basic_block orig = get_bb_original (e->dest); + if (orig) + redirect_edge_and_branch_force (e, orig); + } + continue; + } + + /* Redirect all other edges jumping to non-adjacent blocks back to the + original code. */ + FOR_EACH_EDGE (e, ei, bb->succs) + if (region_copy[i + 1] != e->dest) + { + basic_block orig = get_bb_original (e->dest); + if (orig) + redirect_edge_and_branch_force (e, orig); + } + } + if (total_count) { scale_bbs_frequencies_gcov_type (region, n_region, @@ -2445,8 +2487,7 @@ duplicate_seme_region (edge entry, edge exit, } #ifdef ENABLE_CHECKING - /* Make sure no edge other than ENTRY is entering the copied region. */ - verify_seme (entry, region_copy, n_region); + verify_jump_thread (region_copy, n_region); #endif /* Remove the last branch in the jump thread path. */ @@ -2550,7 +2591,7 @@ thread_through_all_blocks (bool may_peel_loop_headers) for (unsigned int j = 0; j < len - 1; j++) region[j] = (*path)[j]->e->dest; - if (duplicate_seme_region (entry, exit, region, len - 1, NULL)) + if (duplicate_thread_path (entry, exit, region, len - 1, NULL)) { /* We do not update dominance info. */ free_dominance_info (CDI_DOMINATORS); -- 1.7.2.5