From patchwork Mon Nov 14 13:02:00 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 125522 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id E7D4FB71FB for ; Tue, 15 Nov 2011 00:05:03 +1100 (EST) Received: (qmail 1636 invoked by alias); 14 Nov 2011 13:05:00 -0000 Received: (qmail 1626 invoked by uid 22791); 14 Nov 2011 13:04:58 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL, BAYES_00, TW_CF, TW_TM X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 14 Nov 2011 13:04:41 +0000 Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1RPwDI-0002Ie-V5 from Tom_deVries@mentor.com ; Mon, 14 Nov 2011 05:04:40 -0800 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 14 Nov 2011 05:02:25 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.1.289.1; Mon, 14 Nov 2011 13:04:38 +0000 Message-ID: <4EC11148.60009@mentor.com> Date: Mon, 14 Nov 2011 14:02:00 +0100 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Lightning/1.0b2 Thunderbird/3.1.15 MIME-Version: 1.0 To: Richard Guenther , "gcc-patches@gcc.gnu.org" Subject: [PATCH, PR51005] Fix slow down of compilation of 20001226-1.c by -ftree-tail-merge Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Richard, This patch fixes the slow down in the compilation of 20001226-1.c caused by -ftree-tail-merge. -ftree-tree-tail-merge removes half of the 16000 basic blocks, but increases compilation time with a factor 2.4 from 34 seconds to 80 seconds (not counting the further increase of 46 seconds by verify_dominators ifdef ENABLE_CHECKING). The compilation time is due to recalculating the dominator information after each basic block removal. That dominator information is needed to update the vops after each basic block removal, if this will not be done by TODO_update_ssa_only_virtuals. To update the dominator info, iterate_fix_dominators is called, which takes a lot of time. The patch fixes this by delegating the updating of the vops to TODO_update_ssa_only_virtuals, and recalculating dominator info once per iteration. I have attached another patch to the PR: http://gcc.gnu.org/bugzilla/attachment.cgi?id=25816 , which also delegates the updating of the vops to TODO_update_ssa_only_virtuals, but updates dominator info after each cluster. That patch (25816) has a better best-case time than this patch, but also a worse worst-case time. I think we can make a hybrid solution that chooses which approach to take dynamically, and take advantage of both approaches. But with the maximum number of iterations defaulting to 2, the amount of run-time won by a hybrid approach will be limited, and IMHO does not justify the extra complexity. If we try to dynamically limit the amount of iterations, based on the amount of time spent in computing dominator info, the hybrid solution starts to make sense. But getting this right will require quite some tuning. So for now I propose this simple solution. bootstrapped on and reg-tested on x86_64 and i686. Build and reg-tested on MIPS and ARM. ok for trunk? Thanks, - Tom 2011-11-08 Tom de Vries PR tree-optimization/51005 * tree-ssa-tail-merge (delete_basic_block_same_succ): Rename to mark_basic_block_deleted. (update_worklist): Inline purge_bbs. (purge_bbs, unlink_virtual_phi, update_vuses, vop_at_entry) (delete_block_update_dominator_info): Remove. (replace_block_by): Remove update_vops parameter. Partially evaluate for update_vops == false. (apply_clusters): Remove update_vops parameter. Remove update_vops argument in replace_block_by call. (update_debug_stmts): Remove MAY_HAVE_DEBUG_STMTS test. (tail_merge_optimize): Remove update_vops argument to apply_clusters. Remove call to purge_bbs. Add calls to calculate_dominance_info and free_dominance_info. Add MAY_HAVE_DEBUG_STMTS before calling update_debug_stmts. Mark vop var for renaming, if necessary. Index: gcc/tree-ssa-tail-merge.c =================================================================== --- gcc/tree-ssa-tail-merge.c (revision 181081) +++ gcc/tree-ssa-tail-merge.c (working copy) @@ -742,7 +742,7 @@ delete_worklist (void) /* Mark BB as deleted, and mark its predecessors. */ static void -delete_basic_block_same_succ (basic_block bb) +mark_basic_block_deleted (basic_block bb) { edge e; edge_iterator ei; @@ -809,15 +809,6 @@ release_last_vdef (basic_block bb) } -/* Delete all deleted_bbs. */ - -static void -purge_bbs (void) -{ - bitmap_and_compl_into (deleted_bb_preds, deleted_bbs); - bitmap_clear (deleted_bbs); -} - /* For deleted_bb_preds, find bbs with same successors. */ static void @@ -828,6 +819,9 @@ update_worklist (void) basic_block bb; same_succ same; + bitmap_and_compl_into (deleted_bb_preds, deleted_bbs); + bitmap_clear (deleted_bbs); + bitmap_clear_bit (deleted_bb_preds, ENTRY_BLOCK); same_succ_flush_bbs (deleted_bb_preds); @@ -1353,125 +1347,6 @@ find_clusters (void) } } -/* Replace uses of the result of PHI with NAME. */ - -static void -unlink_virtual_phi (gimple phi, tree name) -{ - use_operand_p use_p; - imm_use_iterator iter; - gimple use_stmt; - tree vdef = gimple_phi_result (phi); - - if (!vdef - || TREE_CODE (vdef) != SSA_NAME) - return; - - FOR_EACH_IMM_USE_STMT (use_stmt, iter, vdef) - { - FOR_EACH_IMM_USE_ON_STMT (use_p, iter) - SET_USE (use_p, name); - } - - if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (vdef)) - SSA_NAME_OCCURS_IN_ABNORMAL_PHI (name) = 1; -} - -/* Create or update a vop phi in BB2. Use VUSE1 arguments for all the - REDIRECTED_EDGES, or if VUSE1 is NULL_TREE, use BB_VOP_AT_EXIT. If a new - phis is created, use the phi instead of VUSE2 in BB2. */ - -static void -update_vuses (bool vuse1_phi_args, tree vuse1, tree vuse2, basic_block bb2, - VEC (edge,heap) *redirected_edges) -{ - gimple stmt, phi = NULL; - tree lhs = NULL_TREE, arg, var; - unsigned int i; - gimple def_stmt2 = NULL; - imm_use_iterator iter; - use_operand_p use_p; - edge_iterator ei; - edge e; - - if (vuse2 != NULL_TREE) - { - var = SSA_NAME_VAR (vuse2); - def_stmt2 = SSA_NAME_DEF_STMT (vuse2); - } - else - var = SSA_NAME_VAR (vuse1); - - if (def_stmt2 && gimple_bb (def_stmt2) == bb2) - /* Update existing phi. */ - phi = def_stmt2; - else - { - /* No need to create a phi with 2 equal arguments. */ - if (vuse1 == vuse2) - return; - - /* Create a phi. */ - lhs = make_ssa_name (var, NULL); - VN_INFO_GET (lhs); - phi = create_phi_node (lhs, bb2); - SSA_NAME_DEF_STMT (lhs) = phi; - - /* Set default argument vuse2 for all preds. */ - arg = vuse2 == NULL_TREE ? gimple_phi_result (phi): vuse2; - FOR_EACH_EDGE (e, ei, bb2->preds) - add_phi_arg (phi, arg, e, UNKNOWN_LOCATION); - } - - /* Update phi. */ - for (i = 0; i < EDGE_COUNT (redirected_edges); ++i) - { - e = VEC_index (edge, redirected_edges, i); - if (vuse1_phi_args) - arg = BB_VOP_AT_EXIT (e->src); - else - arg = vuse1 == NULL_TREE ? gimple_phi_result (phi): vuse1; - - add_phi_arg (phi, arg, e, UNKNOWN_LOCATION); - } - - /* Return if we updated an existing phi. */ - if (def_stmt2 && gimple_bb (def_stmt2) == bb2) - return; - - /* Replace relevant uses with the newly created phi. */ - FOR_EACH_IMM_USE_STMT (stmt, iter, vuse2 == NULL_TREE ? vuse1 : vuse2) - { - if (stmt == phi) - continue; - - if (gimple_code (stmt) != GIMPLE_PHI - && !dominated_by_p (CDI_DOMINATORS, gimple_bb (stmt), bb2)) - continue; - - FOR_EACH_IMM_USE_ON_STMT (use_p, iter) - { - if (gimple_code (stmt) == GIMPLE_PHI) - { - unsigned int pred_index = PHI_ARG_INDEX_FROM_USE (use_p); - basic_block pred = EDGE_PRED (gimple_bb (stmt), pred_index)->src; - if (!dominated_by_p (CDI_DOMINATORS, pred, bb2)) - continue; - - if (pred == bb2 && EDGE_COUNT (gimple_bb (stmt)->preds) == 1) - { - gimple_stmt_iterator gsi = gsi_for_stmt (stmt); - unlink_virtual_phi (stmt, lhs); - remove_phi_node (&gsi, true); - break; - } - } - SET_USE (use_p, lhs); - update_stmt (stmt); - } - } -} - /* Returns the vop phi of BB, if any. */ static gimple @@ -1489,176 +1364,19 @@ vop_phi (basic_block bb) return NULL; } -/* Returns the vop state at the entry of BB, if found in BB or a successor - bb. */ - -static tree -vop_at_entry (basic_block bb) -{ - gimple bb_phi, succ_phi; - gimple_stmt_iterator gsi; - gimple stmt; - tree vuse, vdef; - basic_block succ; - - bb_phi = vop_phi (bb); - if (bb_phi != NULL) - return gimple_phi_result (bb_phi); - - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) - { - stmt = gsi_stmt (gsi); - vuse = gimple_vuse (stmt); - vdef = gimple_vdef (stmt); - if (vuse != NULL_TREE) - return vuse; - if (vdef != NULL_TREE) - return NULL_TREE; - } - - if (EDGE_COUNT (bb->succs) == 0) - return NULL_TREE; - - succ = EDGE_SUCC (bb, 0)->dest; - succ_phi = vop_phi (succ); - return (succ_phi != NULL - ? PHI_ARG_DEF_FROM_EDGE (succ_phi, find_edge (bb, succ)) - : NULL_TREE); -} - -/* Given that all incoming edges of BB1 have been redirected to BB2, delete BB1 - and recompute dominator info. */ - -static void -delete_block_update_dominator_info (basic_block bb1, basic_block bb2) -{ - VEC (basic_block,heap) *fix_dom_bb; - unsigned int i; - basic_block bb, dom; - edge e; - edge_iterator ei; - - /* Consider the following cfg, where A is the direct dominator of I: - - A - / \ - B \ - / \ \ - C D - /| |\ - E F - |\ /| - | x | - |/ \| - G H - \ / - I - - Say E and F are duplicates, and F is removed. The cfg then looks like - this: - - A - / \ - B \ - / \ \ - C D - / \ / \ - E - / \ - G H - \ / - I - - E is now the new direct dominator of I. - - In order to calculate the new dominator info, we take the nearest common - dominator (A) of bb1 (F) and bb2 (E), and get the set of bbs immediately - dominated by it. Some of this set may now be directly dominated by bb2. - - Ideally we would have a means to determine which bbs in the set are now - dominated by bb2, and call set_immediate_dominator for those bbs, but we - don't, so instead we let iterate_fix_dominators figure it out. */ - - /* Add bbs immediately dominated by the most common dominator. */ - dom = nearest_common_dominator (CDI_DOMINATORS, bb1, bb2); - fix_dom_bb = get_dominated_by (CDI_DOMINATORS, dom); - - if (get_immediate_dominator (CDI_DOMINATORS, bb1) == dom) - for (i = 0; VEC_iterate (basic_block, fix_dom_bb, i, bb); ++i) - { - if (bb != bb1) - continue; - VEC_unordered_remove (basic_block, fix_dom_bb, i); - break; - } - - /* Add bb2, but not twice. */ - if (get_immediate_dominator (CDI_DOMINATORS, bb2) != dom) - VEC_safe_push (basic_block, heap, fix_dom_bb, bb2); - /* Add succs of bb2, but not twice. */ - FOR_EACH_EDGE (e, ei, bb2->succs) - if (get_immediate_dominator (CDI_DOMINATORS, e->dest) != dom) - VEC_safe_push (basic_block, heap, fix_dom_bb, e->dest); - - delete_basic_block (bb1); - iterate_fix_dominators (CDI_DOMINATORS, fix_dom_bb, false); -#if defined (ENABLE_CHECKING) - verify_dominators (CDI_DOMINATORS); -#endif - VEC_free (basic_block, heap, fix_dom_bb); -} - -/* Redirect all edges from BB1 to BB2, marks BB1 for removal, and if - UPDATE_VOPS, inserts vop phis. */ +/* Redirect all edges from BB1 to BB2, removes BB1 and marks it as removed. */ static void -replace_block_by (basic_block bb1, basic_block bb2, bool update_vops) +replace_block_by (basic_block bb1, basic_block bb2) { edge pred_edge; unsigned int i; - tree phi_vuse1 = NULL_TREE, phi_vuse2 = NULL_TREE, arg; - VEC (edge,heap) *redirected_edges = NULL; - edge e; - edge_iterator ei; - bool vuse1_phi_args = false; - - phi_vuse2 = vop_at_entry (bb2); - if (phi_vuse2 != NULL_TREE && TREE_CODE (phi_vuse2) != SSA_NAME) - phi_vuse2 = NULL_TREE; - - if (update_vops) - { - /* Find the vops at entry of bb1 and bb2. */ - phi_vuse1 = vop_at_entry (bb1); - - /* If both are not found, it means there's no need to update. Uses old - dominator info. */ - if (phi_vuse1 == NULL_TREE && phi_vuse2 == NULL_TREE) - update_vops = false; - else if (phi_vuse1 == NULL_TREE) - update_vops = dominated_by_p (CDI_DOMINATORS, bb1, bb2); - else if (phi_vuse2 == NULL_TREE) - update_vops = dominated_by_p (CDI_DOMINATORS, bb2, bb1); - } - - if (phi_vuse1 && gimple_bb (SSA_NAME_DEF_STMT (phi_vuse1)) == bb1) - { - /* If the vop at entry of bb1 is a phi, save the phi alternatives in - BB_VOP_AT_EXIT, before we lose that information by redirecting the - edges. */ - FOR_EACH_EDGE (e, ei, bb1->preds) - { - arg = PHI_ARG_DEF_FROM_EDGE (SSA_NAME_DEF_STMT (phi_vuse1), e); - BB_VOP_AT_EXIT (e->src) = arg; - } - vuse1_phi_args = true; - } + gimple bb2_phi; - /* Mark the basic block for later deletion. */ - delete_basic_block_same_succ (bb1); + bb2_phi = vop_phi (bb2); - if (update_vops) - redirected_edges = VEC_alloc (edge, heap, 10); + /* Mark the basic block as deleted. */ + mark_basic_block_deleted (bb1); /* Redirect the incoming edges of bb1 to bb2. */ for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i) @@ -1666,27 +1384,23 @@ replace_block_by (basic_block bb1, basic pred_edge = EDGE_PRED (bb1, i - 1); pred_edge = redirect_edge_and_branch (pred_edge, bb2); gcc_assert (pred_edge != NULL); - if (update_vops) - VEC_safe_push (edge, heap, redirected_edges, pred_edge); - else if (phi_vuse2 && gimple_bb (SSA_NAME_DEF_STMT (phi_vuse2)) == bb2) - add_phi_arg (SSA_NAME_DEF_STMT (phi_vuse2), SSA_NAME_VAR (phi_vuse2), - pred_edge, UNKNOWN_LOCATION); + + if (bb2_phi == NULL) + continue; + + /* The phi might have run out of capacity when the redirect added an + argument, which means it could have been replaced. Refresh it. */ + bb2_phi = vop_phi (bb2); + + add_phi_arg (bb2_phi, SSA_NAME_VAR (gimple_phi_result (bb2_phi)), + pred_edge, UNKNOWN_LOCATION); } /* Do updates that use bb1, before deleting bb1. */ - if (!update_vops) - release_last_vdef (bb1); + release_last_vdef (bb1); same_succ_flush_bb (bb1); - delete_block_update_dominator_info (bb1, bb2); - - /* Update the vops. Uses new dominator info. */ - if (update_vops) - { - update_vuses (vuse1_phi_args, phi_vuse1, phi_vuse2, bb2, - redirected_edges); - VEC_free (edge, heap, redirected_edges); - } + delete_basic_block (bb1); } /* Bbs for which update_debug_stmt need to be called. */ @@ -1694,10 +1409,10 @@ replace_block_by (basic_block bb1, basic static bitmap update_bbs; /* For each cluster in all_clusters, merge all cluster->bbs. Returns - number of bbs removed. Insert vop phis if UPDATE_VOPS. */ + number of bbs removed. */ static int -apply_clusters (bool update_vops) +apply_clusters (void) { basic_block bb1, bb2; bb_cluster c; @@ -1720,7 +1435,7 @@ apply_clusters (bool update_vops) bb1 = BASIC_BLOCK (j); bitmap_clear_bit (update_bbs, bb1->index); - replace_block_by (bb1, bb2, update_vops); + replace_block_by (bb1, bb2); nr_bbs_removed++; } } @@ -1772,9 +1487,6 @@ update_debug_stmts (void) bitmap_iterator bi; unsigned int i; - if (!MAY_HAVE_DEBUG_STMTS) - return; - EXECUTE_IF_SET_IN_BITMAP (update_bbs, 0, i, bi) { gimple stmt; @@ -1800,7 +1512,6 @@ tail_merge_optimize (unsigned int todo) int nr_bbs_removed; bool loop_entered = false; int iteration_nr = 0; - bool update_vops = !symbol_marked_for_renaming (gimple_vop (cfun)); int max_iterations = PARAM_VALUE (PARAM_MAX_TAIL_MERGE_ITERATIONS); if (!flag_tree_tail_merge || max_iterations == 0) @@ -1831,16 +1542,17 @@ tail_merge_optimize (unsigned int todo) if (VEC_empty (bb_cluster, all_clusters)) break; - nr_bbs_removed = apply_clusters (update_vops); + nr_bbs_removed = apply_clusters (); nr_bbs_removed_total += nr_bbs_removed; if (nr_bbs_removed == 0) break; - purge_bbs (); + free_dominance_info (CDI_DOMINATORS); if (iteration_nr == max_iterations) break; + calculate_dominance_info (CDI_DOMINATORS); update_worklist (); } @@ -1850,7 +1562,11 @@ tail_merge_optimize (unsigned int todo) if (nr_bbs_removed_total > 0) { - update_debug_stmts (); + if (MAY_HAVE_DEBUG_STMTS) + { + calculate_dominance_info (CDI_DOMINATORS); + update_debug_stmts (); + } if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -1860,6 +1576,7 @@ tail_merge_optimize (unsigned int todo) todo |= (TODO_verify_ssa | TODO_verify_stmts | TODO_verify_flow | TODO_dump_func); + mark_sym_for_renaming (gimple_vop (cfun)); } delete_worklist ();