From patchwork Fri Nov 22 10:09:30 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 293384 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id B51B72C00CB for ; Fri, 22 Nov 2013 21:09:55 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=jmvc4sCtthwdMlpXyqHq3XXhj4EYv1XywJeAI4wnrlh4W2 FhXgya9R0qJvwbR50NS0ep2CGFuPPknhZW4WT3e5CPy9vtx+rDmy9QO3M8ioj7qT +rDdCP1Ka4n6N75PX9+pqk42k67ODuaNCOkPkaaqpCU95zJffusKkgeMwZQ04= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=Pm0qNINasa1kqNVO6xRbbzzmhmo=; b=V26RDzXaAKc+kNUmMUjs n4jSWL92Otvec2GfL5tDKsxw6ju7aHjlvjfSsrIhh5iRx/ZWkxhTyz/pGBolm+Dk KjZcjzyNLXHfY7DmeyuS9BQPXd9khr/+58NygJc/2iZ7HinrINN6XMZZXAKbWNrl Ddo+ryDkWOYs4xc/rQ7eFqY= Received: (qmail 16483 invoked by alias); 22 Nov 2013 10:09:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16469 invoked by uid 89); 22 Nov 2013 10:09:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.0 required=5.0 tests=AWL, BAYES_50, RDNS_NONE, SPF_HELO_PASS, SPF_PASS autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from Unknown (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 22 Nov 2013 10:09:39 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rAMA9VhJ009576 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 22 Nov 2013 05:09:32 -0500 Received: from stumpy.slc.redhat.com (ovpn-113-162.phx2.redhat.com [10.3.113.162]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id rAMA9UB4020198 for ; Fri, 22 Nov 2013 05:09:31 -0500 Message-ID: <528F2D5A.7030902@redhat.com> Date: Fri, 22 Nov 2013 03:09:30 -0700 From: Jeff Law User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: gcc-patches Subject: [PATCH] Remove many restrictions for identifying jump threads across backedges X-IsSubscribed: yes It's amazing what 10 years away from a hunk of code will do... So we've long had an arbitrary restriction in the jump thread identification code which made it very conservative when searching for threading opportunities once it encountered a back edge. I've always known the problem was pollution of the tables with equivalences that were not valid once we cross the backedge. But I'd never *really* looked at the problem. I'd just had a sneaking suspicion the problems were related to the equivalences we record for the true/false arms of conditionals, PHIs and maybe other lurkers. So for example, if we have a PHI x2 = PHI (x0 (bb2), x1 (bb3)); [ ... ] x1 = Where the incoming edge from bb3 is the backedge. We will record a temporary equivalence x2 = x1. Once x1 gets reassigned, we can no longer use that equivalence. So we record in bitmaps the PHI arg of any PHI nodes that are used to create equivalences and the LHS of PHI nodes into another bitmap. We only do this after traversing a backedge. If we see an assignment to anything recorded in the bitmap for the PHI arg names, then we iterate over the recorded LHS names to see if any are currently equivalent to the RHS that just changed. Obviously if such a situation is found, we eliminate the equivalence. This is probably somewhat imprecise, but it's good enough. We also have problems with expressions in the hash tables. These are pretty trivial to solve -- the hash tables are only used by the SIMPLIFY callback. Thus, we can just use a dummy SIMPLIFY callback once we've passed a backedge in the CFG. Finally, for assignments that set a SSA_NAME to a value that isn't particularly useful, we need to record *something* in the equivalence tables (a NULL_TREE is good). That effectively invalidates any equivalences with that name. This is useful if we had used a conditional to derive a single point range for an SSA_NAME but the SSA_NAME's assignment statement doesn't provide such range precision. This patch has bootstrapped and tested in various forms over the last couple weeks. However, I made some simplifications late tonight and thus it needs to go through another spin. Bootstrapped, regression test is in progress. Will commit in the AM if everything is clean and a final once-over doesn't cause something to jump out as wrong. Onwards to stage3 bugfixing :-) Jeff * tree-ssa-threadedge.c (record_temporary_equivalence): Handle NULL for RHS, which we used to invalidate equivalences. (record_temporary_equivalences_from_phis): New bitmap arguments and a boolean indicating if we have passed a backedge. If we have passed a backedge, then set the appropriate bit in the bitmaps for the SRC & DEST of PHIs creating equivalences. (invalidate_equivalences, dummy_simplify): New functions. (cond_arg_set_in_b): Remove. (record_temporary_equivalences_from_stmts_at_dest): New bitmap arguments and a boolean indicating if we have passed a backedge. If we have passed a backedge, then perform invalidations as needed. (thread_around_empty_blocks): If we have seen a backedge, then use the dummy simplify routine. (thread_through_normal_block): Likewise. Pass bitmaps and backedge status to children. Do not pessimize so much when traversing backedges in the CFG. (thread_across_edge): Manage the SRC_MAP/DST_MAP bitmaps. If we have seen a backedge, then use the dummy simplify routine. Do not pessimize so much when traversing backedges. diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index 7600d7b..ce161a8 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -170,7 +170,8 @@ record_temporary_equivalence (tree x, tree y, vec *stack) { tree prev_x = SSA_NAME_VALUE (x); - if (TREE_CODE (y) == SSA_NAME) + /* Y may be NULL if we are invalidating entries in the table. */ + if (y && TREE_CODE (y) == SSA_NAME) { tree tmp = SSA_NAME_VALUE (y); y = tmp ? tmp : y; @@ -186,10 +187,16 @@ record_temporary_equivalence (tree x, tree y, vec *stack) edge E. Record unwind information for the equivalences onto STACK. If a PHI which prevents threading is encountered, then return FALSE - indicating we should not thread this edge, else return TRUE. */ + indicating we should not thread this edge, else return TRUE. + + If SRC_MAP/DST_MAP exist, then mark the source and destination SSA_NAMEs + of any equivalences recorded. We use this to make invalidation after + traversing back edges less painful. */ static bool -record_temporary_equivalences_from_phis (edge e, vec *stack) +record_temporary_equivalences_from_phis (edge e, vec *stack, + bool backedge_seen, + bitmap src_map, bitmap dst_map) { gimple_stmt_iterator gsi; @@ -217,6 +224,14 @@ record_temporary_equivalences_from_phis (edge e, vec *stack) stmt_count++; record_temporary_equivalence (dst, src, stack); + + /* If we have crossed a backedge, then start recording equivalences + we might need to invalidate. */ + if (backedge_seen && TREE_CODE (src) == SSA_NAME) + { + bitmap_set_bit (src_map, SSA_NAME_VERSION (src)); + bitmap_set_bit (dst_map, SSA_NAME_VERSION (dst)); + } } return true; } @@ -271,6 +286,34 @@ fold_assignment_stmt (gimple stmt) } } +/* A new value has been assigned to LHS. If necessary, invalidate any + equivalences that are no longer valid. */ +static void +invalidate_equivalences (tree lhs, vec *stack, + bitmap src_map, bitmap dst_map) +{ + /* SRC_MAP contains the source SSA_NAMEs for equivalences created by PHI + nodes. If an entry in SRC_MAP changes, there's some destination that + has been recorded as equivalent to the source and that equivalency + needs to be eliminated. */ + if (bitmap_bit_p (src_map, SSA_NAME_VERSION (lhs))) + { + unsigned int i; + bitmap_iterator bi; + + /* We know that the LHS of STMT was used as the RHS in an equivalency + created by a PHI. All the LHS of such PHIs were recorded into DST_MAP. + So we can iterate over them to see if any have the LHS of STMT as + an equivalence, and if so, remove the equivalence as it is no longer + valid. */ + EXECUTE_IF_SET_IN_BITMAP (dst_map, 0, i, bi) + { + if (SSA_NAME_VALUE (ssa_name (i)) == lhs) + record_temporary_equivalence (ssa_name (i), NULL_TREE, stack); + } + } +} + /* Try to simplify each statement in E->dest, ultimately leading to a simplification of the COND_EXPR at the end of E->dest. @@ -292,7 +335,10 @@ static gimple record_temporary_equivalences_from_stmts_at_dest (edge e, vec *stack, tree (*simplify) (gimple, - gimple)) + gimple), + bool backedge_seen, + bitmap src_map, + bitmap dst_map) { gimple stmt = NULL; gimple_stmt_iterator gsi; @@ -369,7 +415,15 @@ record_temporary_equivalences_from_stmts_at_dest (edge e, if (fndecl && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_OBJECT_SIZE || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CONSTANT_P)) - continue; + { + if (backedge_seen) + { + tree lhs = gimple_get_lhs (stmt); + record_temporary_equivalence (lhs, NULL_TREE, stack); + invalidate_equivalences (lhs, stack, src_map, dst_map); + } + continue; + } } /* At this point we have a statement which assigns an RHS to an @@ -433,15 +487,36 @@ record_temporary_equivalences_from_stmts_at_dest (edge e, } /* Record the context sensitive equivalence if we were able - to simplify this statement. */ + to simplify this statement. + + If we have traversed a backedge at some point during threading, + then always enter something here. Either a real equivalence, + or a NULL_TREE equivalence which is effectively invalidation of + prior equivalences. */ if (cached_lhs && (TREE_CODE (cached_lhs) == SSA_NAME || is_gimple_min_invariant (cached_lhs))) record_temporary_equivalence (gimple_get_lhs (stmt), cached_lhs, stack); + else if (backedge_seen) + record_temporary_equivalence (gimple_get_lhs (stmt), NULL_TREE, stack); + + if (backedge_seen) + invalidate_equivalences (gimple_get_lhs (stmt), stack, + src_map, dst_map); } return stmt; } +/* Once we have passed a backedge in the CFG when threading, we do not want to + utilize edge equivalences for simplification purpose. They are no longer + necessarily valid. We use this callback rather than the ones provided by + DOM/VRP to achieve that effect. */ +static tree +dummy_simplify (gimple stmt1 ATTRIBUTE_UNUSED, gimple stmt2 ATTRIBUTE_UNUSED) +{ + return NULL_TREE; +} + /* Simplify the control statement at the end of the block E->dest. To avoid allocating memory unnecessarily, a scratch GIMPLE_COND @@ -581,44 +656,6 @@ simplify_control_stmt_condition (edge e, return cached_lhs; } -/* Return TRUE if the statement at the end of e->dest depends on - the output of any statement in BB. Otherwise return FALSE. - - This is used when we are threading a backedge and need to ensure - that temporary equivalences from BB do not affect the condition - in e->dest. */ - -static bool -cond_arg_set_in_bb (edge e, basic_block bb) -{ - ssa_op_iter iter; - use_operand_p use_p; - gimple last = last_stmt (e->dest); - - /* E->dest does not have to end with a control transferring - instruction. This can occur when we try to extend a jump - threading opportunity deeper into the CFG. In that case - it is safe for this check to return false. */ - if (!last) - return false; - - if (gimple_code (last) != GIMPLE_COND - && gimple_code (last) != GIMPLE_GOTO - && gimple_code (last) != GIMPLE_SWITCH) - return false; - - FOR_EACH_SSA_USE_OPERAND (use_p, last, iter, SSA_OP_USE | SSA_OP_VUSE) - { - tree use = USE_FROM_PTR (use_p); - - if (TREE_CODE (use) == SSA_NAME - && gimple_code (SSA_NAME_DEF_STMT (use)) != GIMPLE_PHI - && gimple_bb (SSA_NAME_DEF_STMT (use)) == bb) - return true; - } - return false; -} - /* Copy debug stmts from DEST's chain of single predecessors up to SRC, so that we don't lose the bindings as PHI nodes are introduced when DEST gains new predecessors. */ @@ -805,6 +842,8 @@ thread_around_empty_blocks (edge taken_edge, path->safe_push (x); bitmap_set_bit (visited, taken_edge->dest->index); *backedge_seen_p |= ((taken_edge->flags & EDGE_DFS_BACK) != 0); + if (*backedge_seen_p) + simplify = dummy_simplify; return thread_around_empty_blocks (taken_edge, dummy_cond, handle_dominating_asserts, @@ -827,6 +866,13 @@ thread_around_empty_blocks (edge taken_edge, && gimple_code (stmt) != GIMPLE_SWITCH) return false; + /* If we have traversed a backedge, then we do not want to look + at certain expressions in the table that can not be relied upon. + Luckily the only code that looked at those expressions is the + SIMPLIFY callback, which we replace if we can no longer use it. */ + if (*backedge_seen_p) + simplify = dummy_simplify; + /* Extract and simplify the condition. */ cond = simplify_control_stmt_condition (taken_edge, stmt, dummy_cond, simplify, handle_dominating_asserts); @@ -846,6 +892,8 @@ thread_around_empty_blocks (edge taken_edge, = new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK); path->safe_push (x); *backedge_seen_p |= ((taken_edge->flags & EDGE_DFS_BACK) != 0); + if (*backedge_seen_p) + simplify = dummy_simplify; thread_around_empty_blocks (taken_edge, dummy_cond, @@ -896,24 +944,28 @@ thread_through_normal_block (edge e, tree (*simplify) (gimple, gimple), vec *path, bitmap visited, - bool *backedge_seen_p) + bool *backedge_seen_p, + bitmap src_map, + bitmap dst_map) { - /* If we have crossed a backedge, then we want to verify that the COND_EXPR, - SWITCH_EXPR or GOTO_EXPR at the end of e->dest is not affected - by any statements in e->dest. If it is affected, then it is not - safe to thread this edge. */ - if (*backedge_seen_p - && cond_arg_set_in_bb (e, e->dest)) - return false; + /* If we have traversed a backedge, then we do not want to look + at certain expressions in the table that can not be relied upon. + Luckily the only code that looked at those expressions is the + SIMPLIFY callback, which we replace if we can no longer use it. */ + if (*backedge_seen_p) + simplify = dummy_simplify; /* PHIs create temporary equivalences. */ - if (!record_temporary_equivalences_from_phis (e, stack)) + if (!record_temporary_equivalences_from_phis (e, stack, *backedge_seen_p, + src_map, dst_map)) return false; /* Now walk each statement recording any context sensitive temporary equivalences we can detect. */ gimple stmt - = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify); + = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify, + *backedge_seen_p, + src_map, dst_map); if (!stmt) return false; @@ -955,25 +1007,24 @@ thread_through_normal_block (edge e, = new jump_thread_edge (taken_edge, EDGE_COPY_SRC_BLOCK); path->safe_push (x); *backedge_seen_p |= ((taken_edge->flags & EDGE_DFS_BACK) != 0); + if (*backedge_seen_p) + simplify = dummy_simplify; /* See if we can thread through DEST as well, this helps capture secondary effects of threading without having to re-run DOM or - VRP. */ - if (!*backedge_seen_p - || ! cond_arg_set_in_bb (taken_edge, e->dest)) - { - /* We don't want to thread back to a block we have already - visited. This may be overly conservative. */ - bitmap_set_bit (visited, dest->index); - bitmap_set_bit (visited, e->dest->index); - thread_around_empty_blocks (taken_edge, - dummy_cond, - handle_dominating_asserts, - simplify, - visited, - path, - backedge_seen_p); - } + VRP. + + We don't want to thread back to a block we have already + visited. This may be overly conservative. */ + bitmap_set_bit (visited, dest->index); + bitmap_set_bit (visited, e->dest->index); + thread_around_empty_blocks (taken_edge, + dummy_cond, + handle_dominating_asserts, + simplify, + visited, + path, + backedge_seen_p); return true; } } @@ -1015,6 +1066,8 @@ thread_across_edge (gimple dummy_cond, tree (*simplify) (gimple, gimple)) { bitmap visited = BITMAP_ALLOC (NULL); + bitmap src_map = BITMAP_ALLOC (NULL); + bitmap dst_map = BITMAP_ALLOC (NULL); bool backedge_seen; stmt_count = 0; @@ -1024,14 +1077,19 @@ thread_across_edge (gimple dummy_cond, bitmap_set_bit (visited, e->src->index); bitmap_set_bit (visited, e->dest->index); backedge_seen = ((e->flags & EDGE_DFS_BACK) != 0); + if (backedge_seen) + simplify = dummy_simplify; + if (thread_through_normal_block (e, dummy_cond, handle_dominating_asserts, stack, simplify, path, visited, - &backedge_seen)) + &backedge_seen, src_map, dst_map)) { propagate_threaded_block_debug_into (path->last ()->e->dest, e->dest); remove_temporary_equivalences (stack); BITMAP_FREE (visited); + BITMAP_FREE (src_map); + BITMAP_FREE (dst_map); register_jump_thread (path); return; } @@ -1066,15 +1124,26 @@ thread_across_edge (gimple dummy_cond, { remove_temporary_equivalences (stack); BITMAP_FREE (visited); + BITMAP_FREE (src_map); + BITMAP_FREE (dst_map); return; } + /* We need to restore the state of the maps to this point each loop + iteration. */ + bitmap src_map_copy = BITMAP_ALLOC (NULL); + bitmap dst_map_copy = BITMAP_ALLOC (NULL); + bitmap_copy (src_map_copy, src_map); + bitmap_copy (dst_map_copy, dst_map); + /* Look at each successor of E->dest to see if we can thread through it. */ FOR_EACH_EDGE (taken_edge, ei, e->dest->succs) { /* Push a fresh marker so we can unwind the equivalences created for each of E->dest's successors. */ stack->safe_push (NULL_TREE); + bitmap_copy (src_map_copy, src_map); + bitmap_copy (dst_map_copy, dst_map); /* Avoid threading to any block we have already visited. */ bitmap_clear (visited); @@ -1093,23 +1162,25 @@ thread_across_edge (gimple dummy_cond, found = false; backedge_seen = ((e->flags & EDGE_DFS_BACK) != 0); backedge_seen |= ((taken_edge->flags & EDGE_DFS_BACK) != 0); - if (!backedge_seen - || ! cond_arg_set_in_bb (path->last ()->e, e->dest)) - found = thread_around_empty_blocks (taken_edge, - dummy_cond, - handle_dominating_asserts, - simplify, - visited, - path, - &backedge_seen); - - if (!found - && (!backedge_seen - || ! cond_arg_set_in_bb (path->last ()->e, e->dest))) + if (backedge_seen) + simplify = dummy_simplify; + found = thread_around_empty_blocks (taken_edge, + dummy_cond, + handle_dominating_asserts, + simplify, + visited, + path, + &backedge_seen); + + if (backedge_seen) + simplify = dummy_simplify; + + if (!found) found = thread_through_normal_block (path->last ()->e, dummy_cond, handle_dominating_asserts, stack, simplify, path, visited, - &backedge_seen); + &backedge_seen, + src_map, dst_map); /* If we were able to thread through a successor of E->dest, then record the jump threading opportunity. */ @@ -1128,6 +1199,10 @@ thread_across_edge (gimple dummy_cond, remove_temporary_equivalences (stack); } BITMAP_FREE (visited); + BITMAP_FREE (src_map); + BITMAP_FREE (dst_map); + BITMAP_FREE (src_map_copy); + BITMAP_FREE (dst_map_copy); } remove_temporary_equivalences (stack);