diff mbox

Fix code generation problem with jump threading

Message ID 52804973.9010903@redhat.com
State New
Headers show

Commit Message

Jeff Law Nov. 11, 2013, 3:05 a.m. UTC
If we have a jump threading request through a joiner block which has two 
successors S1 and S2.  If the threading request goes through S1 and 
ultimately reaches S2, then we have to ensure that any PHI nodes in S2 
have the same arguments for edges J->S2 and J->S1...->S2.

The SSA/CFG updating code (of course) has code to detect this, however, 
it runs *before* we prune the tail of jump threading paths to avoid 
threading through multiple loop headers.

ie, we might have a jump threading request like

(224, 255) (225, 226)J (226, 227) (227, 229)

Where there also exists an edge (225, 227).


So if we have to truncate the last edge off the jump threading path we'd 
be left with a thread path like:

(224,225) (225,226)(J) (226, 227)

And since there's already an edge (225, 227) we have to test the PHI 
node arguments in 227 and reject the thread path if the test fails.

So the obvious fix is to first truncate the paths, *then* do the PHI 
node check if its needed.

I don't have a testcase since I tripped this with other changes. 
However, there is some chance this fixes a SH problem from Oleg.  If so, 
we may be able to extract a usable testcase from Oleg's code.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Installed on the trunk.
* tree-ssa-threadupdate.c (mark_threaded_blocks): Truncate jump
	threading paths first, then perform PHI node checks if applicable.
diff mbox

Patch

diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 24e7767..346d532 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -1269,46 +1395,6 @@  mark_threaded_blocks (bitmap threaded_blocks)
       bitmap_set_bit (tmp, e->dest->index);
     }
 
-  /* If we have a joiner block (J) which has two successors S1 and S2 and
-     we are threading though S1 and the final destination of the thread
-     is S2, then we must verify that any PHI nodes in S2 have the same
-     PHI arguments for the edge J->S2 and J->S1->...->S2.
-
-     We used to detect this prior to registering the jump thread, but
-     that prohibits propagation of edge equivalences into non-dominated
-     PHI nodes as the equivalency test might occur before propagation.
-
-     This works for now, but will need improvement as part of the FSA
-     optimization.
-
-     Note since we've moved the thread request data to the edges,
-     we have to iterate on those rather than the threaded_edges vector.  */
-  EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
-    {
-      bb = BASIC_BLOCK (i);
-      FOR_EACH_EDGE (e, ei, bb->preds)
-	{
-	  if (e->aux)
-	    {
-	      vec<jump_thread_edge *> *path = THREAD_PATH (e);
-	      bool have_joiner = ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK);
-
-	      if (have_joiner)
-		{
-		  basic_block joiner = e->dest;
-		  edge final_edge = path->last ()->e;
-		  basic_block final_dest = final_edge->dest;
-		  edge e2 = find_edge (joiner, final_dest);
-
-		  if (e2 && !phi_args_equal_on_edges (e2, final_edge))
-		    {
-		      delete_jump_thread_path (path);
-		      e->aux = NULL;
-		    }
-		}
-	    }
-	}
-    }
 
 
   /* If optimizing for size, only thread through block if we don't have
@@ -1398,6 +1484,50 @@  mark_threaded_blocks (bitmap threaded_blocks)
 	}
     }
 
+  /* If we have a joiner block (J) which has two successors S1 and S2 and
+     we are threading though S1 and the final destination of the thread
+     is S2, then we must verify that any PHI nodes in S2 have the same
+     PHI arguments for the edge J->S2 and J->S1->...->S2.
+
+     We used to detect this prior to registering the jump thread, but
+     that prohibits propagation of edge equivalences into non-dominated
+     PHI nodes as the equivalency test might occur before propagation.
+
+     This must also occur after we truncate any jump threading paths
+     as this scenario may only show up after truncation.
+
+     This works for now, but will need improvement as part of the FSA
+     optimization.
+
+     Note since we've moved the thread request data to the edges,
+     we have to iterate on those rather than the threaded_edges vector.  */
+  EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
+    {
+      bb = BASIC_BLOCK (i);
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	{
+	  if (e->aux)
+	    {
+	      vec<jump_thread_edge *> *path = THREAD_PATH (e);
+	      bool have_joiner = ((*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK);
+
+	      if (have_joiner)
+		{
+		  basic_block joiner = e->dest;
+		  edge final_edge = path->last ()->e;
+		  basic_block final_dest = final_edge->dest;
+		  edge e2 = find_edge (joiner, final_dest);
+
+		  if (e2 && !phi_args_equal_on_edges (e2, final_edge))
+		    {
+		      delete_jump_thread_path (path);
+		      e->aux = NULL;
+		    }
+		}
+	    }
+	}
+    }
+
   BITMAP_FREE (tmp);
 }