diff mbox series

fab: Factor out the main folding part of pass_fold_builtins::execute [PR116601]

Message ID 20240906072120.4082229-1-quic_apinski@quicinc.com
State New
Headers show
Series fab: Factor out the main folding part of pass_fold_builtins::execute [PR116601] | expand

Commit Message

Andrew Pinski Sept. 6, 2024, 7:21 a.m. UTC
This is an alternative patch to fix PR tree-optimization/116601 by factoring
out the main part of pass_fold_builtins::execute into its own function so that
we don't need to repeat the code for doing the eh cleanup. It also fixes the
problem I saw with the atomics which might skip over a statement; though I don't
have a testcase for that.
Just a note on the return value of fold_all_builtin_stmt, it does not return true
if something was folded but rather if the iterator should increment to the next
statement or not. This was the bug with atomics, is that in some cases the atomic
builtins could remove the statement which is being processed but then there would
be another gsi_next happening.

Bootstrapped and tested on x86_64-linux-gnu.

	PR tree-optimization/116601

gcc/ChangeLog:

	* tree-ssa-ccp.cc (optimize_memcpy): Return true if the statement
	was updated.
	(pass_fold_builtins::execute): Factor out folding code into ...
	(fold_all_builtin_stmt): This.

gcc/testsuite/ChangeLog:

	* g++.dg/torture/except-2.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
---
 gcc/testsuite/g++.dg/torture/except-2.C |  18 +
 gcc/tree-ssa-ccp.cc                     | 534 ++++++++++++------------
 2 files changed, 276 insertions(+), 276 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/except-2.C

Comments

Jakub Jelinek Sept. 6, 2024, 7:30 a.m. UTC | #1
On Fri, Sep 06, 2024 at 12:21:20AM -0700, Andrew Pinski wrote:
> This is an alternative patch to fix PR tree-optimization/116601 by factoring
> out the main part of pass_fold_builtins::execute into its own function so that
> we don't need to repeat the code for doing the eh cleanup. It also fixes the
> problem I saw with the atomics which might skip over a statement; though I don't
> have a testcase for that.

I'm worried about using this elsewhere, various fab foldings are meant to be
done only in that pass and not earlier.
E.g. the __builtin_constant_p folding, __builtin_assume_aligned, stack
restore, unreachable, va_{start,end,copy}.

	Jakub
Richard Biener Sept. 6, 2024, 7:51 a.m. UTC | #2
On Fri, Sep 6, 2024 at 9:31 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Sep 06, 2024 at 12:21:20AM -0700, Andrew Pinski wrote:
> > This is an alternative patch to fix PR tree-optimization/116601 by factoring
> > out the main part of pass_fold_builtins::execute into its own function so that
> > we don't need to repeat the code for doing the eh cleanup. It also fixes the
> > problem I saw with the atomics which might skip over a statement; though I don't
> > have a testcase for that.
>
> I'm worried about using this elsewhere, various fab foldings are meant to be
> done only in that pass and not earlier.
> E.g. the __builtin_constant_p folding, __builtin_assume_aligned, stack
> restore, unreachable, va_{start,end,copy}.

Maybe we can document this fact better or name the function differently?

>
>         Jakub
>
Jakub Jelinek Sept. 6, 2024, 7:56 a.m. UTC | #3
On Fri, Sep 06, 2024 at 09:51:38AM +0200, Richard Biener wrote:
> On Fri, Sep 6, 2024 at 9:31 AM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Fri, Sep 06, 2024 at 12:21:20AM -0700, Andrew Pinski wrote:
> > > This is an alternative patch to fix PR tree-optimization/116601 by factoring
> > > out the main part of pass_fold_builtins::execute into its own function so that
> > > we don't need to repeat the code for doing the eh cleanup. It also fixes the
> > > problem I saw with the atomics which might skip over a statement; though I don't
> > > have a testcase for that.
> >
> > I'm worried about using this elsewhere, various fab foldings are meant to be
> > done only in that pass and not earlier.
> > E.g. the __builtin_constant_p folding, __builtin_assume_aligned, stack
> > restore, unreachable, va_{start,end,copy}.
> 
> Maybe we can document this fact better or name the function differently?

Some of it is documented already in the source.
                case BUILT_IN_CONSTANT_P:
                  /* Resolve __builtin_constant_p.  If it hasn't been
                     folded to integer_one_node by now, it's fairly
                     certain that the value simply isn't constant.  */
                  result = integer_zero_node;
or
                case BUILT_IN_VA_START:
                case BUILT_IN_VA_END:
                case BUILT_IN_VA_COPY:
                  /* These shouldn't be folded before pass_stdarg.  */
                  result = optimize_stdarg_builtin (stmt);
Obviously, if either is done much earlier, then the former can fold to 1
(e.g. if it is before IPA or shortly after IPA and not all usual propagation
after inlining etc. is done already), or pass_stdarg hasn't been done, etc.

	Jakub
diff mbox series

Patch

diff --git a/gcc/testsuite/g++.dg/torture/except-2.C b/gcc/testsuite/g++.dg/torture/except-2.C
new file mode 100644
index 00000000000..d896937a118
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/except-2.C
@@ -0,0 +1,18 @@ 
+// { dg-do compile }
+// { dg-additional-options "-fexceptions -fnon-call-exceptions" }
+// PR tree-optimization/116601
+
+struct RefitOption {
+  char subtype;
+  int string;
+} n;
+void h(RefitOption);
+void k(RefitOption *__val)
+{
+  try {
+    *__val = RefitOption{};
+    RefitOption __trans_tmp_2 = *__val;
+    h(__trans_tmp_2);
+  }
+  catch(...){}
+}
diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 44711018e0e..930432e3244 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -4166,18 +4166,19 @@  optimize_atomic_op_fetch_cmp_0 (gimple_stmt_iterator *gsip,
    a = {};
    b = {};
    Similarly for memset (&a, ..., sizeof (a)); instead of a = {};
-   and/or memcpy (&b, &a, sizeof (a)); instead of b = a;  */
+   and/or memcpy (&b, &a, sizeof (a)); instead of b = a;
+   Returns true if the statement was changed.  */
 
-static void
+static bool
 optimize_memcpy (gimple_stmt_iterator *gsip, tree dest, tree src, tree len)
 {
   gimple *stmt = gsi_stmt (*gsip);
   if (gimple_has_volatile_ops (stmt))
-    return;
+    return false;
 
   tree vuse = gimple_vuse (stmt);
   if (vuse == NULL)
-    return;
+    return false;
 
   gimple *defstmt = SSA_NAME_DEF_STMT (vuse);
   tree src2 = NULL_TREE, len2 = NULL_TREE;
@@ -4202,7 +4203,7 @@  optimize_memcpy (gimple_stmt_iterator *gsip, tree dest, tree src, tree len)
     }
 
   if (src2 == NULL_TREE)
-    return;
+    return false;
 
   if (len == NULL_TREE)
     len = (TREE_CODE (src) == COMPONENT_REF
@@ -4216,24 +4217,24 @@  optimize_memcpy (gimple_stmt_iterator *gsip, tree dest, tree src, tree len)
       || !poly_int_tree_p (len)
       || len2 == NULL_TREE
       || !poly_int_tree_p (len2))
-    return;
+    return false;
 
   src = get_addr_base_and_unit_offset (src, &offset);
   src2 = get_addr_base_and_unit_offset (src2, &offset2);
   if (src == NULL_TREE
       || src2 == NULL_TREE
       || maybe_lt (offset, offset2))
-    return;
+    return false;
 
   if (!operand_equal_p (src, src2, 0))
-    return;
+    return false;
 
   /* [ src + offset2, src + offset2 + len2 - 1 ] is set to val.
      Make sure that
      [ src + offset, src + offset + len - 1 ] is a subset of that.  */
   if (maybe_gt (wi::to_poly_offset (len) + (offset - offset2),
 		wi::to_poly_offset (len2)))
-    return;
+    return false;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -4271,6 +4272,237 @@  optimize_memcpy (gimple_stmt_iterator *gsip, tree dest, tree src, tree len)
       fprintf (dump_file, "into\n  ");
       print_gimple_stmt (dump_file, stmt, 0, dump_flags);
     }
+  return true;
+}
+
+/* Fold statement STMT located at I. Maybe setting CFG_CHANGED if
+   the condition was changed and cfg_cleanup is needed to be run.
+   Returns true if the iterator I is at the statement to handle;
+   otherwise false means move the iterator to the next statement.  */
+static int
+fold_all_builtin_stmt (gimple_stmt_iterator &i, gimple *stmt,
+		       bool &cfg_changed)
+{
+  /* Remove assume internal function calls. */
+  if (gimple_call_internal_p (stmt, IFN_ASSUME))
+    {
+      gsi_remove (&i, true);
+      return true;
+   }
+
+  if (gimple_code (stmt) != GIMPLE_CALL)
+    {
+      if (gimple_assign_load_p (stmt) && gimple_store_p (stmt))
+	return optimize_memcpy (&i, gimple_assign_lhs (stmt),
+				gimple_assign_rhs1 (stmt), NULL_TREE);
+      return false;
+    }
+
+  tree callee = gimple_call_fndecl (stmt);
+
+  if (!callee || !fndecl_built_in_p (callee, BUILT_IN_NORMAL))
+    return false;
+
+  if (fold_stmt (&i))
+    return false;
+
+  tree result = NULL_TREE;
+  switch (DECL_FUNCTION_CODE (callee))
+    {
+    case BUILT_IN_CONSTANT_P:
+      /* Resolve __builtin_constant_p.  If it hasn't been
+	 folded to integer_one_node by now, it's fairly
+	 certain that the value simply isn't constant.  */
+      result = integer_zero_node;
+      break;
+
+    case BUILT_IN_ASSUME_ALIGNED:
+      /* Remove __builtin_assume_aligned.  */
+      result = gimple_call_arg (stmt, 0);
+      break;
+
+    case BUILT_IN_STACK_RESTORE:
+      result = optimize_stack_restore (i);
+      break;
+
+    case BUILT_IN_UNREACHABLE:
+      if (optimize_unreachable (i))
+	 cfg_changed = true;
+      /* Skip on to the next statement. */
+      return false;
+
+    case BUILT_IN_MEMCPY:
+      if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)
+	  && TREE_CODE (gimple_call_arg (stmt, 0)) == ADDR_EXPR
+	  && TREE_CODE (gimple_call_arg (stmt, 1)) == ADDR_EXPR
+	  && TREE_CODE (gimple_call_arg (stmt, 2)) == INTEGER_CST)
+	{
+	  tree dest = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);
+	  tree src = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
+	  tree len = gimple_call_arg (stmt, 2);
+	  return optimize_memcpy (&i, dest, src, len);
+	}
+      return false;
+
+    case BUILT_IN_VA_START:
+    case BUILT_IN_VA_END:
+    case BUILT_IN_VA_COPY:
+      /* These shouldn't be folded before pass_stdarg.  */
+      result = optimize_stdarg_builtin (stmt);
+      break;
+
+    case BUILT_IN_ATOMIC_ADD_FETCH_1:
+    case BUILT_IN_ATOMIC_ADD_FETCH_2:
+    case BUILT_IN_ATOMIC_ADD_FETCH_4:
+    case BUILT_IN_ATOMIC_ADD_FETCH_8:
+    case BUILT_IN_ATOMIC_ADD_FETCH_16:
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_ADD_FETCH_CMP_0,
+					     true);
+    case BUILT_IN_SYNC_ADD_AND_FETCH_1:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_2:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_4:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_8:
+    case BUILT_IN_SYNC_ADD_AND_FETCH_16:
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_ADD_FETCH_CMP_0,
+					     false);
+
+    case BUILT_IN_ATOMIC_SUB_FETCH_1:
+    case BUILT_IN_ATOMIC_SUB_FETCH_2:
+    case BUILT_IN_ATOMIC_SUB_FETCH_4:
+    case BUILT_IN_ATOMIC_SUB_FETCH_8:
+    case BUILT_IN_ATOMIC_SUB_FETCH_16:
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_SUB_FETCH_CMP_0,
+					     true);
+    case BUILT_IN_SYNC_SUB_AND_FETCH_1:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_2:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_4:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_8:
+    case BUILT_IN_SYNC_SUB_AND_FETCH_16:
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_SUB_FETCH_CMP_0,
+					     false);
+
+    case BUILT_IN_ATOMIC_FETCH_OR_1:
+    case BUILT_IN_ATOMIC_FETCH_OR_2:
+    case BUILT_IN_ATOMIC_FETCH_OR_4:
+    case BUILT_IN_ATOMIC_FETCH_OR_8:
+    case BUILT_IN_ATOMIC_FETCH_OR_16:
+      return optimize_atomic_bit_test_and (&i, IFN_ATOMIC_BIT_TEST_AND_SET,
+					   true, false);
+    case BUILT_IN_SYNC_FETCH_AND_OR_1:
+    case BUILT_IN_SYNC_FETCH_AND_OR_2:
+    case BUILT_IN_SYNC_FETCH_AND_OR_4:
+    case BUILT_IN_SYNC_FETCH_AND_OR_8:
+    case BUILT_IN_SYNC_FETCH_AND_OR_16:
+      return optimize_atomic_bit_test_and (&i, IFN_ATOMIC_BIT_TEST_AND_SET,
+					   false, false);
+
+    case BUILT_IN_ATOMIC_FETCH_XOR_1:
+    case BUILT_IN_ATOMIC_FETCH_XOR_2:
+    case BUILT_IN_ATOMIC_FETCH_XOR_4:
+    case BUILT_IN_ATOMIC_FETCH_XOR_8:
+    case BUILT_IN_ATOMIC_FETCH_XOR_16:
+      return optimize_atomic_bit_test_and (&i,
+					   IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT,
+					   true, false);
+    case BUILT_IN_SYNC_FETCH_AND_XOR_1:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_2:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_4:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_8:
+    case BUILT_IN_SYNC_FETCH_AND_XOR_16:
+       return optimize_atomic_bit_test_and (&i,
+					    IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT,
+					    false, false);
+
+    case BUILT_IN_ATOMIC_XOR_FETCH_1:
+    case BUILT_IN_ATOMIC_XOR_FETCH_2:
+    case BUILT_IN_ATOMIC_XOR_FETCH_4:
+    case BUILT_IN_ATOMIC_XOR_FETCH_8:
+    case BUILT_IN_ATOMIC_XOR_FETCH_16:
+      if (optimize_atomic_bit_test_and (&i,
+					IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT,
+					true, true))
+	return true;
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_XOR_FETCH_CMP_0,
+					     true);
+    case BUILT_IN_SYNC_XOR_AND_FETCH_1:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_2:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_4:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_8:
+    case BUILT_IN_SYNC_XOR_AND_FETCH_16:
+      if (optimize_atomic_bit_test_and (&i,
+					IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT,
+					false, true))
+	return true;
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_XOR_FETCH_CMP_0,
+					     false);
+
+    case BUILT_IN_ATOMIC_FETCH_AND_1:
+    case BUILT_IN_ATOMIC_FETCH_AND_2:
+    case BUILT_IN_ATOMIC_FETCH_AND_4:
+    case BUILT_IN_ATOMIC_FETCH_AND_8:
+    case BUILT_IN_ATOMIC_FETCH_AND_16:
+      return optimize_atomic_bit_test_and (&i, IFN_ATOMIC_BIT_TEST_AND_RESET,
+					   true, false);
+    case BUILT_IN_SYNC_FETCH_AND_AND_1:
+    case BUILT_IN_SYNC_FETCH_AND_AND_2:
+    case BUILT_IN_SYNC_FETCH_AND_AND_4:
+    case BUILT_IN_SYNC_FETCH_AND_AND_8:
+    case BUILT_IN_SYNC_FETCH_AND_AND_16:
+      return optimize_atomic_bit_test_and (&i, IFN_ATOMIC_BIT_TEST_AND_RESET,
+					   false, false);
+    case BUILT_IN_ATOMIC_AND_FETCH_1:
+    case BUILT_IN_ATOMIC_AND_FETCH_2:
+    case BUILT_IN_ATOMIC_AND_FETCH_4:
+    case BUILT_IN_ATOMIC_AND_FETCH_8:
+    case BUILT_IN_ATOMIC_AND_FETCH_16:
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_AND_FETCH_CMP_0,
+					     true);
+    case BUILT_IN_SYNC_AND_AND_FETCH_1:
+    case BUILT_IN_SYNC_AND_AND_FETCH_2:
+    case BUILT_IN_SYNC_AND_AND_FETCH_4:
+    case BUILT_IN_SYNC_AND_AND_FETCH_8:
+    case BUILT_IN_SYNC_AND_AND_FETCH_16:
+     return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_AND_FETCH_CMP_0,
+					    false);
+    case BUILT_IN_ATOMIC_OR_FETCH_1:
+    case BUILT_IN_ATOMIC_OR_FETCH_2:
+    case BUILT_IN_ATOMIC_OR_FETCH_4:
+    case BUILT_IN_ATOMIC_OR_FETCH_8:
+    case BUILT_IN_ATOMIC_OR_FETCH_16:
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_OR_FETCH_CMP_0,
+					     true);
+    case BUILT_IN_SYNC_OR_AND_FETCH_1:
+    case BUILT_IN_SYNC_OR_AND_FETCH_2:
+    case BUILT_IN_SYNC_OR_AND_FETCH_4:
+    case BUILT_IN_SYNC_OR_AND_FETCH_8:
+    case BUILT_IN_SYNC_OR_AND_FETCH_16:
+      return optimize_atomic_op_fetch_cmp_0 (&i, IFN_ATOMIC_OR_FETCH_CMP_0,
+					     false);
+
+    default:
+      return false;
+    }
+  if (!result)
+    return false;
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Simplified\n  ");
+      print_gimple_stmt (dump_file, stmt, 0, dump_flags);
+    }
+
+  gimplify_and_update_call_from_tree (&i, result);
+
+  stmt = gsi_stmt (i);
+  update_stmt (stmt);
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "to\n  ");
+      print_gimple_stmt (dump_file, stmt, 0, dump_flags);
+      fprintf (dump_file, "\n");
+    }
+
+  return true;
 }
 
 /* A simple pass that attempts to fold all builtin functions.  This pass
@@ -4308,8 +4540,8 @@  unsigned int
 pass_fold_builtins::execute (function *fun)
 {
   bool cfg_changed = false;
+  bool update_addr = false;
   basic_block bb;
-  unsigned int todoflags = 0;
 
   FOR_EACH_BB_FN (bb, fun)
     {
@@ -4317,294 +4549,44 @@  pass_fold_builtins::execute (function *fun)
       for (i = gsi_start_bb (bb); !gsi_end_p (i); )
 	{
 	  gimple *stmt, *old_stmt;
-	  tree callee;
-	  enum built_in_function fcode;
-
-	  stmt = gsi_stmt (i);
 
-          if (gimple_code (stmt) != GIMPLE_CALL)
-	    {
-	      if (gimple_assign_load_p (stmt) && gimple_store_p (stmt))
-		optimize_memcpy (&i, gimple_assign_lhs (stmt),
-				 gimple_assign_rhs1 (stmt), NULL_TREE);
-	      gsi_next (&i);
-	      continue;
-	    }
+	  old_stmt = stmt = gsi_stmt (i);
 
-	  callee = gimple_call_fndecl (stmt);
-	  if (!callee
-	      && gimple_call_internal_p (stmt, IFN_ASSUME))
-	    {
-	      gsi_remove (&i, true);
-	      continue;
-	    }
-	  if (!callee || !fndecl_built_in_p (callee, BUILT_IN_NORMAL))
+	  if (!fold_all_builtin_stmt (i, stmt, cfg_changed))
 	    {
 	      gsi_next (&i);
 	      continue;
 	    }
 
-	  fcode = DECL_FUNCTION_CODE (callee);
-	  if (fold_stmt (&i))
-	    ;
-	  else
-	    {
-	      tree result = NULL_TREE;
-	      switch (DECL_FUNCTION_CODE (callee))
-		{
-		case BUILT_IN_CONSTANT_P:
-		  /* Resolve __builtin_constant_p.  If it hasn't been
-		     folded to integer_one_node by now, it's fairly
-		     certain that the value simply isn't constant.  */
-		  result = integer_zero_node;
-		  break;
-
-		case BUILT_IN_ASSUME_ALIGNED:
-		  /* Remove __builtin_assume_aligned.  */
-		  result = gimple_call_arg (stmt, 0);
-		  break;
-
-		case BUILT_IN_STACK_RESTORE:
-		  result = optimize_stack_restore (i);
-		  if (result)
-		    break;
-		  gsi_next (&i);
-		  continue;
+	  /* If statement was folded,
+	     then we should update taken addresses too. */
+	  update_addr = true;
 
-		case BUILT_IN_UNREACHABLE:
-		  if (optimize_unreachable (i))
-		    cfg_changed = true;
-		  break;
-
-		case BUILT_IN_ATOMIC_ADD_FETCH_1:
-		case BUILT_IN_ATOMIC_ADD_FETCH_2:
-		case BUILT_IN_ATOMIC_ADD_FETCH_4:
-		case BUILT_IN_ATOMIC_ADD_FETCH_8:
-		case BUILT_IN_ATOMIC_ADD_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_ADD_FETCH_CMP_0,
-						  true);
-		  break;
-		case BUILT_IN_SYNC_ADD_AND_FETCH_1:
-		case BUILT_IN_SYNC_ADD_AND_FETCH_2:
-		case BUILT_IN_SYNC_ADD_AND_FETCH_4:
-		case BUILT_IN_SYNC_ADD_AND_FETCH_8:
-		case BUILT_IN_SYNC_ADD_AND_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_ADD_FETCH_CMP_0,
-						  false);
-		  break;
-
-		case BUILT_IN_ATOMIC_SUB_FETCH_1:
-		case BUILT_IN_ATOMIC_SUB_FETCH_2:
-		case BUILT_IN_ATOMIC_SUB_FETCH_4:
-		case BUILT_IN_ATOMIC_SUB_FETCH_8:
-		case BUILT_IN_ATOMIC_SUB_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_SUB_FETCH_CMP_0,
-						  true);
-		  break;
-		case BUILT_IN_SYNC_SUB_AND_FETCH_1:
-		case BUILT_IN_SYNC_SUB_AND_FETCH_2:
-		case BUILT_IN_SYNC_SUB_AND_FETCH_4:
-		case BUILT_IN_SYNC_SUB_AND_FETCH_8:
-		case BUILT_IN_SYNC_SUB_AND_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_SUB_FETCH_CMP_0,
-						  false);
-		  break;
-
-		case BUILT_IN_ATOMIC_FETCH_OR_1:
-		case BUILT_IN_ATOMIC_FETCH_OR_2:
-		case BUILT_IN_ATOMIC_FETCH_OR_4:
-		case BUILT_IN_ATOMIC_FETCH_OR_8:
-		case BUILT_IN_ATOMIC_FETCH_OR_16:
-		  optimize_atomic_bit_test_and (&i,
-						IFN_ATOMIC_BIT_TEST_AND_SET,
-						true, false);
-		  break;
-		case BUILT_IN_SYNC_FETCH_AND_OR_1:
-		case BUILT_IN_SYNC_FETCH_AND_OR_2:
-		case BUILT_IN_SYNC_FETCH_AND_OR_4:
-		case BUILT_IN_SYNC_FETCH_AND_OR_8:
-		case BUILT_IN_SYNC_FETCH_AND_OR_16:
-		  optimize_atomic_bit_test_and (&i,
-						IFN_ATOMIC_BIT_TEST_AND_SET,
-						false, false);
-		  break;
-
-		case BUILT_IN_ATOMIC_FETCH_XOR_1:
-		case BUILT_IN_ATOMIC_FETCH_XOR_2:
-		case BUILT_IN_ATOMIC_FETCH_XOR_4:
-		case BUILT_IN_ATOMIC_FETCH_XOR_8:
-		case BUILT_IN_ATOMIC_FETCH_XOR_16:
-		  optimize_atomic_bit_test_and
-			(&i, IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT, true, false);
-		  break;
-		case BUILT_IN_SYNC_FETCH_AND_XOR_1:
-		case BUILT_IN_SYNC_FETCH_AND_XOR_2:
-		case BUILT_IN_SYNC_FETCH_AND_XOR_4:
-		case BUILT_IN_SYNC_FETCH_AND_XOR_8:
-		case BUILT_IN_SYNC_FETCH_AND_XOR_16:
-		  optimize_atomic_bit_test_and
-			(&i, IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT, false, false);
-		  break;
-
-		case BUILT_IN_ATOMIC_XOR_FETCH_1:
-		case BUILT_IN_ATOMIC_XOR_FETCH_2:
-		case BUILT_IN_ATOMIC_XOR_FETCH_4:
-		case BUILT_IN_ATOMIC_XOR_FETCH_8:
-		case BUILT_IN_ATOMIC_XOR_FETCH_16:
-		  if (optimize_atomic_bit_test_and
-			(&i, IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT, true, true))
-		    break;
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_XOR_FETCH_CMP_0,
-						  true);
-		  break;
-		case BUILT_IN_SYNC_XOR_AND_FETCH_1:
-		case BUILT_IN_SYNC_XOR_AND_FETCH_2:
-		case BUILT_IN_SYNC_XOR_AND_FETCH_4:
-		case BUILT_IN_SYNC_XOR_AND_FETCH_8:
-		case BUILT_IN_SYNC_XOR_AND_FETCH_16:
-		  if (optimize_atomic_bit_test_and
-			(&i, IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT, false, true))
-		    break;
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_XOR_FETCH_CMP_0,
-						  false);
-		  break;
-
-		case BUILT_IN_ATOMIC_FETCH_AND_1:
-		case BUILT_IN_ATOMIC_FETCH_AND_2:
-		case BUILT_IN_ATOMIC_FETCH_AND_4:
-		case BUILT_IN_ATOMIC_FETCH_AND_8:
-		case BUILT_IN_ATOMIC_FETCH_AND_16:
-		  optimize_atomic_bit_test_and (&i,
-						IFN_ATOMIC_BIT_TEST_AND_RESET,
-						true, false);
-		  break;
-		case BUILT_IN_SYNC_FETCH_AND_AND_1:
-		case BUILT_IN_SYNC_FETCH_AND_AND_2:
-		case BUILT_IN_SYNC_FETCH_AND_AND_4:
-		case BUILT_IN_SYNC_FETCH_AND_AND_8:
-		case BUILT_IN_SYNC_FETCH_AND_AND_16:
-		  optimize_atomic_bit_test_and (&i,
-						IFN_ATOMIC_BIT_TEST_AND_RESET,
-						false, false);
-		  break;
-
-		case BUILT_IN_ATOMIC_AND_FETCH_1:
-		case BUILT_IN_ATOMIC_AND_FETCH_2:
-		case BUILT_IN_ATOMIC_AND_FETCH_4:
-		case BUILT_IN_ATOMIC_AND_FETCH_8:
-		case BUILT_IN_ATOMIC_AND_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_AND_FETCH_CMP_0,
-						  true);
-		  break;
-		case BUILT_IN_SYNC_AND_AND_FETCH_1:
-		case BUILT_IN_SYNC_AND_AND_FETCH_2:
-		case BUILT_IN_SYNC_AND_AND_FETCH_4:
-		case BUILT_IN_SYNC_AND_AND_FETCH_8:
-		case BUILT_IN_SYNC_AND_AND_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_AND_FETCH_CMP_0,
-						  false);
-		  break;
-
-		case BUILT_IN_ATOMIC_OR_FETCH_1:
-		case BUILT_IN_ATOMIC_OR_FETCH_2:
-		case BUILT_IN_ATOMIC_OR_FETCH_4:
-		case BUILT_IN_ATOMIC_OR_FETCH_8:
-		case BUILT_IN_ATOMIC_OR_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_OR_FETCH_CMP_0,
-						  true);
-		  break;
-		case BUILT_IN_SYNC_OR_AND_FETCH_1:
-		case BUILT_IN_SYNC_OR_AND_FETCH_2:
-		case BUILT_IN_SYNC_OR_AND_FETCH_4:
-		case BUILT_IN_SYNC_OR_AND_FETCH_8:
-		case BUILT_IN_SYNC_OR_AND_FETCH_16:
-		  optimize_atomic_op_fetch_cmp_0 (&i,
-						  IFN_ATOMIC_OR_FETCH_CMP_0,
-						  false);
-		  break;
-
-		case BUILT_IN_MEMCPY:
-		  if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL)
-		      && TREE_CODE (gimple_call_arg (stmt, 0)) == ADDR_EXPR
-		      && TREE_CODE (gimple_call_arg (stmt, 1)) == ADDR_EXPR
-		      && TREE_CODE (gimple_call_arg (stmt, 2)) == INTEGER_CST)
-		    {
-		      tree dest = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);
-		      tree src = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
-		      tree len = gimple_call_arg (stmt, 2);
-		      optimize_memcpy (&i, dest, src, len);
-		    }
-		  break;
-
-		case BUILT_IN_VA_START:
-		case BUILT_IN_VA_END:
-		case BUILT_IN_VA_COPY:
-		  /* These shouldn't be folded before pass_stdarg.  */
-		  result = optimize_stdarg_builtin (stmt);
-		  break;
-
-		default:;
-		}
-
-	      if (!result)
-		{
-		  gsi_next (&i);
-		  continue;
-		}
-
-	      gimplify_and_update_call_from_tree (&i, result);
-	    }
-
-	  todoflags |= TODO_update_address_taken;
-
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    {
-	      fprintf (dump_file, "Simplified\n  ");
-	      print_gimple_stmt (dump_file, stmt, 0, dump_flags);
-	    }
+	  /* The iterator could be moved to the end if we
+	     had removed the previous one. */
+	  if (gsi_end_p (i))
+	    break;
 
-          old_stmt = stmt;
 	  stmt = gsi_stmt (i);
-	  update_stmt (stmt);
 
 	  if (maybe_clean_or_replace_eh_stmt (old_stmt, stmt)
 	      && gimple_purge_dead_eh_edges (bb))
 	    cfg_changed = true;
 
-	  if (dump_file && (dump_flags & TDF_DETAILS))
-	    {
-	      fprintf (dump_file, "to\n  ");
-	      print_gimple_stmt (dump_file, stmt, 0, dump_flags);
-	      fprintf (dump_file, "\n");
-	    }
-
-	  /* Retry the same statement if it changed into another
-	     builtin, there might be new opportunities now.  */
-          if (gimple_code (stmt) != GIMPLE_CALL)
-	    {
-	      gsi_next (&i);
-	      continue;
-	    }
-	  callee = gimple_call_fndecl (stmt);
-	  if (!callee
-	      || !fndecl_built_in_p (callee, fcode))
-	    gsi_next (&i);
+	  /* Retry the changed statement, there might be
+	     new opportunities now.  */
 	}
     }
 
+  unsigned int todoflags = 0;
+
   /* Delete unreachable blocks.  */
   if (cfg_changed)
     todoflags |= TODO_cleanup_cfg;
 
+  if (update_addr)
+    todoflags |= TODO_update_address_taken;
+
   return todoflags;
 }