diff mbox series

c++, dyninit, v3: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876]

Message ID ZzHDx8DBwdYmoPLe@tucnak
State New
Headers show
Series c++, dyninit, v3: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876] | expand

Commit Message

Jakub Jelinek Nov. 11, 2024, 8:43 a.m. UTC
Hi!

I'd like to ping the
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588539.html
patch.
Previous mails on this topic
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/thread.html#583289
https://gcc.gnu.org/pipermail/gcc-patches/2021-December/585994.html
As it has been a while, the patch doesn't apply cleanly anymore, so here
is an updated patch (minor changes), bootstrapped/regtested on x86_64-linux
and i686-linux:

2024-11-11  Jakub Jelinek  <jakub@redhat.com>

	PR c++/102876
gcc/
	* internal-fn.def (DYNAMIC_INIT_START, DYNAMIC_INIT_END): New internal
	functions.
	* internal-fn.cc (expand_DYNAMIC_INIT_START, expand_DYNAMIC_INIT_END):
	New functions.
	* tree-pass.h (make_pass_dyninit): Declare.
	* passes.def (pass_dyninit): Add after dce4.
	* function.h (struct function): Add has_dynamic_init bitfield.
	* tree-inline.cc (initialize_cfun): Copy over has_dynamic_init.
	(expand_call_inline): Or in has_dynamic_init from inlined fn into
	caller.
	* params.opt (--param=dynamic-init-max-size=): New param.
	* gimple-ssa-store-merging.cc (pass_data_dyninit): New variable.
	(class pass_dyninit): New type.
	(pass_dyninit::execute): New method.
	(make_pass_dyninit): New function.
gcc/cp/
	* decl2.cc (one_static_initialization_or_destruction): Emit
	.DYNAMIC_INIT_START and .DYNAMIC_INIT_END internal calls around
	dynamic initialization of variables that don't need a guard.
gcc/testsuite/
	* g++.dg/opt/init3.C: New test.


	Jakub

Comments

Jason Merrill Nov. 12, 2024, 3:18 p.m. UTC | #1
On 11/11/24 3:43 AM, Jakub Jelinek wrote:
> Hi!
> 
> I'd like to ping the
> https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588539.html
> patch.
> Previous mails on this topic
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/thread.html#583289
> https://gcc.gnu.org/pipermail/gcc-patches/2021-December/585994.html
> As it has been a while, the patch doesn't apply cleanly anymore, so here
> is an updated patch (minor changes), bootstrapped/regtested on x86_64-linux
> and i686-linux:
> 
> 2024-11-11  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/102876
> gcc/
> 	* internal-fn.def (DYNAMIC_INIT_START, DYNAMIC_INIT_END): New internal
> 	functions.
> 	* internal-fn.cc (expand_DYNAMIC_INIT_START, expand_DYNAMIC_INIT_END):
> 	New functions.
> 	* tree-pass.h (make_pass_dyninit): Declare.
> 	* passes.def (pass_dyninit): Add after dce4.
> 	* function.h (struct function): Add has_dynamic_init bitfield.
> 	* tree-inline.cc (initialize_cfun): Copy over has_dynamic_init.
> 	(expand_call_inline): Or in has_dynamic_init from inlined fn into
> 	caller.
> 	* params.opt (--param=dynamic-init-max-size=): New param.
> 	* gimple-ssa-store-merging.cc (pass_data_dyninit): New variable.
> 	(class pass_dyninit): New type.
> 	(pass_dyninit::execute): New method.
> 	(make_pass_dyninit): New function.
> gcc/cp/
> 	* decl2.cc (one_static_initialization_or_destruction): Emit
> 	.DYNAMIC_INIT_START and .DYNAMIC_INIT_END internal calls around
> 	dynamic initialization of variables that don't need a guard.
> gcc/testsuite/
> 	* g++.dg/opt/init3.C: New test.

Why the v3 tag in the subject line?  I don't see any libstdc++ changes.

> +	  if (optimize && !optimize_debug && !guard_if_stmt && !sanitize)

Why suppress this optimization in these three conditions?

Well, I guess in the guard_if_stmt case we need to consider some TUs 
doing this optimization and others not, so a dynamic init in another TU 
could start with the constant initialization from this TU instead of the 
zero-init it expects.  If the guard is in the same COMDAT group as the 
variable we should be able to statically set the guard to 1 along with 
setting the variable value, and avoid the problem?

Why would we not want this with -Og or asan?

Jason
Jakub Jelinek Nov. 12, 2024, 3:32 p.m. UTC | #2
On Tue, Nov 12, 2024 at 10:18:49AM -0500, Jason Merrill wrote:
> Why the v3 tag in the subject line?  I don't see any libstdc++ changes.

3rd version of the patch.  That has nothing to do with libstdc++.

> > +	  if (optimize && !optimize_debug && !guard_if_stmt && !sanitize)
> 
> Why suppress this optimization in these three conditions?
> 
> Well, I guess in the guard_if_stmt case we need to consider some TUs doing
> this optimization and others not, so a dynamic init in another TU could
> start with the constant initialization from this TU instead of the zero-init
> it expects.  If the guard is in the same COMDAT group as the variable we
> should be able to statically set the guard to 1 along with setting the
> variable value, and avoid the problem?
> 
> Why would we not want this with -Og or asan?

I wanted to avoid -O0 or -Og in case user wants to step through the global
constructor or functions called from there, the optimization (and it is an
optimization, so I think optimize is appropriate in any case) changes the
debugging experience.
For guard_if_stmt, either the guard related stuff is in between the magic
ifns and then it will never be optimized because it isn't turned into a
merge constant store, or if the magic ifns are placed inside of it it would
need more work, tell the post IPA pass there is a guard around this and let
it deal with that.  So, in the latter case, not unsolvable, but something
that could be handled incrementally.
And for asan, the thing is that asan performs at runtime checking of the
order of the global ctors and the optimization could result in not detecting
some misuses.

	Jakub
Jakub Jelinek Nov. 12, 2024, 3:47 p.m. UTC | #3
On Tue, Nov 12, 2024 at 04:32:29PM +0100, Jakub Jelinek wrote:
> I wanted to avoid -O0 or -Og in case user wants to step through the global
> constructor or functions called from there, the optimization (and it is an
> optimization, so I think optimize is appropriate in any case) changes the
> debugging experience.

Not to mention that the pass isn't encountered with -Og or -O0 (it is among
passes which are only done for non-O0/-Og, -Og has its own post-IPA pass
queue and -O0 skips it).

	Jakub
diff mbox series

Patch

--- gcc/internal-fn.def.jj	2024-11-06 18:53:15.015784686 +0100
+++ gcc/internal-fn.def	2024-11-08 12:41:43.324476834 +0100
@@ -527,6 +527,10 @@  DEF_INTERNAL_FN (DEFERRED_INIT, ECF_CONS
    2nd argument.  */
 DEF_INTERNAL_FN (ACCESS_WITH_SIZE, ECF_PURE | ECF_LEAF | ECF_NOTHROW, NULL)
 
+/* Mark start and end of dynamic initialization of a variable.  */
+DEF_INTERNAL_FN (DYNAMIC_INIT_START, ECF_LEAF | ECF_NOTHROW, ".cr ")
+DEF_INTERNAL_FN (DYNAMIC_INIT_END, ECF_LEAF | ECF_NOTHROW, ".cr ")
+
 /* DIM_SIZE and DIM_POS return the size of a particular compute
    dimension and the executing thread's position within that
    dimension.  DIM_POS is pure (and not const) so that it isn't
--- gcc/internal-fn.cc.jj	2024-11-06 18:53:14.993784998 +0100
+++ gcc/internal-fn.cc	2024-11-08 12:41:43.348476494 +0100
@@ -3962,6 +3962,16 @@  expand_CO_ACTOR (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+static void
+expand_DYNAMIC_INIT_START (internal_fn, gcall *)
+{
+}
+
+static void
+expand_DYNAMIC_INIT_END (internal_fn, gcall *)
+{
+}
+
 /* Expand a call to FN using the operands in STMT.  FN has a single
    output operand and NARGS input operands.  */
 
--- gcc/tree-pass.h.jj	2024-09-03 16:48:01.829843825 +0200
+++ gcc/tree-pass.h	2024-11-08 12:41:43.348476494 +0100
@@ -455,6 +455,7 @@  extern gimple_opt_pass *make_pass_cse_si
 extern gimple_opt_pass *make_pass_expand_pow (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_optimize_bswap (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_store_merging (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_dyninit (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_optimize_widening_mul (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_warn_function_return (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_warn_function_noreturn (gcc::context *ctxt);
--- gcc/passes.def.jj	2024-11-08 12:41:43.364476266 +0100
+++ gcc/passes.def	2024-11-08 12:49:41.101693712 +0100
@@ -271,6 +271,7 @@  along with GCC; see the file COPYING3.
       NEXT_PASS (pass_tsan);
       NEXT_PASS (pass_dse, true /* use DR analysis */);
       NEXT_PASS (pass_dce, false /* update_address_taken_p */, false /* remove_unused_locals */);
+      NEXT_PASS (pass_dyninit);
       /* Pass group that runs when 1) enabled, 2) there are loops
 	 in the function.  Make sure to run pass_fix_loops before
 	 to discover/remove loops before running the gate function
--- gcc/function.h.jj	2024-11-08 12:41:43.365476252 +0100
+++ gcc/function.h	2024-11-08 12:50:25.420064516 +0100
@@ -449,6 +449,10 @@  struct GTY(()) function {
   /* Set for artificial function created for [[assume (cond)]].
      These should be GIMPLE optimized, but not expanded to RTL.  */
   unsigned int assume_function : 1;
+
+  /* Set if there are any .DYNAMIC_INIT_{START,END} calls in the
+     function.  */
+  unsigned int has_dynamic_init : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
--- gcc/tree-inline.cc.jj	2024-10-25 10:00:29.531766956 +0200
+++ gcc/tree-inline.cc	2024-11-08 12:41:43.377476081 +0100
@@ -2854,6 +2854,7 @@  initialize_cfun (tree new_fndecl, tree c
   cfun->can_delete_dead_exceptions = src_cfun->can_delete_dead_exceptions;
   cfun->returns_struct = src_cfun->returns_struct;
   cfun->returns_pcc_struct = src_cfun->returns_pcc_struct;
+  cfun->has_dynamic_init = src_cfun->has_dynamic_init;
 
   init_empty_tree_cfg ();
 
@@ -5037,6 +5038,7 @@  expand_call_inline (basic_block bb, gimp
   dst_cfun->calls_eh_return |= id->src_cfun->calls_eh_return;
   id->dst_node->calls_declare_variant_alt
     |= id->src_node->calls_declare_variant_alt;
+  dst_cfun->has_dynamic_init |= id->src_cfun->has_dynamic_init;
 
   gcc_assert (!id->src_cfun->after_inlining);
 
--- gcc/params.opt.jj	2024-10-31 08:45:38.257823859 +0100
+++ gcc/params.opt	2024-11-08 12:41:43.393475854 +0100
@@ -1226,4 +1226,8 @@  Maximum number of outgoing edges in a sw
 Common Joined UInteger Var(param_vrp_vector_threshold) Init(250) Optimization Param
 Maximum number of basic blocks for VRP to use a basic cache vector.
 
+-param=dynamic-init-max-size=
+Common Joined UInteger Var(param_dynamic_init_max_size) Init(1024) Param Optimization
+Maximum size of a dynamically initialized namespace scope C++ variable for dynamic into constant initialization optimization.
+
 ; This comment is to ensure we retain the blank line above.
--- gcc/gimple-ssa-store-merging.cc.jj	2024-11-06 10:22:07.353772630 +0100
+++ gcc/gimple-ssa-store-merging.cc	2024-11-08 12:41:43.412475585 +0100
@@ -171,6 +171,8 @@ 
 #include "optabs-tree.h"
 #include "dbgcnt.h"
 #include "selftest.h"
+#include "cgraph.h"
+#include "varasm.h"
 
 /* The maximum size (in bits) of the stores this pass should generate.  */
 #define MAX_STORE_BITSIZE (BITS_PER_WORD)
@@ -5612,6 +5614,334 @@  pass_store_merging::execute (function *f
   return 0;
 }
 
+/* Pass to optimize C++ dynamic initialization.  */
+
+const pass_data pass_data_dyninit = {
+  GIMPLE_PASS,     /* type */
+  "dyninit",	   /* name */
+  OPTGROUP_NONE,   /* optinfo_flags */
+  TV_GIMPLE_STORE_MERGING,	 /* tv_id */
+  PROP_ssa,	/* properties_required */
+  0,		   /* properties_provided */
+  0,		   /* properties_destroyed */
+  0,		   /* todo_flags_start */
+  0,		/* todo_flags_finish */
+};
+
+class pass_dyninit : public gimple_opt_pass
+{
+public:
+  pass_dyninit (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_dyninit, ctxt)
+  {
+  }
+
+  virtual bool
+  gate (function *fun)
+  {
+    return (DECL_ARTIFICIAL (fun->decl)
+	    && DECL_STATIC_CONSTRUCTOR (fun->decl)
+	    && optimize
+	    && !optimize_debug
+	    && fun->has_dynamic_init);
+  }
+
+  virtual unsigned int execute (function *);
+}; // class pass_dyninit
+
+unsigned int
+pass_dyninit::execute (function *fun)
+{
+  basic_block bb;
+  auto_vec<gimple *, 32> ifns;
+  hash_map<tree, gimple *> *map = NULL;
+  auto_vec<tree, 32> vars;
+  gimple **cur = NULL;
+  bool ssdf_calls = false;
+
+  FOR_EACH_BB_FN (bb, fun)
+    {
+      for (gimple_stmt_iterator gsi = gsi_after_labels (bb);
+	   !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  if (is_gimple_debug (stmt))
+	    continue;
+
+	  /* The C++ FE can wrap dynamic initialization of certain
+	     variables with a pair of iternal function calls, like:
+	     .DYNAMIC_INIT_START (&b, 0);
+	     b = 1;
+	     .DYNAMIC_INIT_END (&b);
+
+	     or
+	     .DYNAMIC_INIT_START (&e, 1);
+	     # DEBUG this => &e.f
+	     MEM[(struct S *)&e + 4B] ={v} {CLOBBER};
+	     MEM[(struct S *)&e + 4B].a = 1;
+	     MEM[(struct S *)&e + 4B].b = 2;
+	     MEM[(struct S *)&e + 4B].c = 3;
+	     # DEBUG BEGIN_STMT
+	     MEM[(struct S *)&e + 4B].d = 6;
+	     # DEBUG this => NULL
+	     .DYNAMIC_INIT_END (&e);
+
+	     Verify if there are only stores of constants to the corresponding
+	     variable or parts of that variable and if so, try to reconstruct
+	     a static initializer from the static initializer if any and
+	     the constant stores into the variable.  This is permitted by
+	     [basic.start.static]/3.  */
+	  if (is_gimple_call (stmt))
+	    {
+	      if (gimple_call_internal_p (stmt, IFN_DYNAMIC_INIT_START))
+		{
+		  ifns.safe_push (stmt);
+		  if (cur)
+		    *cur = NULL;
+		  tree arg = gimple_call_arg (stmt, 0);
+		  gcc_assert (TREE_CODE (arg) == ADDR_EXPR
+			      && DECL_P (TREE_OPERAND (arg, 0)));
+		  tree var = TREE_OPERAND (arg, 0);
+		  gcc_checking_assert (is_global_var (var));
+		  varpool_node *node = varpool_node::get (var);
+		  if (node == NULL
+		      || node->in_other_partition
+		      || TREE_ASM_WRITTEN (var)
+		      || DECL_SIZE_UNIT (var) == NULL_TREE
+		      || !tree_fits_uhwi_p (DECL_SIZE_UNIT (var))
+		      || (tree_to_uhwi (DECL_SIZE_UNIT (var))
+			  > (unsigned) param_dynamic_init_max_size)
+		      || TYPE_SIZE_UNIT (TREE_TYPE (var)) == NULL_TREE
+		      || !tree_int_cst_equal (TYPE_SIZE_UNIT (TREE_TYPE (var)),
+					      DECL_SIZE_UNIT (var)))
+		    continue;
+		  if (map == NULL)
+		    map = new hash_map<tree, gimple *> (61);
+		  bool existed_p;
+		  cur = &map->get_or_insert (var, &existed_p);
+		  if (existed_p)
+		    {
+		      /* Punt if we see more than one .DYNAMIC_INIT_START
+			 internal call for the same variable.  */
+		      *cur = NULL;
+		      cur = NULL;
+		    }
+		  else
+		    {
+		      *cur = stmt;
+		      vars.safe_push (var);
+		    }
+		  continue;
+		}
+	      else if (gimple_call_internal_p (stmt, IFN_DYNAMIC_INIT_END))
+		{
+		  ifns.safe_push (stmt);
+		  tree arg = gimple_call_arg (stmt, 0);
+		  gcc_assert (TREE_CODE (arg) == ADDR_EXPR
+			      && DECL_P (TREE_OPERAND (arg, 0)));
+		  tree var = TREE_OPERAND (arg, 0);
+		  gcc_checking_assert (is_global_var (var));
+		  if (cur)
+		    {
+		      /* Punt if .DYNAMIC_INIT_END call argument doesn't
+			 pair with .DYNAMIC_INIT_START.  */
+		      if (vars.last () != var)
+			*cur = NULL;
+		      cur = NULL;
+		    }
+		  continue;
+		}
+
+	      /* Punt if we see any artificial
+		 __static_initialization_and_destruction_* calls, e.g. if
+		 it would be partially inlined, because we wouldn't then see
+		 all .DYNAMIC_INIT_* calls.  */
+	      tree fndecl = gimple_call_fndecl (stmt);
+	      if (fndecl
+		  && DECL_ARTIFICIAL (fndecl)
+		  && DECL_STRUCT_FUNCTION (fndecl)
+		  && DECL_STRUCT_FUNCTION (fndecl)->has_dynamic_init)
+		ssdf_calls = true;
+	    }
+	  if (cur)
+	    {
+	      if (store_valid_for_store_merging_p (stmt))
+		{
+		  tree lhs = gimple_assign_lhs (stmt);
+		  tree rhs = gimple_assign_rhs1 (stmt);
+		  poly_int64 bitsize, bitpos;
+		  HOST_WIDE_INT ibitsize, ibitpos;
+		  machine_mode mode;
+		  int unsignedp, reversep, volatilep = 0;
+		  tree offset;
+		  tree var = vars.last ();
+		  if (rhs_valid_for_store_merging_p (rhs)
+		      && get_inner_reference (lhs, &bitsize, &bitpos, &offset,
+					      &mode, &unsignedp, &reversep,
+					      &volatilep) == var
+		      && !reversep
+		      && !volatilep
+		      && (offset == NULL_TREE || integer_zerop (offset))
+		      && bitsize.is_constant (&ibitsize)
+		      && bitpos.is_constant (&ibitpos)
+		      && ibitpos >= 0
+		      && ibitsize <= tree_to_shwi (DECL_SIZE (var))
+		      && ibitsize + ibitpos <= tree_to_shwi (DECL_SIZE (var)))
+		    continue;
+		}
+	      *cur = NULL;
+	      cur = NULL;
+	    }
+	}
+      if (cur)
+	{
+	  *cur = NULL;
+	  cur = NULL;
+	}
+    }
+  if (map && !ssdf_calls)
+    {
+      for (tree var : vars)
+	{
+	  gimple *g = *map->get (var);
+	  if (g == NULL)
+	    continue;
+	  varpool_node *node = varpool_node::get (var);
+	  node->get_constructor ();
+	  tree init = DECL_INITIAL (var);
+	  if (init == NULL)
+	    init = build_zero_cst (TREE_TYPE (var));
+	  gimple_stmt_iterator gsi = gsi_for_stmt (g);
+	  unsigned char *buf = NULL;
+	  unsigned int buf_size = tree_to_uhwi (DECL_SIZE_UNIT (var));
+	  bool buf_valid = false;
+	  do
+	    {
+	      gsi_next (&gsi);
+	      gimple *stmt = gsi_stmt (gsi);
+	      if (is_gimple_debug (stmt))
+		continue;
+	      if (is_gimple_call (stmt))
+		break;
+	      if (gimple_clobber_p (stmt))
+		continue;
+	      tree lhs = gimple_assign_lhs (stmt);
+	      tree rhs = gimple_assign_rhs1 (stmt);
+	      if (lhs == var)
+		{
+		  /* Simple assignment to the whole variable.
+		     rhs is the initializer.  */
+		  buf_valid = false;
+		  init = rhs;
+		  continue;
+		}
+	      poly_int64 bitsize, bitpos;
+	      machine_mode mode;
+	      int unsignedp, reversep, volatilep = 0;
+	      tree offset;
+	      get_inner_reference (lhs, &bitsize, &bitpos, &offset,
+				   &mode, &unsignedp, &reversep, &volatilep);
+	      HOST_WIDE_INT ibitsize = bitsize.to_constant ();
+	      HOST_WIDE_INT ibitpos = bitpos.to_constant ();
+	      if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN
+		  || CHAR_BIT != 8
+		  || BITS_PER_UNIT != 8)
+		{
+		  g = NULL;
+		  break;
+		}
+	      if (!buf_valid)
+		{
+		  if (buf == NULL)
+		    buf = XNEWVEC (unsigned char, buf_size * 2);
+		  memset (buf, 0, buf_size);
+		  if (native_encode_initializer (init, buf, buf_size)
+		      != (int) buf_size)
+		    {
+		      g = NULL;
+		      break;
+		    }
+		  buf_valid = true;
+		}
+	      /* Otherwise go through byte representation.  */
+	      if (!encode_tree_to_bitpos (rhs, buf, ibitsize,
+					  ibitpos, buf_size))
+		{
+		  g = NULL;
+		  break;
+		}
+	    }
+	  while (1);
+	  if (g == NULL)
+	    {
+	      XDELETE (buf);
+	      continue;
+	    }
+	  if (buf_valid)
+	    {
+	      init = native_interpret_aggregate (TREE_TYPE (var), buf, 0,
+						 buf_size);
+	      if (init)
+		{
+		  /* Verify the dynamic initialization doesn't e.g. set
+		     some padding bits to non-zero by trying to encode
+		     it again and comparing.  */
+		  memset (buf + buf_size, 0, buf_size);
+		  if (native_encode_initializer (init, buf + buf_size,
+						 buf_size) != (int) buf_size
+		      || memcmp (buf, buf + buf_size, buf_size) != 0)
+		    init = NULL_TREE;
+		}
+	    }
+	  XDELETE (buf);
+	  if (!init || !initializer_constant_valid_p (init, TREE_TYPE (var)))
+	    continue;
+	  if (integer_nonzerop (gimple_call_arg (g, 1)))
+	    TREE_READONLY (var) = 1;
+	  if (dump_file)
+	    {
+	      fprintf (dump_file, "dynamic initialization of ");
+	      print_generic_stmt (dump_file, var, TDF_SLIM);
+	      fprintf (dump_file, " optimized into: ");
+	      print_generic_stmt (dump_file, init, TDF_SLIM);
+	      if (TREE_READONLY (var))
+		fprintf (dump_file, " and making it read-only\n");
+	      fprintf (dump_file, "\n");
+	    }
+	  if (initializer_zerop (init))
+	    DECL_INITIAL (var) = NULL_TREE;
+	  else
+	    DECL_INITIAL (var) = init;
+	  gsi = gsi_for_stmt (g);
+	  gsi_next (&gsi);
+	  do
+	    {
+	      gimple *stmt = gsi_stmt (gsi);
+	      if (is_gimple_debug (stmt))
+		{
+		  gsi_next (&gsi);
+		  continue;
+		}
+	      if (is_gimple_call (stmt))
+		break;
+	      /* Remove now all the stores for the dynamic initialization.  */
+	      unlink_stmt_vdef (stmt);
+	      gsi_remove (&gsi, true);
+	      release_defs (stmt);
+	    }
+	  while (1);
+	}
+    }
+  delete map;
+  for (gimple *g : ifns)
+    {
+      gimple_stmt_iterator gsi = gsi_for_stmt (g);
+      unlink_stmt_vdef (g);
+      gsi_remove (&gsi, true);
+      release_defs (g);
+    }
+  return 0;
+}
 } // anon namespace
 
 /* Construct and return a store merging pass object.  */
@@ -5622,6 +5952,14 @@  make_pass_store_merging (gcc::context *c
   return new pass_store_merging (ctxt);
 }
 
+/* Construct and return a dyninit pass object.  */
+
+gimple_opt_pass *
+make_pass_dyninit (gcc::context *ctxt)
+{
+  return new pass_dyninit (ctxt);
+}
+
 #if CHECKING_P
 
 namespace selftest {
--- gcc/cp/decl2.cc.jj	2024-11-08 12:41:43.414475556 +0100
+++ gcc/cp/decl2.cc	2024-11-08 13:01:45.531408825 +0100
@@ -4398,10 +4398,36 @@  one_static_initialization_or_destruction
     {
       if (init)
 	{
-	  finish_expr_stmt (init);
-	  if (sanitize_flags_p (SANITIZE_ADDRESS, decl))
-	    if (varpool_node *vnode = varpool_node::get (decl))
-	      vnode->dynamically_initialized = 1;
+	  bool sanitize = sanitize_flags_p (SANITIZE_ADDRESS, decl);
+	  if (optimize && !optimize_debug && !guard_if_stmt && !sanitize)
+	    {
+	      tree t = build_fold_addr_expr (decl);
+	      tree type = TREE_TYPE (decl);
+	      tree is_const
+		= constant_boolean_node (TYPE_READONLY (type)
+					 && !cp_has_mutable_p (type),
+					 boolean_type_node);
+	      t = build_call_expr_internal_loc (DECL_SOURCE_LOCATION (decl),
+						IFN_DYNAMIC_INIT_START,
+						void_type_node, 2, t,
+						is_const);
+	      cfun->has_dynamic_init = true;
+	      finish_expr_stmt (t);
+	    }
+ 	  finish_expr_stmt (init);
+	  if (sanitize)
+	    {
+	      if (varpool_node *vnode = varpool_node::get (decl))
+		vnode->dynamically_initialized = 1;
+	    }
+	  else if (optimize && !optimize_debug && !guard_if_stmt)
+	    {
+	      tree t = build_fold_addr_expr (decl);
+	      t = build_call_expr_internal_loc (DECL_SOURCE_LOCATION (decl),
+						IFN_DYNAMIC_INIT_END,
+						void_type_node, 1, t);
+	      finish_expr_stmt (t);
+	    }
 	}
 
       /* If we're using __cxa_atexit, register a function that calls the
--- gcc/testsuite/g++.dg/opt/init3.C.jj	2024-11-08 12:41:43.415475542 +0100
+++ gcc/testsuite/g++.dg/opt/init3.C	2024-11-08 12:41:43.415475542 +0100
@@ -0,0 +1,31 @@ 
+// PR c++/102876
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-dyninit" }
+// { dg-final { scan-tree-dump "dynamic initialization of b\[\n\r]* optimized into: 1" "dyninit" } }
+// { dg-final { scan-tree-dump "dynamic initialization of e\[\n\r]* optimized into: {.e=5, .f={.a=1, .b=2, .c=3, .d=6}, .g=6}\[\n\r]* and making it read-only" "dyninit" } }
+// { dg-final { scan-tree-dump "dynamic initialization of f\[\n\r]* optimized into: {.e=7, .f={.a=1, .b=2, .c=3, .d=6}, .g=1}" "dyninit" } }
+// { dg-final { scan-tree-dump "dynamic initialization of h\[\n\r]* optimized into: {.h=8, .i={.a=1, .b=2, .c=3, .d=6}, .j=9}" "dyninit" } }
+// { dg-final { scan-tree-dump-times "dynamic initialization of " 4 "dyninit" } }
+// { dg-final { scan-tree-dump-times "and making it read-only" 1 "dyninit" } }
+
+struct S { S () : a(1), b(2), c(3), d(4) { d += 2; } int a, b, c, d; };
+struct T { int e; S f; int g; };
+struct U { int h; mutable S i; int j; };
+extern int b;
+int foo (int &);
+int bar (int &);
+int baz () { return 1; }
+int qux () { return b = 2; }
+// Dynamic initialization of a shouldn't be optimized, foo can't be inlined.
+int a = foo (b);
+int b = baz ();
+// Likewise for c.
+int c = bar (b);
+// While qux is inlined, the dynamic initialization modifies another
+// variable, so punt for d as well.
+int d = qux ();
+const T e = { 5, S (), 6 };
+T f = { 7, S (), baz () };
+const T &g = e;
+const U h = { 8, S (), 9 };
+const U &i = h;