diff mbox series

c++/modules: Fix recursive dependencies [PR116317]

Message ID 67235749.170a0220.e81fe.267a@mx.google.com
State New
Headers show
Series c++/modules: Fix recursive dependencies [PR116317] | expand

Commit Message

Nathaniel Shead Oct. 31, 2024, 10:09 a.m. UTC
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

In cases like the linked PR we sometimes get mutually recursive
dependencies that both rely on the other to have been streamed as part
of their merge key information.  In the linked PR, this causes an ICE.

The root cause is that 'sort_cluster' is not correctly ordering the
dependencies; both the element_t specialisation and the
reverse_adaptor::first function decl depend on each other, but by
streaming element_t first it ends up trying to stream itself recursively
as part of calculating its own merge key, which apart from the checking
ICE will also cause issues on stream-in, as the merge key will not
properly stream.

There is a comment already in 'sort_cluster' describing this issue, but
it says:

     Finding the single cluster entry dep is very tricky and
     expensive.  Let's just not do that.  It's harmless in this case
     anyway.

However in this case it was not harmless: it's just somewhat luck that
the sorting happened to work for the existing cases in the testsuite.

This patch solves the issue by noting any declarations that rely on deps
first seen within their own merge key.  This declaration gets marked as
an "entry" dep; any of these deps that end up recursively referring back
to that entry dep as part of their own merge key do not.

Then within sort_cluster we can ensure that the entry dep is written to
be streamed first of its cluster; this will ensure that any other deps
are just emitted as back-references, and the mergeable dep itself will
structurally decompose.

	PR c++/116317

gcc/cp/ChangeLog:

	* module.cc (depset::DB_MAYBE_RECURSIVE_BIT): New flag.
	(depset::DB_ENTRY_BIT): New flag.
	(depset::is_maybe_recursive): New accessor.
	(depset::is_entry): New accessor.
	(depset::hash::writing_merge_key): New field.
	(trees_out::decl_value): Inform dep_hash while we're writing the
	merge key information for a decl.
	(depset::hash::add_dependency): Find recursive deps and mark the
	entry point.
	(sort_cluster): Ensure that the entry dep is streamed first.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/late-ret-4_a.H: New test.
	* g++.dg/modules/late-ret-4_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
---
 gcc/cp/module.cc                            | 88 +++++++++++++++------
 gcc/testsuite/g++.dg/modules/late-ret-4_a.H | 22 ++++++
 gcc/testsuite/g++.dg/modules/late-ret-4_b.C |  4 +
 3 files changed, 89 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/late-ret-4_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/late-ret-4_b.C

Comments

Jason Merrill Nov. 1, 2024, 4:08 p.m. UTC | #1
On 10/31/24 6:09 AM, Nathaniel Shead wrote:
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> In cases like the linked PR we sometimes get mutually recursive
> dependencies that both rely on the other to have been streamed as part
> of their merge key information.  In the linked PR, this causes an ICE.
> 
> The root cause is that 'sort_cluster' is not correctly ordering the
> dependencies; both the element_t specialisation and the
> reverse_adaptor::first function decl depend on each other, but by
> streaming element_t first it ends up trying to stream itself recursively
> as part of calculating its own merge key, which apart from the checking
> ICE will also cause issues on stream-in, as the merge key will not
> properly stream.
> 
> There is a comment already in 'sort_cluster' describing this issue, but
> it says:
> 
>       Finding the single cluster entry dep is very tricky and
>       expensive.  Let's just not do that.  It's harmless in this case
>       anyway.
> 
> However in this case it was not harmless: it's just somewhat luck that
> the sorting happened to work for the existing cases in the testsuite.
> 
> This patch solves the issue by noting any declarations that rely on deps
> first seen within their own merge key.  This declaration gets marked as
> an "entry" dep; any of these deps that end up recursively referring back
> to that entry dep as part of their own merge key do not.
> 
> Then within sort_cluster we can ensure that the entry dep is written to
> be streamed first of its cluster; this will ensure that any other deps
> are just emitted as back-references, and the mergeable dep itself will
> structurally decompose.

Do you also want to adjust the comment before sort_cluster?

> // FIXME: I am not convinced this is needed and, if needed,                                                                         
> // sufficient.  We emit the decls in this order but that emission                                                                   
> // could walk into later decls (from the body of the decl, or default                                                               
> // arg-like things).  Why doesn't that walk do the right thing?  And                                                                
> // if it DTRT why do we need to sort here -- won't things naturally                                                                 
> // work?  I think part of the issue is that when we're going to refer                                                               
> // to an entity by name, and that entity is in the same cluster as us,                                                              
> // we need to actually walk that entity, if we've not already walked                                                                
> // it.       

OK either way.

Jason
Nathaniel Shead Nov. 2, 2024, 11:06 a.m. UTC | #2
On Fri, Nov 01, 2024 at 12:08:45PM -0400, Jason Merrill wrote:
> On 10/31/24 6:09 AM, Nathaniel Shead wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > 
> > -- >8 --
> > 
> > In cases like the linked PR we sometimes get mutually recursive
> > dependencies that both rely on the other to have been streamed as part
> > of their merge key information.  In the linked PR, this causes an ICE.
> > 
> > The root cause is that 'sort_cluster' is not correctly ordering the
> > dependencies; both the element_t specialisation and the
> > reverse_adaptor::first function decl depend on each other, but by
> > streaming element_t first it ends up trying to stream itself recursively
> > as part of calculating its own merge key, which apart from the checking
> > ICE will also cause issues on stream-in, as the merge key will not
> > properly stream.
> > 
> > There is a comment already in 'sort_cluster' describing this issue, but
> > it says:
> > 
> >       Finding the single cluster entry dep is very tricky and
> >       expensive.  Let's just not do that.  It's harmless in this case
> >       anyway.
> > 
> > However in this case it was not harmless: it's just somewhat luck that
> > the sorting happened to work for the existing cases in the testsuite.
> > 
> > This patch solves the issue by noting any declarations that rely on deps
> > first seen within their own merge key.  This declaration gets marked as
> > an "entry" dep; any of these deps that end up recursively referring back
> > to that entry dep as part of their own merge key do not.
> > 
> > Then within sort_cluster we can ensure that the entry dep is written to
> > be streamed first of its cluster; this will ensure that any other deps
> > are just emitted as back-references, and the mergeable dep itself will
> > structurally decompose.
> 
> Do you also want to adjust the comment before sort_cluster?
> 
> > // FIXME: I am not convinced this is needed and, if needed,
> > // sufficient.  We emit the decls in this order but that emission
> > // could walk into later decls (from the body of the decl, or default
> > // arg-like things).  Why doesn't that walk do the right thing?  And
> > // if it DTRT why do we need to sort here -- won't things naturally
> > // work?  I think part of the issue is that when we're going to refer
> > // to an entity by name, and that entity is in the same cluster as us,
> > // we need to actually walk that entity, if we've not already walked
> > // it.
> 
> OK either way.
> 

Thanks; I've elected to keep the comment for now (it was still helpful
for me when I was also wondering about why this function was needed) but
I might reword it from a FIXME later perhaps.

> Jason
>
diff mbox series

Patch

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 297ef85bb1e..17d15878990 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -2336,6 +2336,8 @@  private:
 				   entity. */
     DB_IMPORTED_BIT,		/* An imported entity.  */
     DB_UNREACHED_BIT,		/* A yet-to-be reached entity.  */
+    DB_MAYBE_RECURSIVE_BIT,	/* An entity maybe in a recursive cluster.  */
+    DB_ENTRY_BIT,		/* The first reached recursive dep.  */
     DB_HIDDEN_BIT,		/* A hidden binding.  */
     /* The following bits are not independent, but enumerating them is
        awkward.  */
@@ -2437,6 +2439,14 @@  public:
   {
     return get_flag_bit<DB_HIDDEN_BIT> ();
   }
+  bool is_maybe_recursive () const
+  {
+    return get_flag_bit<DB_MAYBE_RECURSIVE_BIT> ();
+  }
+  bool is_entry () const
+  {
+    return get_flag_bit<DB_ENTRY_BIT> ();
+  }
   bool is_type_spec () const
   {
     return get_flag_bit<DB_TYPE_SPEC_BIT> ();
@@ -2548,11 +2558,12 @@  public:
     depset *current;         /* Current depset being depended.  */
     unsigned section;	     /* When writing out, the section.  */
     bool reached_unreached;  /* We reached an unreached entity.  */
+    bool writing_merge_key;  /* We're writing merge key information.  */
 
   public:
     hash (size_t size, hash *c = NULL)
       : parent (size), chain (c), current (NULL), section (0),
-	reached_unreached (false)
+	reached_unreached (false), writing_merge_key (false)
     {
       worklist.create (size);
     }
@@ -7934,21 +7945,24 @@  trees_out::decl_value (tree decl, depset *dep)
   /* Stream the container, we want it correctly canonicalized before
      we start emitting keys for this decl.  */
   tree container = decl_container (decl);
-
   unsigned tpl_levels = 0;
-  if (decl != inner)
-    tpl_header (decl, &tpl_levels);
-  if (TREE_CODE (inner) == FUNCTION_DECL)
-    fn_parms_init (inner);
 
-  /* Now write out the merging information, and then really
-     install the tag values.  */
-  key_mergeable (tag, mk, decl, inner, container, dep);
+  {
+    auto wmk = make_temp_override (dep_hash->writing_merge_key, true);
+    if (decl != inner)
+      tpl_header (decl, &tpl_levels);
+    if (TREE_CODE (inner) == FUNCTION_DECL)
+      fn_parms_init (inner);
 
-  if (streaming_p ())
-    dump (dumper::MERGE)
-      && dump ("Wrote:%d's %s merge key %C:%N", tag,
-	       merge_kind_name[mk], TREE_CODE (decl), decl);
+    /* Now write out the merging information, and then really
+       install the tag values.  */
+    key_mergeable (tag, mk, decl, inner, container, dep);
+
+    if (streaming_p ())
+      dump (dumper::MERGE)
+	&& dump ("Wrote:%d's %s merge key %C:%N", tag,
+		 merge_kind_name[mk], TREE_CODE (decl), decl);
+  }
 
   if (TREE_CODE (inner) == FUNCTION_DECL)
     fn_parms_fini (inner);
@@ -13106,6 +13120,17 @@  depset::hash::add_dependency (depset *dep)
 	dep->deps.safe_push (current);
     }
 
+  /* If two dependencies recursively depend on each other existing within
+     their own merge keys, we must ensure that the first dep we saw while
+     walking is written first in this cluster.  See sort_cluster for more
+     details.  */
+  if (writing_merge_key)
+    {
+      dep->set_flag_bit<DB_MAYBE_RECURSIVE_BIT> ();
+      if (!current->is_maybe_recursive ())
+	current->set_flag_bit<DB_ENTRY_BIT> ();
+    }
+
   if (dep->is_unreached ())
     {
       /* The dependency is reachable now.  */
@@ -14117,7 +14142,7 @@  sort_cluster (depset::hash *original, depset *scc[], unsigned size)
 
   table.find_dependencies (nullptr);
 
-  vec<depset *> order = table.connect ();
+  auto_vec<depset *> order = table.connect ();
   gcc_checking_assert (order.length () == use_lwm);
 
   /* Now rewrite entries [0,lwm), in the dependency order we
@@ -14134,26 +14159,39 @@  sort_cluster (depset::hash *original, depset *scc[], unsigned size)
      from Foo's declaration, so we only need to treat Foo as mergable
      (We'll do structural comparison of TPL<decltype (arg)>).
 
-     Finding the single cluster entry dep is very tricky and
-     expensive.  Let's just not do that.  It's harmless in this case
-     anyway. */
-  unsigned pos = 0;
+     We approximate finding the single cluster entry dep by checking for
+     entities recursively depending on a dep first seen when streaming
+     its own merge key; the first dep we see in such a cluster should be
+     the first one streamed.  */
+  unsigned entry_pos = ~0u;
   unsigned cluster = ~0u;
   for (unsigned ix = 0; ix != order.length (); ix++)
     {
       gcc_checking_assert (order[ix]->is_special ());
+      bool tight = order[ix]->cluster == cluster;
       depset *dep = order[ix]->deps[0];
-      scc[pos++] = dep;
       dump (dumper::MERGE)
-	&& dump ("Mergeable %u is %N%s", ix, dep->get_entity (),
-		 order[ix]->cluster == cluster ? " (tight)" : "");
+	&& dump ("Mergeable %u is %N%s%s", ix, dep->get_entity (),
+		 tight ? " (tight)" : "", dep->is_entry () ? " (entry)" : "");
+      scc[ix] = dep;
+      if (tight)
+	{
+	  gcc_checking_assert (dep->is_maybe_recursive ());
+	  if (dep->is_entry ())
+	    {
+	      /* There should only be one entry dep in a cluster.  */
+	      gcc_checking_assert (!scc[entry_pos]->is_entry ());
+	      gcc_checking_assert (scc[entry_pos]->is_maybe_recursive ());
+	      scc[ix] = scc[entry_pos];
+	      scc[entry_pos] = dep;
+	    }
+	}
+      else
+	entry_pos = ix;
       cluster = order[ix]->cluster;
     }
 
-  gcc_checking_assert (pos == use_lwm);
-
-  order.release ();
-  dump (dumper::MERGE) && dump ("Ordered %u keys", pos);
+  dump (dumper::MERGE) && dump ("Ordered %u keys", order.length ());
   dump.outdent ();
 }
 
diff --git a/gcc/testsuite/g++.dg/modules/late-ret-4_a.H b/gcc/testsuite/g++.dg/modules/late-ret-4_a.H
new file mode 100644
index 00000000000..80c821407c6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/late-ret-4_a.H
@@ -0,0 +1,22 @@ 
+// PR c++/116317
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi {} }
+
+template <typename T> using element_t = int;
+
+template <typename B>
+struct reverse_adaptor {
+  void pairwise();
+
+  template <typename T>
+  auto first(T self) -> element_t<decltype(self)>;
+
+  // We can get the same issue with function parameters
+  template <typename T>
+  void second(T self, element_t<decltype(self)> second);
+};
+
+template <typename B>
+void reverse_adaptor<B>::pairwise() {
+  reverse_adaptor<int>{};
+}
diff --git a/gcc/testsuite/g++.dg/modules/late-ret-4_b.C b/gcc/testsuite/g++.dg/modules/late-ret-4_b.C
new file mode 100644
index 00000000000..f9a83ba31f5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/late-ret-4_b.C
@@ -0,0 +1,4 @@ 
+// PR c++/116317
+// { dg-additional-options "-fmodules-ts -fno-module-lazy" }
+
+import "late-ret-4_a.H";