diff mbox series

[C++] Kill CLASSTYPE_SORTED_FIELDS

Message ID 70b2cf9b-fa5e-9572-386a-bd36cf010862@acm.org
State New
Headers show
Series [C++] Kill CLASSTYPE_SORTED_FIELDS | expand

Commit Message

Nathan Sidwell Aug. 28, 2017, 4:29 p.m. UTC
This patch replaces the sorted field vector with a hash map.  Lookup for 
non-function members of a complete class is now O(1), not O(log(n)).  We 
still do linear lookup when the class is incomplete.  Fixing that is 
still on the todo list.

This permits moving a bunch of sorted_field_vec handling from c-common 
to the c FE, and I'll post that in a subsequent patch.

committed to trunk.

nathan

Comments

Michael Matz Aug. 28, 2017, 4:58 p.m. UTC | #1
Hi,

On Mon, 28 Aug 2017, Nathan Sidwell wrote:

> This patch replaces the sorted field vector with a hash map.  Lookup for 
> non-function members of a complete class is now O(1), not O(log(n)).  
> We still do linear lookup when the class is incomplete.  Fixing that is 
> still on the todo list.

Have you any memory measurement on a non-trivial C++ codebase with many 
classes (template instantiations?)?  On LP64 platforms the sorted list 
for n members was 8+8*n bytes, the hash-map is 48 bytes itself plus 16*n 
bytes for the entries, where you have at least 13 entries always, next 31, 
next 61 entries (and so on).  You pay the price for the next larger chunk 
already at about 2/3 fill rate, so e.g a class with 10 members was 88 
bytes before and is 544 bytes now, which doesn't look that attractive 
given that the lookup before took four tries and now takes one, especially 
taking into account cache effects.


Ciao,
Michael.
Nathan Sidwell Aug. 28, 2017, 7:08 p.m. UTC | #2
Some quick tests show that memory use increased by 1.5%  Compilation 
time reduced by 1% to 3%

Your comment on IRC about not needing a identifier->decl map, just a 
decl hash, because the decl already has the identifier is a good one.  I 
recall considering that when converting namespaces, but there the 
anonymous namespace has a NULL name, which is annoying.  I just used the 
same map table here.

However, it's probably worth the effort.

It'd also be nice if the hash table primitive didn't unconditionally 
hold instrumentation counters.  AFAICT they only get folded to global 
counters upon destruction of the hash table.  Which for these things 
never happens.

Remember, when I succeed in folding METHOD_VEC into the same structure, 
we'll have several members in this table for all but the simplest of 
structures.

nathan
Richard Biener Aug. 29, 2017, 9:18 a.m. UTC | #3
On Mon, Aug 28, 2017 at 9:08 PM, Nathan Sidwell <nathan@acm.org> wrote:
> Some quick tests show that memory use increased by 1.5%  Compilation time
> reduced by 1% to 3%
>
> Your comment on IRC about not needing a identifier->decl map, just a decl
> hash, because the decl already has the identifier is a good one.  I recall
> considering that when converting namespaces, but there the anonymous
> namespace has a NULL name, which is annoying.  I just used the same map
> table here.
>
> However, it's probably worth the effort.
>
> It'd also be nice if the hash table primitive didn't unconditionally hold
> instrumentation counters.  AFAICT they only get folded to global counters
> upon destruction of the hash table.  Which for these things never happens.

There's some debug methods like ::collision that use the fields.

I think for embedding hash_table it would be nice to subclass it,
have hash_table_embed not including those fields / methods and have
hash_table derive from it.  Not sure what to do about things like
hash_map and hash_set -- either template them on the hash-table
type or always use the embed variant and see if sth blows up.

Richard.

> Remember, when I succeed in folding METHOD_VEC into the same structure,
> we'll have several members in this table for all but the simplest of
> structures.
>
> nathan
>
> --
> Nathan Sidwell
diff mbox series

Patch

2017-08-28  Nathan Sidwell  <nathan@acm.org>

	* cp-tree.h (lang_type): Replace sorted_fields vector with
	bindings map.
	(CLASSTYPE_SORTED_FIELDS): Delete.
	(CLASSTYPE_BINDINGS): New.
	* decl.c (finish_enum_value_list): Swap args of
	insert_late_enum_def_bindings.
	* name-lookup.c (lookup_field_1): Replace binary search of sorted
	fields with map->get.
	(sorted_fields_type_new, count_fields,
	add_fields_to_record_type, add_enum_fields_to_record_type): Delete.
	(add_class_member, add_class_members): New.
	(set_class_bindings): Create map and insert.
	(insert_late_enum_def_binding): Swap parms.  Use add_clasS_member.
	* ptree.c (cxx_print_type): Delete sorted fields printing.

Index: cp-tree.h
===================================================================
--- cp-tree.h	(revision 251387)
+++ cp-tree.h	(working copy)
@@ -2014,10 +2014,10 @@  struct GTY(()) lang_type {
      as a list of adopted protocols or a pointer to a corresponding
      @interface.  See objc/objc-act.h for details.  */
   tree objc_info;
-  /* sorted_fields is sorted based on a pointer, so we need to be able
-     to resort it if pointers get rearranged.  */
-  struct sorted_fields_type * GTY ((reorder ("resort_sorted_fields")))
-    sorted_fields;
+
+  /* Map from IDENTIFIER nodes to DECLS.  */
+  hash_map<lang_identifier *, tree> *bindings;
+
   /* FIXME reuse another field?  */
   tree lambda_expr;
 };
@@ -3243,10 +3243,9 @@  extern void decl_shadowed_for_var_insert
    && TREE_CODE (TYPE_NAME (NODE)) == TYPE_DECL	\
    && TYPE_DECL_ALIAS_P (TYPE_NAME (NODE)))
 
-/* For a class type: if this structure has many fields, we'll sort them
-   and put them into a TREE_VEC.  */
-#define CLASSTYPE_SORTED_FIELDS(NODE) \
-  (LANG_TYPE_CLASS_CHECK (NODE)->sorted_fields)
+/* The binding map for a class (not always present).  */
+#define CLASSTYPE_BINDINGS(NODE) \
+  (LANG_TYPE_CLASS_CHECK (NODE)->bindings)
 
 /* If non-NULL for a VAR_DECL, FUNCTION_DECL, TYPE_DECL or
    TEMPLATE_DECL, the entity is either a template specialization (if
Index: decl.c
===================================================================
--- decl.c	(revision 251387)
+++ decl.c	(working copy)
@@ -14316,7 +14316,8 @@  finish_enum_value_list (tree enumtype)
       && COMPLETE_TYPE_P (current_class_type)
       && UNSCOPED_ENUM_P (enumtype))
     {
-      insert_late_enum_def_bindings (enumtype, current_class_type);
+      insert_late_enum_def_bindings (current_class_type, enumtype);
+      /* TYPE_FIELDS needs fixup.  */
       fixup_type_variants (current_class_type);
     }
 
Index: name-lookup.c
===================================================================
--- name-lookup.c	(revision 251387)
+++ name-lookup.c	(working copy)
@@ -1183,58 +1183,33 @@  lookup_fnfields_slot_nolazy (tree type,
 tree
 lookup_field_1 (tree type, tree name, bool want_type)
 {
-  tree field;
+  tree field = NULL_TREE;
 
   gcc_assert (identifier_p (name) && RECORD_OR_UNION_TYPE_P (type));
 
-  if (CLASSTYPE_SORTED_FIELDS (type))
+  if (CLASSTYPE_BINDINGS (type))
     {
-      tree *fields = &CLASSTYPE_SORTED_FIELDS (type)->elts[0];
-      int lo = 0, hi = CLASSTYPE_SORTED_FIELDS (type)->len;
-      int i;
-
-      while (lo < hi)
-	{
-	  i = (lo + hi) / 2;
-
-	  if (DECL_NAME (fields[i]) > name)
-	    hi = i;
-	  else if (DECL_NAME (fields[i]) < name)
-	    lo = i + 1;
-	  else
-	    {
-	      field = NULL_TREE;
+      tree *slot = CLASSTYPE_BINDINGS (type)->get (name);
+
+      if (slot)
+	{
+	  field = *slot;
 
-	      /* We might have a nested class and a field with the
-		 same name; we sorted them appropriately via
-		 field_decl_cmp, so just look for the first or last
-		 field with this name.  */
+	  if (STAT_HACK_P (field))
+	    {
 	      if (want_type)
-		{
-		  do
-		    field = fields[i--];
-		  while (i >= lo && DECL_NAME (fields[i]) == name);
-		  if (!DECL_DECLARES_TYPE_P (field))
-		    field = NULL_TREE;
-		}
+		field = STAT_TYPE (field);
 	      else
-		{
-		  do
-		    field = fields[i++];
-		  while (i < hi && DECL_NAME (fields[i]) == name);
-		}
-
-	      if (field)
-	      	{
-	      	  field = strip_using_decl (field);
-	      	  if (is_overloaded_fn (field))
-	      	    field = NULL_TREE;
-	      	}
-
-	      return field;
+		field = STAT_DECL (field);
 	    }
+
+	  field = strip_using_decl (field);
+	  if (OVL_P (field))
+	    field = NULL_TREE;
+	  else if (want_type && !DECL_DECLARES_TYPE_P (field))
+	    field = NULL_TREE;
 	}
-      return NULL_TREE;
+      return field;
     }
 
   field = TYPE_FIELDS (type);
@@ -1312,113 +1287,62 @@  lookup_fnfields_slot (tree type, tree na
   return lookup_fnfields_slot_nolazy (type, name);
 }
 
-/* Allocate and return an instance of struct sorted_fields_type with
-   N fields.  */
+/* Add DECL into MAP under NAME.  Collisions fail silently.  Doesn't
+   do sophisticated collision checking.  Deals with STAT_HACK.  */
 
-static struct sorted_fields_type *
-sorted_fields_type_new (int n)
+static void
+add_class_member (hash_map<lang_identifier *, tree> *map, tree name, tree decl)
 {
-  struct sorted_fields_type *sft;
-  sft = (sorted_fields_type *) ggc_internal_alloc (sizeof (sorted_fields_type)
-				      + n * sizeof (tree));
-  sft->len = n;
+  bool existed;
+  tree *slot = &map->get_or_insert (name, &existed);
+  if (!existed)
+    *slot = decl;
+  else if (TREE_CODE (*slot) == TYPE_DECL && DECL_ARTIFICIAL (*slot))
+    *slot = stat_hack (decl, *slot);
+  else if (TREE_CODE (decl) == TYPE_DECL && DECL_ARTIFICIAL (decl))
+    *slot = stat_hack (*slot, decl);
 
-  return sft;
-}
-
-/* Subroutine of insert_into_classtype_sorted_fields.  Recursively
-   count the number of fields in TYPE, including anonymous union
-   members.  */
-
-static int
-count_fields (tree fields)
-{
-  tree x;
-  int n_fields = 0;
-  for (x = fields; x; x = DECL_CHAIN (x))
-    {
-      if (DECL_DECLARES_FUNCTION_P (x))
-	/* Functions are dealt with separately.  */;
-      else if (TREE_CODE (x) == FIELD_DECL && ANON_AGGR_TYPE_P (TREE_TYPE (x)))
-	n_fields += count_fields (TYPE_FIELDS (TREE_TYPE (x)));
-      else
-	n_fields += 1;
-    }
-  return n_fields;
+  /* Else ignore collision.  */
 }
 
-/* Subroutine of insert_into_classtype_sorted_fields.  Recursively add
-   all the fields in the TREE_LIST FIELDS to the SORTED_FIELDS_TYPE
-   elts, starting at offset IDX.  */
+/* Insert the chain FIELDS into MAP.  */
 
-static int
-add_fields_to_record_type (tree fields, struct sorted_fields_type *field_vec,
-			   int idx)
+static void
+add_class_members (hash_map<lang_identifier *, tree> *map, tree fields)
 {
-  tree x;
-  for (x = fields; x; x = DECL_CHAIN (x))
+  for (tree field = fields; field; field = DECL_CHAIN (field))
     {
-      if (DECL_DECLARES_FUNCTION_P (x))
-	/* Functions are handled separately.  */;
-      else if (TREE_CODE (x) == FIELD_DECL && ANON_AGGR_TYPE_P (TREE_TYPE (x)))
-	idx = add_fields_to_record_type (TYPE_FIELDS (TREE_TYPE (x)), field_vec, idx);
-      else
-	field_vec->elts[idx++] = x;
+      if (TREE_CODE (field) == FIELD_DECL
+	  && ANON_AGGR_TYPE_P (TREE_TYPE (field)))
+	add_class_members (map, TYPE_FIELDS (TREE_TYPE (field)));
+      else if (DECL_NAME (field))
+	add_class_member (map, DECL_NAME (field), field);
     }
-  return idx;
-}
-
-/* Add all of the enum values of ENUMTYPE, to the FIELD_VEC elts,
-   starting at offset IDX.  */
-
-static int
-add_enum_fields_to_record_type (tree enumtype,
-				struct sorted_fields_type *field_vec,
-				int idx)
-{
-  tree values;
-  for (values = TYPE_VALUES (enumtype); values; values = TREE_CHAIN (values))
-      field_vec->elts[idx++] = TREE_VALUE (values);
-  return idx;
 }
 
-/* Insert FIELDS into T for the sorted case if the FIELDS count is
-   equal to THRESHOLD or greater than THRESHOLD.  */
+/* Create the binding map of KLASS and insert FIELDS.  */
 
 void 
 set_class_bindings (tree klass, tree fields)
 {
-  int n_fields = count_fields (fields);
-  if (n_fields >= 8)
-    {
-      struct sorted_fields_type *field_vec = sorted_fields_type_new (n_fields);
-      add_fields_to_record_type (fields, field_vec, 0);
-      qsort (field_vec->elts, n_fields, sizeof (tree), field_decl_cmp);
-      CLASSTYPE_SORTED_FIELDS (klass) = field_vec;
-    }
+  gcc_assert (!CLASSTYPE_BINDINGS (klass));
+
+  CLASSTYPE_BINDINGS (klass)
+    = hash_map<lang_identifier *, tree>::create_ggc (8);
+  add_class_members (CLASSTYPE_BINDINGS (klass), fields);
 }
 
 /* Insert lately defined enum ENUMTYPE into T for the sorted case.  */
 
 void
-insert_late_enum_def_bindings (tree enumtype, tree t)
+insert_late_enum_def_bindings (tree klass, tree enumtype)
 {
-  struct sorted_fields_type *sorted_fields = CLASSTYPE_SORTED_FIELDS (t);
-  if (sorted_fields)
-    {
-      int i;
-      int n_fields
-	= list_length (TYPE_VALUES (enumtype)) + sorted_fields->len;
-      struct sorted_fields_type *field_vec = sorted_fields_type_new (n_fields);
-      
-      for (i = 0; i < sorted_fields->len; ++i)
-	field_vec->elts[i] = sorted_fields->elts[i];
-
-      add_enum_fields_to_record_type (enumtype, field_vec,
-				      sorted_fields->len);
-      qsort (field_vec->elts, n_fields, sizeof (tree), field_decl_cmp);
-      CLASSTYPE_SORTED_FIELDS (t) = field_vec;
-    }
+  hash_map<lang_identifier *, tree> *map = CLASSTYPE_BINDINGS (klass);
+
+  for (tree values = TYPE_VALUES (enumtype);
+       values; values = TREE_CHAIN (values))
+    add_class_member (map, DECL_NAME (TREE_VALUE (values)),
+		      TREE_VALUE (values));
 }
 
 /* Compute the chain index of a binding_entry given the HASH value of its
Index: ptree.c
===================================================================
--- ptree.c	(revision 251385)
+++ ptree.c	(working copy)
@@ -151,9 +151,6 @@  cxx_print_type (FILE *file, tree node, i
     fputs (" delete[]", file);
   if (TYPE_HAS_COPY_ASSIGN (node))
     fputs (" this=(X&)", file);
-  if (CLASSTYPE_SORTED_FIELDS (node))
-    fprintf (file, " sorted-fields %p",
-	     (void *) CLASSTYPE_SORTED_FIELDS (node));
 
   if (TREE_CODE (node) == RECORD_TYPE)
     {