diff mbox

[Ping,0/3] New macro PREFERRED_RENAME_CLASS

Message ID 4CF67C38.4020304@codesourcery.com
State New
Headers show

Commit Message

Yao Qi Dec. 1, 2010, 4:47 p.m. UTC
On 11/12/2010 01:34 AM, Eric Botcazou wrote:
> 
> It's a progress, but the first approach of using register classes instead of 
> individual registers seemed to be the right granularity here.
> 
> For each element in the chain, we have the class.  Can we compute a superset 
> of all the classes in the chain?  If we cannot or if this is ALL_REGS, then 
> we iterate as before.  If we can, we call the hook on the result and we first 
> iterate on the contents of this result (a class); if we find a new reg, we 
> stop, otherwise we iterate on the remaining registers.
> 

In this patch, most of part is same as the previous one, except three
changes,

1. compute superunion set of all the classes in the chain,

  reg_class_superunion_in_chain
    =
reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];

2. compute the preferred class of reg_class_superunion_in_chain by hook,

   preferred_class
     = targetm.preferred_rename_class(reg_class_superunion_in_chain);

3. Iterate registers first in preferred_class, and stop once better
register is found.  Then, iterate the rest of registers as usual.

Tested on mainline r167293 x86_64-linux with option {, -O2
-frename-registers} respectively.  No regression.

Macro PREFERRED_RENAME_CLASS is defined in backend.  For example, on
ARM, it can be,

#define PREFERRED_RENAME_CLASS(CLASS)          \
  (TARGET_THUMB2 ? ((CLASS) == GENERAL_REGS    \
   ? LO_REGS : (NO_REGS))                                \
  : (NO_REGS))

Given this definition of macro, code size of bash-3.2 is reduced by 0.2%
roughly, with option "-mthumb -march=armv7 -O2 -frename-registers".  The
result and code is posted here to prove this middle-end patch is useful.
 I'll submit another patch for ARM part changes, once this middle-end
part is approved.

OK for mainline?

Comments

Eric Botcazou Dec. 3, 2010, 10:57 a.m. UTC | #1
> In this patch, most of part is same as the previous one, except three
> changes,
>
> 1. compute superunion set of all the classes in the chain,
>
>   reg_class_superunion_in_chain
>     =
> reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];
>
> 2. compute the preferred class of reg_class_superunion_in_chain by hook,
>
>    preferred_class
>      = targetm.preferred_rename_class(reg_class_superunion_in_chain);
>
> 3. Iterate registers first in preferred_class, and stop once better
> register is found.  Then, iterate the rest of registers as usual.

Almost OK, but:

+	  enum reg_class reg_class_superunion_in_chain = NO_REGS;

The name is too long, leading to

+	      reg_class_superunion_in_chain
+		= reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];

Just call it "superunion_class".


+	  /* The register iteration order here is "preferred-register-first".
+	     Firstly(i == 0), we iterate registers belong to class
+	     PREFERRED_CLASS, if we find a new register, we stop immeidately.

"... we iterate over registers that belong to PREFERRED_CLASS; if we find a 
new register, we stop immediately."

+	     Otherwise, we iterate once more (i == 1) registers, which
+	     doesn't belong class PREFERRED_CLASS.

"...over registers that don't belong to PREFERRED_CLASS".

+	     If preferred class is not found by target hook, PREFERRED_CLASS
+	     is NO_REGS, and registers are iterated in an ascending order
+	     without any preference.  */

"If PREFERRED_CLASS is NO_REGS, we iterate over all registers in ascending 
order without any preference".

But the loop is still run twice, isn't it?  Can we skip the first run if 
PREFERRED_CLASS is NO_REGS, for example by starting at 1 instead of 0?

The usual convention for this kind of scheme (mutiple runs) in the RTL 
optimizers is to use PASS as iteration variable:

  for (pass = 0; pass < 2; pass++)

and just test (pass == 0) or (pass == 1), i.e. don't use PREFER at all:

  /* In the first pass, iterate over registers first in preferred class.  */
  if (pass == 0
      && !TEST_HARD_REG_BIT (reg_class_contents[preferred_class], new_reg))
    continue;

[...]

  /* If we find a new reg in our preferred class, stop immediately.  */
  if (pass == 0 && best_new_reg != reg)
    {
      found = true;
      break;
    }


As for the hook itself: drop the PREFERRED_RENAME_CLASS macro entirely, i.e. 
the default hook is just:

reg_class_t
default_preferred_rename_class (reg_class_t rclass)
{
  return NO_REGS;
}

and back-ends must implement the function (or use the default).


Finally, I have a question for native speakers: does preferred_rename_class 
sound good or would preferred_renaming_class be better?
Daniel Jacobowitz Dec. 4, 2010, 4:06 p.m. UTC | #2
On Fri, Dec 03, 2010 at 11:57:20AM +0100, Eric Botcazou wrote:
> Finally, I have a question for native speakers: does preferred_rename_class 
> sound good or would preferred_renaming_class be better?

I'd go for preferred_rename_class, personally.
Eric Botcazou Dec. 6, 2010, 11:09 a.m. UTC | #3
> I'd go for preferred_rename_class, personally.

OK, thanks for your feedback.
diff mbox

Patch

gcc/
        * regrename.c (struct du_head): Add new element length.
        (sort_du_head, get_element, merge, merge_sort_comparison):
        New functions of merge sort implementation to du_head list.
        (regrename_optimize): Sort du_head linked list by length.
	Move some code to ...
	(check_new_reg_p): here.  New function.
        (create_new_chain):  Initialize length.
        (scan_rtx_reg): Increase length for non-debug insns.
        * target.def: New hook preferred_rename_class.
        * targhook.c (default_preferred_rename_class): New.
        * targhook.h: Declare it.
        * doc/tm.texi.in: New hook TARGET_PREFERRED_RENAME_CLASS.
	* doc/tm.texi: Regenerate.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 5446501..7105c72 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2506,6 +2506,10 @@  looking for one that is valid, and will reload one or both registers
 only if neither labeling works.
 @end defmac

+@deftypefn {Target Hook} reg_class_t TARGET_PREFERRED_RENAME_CLASS (reg_class_t @var{rclass})
+A target hook that places additional preference on the register class to use when it is necessary to rename a register in class @var{class} to another class, or perhaps @var{NO_REGS}, if no prefered register class is found or macro @code{PREFERRED_RENAME_CLASS} is not define. Sometimes returning a more restrictive class makes better code.  For example, on ARM, thumb-2 instructions using @code{LO_REGS} may be smaller than instructions using @code{GENERIC_REGS}.  By returning @code{LO_REGS} from @code{PREFERRED_RENAME_CLASS}, code size can be reduced.
+@end deftypefn
+
 @deftypefn {Target Hook} reg_class_t TARGET_PREFERRED_RELOAD_CLASS (rtx @var{x}, reg_class_t @var{rclass})
 A target hook that places additional restrictions on the register class
 to use when it is necessary to copy value @var{x} into a register in class
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 4b21c92..e9bf4a8 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2496,6 +2496,8 @@  looking for one that is valid, and will reload one or both registers
 only if neither labeling works.
 @end defmac
 
+@hook TARGET_PREFERRED_RENAME_CLASS
+
 @hook TARGET_PREFERRED_RELOAD_CLASS
 A target hook that places additional restrictions on the register class
 to use when it is necessary to copy value @var{x} into a register in class
diff --git a/gcc/regrename.c b/gcc/regrename.c
index 2535ab7..f085e78 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -38,6 +38,7 @@ 
 #include "timevar.h"
 #include "tree-pass.h"
 #include "df.h"
+#include "target.h"
 
 #if HOST_BITS_PER_WIDE_INT <= MAX_RECOG_OPERANDS
 #error "Use a different bitmap implementation for untracked_operands."
@@ -51,6 +52,11 @@  struct du_head
   struct du_head *next_chain;
   /* The first and last elements of this chain.  */
   struct du_chain *first, *last;
+  /* The number of elements of this chain, excluding those corresponding
+     to references of the register in debug insns.  The du_head linked
+     list can be sorted by this, and register-rename can prefer
+     register classes according to this order.  */
+  int length;
   /* Describes the register being tracked.  */
   unsigned regno, nregs;
 
@@ -154,6 +160,199 @@  merge_overlapping_regs (HARD_REG_SET *pset, struct du_head *head)
     }
 }
 
+/* Return the Nth element in LIST.  If LIST contains less than N
+   elements, return the last one.  */
+static struct du_head *
+get_element (struct du_head *list, int n)
+{
+  while (n-- && list->next_chain != NULL)
+    list = list->next_chain;
+
+  return list;
+}
+
+/* Comparison function of merge sort.  Return true if A is less than
+   B, otherwise return false.  */
+static inline int
+merge_sort_comparison(const struct du_head *a,
+		    const struct du_head *b)
+{
+  return a->length < b->length;
+}
+
+/* Merge the first 2 sub-lists of LENGTH nodes contained in the
+   linked list pointed to by START_NODE.  Update START_NODE to point
+   to the merged nodes, and return a pointer to the last merged
+   node.  Return NULL if START_NODE doesn't contain enough
+   elements, or this pass of merge is done.  */
+
+static struct du_head *
+merge(struct du_head **start_node, int length)
+{
+  int i, left_count, right_count;
+  struct du_head *left, *right;
+  /* Current node of sort result.  */
+  struct du_head *current_sorted_node;
+  /* Tail node of sort, used to connect with next piece of list.  */
+  struct du_head *current_tail_node;
+
+  if (*start_node == NULL)
+    return NULL;
+
+  left = right = *start_node;
+  right_count = left_count = 0;
+
+  /* Step RIGHT along the list by LENGTH places.  */
+  for (i = 0; i < length; i++)
+    {
+      right = right->next_chain;
+      if (right == NULL)
+	{
+	  return NULL;
+	}
+    }
+
+  /* Initialize current_sorted_node.  */
+  if (merge_sort_comparison (left, right))
+    {
+      ++right_count;
+      current_sorted_node = right;
+      *start_node = right;
+      right = right->next_chain;
+    }
+  else
+    {
+      ++left_count;
+      current_sorted_node = left;
+      left = left->next_chain;
+    }
+
+  while (1)
+    {
+      /* Choose LEFT or RIGHT to take the next element from.  If
+	 either is empty, choose from the other one.  */
+      if (left_count == length || left == NULL)
+	{
+	  current_sorted_node->next_chain = right;
+	  current_tail_node = get_element (current_sorted_node,
+					   length - right_count);
+
+	  break;
+	}
+      else if (right_count == length || right == NULL)
+	{
+	  /* Save the head node of next piece of linked list.  */
+	  struct du_head *tmp = current_sorted_node->next_chain;
+
+	  current_sorted_node->next_chain = left;
+	  current_tail_node
+	    = get_element (current_sorted_node,
+			   length - left_count);
+	  /* Connect sorted list to next piece of list.  */
+	  current_tail_node->next_chain = tmp;
+	  break;
+	}
+      else
+	{
+	  /* Normal merge operations.  If both LEFT and RIGHT are
+	     non-empty, compare the first element of each and choose
+	     the lower one.  */
+	  if (merge_sort_comparison (left, right))
+	    {
+	      right_count++;
+	      current_sorted_node->next_chain = right;
+	      right = right->next_chain;
+	    }
+	  else
+	    {
+	      left_count++;
+	      current_sorted_node->next_chain = left;
+	      left = left->next_chain;
+	    }
+	  current_sorted_node = current_sorted_node->next_chain;
+	}
+    }
+  /* Return NULL if this pass of merge is done.  */
+  return (current_tail_node->next_chain ? current_tail_node : NULL);
+}
+
+/* Sort the linked list pointed to by HEAD.  The algorithm is a
+   non-recursive merge sort to linked list.  */
+
+static void
+sort_du_head (struct du_head **head)
+{
+  int current_length = 1;
+  struct du_head *last_tail;
+
+  /* In each pass, lists of size current_length is merged to
+     lists of size 2xcurrent_length (Initially current_length
+     is 1).  */
+  while (1)
+    {
+      last_tail = merge(head, current_length);
+      if (last_tail != NULL)
+	{
+	  do
+	    last_tail = merge (&last_tail->next_chain, current_length);
+	  while (last_tail != NULL);
+
+	  current_length *= 2;
+	}
+      else
+	break;
+    }
+}
+
+/* Check if NEW_REG can be the candidate register to rename for
+   REG in THIS_HEAD chain.  THIS_UNAVAILABLE is a set of unavailable hard
+   registers.  */
+
+static bool
+check_new_reg_p (int reg, int new_reg,  struct du_head *this_head,
+		 HARD_REG_SET this_unavailable)
+{
+  enum machine_mode mode = GET_MODE (*this_head->first->loc);
+  int nregs = hard_regno_nregs[new_reg][mode];
+  int i;
+  struct du_chain *tmp;
+
+  for (i = nregs - 1; i >= 0; --i)
+    if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
+	|| fixed_regs[new_reg + i]
+	|| global_regs[new_reg + i]
+	/* Can't use regs which aren't saved by the prologue.  */
+	|| (! df_regs_ever_live_p (new_reg + i)
+	    && ! call_used_regs[new_reg + i])
+#ifdef LEAF_REGISTERS
+	/* We can't use a non-leaf register if we're in a
+	   leaf function.  */
+	|| (current_function_is_leaf
+	    && !LEAF_REGISTERS[new_reg + i])
+#endif
+#ifdef HARD_REGNO_RENAME_OK
+	|| ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)
+#endif
+	)
+      break;
+  if (i >= 0)
+    return false;
+
+  /* See whether it accepts all modes that occur in
+     definition and uses.  */
+  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
+    if ((! HARD_REGNO_MODE_OK (new_reg, GET_MODE (*tmp->loc))
+	 && ! DEBUG_INSN_P (tmp->insn))
+	|| (this_head->need_caller_save_reg
+	    && ! (HARD_REGNO_CALL_PART_CLOBBERED
+		  (reg, GET_MODE (*tmp->loc)))
+	    && (HARD_REGNO_CALL_PART_CLOBBERED
+		(new_reg, GET_MODE (*tmp->loc)))))
+      break;
+
+  return true;
+}
+
 /* Perform register renaming on the current function.  */
 
 static unsigned int
@@ -195,6 +394,8 @@  regrename_optimize (void)
       if (dump_file)
 	dump_def_use_chain (all_chains);
 
+      sort_du_head (&all_chains);
+
       CLEAR_HARD_REG_SET (unavailable);
       /* Don't clobber traceback for noreturn functions.  */
       if (frame_pointer_needed)
@@ -214,6 +415,8 @@  regrename_optimize (void)
 	  HARD_REG_SET this_unavailable;
 	  int reg = this_head->regno;
 	  int i;
+	  enum reg_class reg_class_superunion_in_chain = NO_REGS;
+	  enum reg_class preferred_class;
 
 	  all_chains = this_head->next_chain;
 
@@ -243,16 +446,23 @@  regrename_optimize (void)
 
 	  COPY_HARD_REG_SET (this_unavailable, unavailable);
 
-	  /* Count number of uses, and narrow the set of registers we can
-	     use for renaming.  */
+	  /* Iterate elements in chain in order to:
+	     1. Count number of uses, and narrow the set of registers we can
+	     use for renaming.
+	     2. Compute the superunion of register classes in this chain.  */
 	  n_uses = 0;
+	  reg_class_superunion_in_chain = NO_REGS;
 	  for (tmp = this_head->first; tmp; tmp = tmp->next_use)
 	    {
 	      if (DEBUG_INSN_P (tmp->insn))
 		continue;
 	      n_uses++;
+
 	      IOR_COMPL_HARD_REG_SET (this_unavailable,
 				      reg_class_contents[tmp->cl]);
+
+	      reg_class_superunion_in_chain
+		= reg_class_superunion[(int)reg_class_superunion_in_chain][(int)tmp->cl];
 	    }
 
 	  if (n_uses < 2)
@@ -262,56 +472,55 @@  regrename_optimize (void)
 	    IOR_HARD_REG_SET (this_unavailable, call_used_reg_set);
 
 	  merge_overlapping_regs (&this_unavailable, this_head);
-
+	  /* Compute preferred rename class of super union of all the classes
+	     on the chain.  */
+	  preferred_class
+	    = targetm.preferred_rename_class(reg_class_superunion_in_chain);
+
+	  /* The register iteration order here is "preferred-register-first".
+	     Firstly(i == 0), we iterate registers belong to class
+	     PREFERRED_CLASS, if we find a new register, we stop immeidately.
+	     Otherwise, we iterate once more (i == 1) registers, which
+	     doesn't belong class PREFERRED_CLASS.
+	     If preferred class is not found by target hook, PREFERRED_CLASS
+	     is NO_REGS, and registers are iterated in an ascending order
+	     without any preference.  */
+	  for (i = 0; i <2; i++)
+	    {
+	      bool prefer[] = {true, false};
+	      bool found = false;
 	  /* Now potential_regs is a reasonable approximation, let's
 	     have a closer look at each register still in there.  */
 	  for (new_reg = 0; new_reg < FIRST_PSEUDO_REGISTER; new_reg++)
 	    {
-	      enum machine_mode mode = GET_MODE (*this_head->first->loc);
-	      int nregs = hard_regno_nregs[new_reg][mode];
-
-	      for (i = nregs - 1; i >= 0; --i)
-	        if (TEST_HARD_REG_BIT (this_unavailable, new_reg + i)
-		    || fixed_regs[new_reg + i]
-		    || global_regs[new_reg + i]
-		    /* Can't use regs which aren't saved by the prologue.  */
-		    || (! df_regs_ever_live_p (new_reg + i)
-			&& ! call_used_regs[new_reg + i])
-#ifdef LEAF_REGISTERS
-		    /* We can't use a non-leaf register if we're in a
-		       leaf function.  */
-		    || (current_function_is_leaf
-			&& !LEAF_REGISTERS[new_reg + i])
-#endif
-#ifdef HARD_REGNO_RENAME_OK
-		    || ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i)
-#endif
-		    )
-		  break;
-	      if (i >= 0)
+		  /* Iterate registers first in prefered class.  */
+		  if (prefer[i]
+		      != TEST_HARD_REG_BIT (reg_class_contents[preferred_class],
+					    new_reg))
 		continue;
 
-	      /* See whether it accepts all modes that occur in
-		 definition and uses.  */
-	      for (tmp = this_head->first; tmp; tmp = tmp->next_use)
-		if ((! HARD_REGNO_MODE_OK (new_reg, GET_MODE (*tmp->loc))
-		     && ! DEBUG_INSN_P (tmp->insn))
-		    || (this_head->need_caller_save_reg
-			&& ! (HARD_REGNO_CALL_PART_CLOBBERED
-			      (reg, GET_MODE (*tmp->loc)))
-			&& (HARD_REGNO_CALL_PART_CLOBBERED
-			    (new_reg, GET_MODE (*tmp->loc)))))
-		  break;
-	      if (! tmp)
+		  if (check_new_reg_p (reg, new_reg, this_head,
+				       this_unavailable))
 		{
 		  if (tick[best_new_reg] > tick[new_reg])
 		    {
+			  enum machine_mode mode
+			    = GET_MODE (*this_head->first->loc);
 		      best_new_reg = new_reg;
-		      best_nregs = nregs;
+			  best_nregs = hard_regno_nregs[new_reg][mode];
+			  /* If we find a new reg in our preferred class,
+			     stop immediately.  */
+			  if (best_new_reg != reg && prefer[i])
+			    {
+			      found = true;
+			      break;
 		    }
 		}
 	    }
-
+		}
+	      if (found)
+		break;
+	    }
 	  if (dump_file)
 	    {
 	      fprintf (dump_file, "Register %s in insn %d",
@@ -527,6 +736,7 @@  create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   head->need_caller_save_reg = 0;
   head->cannot_rename = 0;
   head->terminated = 0;
+  head->length = 0;
 
   VEC_safe_push (du_head_p, heap, id_to_chain, head);
   head->id = current_id++;
@@ -572,6 +782,8 @@  create_new_chain (unsigned this_regno, unsigned this_nregs, rtx *loc,
   this_du->loc = loc;
   this_du->insn = insn;
   this_du->cl = cl;
+
+  head->length = 1;
 }
 
 static void
@@ -661,6 +873,8 @@  scan_rtx_reg (rtx insn, rtx *loc, enum reg_class cl, enum scan_actions action,
 		head->last->next_use = this_du;
 	      head->last = this_du;
 
+	      if (!DEBUG_INSN_P (insn))
+		head->length++;
 	    }
 	  /* Avoid adding the same location in a DEBUG_INSN multiple times,
 	     which could happen with non-exact overlap.  */
diff --git a/gcc/target.def b/gcc/target.def
index 66006ae..0f9a9f9 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2204,6 +2204,21 @@  DEFHOOK
  bool, (reg_class_t rclass),
  default_class_likely_spilled_p)
 
+DEFHOOK
+(preferred_rename_class,
+ "A target hook that places additional preference on the register\
+ class to use when it is necessary to rename a register in class\
+ @var{class} to another class, or perhaps @var{NO_REGS}, if no\
+ prefered register class is found or macro @code{PREFERRED_RENAME_CLASS}\
+ is not define.\
+ Sometimes returning a more restrictive class makes better code.  For\
+ example, on ARM, thumb-2 instructions using @code{LO_REGS} may be\
+ smaller than instructions using @code{GENERIC_REGS}.  By returning\
+ @code{LO_REGS} from @code{PREFERRED_RENAME_CLASS}, code size can\
+ be reduced.",
+ reg_class_t, (reg_class_t rclass),
+ default_preferred_rename_class)
+
 /* This target hook allows the backend to perform additional
    processing while initializing for variable expansion.  */
 DEFHOOK
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 3647436..19fc29a 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1270,6 +1270,16 @@  default_preferred_output_reload_class (rtx x ATTRIBUTE_UNUSED,
 #endif
 }
 
+reg_class_t
+default_preferred_rename_class (reg_class_t rclass)
+{
+#ifdef PREFERRED_RENAME_CLASS
+  return PREFERRED_RENAME_CLASS ((enum reg_class)rclass);
+#else
+  return NO_REGS;
+#endif
+}
+
 /* The default implementation of TARGET_CLASS_LIKELY_SPILLED_P.  */
 
 bool
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index eeefe05..6fe8350 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -158,6 +158,7 @@  extern int default_register_move_cost (enum machine_mode, reg_class_t,
 extern bool default_profile_before_prologue (void);
 extern reg_class_t default_preferred_reload_class (rtx, reg_class_t);
 extern reg_class_t default_preferred_output_reload_class (rtx, reg_class_t);
+extern reg_class_t default_preferred_rename_class (reg_class_t rclass);
 extern bool default_class_likely_spilled_p (reg_class_t);
 
 extern enum unwind_info_type default_debug_unwind_info (void);