diff mbox

RFA: MN10300: Add support for SETLB and Lcc instructions

Message ID m3sjvcnu4v.fsf@redhat.com
State New
Headers show

Commit Message

Nick Clifton Feb. 25, 2011, 1:03 p.m. UTC
Hi Richard, Hi Jeff, Hi Alex,

  Below is a patch to add support for the SETLB and Lcc low overhead
  loop instructions to the MN10300 backend.

  The patch makes use of a machine reorg pass to find loops that can use
  these instructions.  I did try to use the doloop_begin and doloop_end
  patterns as well, but I just could not get them to work.  (Either
  because they caused reload failures when the patterns were too
  complex, or else code generation problems because the Lcc instruction
  were being duplicated).

  I plan to add support for the FLcc instruction in a later patch.

  Tested with an mn10300-elf toolchain with no regressions.

  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2011-02-25  Nick Clifton  <nickc@redhat.com>

	* config/mn10300/mn10300.c (cc_flags_for_mode): Add support for
        CC_LCCmode.
        (mn10300_insert_setlb_lcc): New function: Insert a SETLB
	instruction into the code and replace a conditional branch
	instruction with a Lcc instruction.
        (mn10300_scan_for_setlb): Scan insns for places where SETLB and
	Lcc can be used.
        (mn10300_reorg): Invoke mn10300_scan_for_setlb_lcc if allowed.
        (TARGET_DEFAULT_FLAGS): Enable SETLB transformations by default.
        * config/mn10300/mn10300.md (UNSPEC_SETLB): New constant.
        (cmpsi): Allow the creation of a gen_cmpsi function.
        (setlb): New pattern.
        (Lcc): New pattern.
        * config/mn10300/mn10300-modes.def (CC_LCC): New mode.
        * config/mn10300/predicates.md (int_mode_flags): Do not allow
	CC_LCCmode.
        * config/mn10300/mn10300.opt (msetlb): New command line option.
        * config/mn10300/mn10300.h (TARGET_CPU_CPP_BUILTINS): Add
	definition of __SETLB__ or __NO_SETLB__.
	* doc/invoke.texi: Document -mno-setlb.

Comments

Richard Henderson Feb. 26, 2011, 1:42 a.m. UTC | #1
On 02/25/2011 05:03 AM, Nick Clifton wrote:
>> +/* A special mode used to distinguish the Lcc branch instruction
> +   from the Bcc branch instruction.  */
> +CC_MODE (CC_LCC);

Why not just use an unspec on the lcc insn.  I think you want to 
preserve the existing CCmode, so that we DTRT with modes that do
not have the overflow bit.

> +  /* What we're doing here is looking for a conditional branch which
> +     branches backwards without crossing any non-debug labels.  I.e.
> +     the loop has to enter from the top.  */

Ug.  Surely we can re-use the existing loop detection code to find the
inner-most loops.  The existence of normal branches within the loop 
ought not make any sort of a difference.

You'll have to re-establish the CFG, which just got broken down, but
that shouldn't be any different from several other targets.  Then use
flow_loops_find and the rest of the bits in cfgloop.h.


r~
Nick Clifton March 8, 2011, 8:17 a.m. UTC | #2
Hi Richard,

   Thanks for the review.  Attached is a revised patch which addresses 
the issues that you raised:

>>> +/* A special mode used to distinguish the Lcc branch instruction
>> +   from the Bcc branch instruction.  */
>> +CC_MODE (CC_LCC);
>
> Why not just use an unspec on the lcc insn.

I should have done.  The CC_LCC mode was a holdover from when I was 
trying to get the doloop patterns to work.  The doloop pattern 
recognizer does not accept UNSPECs, so I had to use a unique CC mode 
instead.  The revised patch uses an UNSPEC.

>> +  /* What we're doing here is looking for a conditional branch which
>> +     branches backwards without crossing any non-debug labels.  I.e.
>> +     the loop has to enter from the top.  */
>
> Ug.  Surely we can re-use the existing loop detection code to find the
> inner-most loops.

We can, but it is a lot more complicated than just performing a simple 
scan.  Still I have tried my best to follow up on this.  One thing that 
does worry me however is that I encountered a segmentation fault whilst 
trying to free the loop data.  (See a comment in the patch for more 
details).  I tried to follow the example of other passes that use the 
loop information, but I must have gotten something wrong.  Any pointers 
on how to fix this are greatly appreciated.

Cheers
   Nick

gcc/ChangeLog
2011-03-08  Nick Clifton  <nickc@redhat.com>

	* config/mn10300/mn10300.c: Include cfgloop.h.
	(DUMP): New macro.
	(mn10300_insert_setlb_lcc): New function.  Inserts a SETLB and a
	Lcc or a FLcc insn into the instruction stream.
	(mn10300_block_contains_call): New function.  Returns true if the
	given basic block contains a CALL insn.
	(mn10300_loop_contains_call_insn): New function.  Returns true if
	the given loop contains a CALL insn.
	(mn10300_scan_for_setlb_lcc): New function.  Finds opportunities
	to use the SETLB and Lcc or FLcc insns.
	(mn10300_reorg): Invoke mn10300_scan_for_setlb_lcc.
	(TARGET_FLAGS): Add MASK_ALLOW_SETLB.
	* config/mn10300/mn10300.opt (msetlb): New option.  Used to
	disable the SETLB optimization.
	* config/mn10300/mn10300.h (TARGET_CPU_CPP_BUILTINS): Add
	__SETLB__ or __NO_SETLB__.
	* config/mn10300/mn10300.md (UNSPEC_SETLB): New constant.
	(cmpsi): Make visible.
	(setlb): New pattern.
	(Lcc): New pattern.
	(FLcc): New pattern.
	* doc/invoke.texi: Document -mno-setlb option.
Richard Henderson March 9, 2011, 3:58 a.m. UTC | #3
> +  /* If the comparison has not already been split out of the branch
> +     then do so now.  */
> +  if (REGNO (cmp_reg) != CC_REG)
> +    {
> +      rtx cmp;
> +
> +      cmp = emit_insn_before (gen_cmpsi (cmp_reg, XEXP (comparison, 1)), branch);
> +
> +      DUMP ("Extracted comparison from branch:", cmp);
> +    }

This should never ever fire.  These should have been split already,
at least with optimization, on which this pass is dependant.  Just
assert here, if you like.

> +mn10300_block_contains_call (struct basic_block_def * block)

  basic_block block

> +  for (insn = block->il.rtl->head_;
> +       insn != NULL_RTX;
> +       insn = NEXT_INSN (insn))
> +    {
> +      if (CALL_P (insn))
> +	return true;
> +
> +      if (insn == block->il.rtl->end_)
> +	break;
> +    }

  FOR_BB_INSNS (block, insn)
    if (CALL_P (insn))
      return true;

> +  df_analyze ();
> +  if (flow_loops_find (& loops) > 0)
> +    {
> +      unsigned int i;
> +      loop_p loop;
> +
> +      FOR_EACH_VEC_ELT (loop_p, loops.larray, i, loop)

You need compute_bb_for_insn, and to set up current_loops.
Then you can use 

  FOR_EACH_LOOP(liter, loop, LI_ONLY_INNERMOST)

instead of iterating over the array by hand.  This will eliminate
at least the fake loop and non-innermost loop check that you do
by hand.  Of course, current_loops must be cleared at the end.

> +	  else if (loop->header != loop->latch)
> +	    reason = "it is not a simple do-while loop";

Why?  A loop with multiple blocks should be fine with Lcc.
I do see that you've got loop->header and loop->latch backward
further below; perhaps that was your reason for this?

> +	      rtx branch = loop->header->il.rtl->end_;

  BB_END (loop->latch)

> +	      if (single_set (branch) == NULL_RTX
> +		  || GET_CODE (SET_SRC (single_set (branch))) != IF_THEN_ELSE)

  !any_condjump_p (branch)

> +		  label = loop->latch->il.rtl->head_;

  BB_HEAD (loop->header)

> +		  /* If necessary, extract the label from the branch insn.  */
> +		  if (! LABEL_P (label))

This should never ever fire -- how else can the loop have been created?
Again, you can just assert here.

> +      /* FIXME: Calling flow_loops_free() appears to be the correct thing to do,
> +	 but it results in a seg-fault when building regex.c in the target libiberty
> +	 library.  I have no idea why; so I have disabled the call for now.  */
> +      /* flow_loops_free (& loops); */

I'm having a look into this.  It's definitely non-obvious...


r~
Nick Clifton May 3, 2011, 12:13 p.m. UTC | #4
Hi Richard,

   Thanks for the patch review.  Sorry for not responding earlier, but 
some other work got in the way.

   I have attached a revised patch which addresses all of the points you 
raised except one:

>> +	  else if (loop->header != loop->latch
>> +	    reason = "it is not a simple do-while loop";
>
> Why?  A loop with multiple blocks should be fine with Lcc.

   The problem I found here was that I did not know how to find all of 
the test-and-loop-back insns if the loop spanned multiple blocks.  The 
revised patch includes a comment saying that loops spanning multiple 
blocks could be supported, but I would like to leave that for a future 
patch rather than trying to get it all working now.

Cheers
   Nick
Richard Henderson May 3, 2011, 7:18 p.m. UTC | #5
On 05/03/2011 05:13 AM, Nick Clifton wrote:
> The problem I found here was that I did not know how to find all of
> the test-and-loop-back insns if the loop spanned multiple blocks.
> The revised patch includes a comment saying that loops spanning
> multiple blocks could be supported, but I would like to leave that
> for a future patch rather than trying to get it all working now.

The patch is ok.

As for the finding the branches, I'm pretty sure that 

  /* Head of the cyclic list of the exits of the loop.  */
  struct loop_exit *exits;

means what it says and contains all of the edges that exit
the block.  But doing that in a separate patch is fine.


r~
diff mbox

Patch

Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 170489)
+++ gcc/doc/invoke.texi	(working copy)
@@ -741,7 +741,7 @@ 
 -mno-am33 -mam33 -mam33-2 -mam34 @gol
 -mtune=@var{cpu-type} @gol
 -mreturn-pointer-on-d0 @gol
--mno-crt0  -mrelax -mliw}
+-mno-crt0  -mrelax -mliw -msetlb}
 
 @emph{PDP-11 Options}
 @gccoptlist{-mfpu  -msoft-float  -mac0  -mno-ac0  -m40  -m45  -m10 @gol
@@ -15071,6 +15071,18 @@ 
 instructions.  This option defines the preprocessor macro
 @samp{__NO_LIW__}.
 
+@item -msetlb
+@opindex msetlb
+Allow the compiler to generate the @emph{SETLB} and @emph{Lcc}
+instructions if the target is the @samp{AM33} or later.  This is the
+default.  This option defines the preprocessor macro @samp{__SETLB__}.
+
+@item -mnosetlb
+@opindex mnosetlb
+Do not allow the compiler to generate @emph{SETLB} or @emph{Lcc}
+instructions.  This option defines the preprocessor macro
+@samp{__NO_SETLB__}.
+
 @end table
 
 @node PDP-11 Options
Index: gcc/config/mn10300/mn10300.opt
===================================================================
--- gcc/config/mn10300/mn10300.opt	(revision 170489)
+++ gcc/config/mn10300/mn10300.opt	(working copy)
@@ -54,3 +54,7 @@ 
 mliw
 Target Report Mask(ALLOW_LIW)
 Allow gcc to generate LIW instructions
+
+msetlb
+Target Report Mask(ALLOW_SETLB)
+Allow gcc to generate the SETLB and Lcc instructions
Index: gcc/config/mn10300/mn10300-modes.def
===================================================================
--- gcc/config/mn10300/mn10300-modes.def	(revision 170489)
+++ gcc/config/mn10300/mn10300-modes.def	(working copy)
@@ -22,3 +22,7 @@ 
 CC_MODE (CCZN);
 CC_MODE (CCZNC);
 CC_MODE (CC_FLOAT);
+
+/* A special mode used to distinguish the Lcc branch instruction
+   from the Bcc branch instruction.  */
+CC_MODE (CC_LCC);
Index: gcc/config/mn10300/predicates.md
===================================================================
--- gcc/config/mn10300/predicates.md	(revision 170489)
+++ gcc/config/mn10300/predicates.md	(working copy)
@@ -56,6 +56,8 @@ 
 {
   if (REGNO (op) != CC_REG)
     return false;
+  if (GET_MODE (op) == CC_LCCmode)
+    return false;
   if (GET_MODE (op) == CC_FLOATmode)
     return false;
   return GET_MODE_CLASS (GET_MODE (op)) == MODE_CC;
Index: gcc/config/mn10300/mn10300.h
===================================================================
--- gcc/config/mn10300/mn10300.h	(revision 170489)
+++ gcc/config/mn10300/mn10300.h	(working copy)
@@ -54,6 +55,8 @@ 
       builtin_define (TARGET_ALLOW_LIW ?	\
 		      "__LIW__" : "__NO_LIW__");\
 						\
+      builtin_define (TARGET_ALLOW_SETLB  ?	\
+		      "__SETLB__" : "__NO_SETLB__");\
     }						\
   while (0)
 
Index: gcc/config/mn10300/mn10300.md
===================================================================
--- gcc/config/mn10300/mn10300.md	(revision 170489)
+++ gcc/config/mn10300/mn10300.md	(working copy)
@@ -42,6 +42,8 @@ 
 
   ;; This is used to encode LIW patterns.
   (UNSPEC_LIW		8)
+  ;; This is for the low overhead loop instructions.
+  (UNSPEC_SETLB         9)
 ])
 
 (include "predicates.md")
@@ -1375,7 +1377,7 @@ 
   DONE;
 })
 
-(define_insn "*cmpsi"
+(define_insn "cmpsi"
   [(set (reg CC_REG)
 	(compare (match_operand:SI 0 "register_operand"  "r,r,r")
 		 (match_operand:SI 1 "nonmemory_operand" "r,O,i")))]
@@ -2152,3 +2154,37 @@ 
   [(set (attr "timings") (if_then_else (eq_attr "cpu" "am34")
 				       (const_int 13) (const_int 12)))]
 )
+
+;; Note - in theory the doloop patterns could be used here to express
+;; the SETLB and Lcc instructions.  In practice this does not work because
+;; the acceptable forms of the doloop patterns do not include UNSPECs
+;; and without them gcc's basic block reordering code can duplicate the
+;; doloop_end pattern, leading to bogus multiple decrements of the loop
+;; counter. 
+
+(define_insn "setlb"
+  [(unspec [(const_int 0)] UNSPEC_SETLB)]
+  "TARGET_AM33 && TARGET_ALLOW_SETLB"
+  "setlb"
+)
+
+(define_insn "Lcc"
+  [(set (pc)
+	(if_then_else (match_operator 0 "comparison_operator"
+		      [(reg:CC_LCC CC_REG) (const_int 0)])
+		      (label_ref (match_operand 1 "" ""))
+		      (pc)))]
+  "TARGET_AM33 && TARGET_ALLOW_SETLB"
+  "L%b0 # loop back to: %1"
+)
+
+;; (define_insn "FLcc"
+;;   [(set (pc)
+;; 	(if_then_else (match_operator 0 "comparison_operator"
+;; 		      [(reg:CC_FLCC CC_REG) (const_int 0)])
+;; 		      (label_ref (match_operand 1 "" ""))
+;; 		      (pc)))]
+;;   "TARGET_AM33 && TARGET_ALLOW_SETLB"
+;;   "FL%b0 # loop back to: %1"
+;;   [(set (attr "timings") (if_then_else (eq_attr "cpu" "am34") (const_int 44) (const_int 11)))]
+;; )
Index: gcc/config/mn10300/mn10300.c
===================================================================
--- gcc/config/mn10300/mn10300.c	(revision 170489)
+++ gcc/config/mn10300/mn10300.c	(working copy)
@@ -2674,6 +2674,7 @@ 
       return CC_FLAG_Z | CC_FLAG_N | CC_FLAG_C;
     case CCZNmode:
       return CC_FLAG_Z | CC_FLAG_N;
+    case CC_LCCmode:
     case CC_FLOATmode:
       return -1;
     default:
@@ -3134,11 +3135,82 @@ 
     }
 }
 
+/* Replace the JUMP insn with a Lcc insn that goes to LABEL.
+   Insert a SETLB insn before LABEL.  */
+
 static void
+mn10300_insert_setlb_lcc (rtx label, rtx jump)
+{
+  rtx lcc, comparison, cmp_reg;
+
+  emit_insn_before (gen_setlb (), label);
+
+  comparison = XEXP (SET_SRC (PATTERN (jump)), 0);
+  cmp_reg = XEXP (comparison, 0);
+  gcc_assert (REG_P (cmp_reg));
+  /* FIXME: We do not currently support the FLcc instruction.  */
+  gcc_assert (GET_MODE (cmp_reg) != CC_FLOATmode);
+
+  /* If the comparison has not already been split out of the jump
+     then do so now.  */
+  if (REGNO (cmp_reg) != CC_REG)
+    emit_insn_before (gen_cmpsi (cmp_reg, XEXP (comparison, 1)), jump);
+  
+  lcc = gen_Lcc (comparison, label);
+  lcc = emit_jump_insn_before (lcc, jump);
+  mark_jump_label (PATTERN (lcc), lcc, 0);
+  delete_insn (jump);
+}
+
+static void
+mn10300_scan_for_setlb_lcc (void)
+{
+  rtx last_code_label = NULL_RTX, r;
+
+  /* What we're doing here is looking for a conditional branch which
+     branches backwards without crossing any non-debug labels.  I.e.
+     the loop has to enter from the top.  */
+
+  for (r = get_insns (); r != NULL_RTX; r = next_nonnote_nondebug_insn (r))
+    {
+      switch (GET_CODE (r))
+	{
+	case CODE_LABEL:
+	  last_code_label = r;
+	  break;
+	case NOTE:
+	  /* This can happen if the first insn is a NOTE.  */
+	case INSN:
+	  /* These are safe to ignore.  */
+	  break;
+	case JUMP_INSN:
+	  /* Check these.  */
+	  if (GET_CODE (PATTERN (r)) == SET
+	      && GET_CODE (SET_SRC (PATTERN (r))) == IF_THEN_ELSE)
+	    {
+	      if (last_code_label != NULL_RTX
+		  && JUMP_LABEL (r) == last_code_label)
+		{
+		  if (LABEL_NUSES (last_code_label) == 1)
+		    mn10300_insert_setlb_lcc (last_code_label, r);
+		}
+	    }
+	  break;
+	default:
+	  last_code_label = NULL_RTX;
+	  break;
+	}
+    }
+}
+
+static void
 mn10300_reorg (void)
 {
   if (TARGET_AM33)
     {
+      if (TARGET_ALLOW_SETLB)
+	mn10300_scan_for_setlb_lcc ();
+
       if (TARGET_ALLOW_LIW)
 	mn10300_bundle_liw ();
     }
@@ -3176,7 +3248,7 @@ 
 #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA mn10300_asm_output_addr_const_extra
 
 #undef  TARGET_DEFAULT_TARGET_FLAGS
-#define TARGET_DEFAULT_TARGET_FLAGS MASK_MULT_BUG | MASK_PTR_A0D0 | MASK_ALLOW_LIW
+#define TARGET_DEFAULT_TARGET_FLAGS MASK_MULT_BUG | MASK_PTR_A0D0 | MASK_ALLOW_LIW | MASK_ALLOW_SETLB
 #undef  TARGET_HANDLE_OPTION
 #define TARGET_HANDLE_OPTION mn10300_handle_option
 #undef  TARGET_OPTION_OVERRIDE