diff mbox

[4/6] Shrink-wrapping

Message ID 4E37B7A8.8010901@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt Aug. 2, 2011, 8:39 a.m. UTC
On 07/07/11 16:34, Richard Sandiford wrote:
> I didn't review much after this, because it was hard to sort the
> simple_return stuff out from the "JUMP_LABEL can be a return rtx" change.

So, here's a second preliminary patch. Now that we have returns in
JUMP_LABELs, we can introduce SIMPLE_RETURN and distinguish between the two.

Admittedly this patch is somewhat poorly motivated when separated from
the rest of the shrink-wrapping stuff, but it is self-contained. In
order to have one user of simple_return, I've modified the mips epilogue
to generate it.

Bootstrapped and tested on i686-linux; I've also verified that I don't
see code generation changes with mips64-elf, sh-elf and sparc-linux
cross-compilers (with SIMPLE_RETURN placed last in rtl.def since
otherwise there are hashing differences).


Bernd
* doc/rtl.texi (simple_return): Document.
	(parallel, PATTERN): Here too.
	* gengenrtl.c (special_rtx): SIMPLE_RETURN is special.
	* final.c (final_scan_insn): Use ANY_RETURN_P on body.
	* reorg.c (function_return_label, function_simple_return_label):
	New static variables, replacing...
	(end_of_function_label): ... this.
	(simplejump_or_return_p): New static function.
	(optimize_skip, steal_delay_list_from_fallthrough,
	fill_slots_from_thread): Use it.
	(relax_delay_slots): Likewise.  Use ANY_RETURN_P on body.
	(rare_destination, follow_jumps): Use ANY_RETURN_P on body.
	(find_end_label): Take a new arg which is one of the two return
	rtxs.  Depending on which, set either function_return_label or
	function_simple_return_label.  All callers changed.
	(make_return_insns): Make both kinds.
	(dbr_schedule): Adjust for two kinds of end labels.
	* genemit.c (gen_exp): Handle SIMPLE_RETURN.
	(gen_expand, gen_split): Use ANY_RETURN_P.
	* df-scan.c (df_uses_record): Handle SIMPLE_RETURN.
	* rtl.def (SIMPLE_RETURN): New code.
	* ifcvt.c (find_if_case_1): Be more careful about
	redirecting jumps to the EXIT_BLOCK.
	* jump.c (condjump_p, condjump_in_parallel_p, any_condjump_p,
	returnjump_p_1): Handle SIMPLE_RETURNs.
	* print-rtl.c (print_rtx): Likewise.
	* rtl.c (copy_rtx): Likewise.
	* bt-load.c (compute_defs_uses_and_gen): Use ANY_RETURN_P.
	* combine.c (simplify_set): Likewise.
	* resource.c (find_dead_or_set_registers, mark_set_resources):
	Likewise.
	* emit-rtl.c (verify_rtx_sharing, classify_insn): Handle
	SIMPLE_RETURNs.
	(init_emit_regs): Initialize simple_return_rtx.
	* cfglayout.c (fixup_reorder_chain): Pass a JUMP_LABEL to
	force_nonfallthru_and_redirect.
	* rtl.h (ANY_RETURN_P): Allow SIMPLE_RETURN.
	(GR_SIMPLE_RETURN): New enum value.
	(simple_return_rtx): New macro.
	* basic-block.h (force_nonfallthru_and_redirect): Adjust
	declaration.
	* cfgrtl.c (force_nonfallthru_and_redirect): Take a new jump_label
	argument.  All callers changed.  Be careful about what kinds of
	returnjumps to generate.
	* config/i386/3i86.c (ix86_pad_returns, ix86_count_insn_bb,
	ix86_pad_short_function): Likewise.
	* config/arm/arm.c (arm_final_prescan_insn): Handle both kinds
	of return.
	* config/mips/mips.md (simple_return, *simple_return,
	simple_return_internal): New patterns.
	* config/mips/mips.c (mips_expand_epilogue): Make the last insn
	a simple_return_internal.

Comments

Richard Sandiford Aug. 3, 2011, 3:38 p.m. UTC | #1
Bernd Schmidt <bernds@codesourcery.com> writes:
> +@findex simple_return
> +@item (simple_return)
> +Like @code{(return)}, but truly represents only a function return, while
> +@code{(return)} may represent an insn that also performs other functions
> +of the function epilogue.  Like @code{(return)}, this may also occur in
> +conditional jumps.

Sorry, I've forgotton the outcome of the discussion about what happens
on targets whose return expands to the same code as their simple_return.
Do the targets still need both "return" and "simple_return" rtxes?
Do they need both md patterns (but potentially using the same rtx
underneath)?

I ask because the rtl.def comment implies that those targets still
need both expanders and both rtxes.  If that's so, I think it needs
to be mentioned here too.  E.g. something like:

  Like @code{(return)}, but truly represents only a function return, while
  @code{(return)} may represent an insn that also performs other functions
  of the function epilogue.  @code{(return)} only occurs on paths that
  pass through the function prologue, while @code{(simple_return)}
  only occurs on paths that do not pass through the prologue.

  Like @code{(return)}, @code{(simple_return)} may also occur in
  conditional jumps.

You need to document the simple_return pattern in md.texi too.

> @@ -231,6 +234,15 @@ first_active_target_insn (rtx insn)
>    return next_active_insn (insn);
>  }
>  
> +/* Return true iff INSN is a simplejump, or any kind of return insn.  */
> +
> +static bool
> +simplejump_or_return_p (rtx insn)
> +{
> +  return (JUMP_P (insn)
> +	  && (simplejump_p (insn) || ANY_RETURN_P (PATTERN (insn))));
> +}

Maybe better in jump.c?  I'll leave it up to you though.

> @@ -346,23 +358,29 @@ insn_sets_resource_p (rtx insn, struct r
>  
>     ??? There may be a problem with the current implementation.  Suppose
>     we start with a bare RETURN insn and call find_end_label.  It may set
> -   end_of_function_label just before the RETURN.  Suppose the machinery
> +   function_return_label just before the RETURN.  Suppose the machinery
>     is able to fill the delay slot of the RETURN insn afterwards.  Then
> -   end_of_function_label is no longer valid according to the property
> +   function_return_label is no longer valid according to the property
>     described above and find_end_label will still return it unmodified.
>     Note that this is probably mitigated by the following observation:
> -   once end_of_function_label is made, it is very likely the target of
> +   once function_return_label is made, it is very likely the target of
>     a jump, so filling the delay slot of the RETURN will be much more
>     difficult.  */
>  
>  static rtx
> -find_end_label (void)
> +find_end_label (rtx kind)

Need to document the new parameter.

>  {
>    rtx insn;
> +  rtx *plabel;
> +
> +  if (kind == ret_rtx)
> +    plabel = &function_return_label;
> +  else
> +    plabel = &function_simple_return_label;

I think it'd be worth a gcc_checking_assert that ret == simple_return_rtx
in the other case.

> -	  /* Put the label before an USE insns that may proceed the
> +	  /* Put the label before an USE insns that may precede the
>  	     RETURN insn.  */

Might as well fix s/an USE/any USE/ too while you're there

> @@ -3498,6 +3506,8 @@ relax_delay_slots (rtx first)
>  	continue;
>  
>        target_label = JUMP_LABEL (delay_insn);
> +      if (target_label && ANY_RETURN_P (target_label))
> +	continue;
>  
>        if (!ANY_RETURN_P (target_label))
>  	{

This doesn't look like a pure "handle return as well as simple return"
change.  Is the idea that every following test only makes sense for
labels, and that things like:

	  && prev_active_insn (target_label) == insn

(to pick just one example) are actively dangerous for returns?
If so, I think you should remove the immediately-following.
"if (!ANY_RETURN_P (target_label))" condition and reindent the body.

> @@ -3737,13 +3753,27 @@ make_return_insns (rtx first)
>    for (insn = first; insn; insn = NEXT_INSN (insn))
>      {
>        int flags;
> +      rtx kind, real_label;
>  
>        /* Only look at filled JUMP_INSNs that go to the end of function
>  	 label.  */
>        if (!NONJUMP_INSN_P (insn)
>  	  || GET_CODE (PATTERN (insn)) != SEQUENCE
> -	  || !JUMP_P (XVECEXP (PATTERN (insn), 0, 0))
> -	  || JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) != end_of_function_label)
> +	  || !JUMP_P (XVECEXP (PATTERN (insn), 0, 0)))
> +	continue;
> +
> +      if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) == function_return_label)
> +	{
> +	  kind = ret_rtx;
> +	  real_label = real_return_label;
> +	}
> +      else if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0))
> +	       == function_simple_return_label)
> +	{
> +	  kind = simple_return_rtx;
> +	  real_label = real_simple_return_label;
> +	}
> +      else
>  	continue;
>  
>        pat = PATTERN (insn);

Given what you said about JUMP_LABEL sometimes being null,
I think we need either (a) to check whether each *_return_label
is null before comparing it with JUMP_LABEL, or (b) to ensure that
we're dealing with a jump to a label.  (b) seems neater IMO
(as a call to jump_to_label_p).

> +#if defined HAVE_return || defined HAVE_simple_return
> +  if (
>  #ifdef HAVE_return
> -  if (HAVE_return && end_of_function_label != 0)
> +      (HAVE_return && function_return_label != 0)
> +#else
> +      0
> +#endif
> +#ifdef HAVE_simple_return
> +      || (HAVE_simple_return && function_simple_return_label != 0)
> +#endif
> +      )
>      make_return_insns (first);
>  #endif

Eww.  Given that make_return_insns clears the *return_labels,
it's probably more readable just to have two conditional calls:

#ifdef HAVE_return
  if (HAVE_return && function_return_label != 0)
    make_return_insns (first);
#endif
#ifdef HAVE_simple_return
  if (HAVE_simple_return && function_simple_return_label != 0)
    make_return_insns (first);
#endif

I'll leave it up to you though.

> Index: gcc/emit-rtl.c
> ===================================================================
> --- gcc/emit-rtl.c	(revision 176881)
> +++ gcc/emit-rtl.c	(working copy)
> @@ -2518,6 +2518,7 @@ verify_rtx_sharing (rtx orig, rtx insn)
>      case PC:
>      case CC0:
>      case RETURN:
> +    case SIMPLE_RETURN:
>      case SCRATCH:
>        return;
>        /* SCRATCH must be shared because they represent distinct values.  */

Given Alan's patch, I suppose you also need cases for copy_rtx_if_shared_1,
copy_insn_1 and mark_used_flags.  (Sorry about being wise after the fact here.)

> Index: gcc/config/mips/mips.md
> ===================================================================
> --- gcc/config/mips/mips.md	(revision 176879)
> +++ gcc/config/mips/mips.md	(working copy)
> @@ -5724,6 +5724,18 @@ (define_insn "*return"
>    [(set_attr "type"	"jump")
>     (set_attr "mode"	"none")])
>  
> +(define_expand "simple_return"
> +  [(simple_return)]
> +  "!mips_can_use_return_insn ()"
> +  { mips_expand_before_return (); })
> +
> +(define_insn "*simple_return"
> +  [(simple_return)]
> +  "!mips_can_use_return_insn ()"
> +  "%*j\t$31%/"
> +  [(set_attr "type"	"jump")
> +   (set_attr "mode"	"none")])
> +
>  ;; Normal return.
>  
>  (define_insn "return_internal"
> @@ -5731,6 +5743,14 @@ (define_insn "return_internal"
>     (use (match_operand 0 "pmode_register_operand" ""))]
>    ""
>    "%*j\t%0%/"
> +  [(set_attr "type"	"jump")
> +   (set_attr "mode"	"none")])
> +
> +(define_insn "simple_return_internal"
> +  [(simple_return)
> +   (use (match_operand 0 "pmode_register_operand" ""))]
> +  ""
> +  "%*j\t%0%/"
>    [(set_attr "type"	"jump")
>     (set_attr "mode"	"none")])

Please add:

(define_code_iterator any_return [return simple_return])

and just change the appropriate returns to any_returns.

The rtl and MIPS bits look good to me otherwise.

Richard
diff mbox

Patch

Index: gcc/doc/rtl.texi
===================================================================
--- gcc/doc/rtl.texi	(revision 176879)
+++ gcc/doc/rtl.texi	(working copy)
@@ -2915,6 +2915,13 @@  placed in @code{pc} to return to the cal
 Note that an insn pattern of @code{(return)} is logically equivalent to
 @code{(set (pc) (return))}, but the latter form is never used.
 
+@findex simple_return
+@item (simple_return)
+Like @code{(return)}, but truly represents only a function return, while
+@code{(return)} may represent an insn that also performs other functions
+of the function epilogue.  Like @code{(return)}, this may also occur in
+conditional jumps.
+
 @findex call
 @item (call @var{function} @var{nargs})
 Represents a function call.  @var{function} is a @code{mem} expression
@@ -3044,7 +3051,7 @@  Represents several side effects performe
 brackets stand for a vector; the operand of @code{parallel} is a
 vector of expressions.  @var{x0}, @var{x1} and so on are individual
 side effect expressions---expressions of code @code{set}, @code{call},
-@code{return}, @code{clobber} or @code{use}.
+@code{return}, @code{simple_return}, @code{clobber} or @code{use}.
 
 ``In parallel'' means that first all the values used in the individual
 side-effects are computed, and second all the actual side-effects are
@@ -3683,14 +3690,16 @@  and @code{call_insn} insns:
 @table @code
 @findex PATTERN
 @item PATTERN (@var{i})
-An expression for the side effect performed by this insn.  This must be
-one of the following codes: @code{set}, @code{call}, @code{use},
-@code{clobber}, @code{return}, @code{asm_input}, @code{asm_output},
-@code{addr_vec}, @code{addr_diff_vec}, @code{trap_if}, @code{unspec},
-@code{unspec_volatile}, @code{parallel}, @code{cond_exec}, or @code{sequence}.  If it is a @code{parallel},
-each element of the @code{parallel} must be one these codes, except that
-@code{parallel} expressions cannot be nested and @code{addr_vec} and
-@code{addr_diff_vec} are not permitted inside a @code{parallel} expression.
+An expression for the side effect performed by this insn.  This must
+be one of the following codes: @code{set}, @code{call}, @code{use},
+@code{clobber}, @code{return}, @code{simple_return}, @code{asm_input},
+@code{asm_output}, @code{addr_vec}, @code{addr_diff_vec},
+@code{trap_if}, @code{unspec}, @code{unspec_volatile},
+@code{parallel}, @code{cond_exec}, or @code{sequence}.  If it is a
+@code{parallel}, each element of the @code{parallel} must be one these
+codes, except that @code{parallel} expressions cannot be nested and
+@code{addr_vec} and @code{addr_diff_vec} are not permitted inside a
+@code{parallel} expression.
 
 @findex INSN_CODE
 @item INSN_CODE (@var{i})
Index: gcc/gengenrtl.c
===================================================================
--- gcc/gengenrtl.c	(revision 176879)
+++ gcc/gengenrtl.c	(working copy)
@@ -131,6 +131,7 @@  special_rtx (int idx)
 	  || strcmp (defs[idx].enumname, "PC") == 0
 	  || strcmp (defs[idx].enumname, "CC0") == 0
 	  || strcmp (defs[idx].enumname, "RETURN") == 0
+	  || strcmp (defs[idx].enumname, "SIMPLE_RETURN") == 0
 	  || strcmp (defs[idx].enumname, "CONST_VECTOR") == 0);
 }
 
Index: gcc/final.c
===================================================================
--- gcc/final.c	(revision 176879)
+++ gcc/final.c	(working copy)
@@ -2492,7 +2492,7 @@  final_scan_insn (rtx insn, FILE *file, i
 	        delete_insn (insn);
 		break;
 	      }
-	    else if (GET_CODE (SET_SRC (body)) == RETURN)
+	    else if (ANY_RETURN_P (SET_SRC (body)))
 	      /* Replace (set (pc) (return)) with (return).  */
 	      PATTERN (insn) = body = SET_SRC (body);
 
Index: gcc/reorg.c
===================================================================
--- gcc/reorg.c	(revision 176881)
+++ gcc/reorg.c	(working copy)
@@ -161,8 +161,11 @@  static rtx *unfilled_firstobj;
 #define unfilled_slots_next	\
   ((rtx *) obstack_next_free (&unfilled_slots_obstack))
 
-/* Points to the label before the end of the function.  */
-static rtx end_of_function_label;
+/* Points to the label before the end of the function, or before a
+   return insn.  */
+static rtx function_return_label;
+/* Likewise for a simple_return.  */
+static rtx function_simple_return_label;
 
 /* Mapping between INSN_UID's and position in the code since INSN_UID's do
    not always monotonically increase.  */
@@ -175,7 +178,7 @@  static int stop_search_p (rtx, int);
 static int resource_conflicts_p (struct resources *, struct resources *);
 static int insn_references_resource_p (rtx, struct resources *, bool);
 static int insn_sets_resource_p (rtx, struct resources *, bool);
-static rtx find_end_label (void);
+static rtx find_end_label (rtx);
 static rtx emit_delay_sequence (rtx, rtx, int);
 static rtx add_to_delay_list (rtx, rtx);
 static rtx delete_from_delay_slot (rtx);
@@ -231,6 +234,15 @@  first_active_target_insn (rtx insn)
   return next_active_insn (insn);
 }
 
+/* Return true iff INSN is a simplejump, or any kind of return insn.  */
+
+static bool
+simplejump_or_return_p (rtx insn)
+{
+  return (JUMP_P (insn)
+	  && (simplejump_p (insn) || ANY_RETURN_P (PATTERN (insn))));
+}
+
 /* Return TRUE if this insn should stop the search for insn to fill delay
    slots.  LABELS_P indicates that labels should terminate the search.
    In all cases, jumps terminate the search.  */
@@ -346,23 +358,29 @@  insn_sets_resource_p (rtx insn, struct r
 
    ??? There may be a problem with the current implementation.  Suppose
    we start with a bare RETURN insn and call find_end_label.  It may set
-   end_of_function_label just before the RETURN.  Suppose the machinery
+   function_return_label just before the RETURN.  Suppose the machinery
    is able to fill the delay slot of the RETURN insn afterwards.  Then
-   end_of_function_label is no longer valid according to the property
+   function_return_label is no longer valid according to the property
    described above and find_end_label will still return it unmodified.
    Note that this is probably mitigated by the following observation:
-   once end_of_function_label is made, it is very likely the target of
+   once function_return_label is made, it is very likely the target of
    a jump, so filling the delay slot of the RETURN will be much more
    difficult.  */
 
 static rtx
-find_end_label (void)
+find_end_label (rtx kind)
 {
   rtx insn;
+  rtx *plabel;
+
+  if (kind == ret_rtx)
+    plabel = &function_return_label;
+  else
+    plabel = &function_simple_return_label;
 
   /* If we found one previously, return it.  */
-  if (end_of_function_label)
-    return end_of_function_label;
+  if (*plabel)
+    return *plabel;
 
   /* Otherwise, see if there is a label at the end of the function.  If there
      is, it must be that RETURN insns aren't needed, so that is our return
@@ -377,44 +395,44 @@  find_end_label (void)
 
   /* When a target threads its epilogue we might already have a
      suitable return insn.  If so put a label before it for the
-     end_of_function_label.  */
+     function_return_label.  */
   if (BARRIER_P (insn)
       && JUMP_P (PREV_INSN (insn))
-      && GET_CODE (PATTERN (PREV_INSN (insn))) == RETURN)
+      && PATTERN (PREV_INSN (insn)) == kind)
     {
       rtx temp = PREV_INSN (PREV_INSN (insn));
-      end_of_function_label = gen_label_rtx ();
-      LABEL_NUSES (end_of_function_label) = 0;
+      rtx label = gen_label_rtx ();
+      LABEL_NUSES (label) = 0;
 
       /* Put the label before an USE insns that may precede the RETURN insn.  */
       while (GET_CODE (temp) == USE)
 	temp = PREV_INSN (temp);
 
-      emit_label_after (end_of_function_label, temp);
+      emit_label_after (label, temp);
+      *plabel = label;
     }
 
   else if (LABEL_P (insn))
-    end_of_function_label = insn;
+    *plabel = insn;
   else
     {
-      end_of_function_label = gen_label_rtx ();
-      LABEL_NUSES (end_of_function_label) = 0;
+      rtx label = gen_label_rtx ();
+      LABEL_NUSES (label) = 0;
       /* If the basic block reorder pass moves the return insn to
 	 some other place try to locate it again and put our
-	 end_of_function_label there.  */
-      while (insn && ! (JUMP_P (insn)
-		        && (GET_CODE (PATTERN (insn)) == RETURN)))
+	 function_return_label there.  */
+      while (insn && ! (JUMP_P (insn) && (PATTERN (insn) == kind)))
 	insn = PREV_INSN (insn);
       if (insn)
 	{
 	  insn = PREV_INSN (insn);
 
-	  /* Put the label before an USE insns that may proceed the
+	  /* Put the label before an USE insns that may precede the
 	     RETURN insn.  */
 	  while (GET_CODE (insn) == USE)
 	    insn = PREV_INSN (insn);
 
-	  emit_label_after (end_of_function_label, insn);
+	  emit_label_after (label, insn);
 	}
       else
 	{
@@ -424,19 +442,16 @@  find_end_label (void)
 	      && ! HAVE_return
 #endif
 	      )
-	    {
-	      /* The RETURN insn has its delay slot filled so we cannot
-		 emit the label just before it.  Since we already have
-		 an epilogue and cannot emit a new RETURN, we cannot
-		 emit the label at all.  */
-	      end_of_function_label = NULL_RTX;
-	      return end_of_function_label;
-	    }
+	    /* The RETURN insn has its delay slot filled so we cannot
+	       emit the label just before it.  Since we already have
+	       an epilogue and cannot emit a new RETURN, we cannot
+	       emit the label at all.  */
+	    return NULL_RTX;
 #endif /* HAVE_epilogue */
 
 	  /* Otherwise, make a new label and emit a RETURN and BARRIER,
 	     if needed.  */
-	  emit_label (end_of_function_label);
+	  emit_label (label);
 #ifdef HAVE_return
 	  /* We don't bother trying to create a return insn if the
 	     epilogue has filled delay-slots; we would have to try and
@@ -455,13 +470,14 @@  find_end_label (void)
 	    }
 #endif
 	}
+      *plabel = label;
     }
 
   /* Show one additional use for this label so it won't go away until
      we are done.  */
-  ++LABEL_NUSES (end_of_function_label);
+  ++LABEL_NUSES (*plabel);
 
-  return end_of_function_label;
+  return *plabel;
 }
 
 /* Put INSN and LIST together in a SEQUENCE rtx of LENGTH, and replace
@@ -809,10 +825,8 @@  optimize_skip (rtx insn)
   if ((next_trial == next_active_insn (JUMP_LABEL (insn))
        && ! (next_trial == 0 && crtl->epilogue_delay_list != 0))
       || (next_trial != 0
-	  && JUMP_P (next_trial)
-	  && JUMP_LABEL (insn) == JUMP_LABEL (next_trial)
-	  && (simplejump_p (next_trial)
-	      || GET_CODE (PATTERN (next_trial)) == RETURN)))
+	  && simplejump_or_return_p (next_trial)
+	  && JUMP_LABEL (insn) == JUMP_LABEL (next_trial)))
     {
       if (eligible_for_annul_false (insn, 0, trial, flags))
 	{
@@ -831,13 +845,11 @@  optimize_skip (rtx insn)
 	 branch, thread our jump to the target of that branch.  Don't
 	 change this into a RETURN here, because it may not accept what
 	 we have in the delay slot.  We'll fix this up later.  */
-      if (next_trial && JUMP_P (next_trial)
-	  && (simplejump_p (next_trial)
-	      || GET_CODE (PATTERN (next_trial)) == RETURN))
+      if (next_trial && simplejump_or_return_p (next_trial))
 	{
 	  rtx target_label = JUMP_LABEL (next_trial);
 	  if (ANY_RETURN_P (target_label))
-	    target_label = find_end_label ();
+	    target_label = find_end_label (target_label);
 
 	  if (target_label)
 	    {
@@ -951,7 +963,7 @@  rare_destination (rtx insn)
 	     return.  */
 	  return 2;
 	case JUMP_INSN:
-	  if (GET_CODE (PATTERN (insn)) == RETURN)
+	  if (ANY_RETURN_P (PATTERN (insn)))
 	    return 1;
 	  else if (simplejump_p (insn)
 		   && jump_count++ < 10)
@@ -1366,8 +1378,7 @@  steal_delay_list_from_fallthrough (rtx i
   /* We can't do anything if SEQ's delay insn isn't an
      unconditional branch.  */
 
-  if (! simplejump_p (XVECEXP (seq, 0, 0))
-      && GET_CODE (PATTERN (XVECEXP (seq, 0, 0))) != RETURN)
+  if (! simplejump_or_return_p (XVECEXP (seq, 0, 0)))
     return delay_list;
 
   for (i = 1; i < XVECLEN (seq, 0); i++)
@@ -2376,7 +2387,7 @@  fill_simple_delay_slots (int non_jumps_p
 	      if (new_label != 0)
 		new_label = get_label_before (new_label);
 	      else
-		new_label = find_end_label ();
+		new_label = find_end_label (simple_return_rtx);
 
 	      if (new_label)
 	        {
@@ -2508,7 +2519,8 @@  fill_simple_delay_slots (int non_jumps_p
 
 /* Follow any unconditional jump at LABEL;
    return the ultimate label reached by any such chain of jumps.
-   Return ret_rtx if the chain ultimately leads to a return instruction.
+   Return a suitable return rtx if the chain ultimately leads to a
+   return instruction.
    If LABEL is not followed by a jump, return LABEL.
    If the chain loops or we can't find end, return LABEL,
    since that tells caller to avoid changing the insn.  */
@@ -2529,7 +2541,7 @@  follow_jumps (rtx label)
 	&& JUMP_P (insn)
 	&& JUMP_LABEL (insn) != NULL_RTX
 	&& ((any_uncondjump_p (insn) && onlyjump_p (insn))
-	    || GET_CODE (PATTERN (insn)) == RETURN)
+	    || ANY_RETURN_P (PATTERN (insn)))
 	&& (next = NEXT_INSN (insn))
 	&& BARRIER_P (next));
        depth++)
@@ -2996,16 +3008,14 @@  fill_slots_from_thread (rtx insn, rtx co
 
       gcc_assert (thread_if_true);
 
-      if (new_thread && JUMP_P (new_thread)
-	  && (simplejump_p (new_thread)
-	      || GET_CODE (PATTERN (new_thread)) == RETURN)
+      if (new_thread && simplejump_or_return_p (new_thread)
 	  && redirect_with_delay_list_safe_p (insn,
 					      JUMP_LABEL (new_thread),
 					      delay_list))
 	new_thread = follow_jumps (JUMP_LABEL (new_thread));
 
       if (ANY_RETURN_P (new_thread))
-	label = find_end_label ();
+	label = find_end_label (new_thread);
       else if (LABEL_P (new_thread))
 	label = new_thread;
       else
@@ -3355,7 +3365,7 @@  relax_delay_slots (rtx first)
 	{
 	  target_label = skip_consecutive_labels (follow_jumps (target_label));
 	  if (ANY_RETURN_P (target_label))
-	    target_label = find_end_label ();
+	    target_label = find_end_label (target_label);
 
 	  if (target_label && next_active_insn (target_label) == next
 	      && ! condjump_in_parallel_p (insn))
@@ -3370,9 +3380,8 @@  relax_delay_slots (rtx first)
 	  /* See if this jump conditionally branches around an unconditional
 	     jump.  If so, invert this jump and point it to the target of the
 	     second jump.  */
-	  if (next && JUMP_P (next)
+	  if (next && simplejump_or_return_p (next)
 	      && any_condjump_p (insn)
-	      && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN)
 	      && target_label
 	      && next_active_insn (target_label) == next_active_insn (next)
 	      && no_labels_between_p (insn, next))
@@ -3414,8 +3423,7 @@  relax_delay_slots (rtx first)
 	 Don't do this if we expect the conditional branch to be true, because
 	 we would then be making the more common case longer.  */
 
-      if (JUMP_P (insn)
-	  && (simplejump_p (insn) || GET_CODE (PATTERN (insn)) == RETURN)
+      if (simplejump_or_return_p (insn)
 	  && (other = prev_active_insn (insn)) != 0
 	  && any_condjump_p (other)
 	  && no_labels_between_p (other, insn)
@@ -3456,10 +3464,10 @@  relax_delay_slots (rtx first)
 	 Only do so if optimizing for size since this results in slower, but
 	 smaller code.  */
       if (optimize_function_for_size_p (cfun)
-	  && GET_CODE (PATTERN (delay_insn)) == RETURN
+	  && ANY_RETURN_P (PATTERN (delay_insn))
 	  && next
 	  && JUMP_P (next)
-	  && GET_CODE (PATTERN (next)) == RETURN)
+	  && PATTERN (next) == PATTERN (delay_insn))
 	{
 	  rtx after;
 	  int i;
@@ -3498,6 +3506,8 @@  relax_delay_slots (rtx first)
 	continue;
 
       target_label = JUMP_LABEL (delay_insn);
+      if (target_label && ANY_RETURN_P (target_label))
+	continue;
 
       if (!ANY_RETURN_P (target_label))
 	{
@@ -3505,7 +3515,7 @@  relax_delay_slots (rtx first)
 	     don't convert a jump into a RETURN here.  */
 	  trial = skip_consecutive_labels (follow_jumps (target_label));
 	  if (ANY_RETURN_P (trial))
-	    trial = find_end_label ();
+	    trial = find_end_label (trial);
 
 	  if (trial && trial != target_label
 	      && redirect_with_delay_slots_safe_p (delay_insn, trial, insn))
@@ -3528,7 +3538,7 @@  relax_delay_slots (rtx first)
 		 later incorrectly compute register live/death info.  */
 	      rtx tmp = next_active_insn (trial);
 	      if (tmp == 0)
-		tmp = find_end_label ();
+		tmp = find_end_label (simple_return_rtx);
 
 	      if (tmp)
 	        {
@@ -3549,13 +3559,12 @@  relax_delay_slots (rtx first)
 	  if (trial && GET_CODE (PATTERN (trial)) == SEQUENCE
 	      && XVECLEN (PATTERN (trial), 0) == 2
 	      && JUMP_P (XVECEXP (PATTERN (trial), 0, 0))
-	      && (simplejump_p (XVECEXP (PATTERN (trial), 0, 0))
-		  || GET_CODE (PATTERN (XVECEXP (PATTERN (trial), 0, 0))) == RETURN)
+	      && simplejump_or_return_p (XVECEXP (PATTERN (trial), 0, 0))
 	      && redundant_insn (XVECEXP (PATTERN (trial), 0, 1), insn, 0))
 	    {
 	      target_label = JUMP_LABEL (XVECEXP (PATTERN (trial), 0, 0));
 	      if (ANY_RETURN_P (target_label))
-		target_label = find_end_label ();
+		target_label = find_end_label (target_label);
 
 	      if (target_label
 	          && redirect_with_delay_slots_safe_p (delay_insn, target_label,
@@ -3633,8 +3642,7 @@  relax_delay_slots (rtx first)
 	 a RETURN here.  */
       if (! INSN_ANNULLED_BRANCH_P (delay_insn)
 	  && any_condjump_p (delay_insn)
-	  && next && JUMP_P (next)
-	  && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN)
+	  && next && simplejump_or_return_p (next)
 	  && next_active_insn (target_label) == next_active_insn (next)
 	  && no_labels_between_p (insn, next))
 	{
@@ -3642,7 +3650,7 @@  relax_delay_slots (rtx first)
 	  rtx old_label = JUMP_LABEL (delay_insn);
 
 	  if (ANY_RETURN_P (label))
-	    label = find_end_label ();
+	    label = find_end_label (label);
 
 	  /* find_end_label can generate a new label. Check this first.  */
 	  if (label
@@ -3703,7 +3711,8 @@  static void
 make_return_insns (rtx first)
 {
   rtx insn, jump_insn, pat;
-  rtx real_return_label = end_of_function_label;
+  rtx real_return_label = function_return_label;
+  rtx real_simple_return_label = function_simple_return_label;
   int slots, i;
 
 #ifdef DELAY_SLOTS_FOR_EPILOGUE
@@ -3721,15 +3730,22 @@  make_return_insns (rtx first)
      made for END_OF_FUNCTION_LABEL.  If so, set up anything we can't change
      into a RETURN to jump to it.  */
   for (insn = first; insn; insn = NEXT_INSN (insn))
-    if (JUMP_P (insn) && GET_CODE (PATTERN (insn)) == RETURN)
+    if (JUMP_P (insn) && ANY_RETURN_P (PATTERN (insn)))
       {
-	real_return_label = get_label_before (insn);
+	rtx t = get_label_before (insn);
+	if (PATTERN (insn) == ret_rtx)
+	  real_return_label = t;
+	else
+	  real_simple_return_label = t;
 	break;
       }
 
   /* Show an extra usage of REAL_RETURN_LABEL so it won't go away if it
      was equal to END_OF_FUNCTION_LABEL.  */
-  LABEL_NUSES (real_return_label)++;
+  if (real_return_label)
+    LABEL_NUSES (real_return_label)++;
+  if (real_simple_return_label)
+    LABEL_NUSES (real_simple_return_label)++;
 
   /* Clear the list of insns to fill so we can use it.  */
   obstack_free (&unfilled_slots_obstack, unfilled_firstobj);
@@ -3737,13 +3753,27 @@  make_return_insns (rtx first)
   for (insn = first; insn; insn = NEXT_INSN (insn))
     {
       int flags;
+      rtx kind, real_label;
 
       /* Only look at filled JUMP_INSNs that go to the end of function
 	 label.  */
       if (!NONJUMP_INSN_P (insn)
 	  || GET_CODE (PATTERN (insn)) != SEQUENCE
-	  || !JUMP_P (XVECEXP (PATTERN (insn), 0, 0))
-	  || JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) != end_of_function_label)
+	  || !JUMP_P (XVECEXP (PATTERN (insn), 0, 0)))
+	continue;
+
+      if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) == function_return_label)
+	{
+	  kind = ret_rtx;
+	  real_label = real_return_label;
+	}
+      else if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0))
+	       == function_simple_return_label)
+	{
+	  kind = simple_return_rtx;
+	  real_label = real_simple_return_label;
+	}
+      else
 	continue;
 
       pat = PATTERN (insn);
@@ -3751,14 +3781,12 @@  make_return_insns (rtx first)
 
       /* If we can't make the jump into a RETURN, try to redirect it to the best
 	 RETURN and go on to the next insn.  */
-      if (! reorg_redirect_jump (jump_insn, ret_rtx))
+      if (!reorg_redirect_jump (jump_insn, kind))
 	{
 	  /* Make sure redirecting the jump will not invalidate the delay
 	     slot insns.  */
-	  if (redirect_with_delay_slots_safe_p (jump_insn,
-						real_return_label,
-						insn))
-	    reorg_redirect_jump (jump_insn, real_return_label);
+	  if (redirect_with_delay_slots_safe_p (jump_insn, real_label, insn))
+	    reorg_redirect_jump (jump_insn, real_label);
 	  continue;
 	}
 
@@ -3798,7 +3826,7 @@  make_return_insns (rtx first)
 	 RETURN, delete the SEQUENCE and output the individual insns,
 	 followed by the RETURN.  Then set things up so we try to find
 	 insns for its delay slots, if it needs some.  */
-      if (GET_CODE (PATTERN (jump_insn)) == RETURN)
+      if (ANY_RETURN_P (PATTERN (jump_insn)))
 	{
 	  rtx prev = PREV_INSN (insn);
 
@@ -3815,13 +3843,16 @@  make_return_insns (rtx first)
       else
 	/* It is probably more efficient to keep this with its current
 	   delay slot as a branch to a RETURN.  */
-	reorg_redirect_jump (jump_insn, real_return_label);
+	reorg_redirect_jump (jump_insn, real_label);
     }
 
   /* Now delete REAL_RETURN_LABEL if we never used it.  Then try to fill any
      new delay slots we have created.  */
-  if (--LABEL_NUSES (real_return_label) == 0)
+  if (real_return_label != NULL_RTX && --LABEL_NUSES (real_return_label) == 0)
     delete_related_insns (real_return_label);
+  if (real_simple_return_label != NULL_RTX
+      && --LABEL_NUSES (real_simple_return_label) == 0)
+    delete_related_insns (real_simple_return_label);
 
   fill_simple_delay_slots (1);
   fill_simple_delay_slots (0);
@@ -3889,7 +3920,7 @@  dbr_schedule (rtx first)
   init_resource_info (epilogue_insn);
 
   /* Show we haven't computed an end-of-function label yet.  */
-  end_of_function_label = 0;
+  function_return_label = function_simple_return_label = NULL_RTX;
 
   /* Initialize the statistics for this function.  */
   memset (num_insns_needing_delays, 0, sizeof num_insns_needing_delays);
@@ -3911,11 +3942,23 @@  dbr_schedule (rtx first)
   /* If we made an end of function label, indicate that it is now
      safe to delete it by undoing our prior adjustment to LABEL_NUSES.
      If it is now unused, delete it.  */
-  if (end_of_function_label && --LABEL_NUSES (end_of_function_label) == 0)
-    delete_related_insns (end_of_function_label);
+  if (function_return_label && --LABEL_NUSES (function_return_label) == 0)
+    delete_related_insns (function_return_label);
+  if (function_simple_return_label
+      && --LABEL_NUSES (function_simple_return_label) == 0)
+    delete_related_insns (function_simple_return_label);
 
+#if defined HAVE_return || defined HAVE_simple_return
+  if (
 #ifdef HAVE_return
-  if (HAVE_return && end_of_function_label != 0)
+      (HAVE_return && function_return_label != 0)
+#else
+      0
+#endif
+#ifdef HAVE_simple_return
+      || (HAVE_simple_return && function_simple_return_label != 0)
+#endif
+      )
     make_return_insns (first);
 #endif
 
Index: gcc/genemit.c
===================================================================
--- gcc/genemit.c	(revision 176879)
+++ gcc/genemit.c	(working copy)
@@ -169,6 +169,9 @@  gen_exp (rtx x, enum rtx_code subroutine
     case RETURN:
       printf ("ret_rtx");
       return;
+    case SIMPLE_RETURN:
+      printf ("simple_return_rtx");
+      return;
     case CLOBBER:
       if (REG_P (XEXP (x, 0)))
 	{
@@ -489,8 +492,8 @@  gen_expand (rtx expand)
 	  || (GET_CODE (next) == PARALLEL
 	      && ((GET_CODE (XVECEXP (next, 0, 0)) == SET
 		   && GET_CODE (SET_DEST (XVECEXP (next, 0, 0))) == PC)
-		  || GET_CODE (XVECEXP (next, 0, 0)) == RETURN))
-	  || GET_CODE (next) == RETURN)
+		  || ANY_RETURN_P (XVECEXP (next, 0, 0))))
+	  || ANY_RETURN_P (next))
 	printf ("  emit_jump_insn (");
       else if ((GET_CODE (next) == SET && GET_CODE (SET_SRC (next)) == CALL)
 	       || GET_CODE (next) == CALL
@@ -607,7 +610,7 @@  gen_split (rtx split)
 	  || (GET_CODE (next) == PARALLEL
 	      && GET_CODE (XVECEXP (next, 0, 0)) == SET
 	      && GET_CODE (SET_DEST (XVECEXP (next, 0, 0))) == PC)
-	  || GET_CODE (next) == RETURN)
+	  || ANY_RETURN_P (next))
 	printf ("  emit_jump_insn (");
       else if ((GET_CODE (next) == SET && GET_CODE (SET_SRC (next)) == CALL)
 	       || GET_CODE (next) == CALL
Index: gcc/df-scan.c
===================================================================
--- gcc/df-scan.c	(revision 176879)
+++ gcc/df-scan.c	(working copy)
@@ -3181,6 +3181,7 @@  df_uses_record (struct df_collection_rec
       }
 
     case RETURN:
+    case SIMPLE_RETURN:
       break;
 
     case ASM_OPERANDS:
Index: gcc/rtl.def
===================================================================
--- gcc/rtl.def	(revision 176879)
+++ gcc/rtl.def	(working copy)
@@ -731,6 +731,10 @@  DEF_RTL_EXPR(ENTRY_VALUE, "entry_value",
    been optimized away completely.  */
 DEF_RTL_EXPR(DEBUG_PARAMETER_REF, "debug_parameter_ref", "t", RTX_OBJ)
 
+/* A plain return, to be used on paths that are reached without going
+   through the function prologue.  */
+DEF_RTL_EXPR(SIMPLE_RETURN, "simple_return", "", RTX_EXTRA)
+
 /* All expressions from this point forward appear only in machine
    descriptions.  */
 #ifdef GENERATOR_FILE
Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 176881)
+++ gcc/ifcvt.c	(working copy)
@@ -3796,6 +3796,7 @@  find_if_case_1 (basic_block test_bb, edg
   basic_block then_bb = then_edge->dest;
   basic_block else_bb = else_edge->dest;
   basic_block new_bb;
+  rtx else_target = NULL_RTX;
   int then_bb_index;
 
   /* If we are partitioning hot/cold basic blocks, we don't want to
@@ -3845,6 +3846,13 @@  find_if_case_1 (basic_block test_bb, edg
 				    predictable_edge_p (then_edge)))))
     return FALSE;
 
+  if (else_bb == EXIT_BLOCK_PTR)
+    {
+      rtx jump = BB_END (else_edge->src);
+      gcc_assert (JUMP_P (jump));
+      else_target = JUMP_LABEL (jump);
+    }
+
   /* Registers set are dead, or are predicable.  */
   if (! dead_or_predicable (test_bb, then_bb, else_bb,
 			    single_succ_edge (then_bb), 1))
@@ -3864,6 +3872,9 @@  find_if_case_1 (basic_block test_bb, edg
       redirect_edge_succ (FALLTHRU_EDGE (test_bb), else_bb);
       new_bb = 0;
     }
+  else if (else_bb == EXIT_BLOCK_PTR)
+    new_bb = force_nonfallthru_and_redirect (FALLTHRU_EDGE (test_bb),
+					     else_bb, else_target);
   else
     new_bb = redirect_edge_and_branch_force (FALLTHRU_EDGE (test_bb),
 					     else_bb);
Index: gcc/jump.c
===================================================================
--- gcc/jump.c	(revision 176881)
+++ gcc/jump.c	(working copy)
@@ -29,7 +29,8 @@  along with GCC; see the file COPYING3.
    JUMP_LABEL internal field.  With this we can detect labels that
    become unused because of the deletion of all the jumps that
    formerly used them.  The JUMP_LABEL info is sometimes looked
-   at by later passes.
+   at by later passes.  For return insns, it contains either a
+   RETURN or a SIMPLE_RETURN rtx.
 
    The subroutines redirect_jump and invert_jump are used
    from other passes as well.  */
@@ -775,10 +776,10 @@  condjump_p (const_rtx insn)
     return (GET_CODE (x) == IF_THEN_ELSE
 	    && ((GET_CODE (XEXP (x, 2)) == PC
 		 && (GET_CODE (XEXP (x, 1)) == LABEL_REF
-		     || GET_CODE (XEXP (x, 1)) == RETURN))
+		     || ANY_RETURN_P (XEXP (x, 1))))
 		|| (GET_CODE (XEXP (x, 1)) == PC
 		    && (GET_CODE (XEXP (x, 2)) == LABEL_REF
-			|| GET_CODE (XEXP (x, 2)) == RETURN))));
+			|| ANY_RETURN_P (XEXP (x, 2))))));
 }
 
 /* Return nonzero if INSN is a (possibly) conditional jump inside a
@@ -807,11 +808,11 @@  condjump_in_parallel_p (const_rtx insn)
     return 0;
   if (XEXP (SET_SRC (x), 2) == pc_rtx
       && (GET_CODE (XEXP (SET_SRC (x), 1)) == LABEL_REF
-	  || GET_CODE (XEXP (SET_SRC (x), 1)) == RETURN))
+	  || ANY_RETURN_P (XEXP (SET_SRC (x), 1))))
     return 1;
   if (XEXP (SET_SRC (x), 1) == pc_rtx
       && (GET_CODE (XEXP (SET_SRC (x), 2)) == LABEL_REF
-	  || GET_CODE (XEXP (SET_SRC (x), 2)) == RETURN))
+	  || ANY_RETURN_P (XEXP (SET_SRC (x), 2))))
     return 1;
   return 0;
 }
@@ -873,8 +874,9 @@  any_condjump_p (const_rtx insn)
   a = GET_CODE (XEXP (SET_SRC (x), 1));
   b = GET_CODE (XEXP (SET_SRC (x), 2));
 
-  return ((b == PC && (a == LABEL_REF || a == RETURN))
-	  || (a == PC && (b == LABEL_REF || b == RETURN)));
+  return ((b == PC && (a == LABEL_REF || a == RETURN || a == SIMPLE_RETURN))
+	  || (a == PC
+	      && (b == LABEL_REF || b == RETURN || b == SIMPLE_RETURN)));
 }
 
 /* Return the label of a conditional jump.  */
@@ -911,6 +913,7 @@  returnjump_p_1 (rtx *loc, void *data ATT
   switch (GET_CODE (x))
     {
     case RETURN:
+    case SIMPLE_RETURN:
     case EH_RETURN:
       return true;
 
Index: gcc/print-rtl.c
===================================================================
--- gcc/print-rtl.c	(revision 176881)
+++ gcc/print-rtl.c	(working copy)
@@ -328,6 +328,8 @@  print_rtx (const_rtx in_rtx)
 	    fprintf (outfile, "\n%s%*s -> ", print_rtx_head, indent * 2, "");
 	    if (GET_CODE (JUMP_LABEL (in_rtx)) == RETURN)
 	      fprintf (outfile, "return");
+	    else if (GET_CODE (JUMP_LABEL (in_rtx)) == SIMPLE_RETURN)
+	      fprintf (outfile, "simple_return");
 	    else
 	      fprintf (outfile, "%d", INSN_UID (JUMP_LABEL (in_rtx)));
 	  }
Index: gcc/bt-load.c
===================================================================
--- gcc/bt-load.c	(revision 176879)
+++ gcc/bt-load.c	(working copy)
@@ -558,7 +558,7 @@  compute_defs_uses_and_gen (fibheap_t all
 		      /* Check for sibcall.  */
 		      if (GET_CODE (pat) == PARALLEL)
 			for (i = XVECLEN (pat, 0) - 1; i >= 0; i--)
-			  if (GET_CODE (XVECEXP (pat, 0, i)) == RETURN)
+			  if (ANY_RETURN_P (XVECEXP (pat, 0, i)))
 			    {
 			      COMPL_HARD_REG_SET (call_saved,
 						  call_used_reg_set);
Index: gcc/emit-rtl.c
===================================================================
--- gcc/emit-rtl.c	(revision 176881)
+++ gcc/emit-rtl.c	(working copy)
@@ -2518,6 +2518,7 @@  verify_rtx_sharing (rtx orig, rtx insn)
     case PC:
     case CC0:
     case RETURN:
+    case SIMPLE_RETURN:
     case SCRATCH:
       return;
       /* SCRATCH must be shared because they represent distinct values.  */
@@ -5002,7 +5003,7 @@  classify_insn (rtx x)
     return CODE_LABEL;
   if (GET_CODE (x) == CALL)
     return CALL_INSN;
-  if (GET_CODE (x) == RETURN)
+  if (ANY_RETURN_P (x))
     return JUMP_INSN;
   if (GET_CODE (x) == SET)
     {
@@ -5514,6 +5515,7 @@  init_emit_regs (void)
   /* Assign register numbers to the globally defined register rtx.  */
   pc_rtx = gen_rtx_fmt_ (PC, VOIDmode);
   ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode);
+  simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode);
   cc0_rtx = gen_rtx_fmt_ (CC0, VOIDmode);
   stack_pointer_rtx = gen_raw_REG (Pmode, STACK_POINTER_REGNUM);
   frame_pointer_rtx = gen_raw_REG (Pmode, FRAME_POINTER_REGNUM);
Index: gcc/cfglayout.c
===================================================================
--- gcc/cfglayout.c	(revision 176881)
+++ gcc/cfglayout.c	(working copy)
@@ -767,6 +767,7 @@  fixup_reorder_chain (void)
     {
       edge e_fall, e_taken, e;
       rtx bb_end_insn;
+      rtx ret_label = NULL_RTX;
       basic_block nb, src_bb;
       edge_iterator ei;
 
@@ -786,6 +787,7 @@  fixup_reorder_chain (void)
       bb_end_insn = BB_END (bb);
       if (JUMP_P (bb_end_insn))
 	{
+	  ret_label = JUMP_LABEL (bb_end_insn);
 	  if (any_condjump_p (bb_end_insn))
 	    {
 	      /* This might happen if the conditional jump has side
@@ -899,7 +901,7 @@  fixup_reorder_chain (void)
 	 Note force_nonfallthru can delete E_FALL and thus we have to
 	 save E_FALL->src prior to the call to force_nonfallthru.  */
       src_bb = e_fall->src;
-      nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest);
+      nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label);
       if (nb)
 	{
 	  nb->il.rtl->visited = 1;
Index: gcc/rtl.c
===================================================================
--- gcc/rtl.c	(revision 176879)
+++ gcc/rtl.c	(working copy)
@@ -256,6 +256,7 @@  copy_rtx (rtx orig)
     case PC:
     case CC0:
     case RETURN:
+    case SIMPLE_RETURN:
     case SCRATCH:
       /* SCRATCH must be shared because they represent distinct values.  */
       return orig;
Index: gcc/rtl.h
===================================================================
--- gcc/rtl.h	(revision 176881)
+++ gcc/rtl.h	(working copy)
@@ -432,8 +432,9 @@  struct GTY((variable_size)) rtvec_def {
   (JUMP_P (INSN) && (GET_CODE (PATTERN (INSN)) == ADDR_VEC || \
 		     GET_CODE (PATTERN (INSN)) == ADDR_DIFF_VEC))
 
-/* Predicate yielding nonzero iff X is a return.  */
-#define ANY_RETURN_P(X) ((X) == ret_rtx)
+/* Predicate yielding nonzero iff X is a return or simple_return.  */
+#define ANY_RETURN_P(X) \
+  (GET_CODE (X) == RETURN || GET_CODE (X) == SIMPLE_RETURN)
 
 /* 1 if X is a unary operator.  */
 
@@ -2074,6 +2075,7 @@  enum global_rtl_index
   GR_PC,
   GR_CC0,
   GR_RETURN,
+  GR_SIMPLE_RETURN,
   GR_STACK_POINTER,
   GR_FRAME_POINTER,
 /* For register elimination to work properly these hard_frame_pointer_rtx,
@@ -2169,6 +2171,7 @@  extern struct target_rtl *this_target_rt
 /* Standard pieces of rtx, to be substituted directly into things.  */
 #define pc_rtx                  (global_rtl[GR_PC])
 #define ret_rtx                 (global_rtl[GR_RETURN])
+#define simple_return_rtx       (global_rtl[GR_SIMPLE_RETURN])
 #define cc0_rtx                 (global_rtl[GR_CC0])
 
 /* All references to certain hard regs, except those created
Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(revision 176879)
+++ gcc/combine.c	(working copy)
@@ -6303,7 +6303,7 @@  simplify_set (rtx x)
   rtx *cc_use;
 
   /* (set (pc) (return)) gets written as (return).  */
-  if (GET_CODE (dest) == PC && GET_CODE (src) == RETURN)
+  if (GET_CODE (dest) == PC && ANY_RETURN_P (src))
     return src;
 
   /* Now that we know for sure which bits of SRC we are using, see if we can
Index: gcc/resource.c
===================================================================
--- gcc/resource.c	(revision 176881)
+++ gcc/resource.c	(working copy)
@@ -492,7 +492,7 @@  find_dead_or_set_registers (rtx target,
 	  if (jump_count++ < 10)
 	    {
 	      if (any_uncondjump_p (this_jump_insn)
-		  || GET_CODE (PATTERN (this_jump_insn)) == RETURN)
+		  || ANY_RETURN_P (PATTERN (this_jump_insn)))
 		{
 		  next = JUMP_LABEL (this_jump_insn);
 		  if (ANY_RETURN_P (next))
@@ -821,7 +821,7 @@  mark_set_resources (rtx x, struct resour
 static bool
 return_insn_p (const_rtx insn)
 {
-  if (JUMP_P (insn) && GET_CODE (PATTERN (insn)) == RETURN)
+  if (JUMP_P (insn) && ANY_RETURN_P (PATTERN (insn)))
     return true;
 
   if (NONJUMP_INSN_P (insn) && GET_CODE (PATTERN (insn)) == SEQUENCE)
Index: gcc/basic-block.h
===================================================================
--- gcc/basic-block.h	(revision 176881)
+++ gcc/basic-block.h	(working copy)
@@ -804,7 +804,7 @@  extern rtx block_label (basic_block);
 extern bool purge_all_dead_edges (void);
 extern bool purge_dead_edges (basic_block);
 extern bool fixup_abnormal_edges (void);
-extern basic_block force_nonfallthru_and_redirect (edge, basic_block);
+extern basic_block force_nonfallthru_and_redirect (edge, basic_block, rtx);
 
 /* In cfgbuild.c.  */
 extern void find_many_sub_basic_blocks (sbitmap);
Index: gcc/sched-vis.c
===================================================================
--- gcc/sched-vis.c	(revision 176879)
+++ gcc/sched-vis.c	(working copy)
@@ -554,6 +554,9 @@  print_pattern (char *buf, const_rtx x, i
     case RETURN:
       sprintf (buf, "return");
       break;
+    case SIMPLE_RETURN:
+      sprintf (buf, "simple_return");
+      break;
     case CALL:
       print_exp (buf, x, verbose);
       break;
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 176879)
+++ gcc/config/i386/i386.c	(working copy)
@@ -29890,7 +29890,7 @@  ix86_pad_returns (void)
       rtx prev;
       bool replace = false;
 
-      if (!JUMP_P (ret) || GET_CODE (PATTERN (ret)) != RETURN
+      if (!JUMP_P (ret) || !ANY_RETURN_P (PATTERN (ret))
 	  || optimize_bb_for_size_p (bb))
 	continue;
       for (prev = PREV_INSN (ret); prev; prev = PREV_INSN (prev))
@@ -29941,7 +29941,7 @@  ix86_count_insn_bb (basic_block bb)
     {
       /* Only happen in exit blocks.  */
       if (JUMP_P (insn)
-	  && GET_CODE (PATTERN (insn)) == RETURN)
+	  && ANY_RETURN_P (PATTERN (insn)))
 	break;
 
       if (NONDEBUG_INSN_P (insn)
@@ -30014,7 +30014,7 @@  ix86_pad_short_function (void)
   FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR->preds)
     {
       rtx ret = BB_END (e->src);
-      if (JUMP_P (ret) && GET_CODE (PATTERN (ret)) == RETURN)
+      if (JUMP_P (ret) && ANY_RETURN_P (PATTERN (ret)))
 	{
 	  int insn_count = ix86_count_insn (e->src);
 
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 176881)
+++ gcc/config/arm/arm.c	(working copy)
@@ -17382,6 +17382,7 @@  arm_final_prescan_insn (rtx insn)
 
   /* If we start with a return insn, we only succeed if we find another one.  */
   int seeking_return = 0;
+  enum rtx_code return_code = UNKNOWN;
 
   /* START_INSN will hold the insn from where we start looking.  This is the
      first insn after the following code_label if REVERSE is true.  */
@@ -17420,7 +17421,7 @@  arm_final_prescan_insn (rtx insn)
 	  else
 	    return;
 	}
-      else if (GET_CODE (body) == RETURN)
+      else if (ANY_RETURN_P (body))
         {
 	  start_insn = next_nonnote_insn (start_insn);
 	  if (GET_CODE (start_insn) == BARRIER)
@@ -17431,6 +17432,7 @@  arm_final_prescan_insn (rtx insn)
 	    {
 	      reverse = TRUE;
 	      seeking_return = 1;
+	      return_code = GET_CODE (body);
 	    }
 	  else
 	    return;
@@ -17471,11 +17473,15 @@  arm_final_prescan_insn (rtx insn)
 	  label = XEXP (XEXP (SET_SRC (body), 2), 0);
 	  then_not_else = FALSE;
 	}
-      else if (GET_CODE (XEXP (SET_SRC (body), 1)) == RETURN)
-	seeking_return = 1;
-      else if (GET_CODE (XEXP (SET_SRC (body), 2)) == RETURN)
+      else if (ANY_RETURN_P (XEXP (SET_SRC (body), 1)))
+	{
+	  seeking_return = 1;
+	  return_code = GET_CODE (XEXP (SET_SRC (body), 1));
+	}
+      else if (ANY_RETURN_P (XEXP (SET_SRC (body), 2)))
         {
 	  seeking_return = 1;
+	  return_code = GET_CODE (XEXP (SET_SRC (body), 2));
 	  then_not_else = FALSE;
         }
       else
@@ -17572,12 +17578,11 @@  arm_final_prescan_insn (rtx insn)
 		}
 	      /* Fail if a conditional return is undesirable (e.g. on a
 		 StrongARM), but still allow this if optimizing for size.  */
-	      else if (GET_CODE (scanbody) == RETURN
+	      else if (GET_CODE (scanbody) == return_code
 		       && !use_return_insn (TRUE, NULL)
 		       && !optimize_size)
 		fail = TRUE;
-	      else if (GET_CODE (scanbody) == RETURN
-		       && seeking_return)
+	      else if (GET_CODE (scanbody) == return_code)
 	        {
 		  arm_ccfsm_state = 2;
 		  succeed = TRUE;
Index: gcc/config/mips/mips.md
===================================================================
--- gcc/config/mips/mips.md	(revision 176879)
+++ gcc/config/mips/mips.md	(working copy)
@@ -5724,6 +5724,18 @@  (define_insn "*return"
   [(set_attr "type"	"jump")
    (set_attr "mode"	"none")])
 
+(define_expand "simple_return"
+  [(simple_return)]
+  "!mips_can_use_return_insn ()"
+  { mips_expand_before_return (); })
+
+(define_insn "*simple_return"
+  [(simple_return)]
+  "!mips_can_use_return_insn ()"
+  "%*j\t$31%/"
+  [(set_attr "type"	"jump")
+   (set_attr "mode"	"none")])
+
 ;; Normal return.
 
 (define_insn "return_internal"
@@ -5731,6 +5743,14 @@  (define_insn "return_internal"
    (use (match_operand 0 "pmode_register_operand" ""))]
   ""
   "%*j\t%0%/"
+  [(set_attr "type"	"jump")
+   (set_attr "mode"	"none")])
+
+(define_insn "simple_return_internal"
+  [(simple_return)
+   (use (match_operand 0 "pmode_register_operand" ""))]
+  ""
+  "%*j\t%0%/"
   [(set_attr "type"	"jump")
    (set_attr "mode"	"none")])
 
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	(revision 176879)
+++ gcc/config/mips/mips.c	(working copy)
@@ -10452,7 +10452,8 @@  mips_expand_epilogue (bool sibcall_p)
 	    regno = GP_REG_FIRST + 7;
 	  else
 	    regno = RETURN_ADDR_REGNUM;
-	  emit_jump_insn (gen_return_internal (gen_rtx_REG (Pmode, regno)));
+	  emit_jump_insn (gen_simple_return_internal (gen_rtx_REG (Pmode,
+								   regno)));
 	}
     }
 
Index: gcc/cfgrtl.c
===================================================================
--- gcc/cfgrtl.c	(revision 176905)
+++ gcc/cfgrtl.c	(working copy)
@@ -1117,10 +1117,13 @@  rtl_redirect_edge_and_branch (edge e, ba
 }
 
 /* Like force_nonfallthru below, but additionally performs redirection
-   Used by redirect_edge_and_branch_force.  */
+   Used by redirect_edge_and_branch_force.  JUMP_LABEL is used only
+   when redirecting to the EXIT_BLOCK, it is either ret_rtx or
+   simple_return_rtx, indicating which kind of returnjump to create.
+   It should be NULL otherwise.  */
 
 basic_block
-force_nonfallthru_and_redirect (edge e, basic_block target)
+force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label)
 {
   basic_block jump_block, new_bb = NULL, src = e->src;
   rtx note;
@@ -1252,12 +1255,25 @@  force_nonfallthru_and_redirect (edge e,
   e->flags &= ~EDGE_FALLTHRU;
   if (target == EXIT_BLOCK_PTR)
     {
+      if (jump_label == ret_rtx)
+	{
 #ifdef HAVE_return
-	emit_jump_insn_after_setloc (gen_return (), BB_END (jump_block), loc);
-	JUMP_LABEL (BB_END (jump_block)) = ret_rtx;
+	  emit_jump_insn_after_setloc (gen_return (), BB_END (jump_block), loc);
 #else
-	gcc_unreachable ();
+	  gcc_unreachable ();
+#endif
+	}
+      else
+	{
+	  gcc_assert (jump_label == simple_return_rtx);
+#ifdef HAVE_simple_return
+	  emit_jump_insn_after_setloc (gen_simple_return (),
+				       BB_END (jump_block), loc);
+#else
+	  gcc_unreachable ();
 #endif
+	}
+      JUMP_LABEL (BB_END (jump_block)) = jump_label;
     }
   else
     {
@@ -1284,7 +1300,7 @@  force_nonfallthru_and_redirect (edge e,
 static basic_block
 rtl_force_nonfallthru (edge e)
 {
-  return force_nonfallthru_and_redirect (e, e->dest);
+  return force_nonfallthru_and_redirect (e, e->dest, NULL_RTX);
 }
 
 /* Redirect edge even at the expense of creating new jump insn or
@@ -1301,7 +1317,7 @@  rtl_redirect_edge_and_branch_force (edge
   /* In case the edge redirection failed, try to force it to be non-fallthru
      and redirect newly created simplejump.  */
   df_set_bb_dirty (e->src);
-  return force_nonfallthru_and_redirect (e, target);
+  return force_nonfallthru_and_redirect (e, target, NULL_RTX);
 }
 
 /* The given edge should potentially be a fallthru edge.  If that is in