From patchwork Wed Aug 24 16:46:03 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 111386 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id A6584B6F18 for ; Thu, 25 Aug 2011 02:48:08 +1000 (EST) Received: (qmail 26490 invoked by alias); 24 Aug 2011 16:48:03 -0000 Received: (qmail 26432 invoked by uid 22791); 24 Aug 2011 16:47:56 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, TW_TX X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 24 Aug 2011 16:47:33 +0000 Received: (qmail 15850 invoked from network); 24 Aug 2011 16:47:31 -0000 Received: from unknown (HELO ?84.152.209.133?) (bernds@127.0.0.2) by mail.codesourcery.com with ESMTPA; 24 Aug 2011 16:47:31 -0000 Message-ID: <4E552ACB.8050702@codesourcery.com> Date: Wed, 24 Aug 2011 18:46:03 +0200 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.18) Gecko/20110801 Lightning/1.0b3pre Thunderbird/3.1.11 MIME-Version: 1.0 To: GCC Patches , richard.sandiford@linaro.org Subject: Re: [PATCH 4/6] Shrink-wrapping References: <4D8A0703.9090306@codesourcery.com> <4D8A095C.8050809@codesourcery.com> <4E37B7A8.8010901@codesourcery.com> In-Reply-To: Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 08/03/11 17:38, Richard Sandiford wrote: > Bernd Schmidt writes: >> +@findex simple_return >> +@item (simple_return) >> +Like @code{(return)}, but truly represents only a function return, while >> +@code{(return)} may represent an insn that also performs other functions >> +of the function epilogue. Like @code{(return)}, this may also occur in >> +conditional jumps. > > Sorry, I've forgotton the outcome of the discussion about what happens > on targets whose return expands to the same code as their simple_return. > Do the targets still need both "return" and "simple_return" rtxes? It's important to distinguish between these names as rtxes that can occur in instruction patterns, and a use as a standard pattern name. When a "return" pattern is generated, it should either fail or expand to something that performs both the epilogue and the return. A "simple_return" expands to something that performs only the return. Most targets allow "return" patterns only if the epilogue is empty. In that case, "return" and "simple_return" can expand to the same insn; it does not matter if that insn uses "simple_return" or "return", as they are equivalent in the absence of an epilogue. It would be slightly nicer to use "simple_return" in the patterns everywhere except ARM, but ports don't need to be changed. > Do they need both md patterns (but potentially using the same rtx > underneath)? The "return" standard pattern is needed for the existing optimizations (inserting returns in-line rather than jumping to the end of the function). Typically, it always fails if the function needs an epilogue, except in the ARM case. For shrink-wrapping to work, a port needs a "simple_return" pattern, which the compiler can use even if parts of the function need an epilogue. So yes, they have different purposes. > I ask because the rtl.def comment implies that those targets still > need both expanders and both rtxes. If that's so, I think it needs > to be mentioned here too. E.g. something like: > > Like @code{(return)}, but truly represents only a function return, while > @code{(return)} may represent an insn that also performs other functions > of the function epilogue. @code{(return)} only occurs on paths that > pass through the function prologue, while @code{(simple_return)} > only occurs on paths that do not pass through the prologue. This is not accurate for the rtx code. It is mostly accurate for the standard pattern name. A simple_return rtx may occur just after an epilogue, i.e. on a path that has passed through the prologue. Even for the simple_return pattern, I'm not sure reorg.c couldn't introduce new expansions in a location after both prologue and epilogue. > Like @code{(return)}, @code{(simple_return)} may also occur in > conditional jumps. > > You need to document the simple_return pattern in md.texi too. I was trying to update the documentation to only the current state after the patch. The thinking was that without shrink-wrapping, nothing generates this pattern, so documenting it would be misleading. However, with the mips changes in this version of the patch, reorg.c does make use of this pattern, so I've added documentation >> @@ -3498,6 +3506,8 @@ relax_delay_slots (rtx first) >> continue; >> >> target_label = JUMP_LABEL (delay_insn); >> + if (target_label && ANY_RETURN_P (target_label)) >> + continue; >> >> if (!ANY_RETURN_P (target_label)) >> { > > This doesn't look like a pure "handle return as well as simple return" > change. Is the idea that every following test only makes sense for > labels, and that things like: > > && prev_active_insn (target_label) == insn > > (to pick just one example) are actively dangerous for returns? That probably was the idea. Looking at it again, there's one case at the bottom of the loop which may be safe, but given that there were no code generation differences with the patch on three targets with define_delay, I've done: > If so, I think you should remove the immediately-following. > "if (!ANY_RETURN_P (target_label))" condition and reindent the body. this. > Given what you said about JUMP_LABEL sometimes being null, > I think we need either (a) to check whether each *_return_label > is null before comparing it with JUMP_LABEL, or (b) to ensure that > we're dealing with a jump to a label. (b) seems neater IMO > (as a call to jump_to_label_p). Done. > >> +#if defined HAVE_return || defined HAVE_simple_return >> + if ( >> #ifdef HAVE_return >> - if (HAVE_return && end_of_function_label != 0) >> + (HAVE_return && function_return_label != 0) >> +#else >> + 0 >> +#endif >> +#ifdef HAVE_simple_return >> + || (HAVE_simple_return && function_simple_return_label != 0) >> +#endif >> + ) >> make_return_insns (first); >> #endif > > Eww. Restructured. > (define_code_iterator any_return [return simple_return]) > > and just change the appropriate returns to any_returns. I've done this a bit differently - to show that it can be done, I've changed mips to always emit simple_return rtxs, even for "return" patterns (no code generation changes observed again). This version regression tested on mips64-elf, c/c++/objc. Bernd * doc/rtl.texi (simple_return): Document. (parallel, PATTERN): Here too. * doc/md.texi (return): Mention it's allowed to expand to simple_return in some cases. (simple_return): Document standard pattern. * gengenrtl.c (special_rtx): SIMPLE_RETURN is special. * final.c (final_scan_insn): Use ANY_RETURN_P on body. * reorg.c (function_return_label, function_simple_return_label): New static variables, replacing... (end_of_function_label): ... this. (simplejump_or_return_p): New static function. (optimize_skip, steal_delay_list_from_fallthrough, fill_slots_from_thread): Use it. (relax_delay_slots): Likewise. Use ANY_RETURN_P on body. (rare_destination, follow_jumps): Use ANY_RETURN_P on body. (find_end_label): Take a new arg which is one of the two return rtxs. Depending on which, set either function_return_label or function_simple_return_label. All callers changed. (make_return_insns): Make both kinds. (dbr_schedule): Adjust for two kinds of end labels. * genemit.c (gen_exp): Handle SIMPLE_RETURN. (gen_expand, gen_split): Use ANY_RETURN_P. * df-scan.c (df_uses_record): Handle SIMPLE_RETURN. * rtl.def (SIMPLE_RETURN): New code. * ifcvt.c (find_if_case_1): Be more careful about redirecting jumps to the EXIT_BLOCK. * jump.c (condjump_p, condjump_in_parallel_p, any_condjump_p, returnjump_p_1): Handle SIMPLE_RETURNs. * print-rtl.c (print_rtx): Likewise. * rtl.c (copy_rtx): Likewise. * bt-load.c (compute_defs_uses_and_gen): Use ANY_RETURN_P. * combine.c (simplify_set): Likewise. * resource.c (find_dead_or_set_registers, mark_set_resources): Likewise. * emit-rtl.c (verify_rtx_sharing, classify_insn, copy_insn_1, copy_rtx_if_shared_1, mark_used_flags): Handle SIMPLE_RETURNs. (init_emit_regs): Initialize simple_return_rtx. * cfglayout.c (fixup_reorder_chain): Pass a JUMP_LABEL to force_nonfallthru_and_redirect. * rtl.h (ANY_RETURN_P): Allow SIMPLE_RETURN. (GR_SIMPLE_RETURN): New enum value. (simple_return_rtx): New macro. * basic-block.h (force_nonfallthru_and_redirect): Adjust declaration. * cfgrtl.c (force_nonfallthru_and_redirect): Take a new jump_label argument. All callers changed. Be careful about what kinds of returnjumps to generate. * config/i386/3i86.c (ix86_pad_returns, ix86_count_insn_bb, ix86_pad_short_function): Likewise. * config/arm/arm.c (arm_final_prescan_insn): Handle both kinds of return. * config/mips/mips.md (any_return): New code_iterator. (optab): Add cases for return and simple_return. (return): Expand to a simple_return. (simple_return): New pattern. (*, *_internal for any_return): New patterns. (return_internal): Remove. * config/mips/mips.c (mips_expand_epilogue): Make the last insn a simple_return_internal. Index: gcc/doc/rtl.texi =================================================================== --- gcc/doc/rtl.texi (revision 177999) +++ gcc/doc/rtl.texi (working copy) @@ -2915,6 +2915,13 @@ placed in @code{pc} to return to the cal Note that an insn pattern of @code{(return)} is logically equivalent to @code{(set (pc) (return))}, but the latter form is never used. +@findex simple_return +@item (simple_return) +Like @code{(return)}, but truly represents only a function return, while +@code{(return)} may represent an insn that also performs other functions +of the function epilogue. Like @code{(return)}, this may also occur in +conditional jumps. + @findex call @item (call @var{function} @var{nargs}) Represents a function call. @var{function} is a @code{mem} expression @@ -3044,7 +3051,7 @@ Represents several side effects performe brackets stand for a vector; the operand of @code{parallel} is a vector of expressions. @var{x0}, @var{x1} and so on are individual side effect expressions---expressions of code @code{set}, @code{call}, -@code{return}, @code{clobber} or @code{use}. +@code{return}, @code{simple_return}, @code{clobber} or @code{use}. ``In parallel'' means that first all the values used in the individual side-effects are computed, and second all the actual side-effects are @@ -3683,14 +3690,16 @@ and @code{call_insn} insns: @table @code @findex PATTERN @item PATTERN (@var{i}) -An expression for the side effect performed by this insn. This must be -one of the following codes: @code{set}, @code{call}, @code{use}, -@code{clobber}, @code{return}, @code{asm_input}, @code{asm_output}, -@code{addr_vec}, @code{addr_diff_vec}, @code{trap_if}, @code{unspec}, -@code{unspec_volatile}, @code{parallel}, @code{cond_exec}, or @code{sequence}. If it is a @code{parallel}, -each element of the @code{parallel} must be one these codes, except that -@code{parallel} expressions cannot be nested and @code{addr_vec} and -@code{addr_diff_vec} are not permitted inside a @code{parallel} expression. +An expression for the side effect performed by this insn. This must +be one of the following codes: @code{set}, @code{call}, @code{use}, +@code{clobber}, @code{return}, @code{simple_return}, @code{asm_input}, +@code{asm_output}, @code{addr_vec}, @code{addr_diff_vec}, +@code{trap_if}, @code{unspec}, @code{unspec_volatile}, +@code{parallel}, @code{cond_exec}, or @code{sequence}. If it is a +@code{parallel}, each element of the @code{parallel} must be one these +codes, except that @code{parallel} expressions cannot be nested and +@code{addr_vec} and @code{addr_diff_vec} are not permitted inside a +@code{parallel} expression. @findex INSN_CODE @item INSN_CODE (@var{i}) Index: gcc/gengenrtl.c =================================================================== --- gcc/gengenrtl.c (revision 177999) +++ gcc/gengenrtl.c (working copy) @@ -131,6 +131,7 @@ special_rtx (int idx) || strcmp (defs[idx].enumname, "PC") == 0 || strcmp (defs[idx].enumname, "CC0") == 0 || strcmp (defs[idx].enumname, "RETURN") == 0 + || strcmp (defs[idx].enumname, "SIMPLE_RETURN") == 0 || strcmp (defs[idx].enumname, "CONST_VECTOR") == 0); } Index: gcc/final.c =================================================================== --- gcc/final.c (revision 177999) +++ gcc/final.c (working copy) @@ -2492,7 +2492,7 @@ final_scan_insn (rtx insn, FILE *file, i delete_insn (insn); break; } - else if (GET_CODE (SET_SRC (body)) == RETURN) + else if (ANY_RETURN_P (SET_SRC (body))) /* Replace (set (pc) (return)) with (return). */ PATTERN (insn) = body = SET_SRC (body); Index: gcc/reorg.c =================================================================== --- gcc/reorg.c (revision 177999) +++ gcc/reorg.c (working copy) @@ -161,8 +161,11 @@ static rtx *unfilled_firstobj; #define unfilled_slots_next \ ((rtx *) obstack_next_free (&unfilled_slots_obstack)) -/* Points to the label before the end of the function. */ -static rtx end_of_function_label; +/* Points to the label before the end of the function, or before a + return insn. */ +static rtx function_return_label; +/* Likewise for a simple_return. */ +static rtx function_simple_return_label; /* Mapping between INSN_UID's and position in the code since INSN_UID's do not always monotonically increase. */ @@ -175,7 +178,7 @@ static int stop_search_p (rtx, int); static int resource_conflicts_p (struct resources *, struct resources *); static int insn_references_resource_p (rtx, struct resources *, bool); static int insn_sets_resource_p (rtx, struct resources *, bool); -static rtx find_end_label (void); +static rtx find_end_label (rtx); static rtx emit_delay_sequence (rtx, rtx, int); static rtx add_to_delay_list (rtx, rtx); static rtx delete_from_delay_slot (rtx); @@ -231,6 +234,15 @@ first_active_target_insn (rtx insn) return next_active_insn (insn); } +/* Return true iff INSN is a simplejump, or any kind of return insn. */ + +static bool +simplejump_or_return_p (rtx insn) +{ + return (JUMP_P (insn) + && (simplejump_p (insn) || ANY_RETURN_P (PATTERN (insn)))); +} + /* Return TRUE if this insn should stop the search for insn to fill delay slots. LABELS_P indicates that labels should terminate the search. In all cases, jumps terminate the search. */ @@ -346,23 +358,34 @@ insn_sets_resource_p (rtx insn, struct r ??? There may be a problem with the current implementation. Suppose we start with a bare RETURN insn and call find_end_label. It may set - end_of_function_label just before the RETURN. Suppose the machinery + function_return_label just before the RETURN. Suppose the machinery is able to fill the delay slot of the RETURN insn afterwards. Then - end_of_function_label is no longer valid according to the property + function_return_label is no longer valid according to the property described above and find_end_label will still return it unmodified. Note that this is probably mitigated by the following observation: - once end_of_function_label is made, it is very likely the target of + once function_return_label is made, it is very likely the target of a jump, so filling the delay slot of the RETURN will be much more - difficult. */ + difficult. + KIND is either simple_return_rtx or ret_rtx, indicating which type of + return we're looking for. */ static rtx -find_end_label (void) +find_end_label (rtx kind) { rtx insn; + rtx *plabel; + + if (kind == ret_rtx) + plabel = &function_return_label; + else + { + gcc_assert (kind == simple_return_rtx); + plabel = &function_simple_return_label; + } /* If we found one previously, return it. */ - if (end_of_function_label) - return end_of_function_label; + if (*plabel) + return *plabel; /* Otherwise, see if there is a label at the end of the function. If there is, it must be that RETURN insns aren't needed, so that is our return @@ -377,44 +400,45 @@ find_end_label (void) /* When a target threads its epilogue we might already have a suitable return insn. If so put a label before it for the - end_of_function_label. */ + function_return_label. */ if (BARRIER_P (insn) && JUMP_P (PREV_INSN (insn)) - && GET_CODE (PATTERN (PREV_INSN (insn))) == RETURN) + && PATTERN (PREV_INSN (insn)) == kind) { rtx temp = PREV_INSN (PREV_INSN (insn)); - end_of_function_label = gen_label_rtx (); - LABEL_NUSES (end_of_function_label) = 0; + rtx label = gen_label_rtx (); + LABEL_NUSES (label) = 0; - /* Put the label before an USE insns that may precede the RETURN insn. */ + /* Put the label before any USE insns that may precede the RETURN + insn. */ while (GET_CODE (temp) == USE) temp = PREV_INSN (temp); - emit_label_after (end_of_function_label, temp); + emit_label_after (label, temp); + *plabel = label; } else if (LABEL_P (insn)) - end_of_function_label = insn; + *plabel = insn; else { - end_of_function_label = gen_label_rtx (); - LABEL_NUSES (end_of_function_label) = 0; + rtx label = gen_label_rtx (); + LABEL_NUSES (label) = 0; /* If the basic block reorder pass moves the return insn to some other place try to locate it again and put our - end_of_function_label there. */ - while (insn && ! (JUMP_P (insn) - && (GET_CODE (PATTERN (insn)) == RETURN))) + function_return_label there. */ + while (insn && ! (JUMP_P (insn) && (PATTERN (insn) == kind))) insn = PREV_INSN (insn); if (insn) { insn = PREV_INSN (insn); - /* Put the label before an USE insns that may proceed the + /* Put the label before any USE insns that may precede the RETURN insn. */ while (GET_CODE (insn) == USE) insn = PREV_INSN (insn); - emit_label_after (end_of_function_label, insn); + emit_label_after (label, insn); } else { @@ -424,19 +448,16 @@ find_end_label (void) && ! HAVE_return #endif ) - { - /* The RETURN insn has its delay slot filled so we cannot - emit the label just before it. Since we already have - an epilogue and cannot emit a new RETURN, we cannot - emit the label at all. */ - end_of_function_label = NULL_RTX; - return end_of_function_label; - } + /* The RETURN insn has its delay slot filled so we cannot + emit the label just before it. Since we already have + an epilogue and cannot emit a new RETURN, we cannot + emit the label at all. */ + return NULL_RTX; #endif /* HAVE_epilogue */ /* Otherwise, make a new label and emit a RETURN and BARRIER, if needed. */ - emit_label (end_of_function_label); + emit_label (label); #ifdef HAVE_return /* We don't bother trying to create a return insn if the epilogue has filled delay-slots; we would have to try and @@ -455,13 +476,14 @@ find_end_label (void) } #endif } + *plabel = label; } /* Show one additional use for this label so it won't go away until we are done. */ - ++LABEL_NUSES (end_of_function_label); + ++LABEL_NUSES (*plabel); - return end_of_function_label; + return *plabel; } /* Put INSN and LIST together in a SEQUENCE rtx of LENGTH, and replace @@ -809,10 +831,8 @@ optimize_skip (rtx insn) if ((next_trial == next_active_insn (JUMP_LABEL (insn)) && ! (next_trial == 0 && crtl->epilogue_delay_list != 0)) || (next_trial != 0 - && JUMP_P (next_trial) - && JUMP_LABEL (insn) == JUMP_LABEL (next_trial) - && (simplejump_p (next_trial) - || GET_CODE (PATTERN (next_trial)) == RETURN))) + && simplejump_or_return_p (next_trial) + && JUMP_LABEL (insn) == JUMP_LABEL (next_trial))) { if (eligible_for_annul_false (insn, 0, trial, flags)) { @@ -831,13 +851,11 @@ optimize_skip (rtx insn) branch, thread our jump to the target of that branch. Don't change this into a RETURN here, because it may not accept what we have in the delay slot. We'll fix this up later. */ - if (next_trial && JUMP_P (next_trial) - && (simplejump_p (next_trial) - || GET_CODE (PATTERN (next_trial)) == RETURN)) + if (next_trial && simplejump_or_return_p (next_trial)) { rtx target_label = JUMP_LABEL (next_trial); if (ANY_RETURN_P (target_label)) - target_label = find_end_label (); + target_label = find_end_label (target_label); if (target_label) { @@ -951,7 +969,7 @@ rare_destination (rtx insn) return. */ return 2; case JUMP_INSN: - if (GET_CODE (PATTERN (insn)) == RETURN) + if (ANY_RETURN_P (PATTERN (insn))) return 1; else if (simplejump_p (insn) && jump_count++ < 10) @@ -1368,8 +1386,7 @@ steal_delay_list_from_fallthrough (rtx i /* We can't do anything if SEQ's delay insn isn't an unconditional branch. */ - if (! simplejump_p (XVECEXP (seq, 0, 0)) - && GET_CODE (PATTERN (XVECEXP (seq, 0, 0))) != RETURN) + if (! simplejump_or_return_p (XVECEXP (seq, 0, 0))) return delay_list; for (i = 1; i < XVECLEN (seq, 0); i++) @@ -2383,7 +2400,7 @@ fill_simple_delay_slots (int non_jumps_p if (new_label != 0) new_label = get_label_before (new_label); else - new_label = find_end_label (); + new_label = find_end_label (simple_return_rtx); if (new_label) { @@ -2515,7 +2532,8 @@ fill_simple_delay_slots (int non_jumps_p /* Follow any unconditional jump at LABEL; return the ultimate label reached by any such chain of jumps. - Return ret_rtx if the chain ultimately leads to a return instruction. + Return a suitable return rtx if the chain ultimately leads to a + return instruction. If LABEL is not followed by a jump, return LABEL. If the chain loops or we can't find end, return LABEL, since that tells caller to avoid changing the insn. */ @@ -2536,7 +2554,7 @@ follow_jumps (rtx label) && JUMP_P (insn) && JUMP_LABEL (insn) != NULL_RTX && ((any_uncondjump_p (insn) && onlyjump_p (insn)) - || GET_CODE (PATTERN (insn)) == RETURN) + || ANY_RETURN_P (PATTERN (insn))) && (next = NEXT_INSN (insn)) && BARRIER_P (next)); depth++) @@ -3003,16 +3021,14 @@ fill_slots_from_thread (rtx insn, rtx co gcc_assert (thread_if_true); - if (new_thread && JUMP_P (new_thread) - && (simplejump_p (new_thread) - || GET_CODE (PATTERN (new_thread)) == RETURN) + if (new_thread && simplejump_or_return_p (new_thread) && redirect_with_delay_list_safe_p (insn, JUMP_LABEL (new_thread), delay_list)) new_thread = follow_jumps (JUMP_LABEL (new_thread)); if (ANY_RETURN_P (new_thread)) - label = find_end_label (); + label = find_end_label (new_thread); else if (LABEL_P (new_thread)) label = new_thread; else @@ -3362,7 +3378,7 @@ relax_delay_slots (rtx first) { target_label = skip_consecutive_labels (follow_jumps (target_label)); if (ANY_RETURN_P (target_label)) - target_label = find_end_label (); + target_label = find_end_label (target_label); if (target_label && next_active_insn (target_label) == next && ! condjump_in_parallel_p (insn)) @@ -3377,9 +3393,8 @@ relax_delay_slots (rtx first) /* See if this jump conditionally branches around an unconditional jump. If so, invert this jump and point it to the target of the second jump. */ - if (next && JUMP_P (next) + if (next && simplejump_or_return_p (next) && any_condjump_p (insn) - && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN) && target_label && next_active_insn (target_label) == next_active_insn (next) && no_labels_between_p (insn, next)) @@ -3421,8 +3436,7 @@ relax_delay_slots (rtx first) Don't do this if we expect the conditional branch to be true, because we would then be making the more common case longer. */ - if (JUMP_P (insn) - && (simplejump_p (insn) || GET_CODE (PATTERN (insn)) == RETURN) + if (simplejump_or_return_p (insn) && (other = prev_active_insn (insn)) != 0 && any_condjump_p (other) && no_labels_between_p (other, insn) @@ -3463,10 +3477,10 @@ relax_delay_slots (rtx first) Only do so if optimizing for size since this results in slower, but smaller code. */ if (optimize_function_for_size_p (cfun) - && GET_CODE (PATTERN (delay_insn)) == RETURN + && ANY_RETURN_P (PATTERN (delay_insn)) && next && JUMP_P (next) - && GET_CODE (PATTERN (next)) == RETURN) + && PATTERN (next) == PATTERN (delay_insn)) { rtx after; int i; @@ -3505,73 +3519,71 @@ relax_delay_slots (rtx first) continue; target_label = JUMP_LABEL (delay_insn); + if (target_label && ANY_RETURN_P (target_label)) + continue; - if (!ANY_RETURN_P (target_label)) + /* If this jump goes to another unconditional jump, thread it, but + don't convert a jump into a RETURN here. */ + trial = skip_consecutive_labels (follow_jumps (target_label)); + if (ANY_RETURN_P (trial)) + trial = find_end_label (trial); + + if (trial && trial != target_label + && redirect_with_delay_slots_safe_p (delay_insn, trial, insn)) { - /* If this jump goes to another unconditional jump, thread it, but - don't convert a jump into a RETURN here. */ - trial = skip_consecutive_labels (follow_jumps (target_label)); - if (ANY_RETURN_P (trial)) - trial = find_end_label (); + reorg_redirect_jump (delay_insn, trial); + target_label = trial; + } - if (trial && trial != target_label - && redirect_with_delay_slots_safe_p (delay_insn, trial, insn)) - { - reorg_redirect_jump (delay_insn, trial); - target_label = trial; - } + /* If the first insn at TARGET_LABEL is redundant with a previous + insn, redirect the jump to the following insn and process again. + We use next_real_insn instead of next_active_insn so we + don't skip USE-markers, or we'll end up with incorrect + liveness info. */ + trial = next_real_insn (target_label); + if (trial && GET_CODE (PATTERN (trial)) != SEQUENCE + && redundant_insn (trial, insn, 0) + && ! can_throw_internal (trial)) + { + /* Figure out where to emit the special USE insn so we don't + later incorrectly compute register live/death info. */ + rtx tmp = next_active_insn (trial); + if (tmp == 0) + tmp = find_end_label (simple_return_rtx); - /* If the first insn at TARGET_LABEL is redundant with a previous - insn, redirect the jump to the following insn and process again. - We use next_real_insn instead of next_active_insn so we - don't skip USE-markers, or we'll end up with incorrect - liveness info. */ - trial = next_real_insn (target_label); - if (trial && GET_CODE (PATTERN (trial)) != SEQUENCE - && redundant_insn (trial, insn, 0) - && ! can_throw_internal (trial)) + if (tmp) { - /* Figure out where to emit the special USE insn so we don't - later incorrectly compute register live/death info. */ - rtx tmp = next_active_insn (trial); - if (tmp == 0) - tmp = find_end_label (); - - if (tmp) - { - /* Insert the special USE insn and update dataflow info. */ - update_block (trial, tmp); - - /* Now emit a label before the special USE insn, and - redirect our jump to the new label. */ - target_label = get_label_before (PREV_INSN (tmp)); - reorg_redirect_jump (delay_insn, target_label); - next = insn; - continue; - } + /* Insert the special USE insn and update dataflow info. */ + update_block (trial, tmp); + + /* Now emit a label before the special USE insn, and + redirect our jump to the new label. */ + target_label = get_label_before (PREV_INSN (tmp)); + reorg_redirect_jump (delay_insn, target_label); + next = insn; + continue; } + } - /* Similarly, if it is an unconditional jump with one insn in its - delay list and that insn is redundant, thread the jump. */ - if (trial && GET_CODE (PATTERN (trial)) == SEQUENCE - && XVECLEN (PATTERN (trial), 0) == 2 - && JUMP_P (XVECEXP (PATTERN (trial), 0, 0)) - && (simplejump_p (XVECEXP (PATTERN (trial), 0, 0)) - || GET_CODE (PATTERN (XVECEXP (PATTERN (trial), 0, 0))) == RETURN) - && redundant_insn (XVECEXP (PATTERN (trial), 0, 1), insn, 0)) + /* Similarly, if it is an unconditional jump with one insn in its + delay list and that insn is redundant, thread the jump. */ + if (trial && GET_CODE (PATTERN (trial)) == SEQUENCE + && XVECLEN (PATTERN (trial), 0) == 2 + && JUMP_P (XVECEXP (PATTERN (trial), 0, 0)) + && simplejump_or_return_p (XVECEXP (PATTERN (trial), 0, 0)) + && redundant_insn (XVECEXP (PATTERN (trial), 0, 1), insn, 0)) + { + target_label = JUMP_LABEL (XVECEXP (PATTERN (trial), 0, 0)); + if (ANY_RETURN_P (target_label)) + target_label = find_end_label (target_label); + + if (target_label + && redirect_with_delay_slots_safe_p (delay_insn, target_label, + insn)) { - target_label = JUMP_LABEL (XVECEXP (PATTERN (trial), 0, 0)); - if (ANY_RETURN_P (target_label)) - target_label = find_end_label (); - - if (target_label - && redirect_with_delay_slots_safe_p (delay_insn, target_label, - insn)) - { - reorg_redirect_jump (delay_insn, target_label); - next = insn; - continue; - } + reorg_redirect_jump (delay_insn, target_label); + next = insn; + continue; } } @@ -3640,8 +3652,7 @@ relax_delay_slots (rtx first) a RETURN here. */ if (! INSN_ANNULLED_BRANCH_P (delay_insn) && any_condjump_p (delay_insn) - && next && JUMP_P (next) - && (simplejump_p (next) || GET_CODE (PATTERN (next)) == RETURN) + && next && simplejump_or_return_p (next) && next_active_insn (target_label) == next_active_insn (next) && no_labels_between_p (insn, next)) { @@ -3649,7 +3660,7 @@ relax_delay_slots (rtx first) rtx old_label = JUMP_LABEL (delay_insn); if (ANY_RETURN_P (label)) - label = find_end_label (); + label = find_end_label (label); /* find_end_label can generate a new label. Check this first. */ if (label @@ -3710,7 +3721,8 @@ static void make_return_insns (rtx first) { rtx insn, jump_insn, pat; - rtx real_return_label = end_of_function_label; + rtx real_return_label = function_return_label; + rtx real_simple_return_label = function_simple_return_label; int slots, i; #ifdef DELAY_SLOTS_FOR_EPILOGUE @@ -3728,15 +3740,22 @@ make_return_insns (rtx first) made for END_OF_FUNCTION_LABEL. If so, set up anything we can't change into a RETURN to jump to it. */ for (insn = first; insn; insn = NEXT_INSN (insn)) - if (JUMP_P (insn) && GET_CODE (PATTERN (insn)) == RETURN) + if (JUMP_P (insn) && ANY_RETURN_P (PATTERN (insn))) { - real_return_label = get_label_before (insn); + rtx t = get_label_before (insn); + if (PATTERN (insn) == ret_rtx) + real_return_label = t; + else + real_simple_return_label = t; break; } /* Show an extra usage of REAL_RETURN_LABEL so it won't go away if it was equal to END_OF_FUNCTION_LABEL. */ - LABEL_NUSES (real_return_label)++; + if (real_return_label) + LABEL_NUSES (real_return_label)++; + if (real_simple_return_label) + LABEL_NUSES (real_simple_return_label)++; /* Clear the list of insns to fill so we can use it. */ obstack_free (&unfilled_slots_obstack, unfilled_firstobj); @@ -3744,13 +3763,27 @@ make_return_insns (rtx first) for (insn = first; insn; insn = NEXT_INSN (insn)) { int flags; + rtx kind, real_label; /* Only look at filled JUMP_INSNs that go to the end of function label. */ if (!NONJUMP_INSN_P (insn) || GET_CODE (PATTERN (insn)) != SEQUENCE - || !JUMP_P (XVECEXP (PATTERN (insn), 0, 0)) - || JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) != end_of_function_label) + || !jump_to_label_p (XVECEXP (PATTERN (insn), 0, 0))) + continue; + + if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) == function_return_label) + { + kind = ret_rtx; + real_label = real_return_label; + } + else if (JUMP_LABEL (XVECEXP (PATTERN (insn), 0, 0)) + == function_simple_return_label) + { + kind = simple_return_rtx; + real_label = real_simple_return_label; + } + else continue; pat = PATTERN (insn); @@ -3758,14 +3791,12 @@ make_return_insns (rtx first) /* If we can't make the jump into a RETURN, try to redirect it to the best RETURN and go on to the next insn. */ - if (! reorg_redirect_jump (jump_insn, ret_rtx)) + if (!reorg_redirect_jump (jump_insn, kind)) { /* Make sure redirecting the jump will not invalidate the delay slot insns. */ - if (redirect_with_delay_slots_safe_p (jump_insn, - real_return_label, - insn)) - reorg_redirect_jump (jump_insn, real_return_label); + if (redirect_with_delay_slots_safe_p (jump_insn, real_label, insn)) + reorg_redirect_jump (jump_insn, real_label); continue; } @@ -3805,7 +3836,7 @@ make_return_insns (rtx first) RETURN, delete the SEQUENCE and output the individual insns, followed by the RETURN. Then set things up so we try to find insns for its delay slots, if it needs some. */ - if (GET_CODE (PATTERN (jump_insn)) == RETURN) + if (ANY_RETURN_P (PATTERN (jump_insn))) { rtx prev = PREV_INSN (insn); @@ -3822,13 +3853,16 @@ make_return_insns (rtx first) else /* It is probably more efficient to keep this with its current delay slot as a branch to a RETURN. */ - reorg_redirect_jump (jump_insn, real_return_label); + reorg_redirect_jump (jump_insn, real_label); } /* Now delete REAL_RETURN_LABEL if we never used it. Then try to fill any new delay slots we have created. */ - if (--LABEL_NUSES (real_return_label) == 0) + if (real_return_label != NULL_RTX && --LABEL_NUSES (real_return_label) == 0) delete_related_insns (real_return_label); + if (real_simple_return_label != NULL_RTX + && --LABEL_NUSES (real_simple_return_label) == 0) + delete_related_insns (real_simple_return_label); fill_simple_delay_slots (1); fill_simple_delay_slots (0); @@ -3842,6 +3876,7 @@ dbr_schedule (rtx first) { rtx insn, next, epilogue_insn = 0; int i; + bool need_return_insns; /* If the current function has no insns other than the prologue and epilogue, then do not try to fill any delay slots. */ @@ -3897,7 +3932,7 @@ dbr_schedule (rtx first) init_resource_info (epilogue_insn); /* Show we haven't computed an end-of-function label yet. */ - end_of_function_label = 0; + function_return_label = function_simple_return_label = NULL_RTX; /* Initialize the statistics for this function. */ memset (num_insns_needing_delays, 0, sizeof num_insns_needing_delays); @@ -3919,13 +3954,21 @@ dbr_schedule (rtx first) /* If we made an end of function label, indicate that it is now safe to delete it by undoing our prior adjustment to LABEL_NUSES. If it is now unused, delete it. */ - if (end_of_function_label && --LABEL_NUSES (end_of_function_label) == 0) - delete_related_insns (end_of_function_label); + if (function_return_label && --LABEL_NUSES (function_return_label) == 0) + delete_related_insns (function_return_label); + if (function_simple_return_label + && --LABEL_NUSES (function_simple_return_label) == 0) + delete_related_insns (function_simple_return_label); + need_return_insns = false; #ifdef HAVE_return - if (HAVE_return && end_of_function_label != 0) - make_return_insns (first); + need_return_insns |= HAVE_return && function_return_label != 0; #endif +#ifdef HAVE_simple_return + need_return_insns |= HAVE_simple_return && function_simple_return_label != 0; +#endif + if (need_return_insns) + make_return_insns (first); /* Delete any USE insns made by update_block; subsequent passes don't need them or know how to deal with them. */ Index: gcc/genemit.c =================================================================== --- gcc/genemit.c (revision 177999) +++ gcc/genemit.c (working copy) @@ -169,6 +169,9 @@ gen_exp (rtx x, enum rtx_code subroutine case RETURN: printf ("ret_rtx"); return; + case SIMPLE_RETURN: + printf ("simple_return_rtx"); + return; case CLOBBER: if (REG_P (XEXP (x, 0))) { @@ -489,8 +492,8 @@ gen_expand (rtx expand) || (GET_CODE (next) == PARALLEL && ((GET_CODE (XVECEXP (next, 0, 0)) == SET && GET_CODE (SET_DEST (XVECEXP (next, 0, 0))) == PC) - || GET_CODE (XVECEXP (next, 0, 0)) == RETURN)) - || GET_CODE (next) == RETURN) + || ANY_RETURN_P (XVECEXP (next, 0, 0)))) + || ANY_RETURN_P (next)) printf (" emit_jump_insn ("); else if ((GET_CODE (next) == SET && GET_CODE (SET_SRC (next)) == CALL) || GET_CODE (next) == CALL @@ -607,7 +610,7 @@ gen_split (rtx split) || (GET_CODE (next) == PARALLEL && GET_CODE (XVECEXP (next, 0, 0)) == SET && GET_CODE (SET_DEST (XVECEXP (next, 0, 0))) == PC) - || GET_CODE (next) == RETURN) + || ANY_RETURN_P (next)) printf (" emit_jump_insn ("); else if ((GET_CODE (next) == SET && GET_CODE (SET_SRC (next)) == CALL) || GET_CODE (next) == CALL Index: gcc/df-scan.c =================================================================== --- gcc/df-scan.c (revision 177999) +++ gcc/df-scan.c (working copy) @@ -3181,6 +3181,7 @@ df_uses_record (struct df_collection_rec } case RETURN: + case SIMPLE_RETURN: break; case ASM_OPERANDS: Index: gcc/rtl.def =================================================================== --- gcc/rtl.def (revision 177999) +++ gcc/rtl.def (working copy) @@ -731,6 +731,10 @@ DEF_RTL_EXPR(ENTRY_VALUE, "entry_value", been optimized away completely. */ DEF_RTL_EXPR(DEBUG_PARAMETER_REF, "debug_parameter_ref", "t", RTX_OBJ) +/* A plain return, to be used on paths that are reached without going + through the function prologue. */ +DEF_RTL_EXPR(SIMPLE_RETURN, "simple_return", "", RTX_EXTRA) + /* All expressions from this point forward appear only in machine descriptions. */ #ifdef GENERATOR_FILE Index: gcc/ifcvt.c =================================================================== --- gcc/ifcvt.c (revision 177999) +++ gcc/ifcvt.c (working copy) @@ -3796,6 +3796,7 @@ find_if_case_1 (basic_block test_bb, edg basic_block then_bb = then_edge->dest; basic_block else_bb = else_edge->dest; basic_block new_bb; + rtx else_target = NULL_RTX; int then_bb_index; /* If we are partitioning hot/cold basic blocks, we don't want to @@ -3845,6 +3846,13 @@ find_if_case_1 (basic_block test_bb, edg predictable_edge_p (then_edge))))) return FALSE; + if (else_bb == EXIT_BLOCK_PTR) + { + rtx jump = BB_END (else_edge->src); + gcc_assert (JUMP_P (jump)); + else_target = JUMP_LABEL (jump); + } + /* Registers set are dead, or are predicable. */ if (! dead_or_predicable (test_bb, then_bb, else_bb, single_succ_edge (then_bb), 1)) @@ -3864,6 +3872,9 @@ find_if_case_1 (basic_block test_bb, edg redirect_edge_succ (FALLTHRU_EDGE (test_bb), else_bb); new_bb = 0; } + else if (else_bb == EXIT_BLOCK_PTR) + new_bb = force_nonfallthru_and_redirect (FALLTHRU_EDGE (test_bb), + else_bb, else_target); else new_bb = redirect_edge_and_branch_force (FALLTHRU_EDGE (test_bb), else_bb); Index: gcc/jump.c =================================================================== --- gcc/jump.c (revision 177999) +++ gcc/jump.c (working copy) @@ -29,7 +29,8 @@ along with GCC; see the file COPYING3. JUMP_LABEL internal field. With this we can detect labels that become unused because of the deletion of all the jumps that formerly used them. The JUMP_LABEL info is sometimes looked - at by later passes. + at by later passes. For return insns, it contains either a + RETURN or a SIMPLE_RETURN rtx. The subroutines redirect_jump and invert_jump are used from other passes as well. */ @@ -775,10 +776,10 @@ condjump_p (const_rtx insn) return (GET_CODE (x) == IF_THEN_ELSE && ((GET_CODE (XEXP (x, 2)) == PC && (GET_CODE (XEXP (x, 1)) == LABEL_REF - || GET_CODE (XEXP (x, 1)) == RETURN)) + || ANY_RETURN_P (XEXP (x, 1)))) || (GET_CODE (XEXP (x, 1)) == PC && (GET_CODE (XEXP (x, 2)) == LABEL_REF - || GET_CODE (XEXP (x, 2)) == RETURN)))); + || ANY_RETURN_P (XEXP (x, 2)))))); } /* Return nonzero if INSN is a (possibly) conditional jump inside a @@ -807,11 +808,11 @@ condjump_in_parallel_p (const_rtx insn) return 0; if (XEXP (SET_SRC (x), 2) == pc_rtx && (GET_CODE (XEXP (SET_SRC (x), 1)) == LABEL_REF - || GET_CODE (XEXP (SET_SRC (x), 1)) == RETURN)) + || ANY_RETURN_P (XEXP (SET_SRC (x), 1)))) return 1; if (XEXP (SET_SRC (x), 1) == pc_rtx && (GET_CODE (XEXP (SET_SRC (x), 2)) == LABEL_REF - || GET_CODE (XEXP (SET_SRC (x), 2)) == RETURN)) + || ANY_RETURN_P (XEXP (SET_SRC (x), 2)))) return 1; return 0; } @@ -873,8 +874,9 @@ any_condjump_p (const_rtx insn) a = GET_CODE (XEXP (SET_SRC (x), 1)); b = GET_CODE (XEXP (SET_SRC (x), 2)); - return ((b == PC && (a == LABEL_REF || a == RETURN)) - || (a == PC && (b == LABEL_REF || b == RETURN))); + return ((b == PC && (a == LABEL_REF || a == RETURN || a == SIMPLE_RETURN)) + || (a == PC + && (b == LABEL_REF || b == RETURN || b == SIMPLE_RETURN))); } /* Return the label of a conditional jump. */ @@ -911,6 +913,7 @@ returnjump_p_1 (rtx *loc, void *data ATT switch (GET_CODE (x)) { case RETURN: + case SIMPLE_RETURN: case EH_RETURN: return true; Index: gcc/function.c =================================================================== --- gcc/function.c (revision 177999) +++ gcc/function.c (working copy) @@ -5306,7 +5306,11 @@ static void emit_return_into_block (basic_block bb) { rtx jump = emit_jump_insn_after (gen_return (), BB_END (bb)); - JUMP_LABEL (jump) = ret_rtx; + rtx pat = PATTERN (jump); + if (GET_CODE (pat) == PARALLEL) + pat = XVECEXP (pat, 0, 0); + gcc_assert (ANY_RETURN_P (pat)); + JUMP_LABEL (jump) = pat; } #endif /* HAVE_return */ Index: gcc/print-rtl.c =================================================================== --- gcc/print-rtl.c (revision 177999) +++ gcc/print-rtl.c (working copy) @@ -328,6 +328,8 @@ print_rtx (const_rtx in_rtx) fprintf (outfile, "\n%s%*s -> ", print_rtx_head, indent * 2, ""); if (GET_CODE (JUMP_LABEL (in_rtx)) == RETURN) fprintf (outfile, "return"); + else if (GET_CODE (JUMP_LABEL (in_rtx)) == SIMPLE_RETURN) + fprintf (outfile, "simple_return"); else fprintf (outfile, "%d", INSN_UID (JUMP_LABEL (in_rtx))); } Index: gcc/bt-load.c =================================================================== --- gcc/bt-load.c (revision 177999) +++ gcc/bt-load.c (working copy) @@ -558,7 +558,7 @@ compute_defs_uses_and_gen (fibheap_t all /* Check for sibcall. */ if (GET_CODE (pat) == PARALLEL) for (i = XVECLEN (pat, 0) - 1; i >= 0; i--) - if (GET_CODE (XVECEXP (pat, 0, i)) == RETURN) + if (ANY_RETURN_P (XVECEXP (pat, 0, i))) { COMPL_HARD_REG_SET (call_saved, call_used_reg_set); Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 177999) +++ gcc/emit-rtl.c (working copy) @@ -2518,6 +2518,7 @@ verify_rtx_sharing (rtx orig, rtx insn) case PC: case CC0: case RETURN: + case SIMPLE_RETURN: case SCRATCH: return; /* SCRATCH must be shared because they represent distinct values. */ @@ -2725,6 +2726,7 @@ repeat: case PC: case CC0: case RETURN: + case SIMPLE_RETURN: case SCRATCH: /* SCRATCH must be shared because they represent distinct values. */ return; @@ -2845,6 +2847,7 @@ repeat: case PC: case CC0: case RETURN: + case SIMPLE_RETURN: return; case DEBUG_INSN: @@ -5008,7 +5011,7 @@ classify_insn (rtx x) return CODE_LABEL; if (GET_CODE (x) == CALL) return CALL_INSN; - if (GET_CODE (x) == RETURN) + if (ANY_RETURN_P (x)) return JUMP_INSN; if (GET_CODE (x) == SET) { @@ -5264,6 +5267,7 @@ copy_insn_1 (rtx orig) case PC: case CC0: case RETURN: + case SIMPLE_RETURN: return orig; case CLOBBER: if (REG_P (XEXP (orig, 0)) && REGNO (XEXP (orig, 0)) < FIRST_PSEUDO_REGISTER) @@ -5521,6 +5525,7 @@ init_emit_regs (void) /* Assign register numbers to the globally defined register rtx. */ pc_rtx = gen_rtx_fmt_ (PC, VOIDmode); ret_rtx = gen_rtx_fmt_ (RETURN, VOIDmode); + simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode); cc0_rtx = gen_rtx_fmt_ (CC0, VOIDmode); stack_pointer_rtx = gen_raw_REG (Pmode, STACK_POINTER_REGNUM); frame_pointer_rtx = gen_raw_REG (Pmode, FRAME_POINTER_REGNUM); Index: gcc/cfglayout.c =================================================================== --- gcc/cfglayout.c (revision 177999) +++ gcc/cfglayout.c (working copy) @@ -767,6 +767,7 @@ fixup_reorder_chain (void) { edge e_fall, e_taken, e; rtx bb_end_insn; + rtx ret_label = NULL_RTX; basic_block nb, src_bb; edge_iterator ei; @@ -786,6 +787,7 @@ fixup_reorder_chain (void) bb_end_insn = BB_END (bb); if (JUMP_P (bb_end_insn)) { + ret_label = JUMP_LABEL (bb_end_insn); if (any_condjump_p (bb_end_insn)) { /* This might happen if the conditional jump has side @@ -899,7 +901,7 @@ fixup_reorder_chain (void) Note force_nonfallthru can delete E_FALL and thus we have to save E_FALL->src prior to the call to force_nonfallthru. */ src_bb = e_fall->src; - nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest); + nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label); if (nb) { nb->il.rtl->visited = 1; Index: gcc/rtl.c =================================================================== --- gcc/rtl.c (revision 177999) +++ gcc/rtl.c (working copy) @@ -256,6 +256,7 @@ copy_rtx (rtx orig) case PC: case CC0: case RETURN: + case SIMPLE_RETURN: case SCRATCH: /* SCRATCH must be shared because they represent distinct values. */ return orig; Index: gcc/rtl.h =================================================================== --- gcc/rtl.h (revision 177999) +++ gcc/rtl.h (working copy) @@ -432,8 +432,9 @@ struct GTY((variable_size)) rtvec_def { (JUMP_P (INSN) && (GET_CODE (PATTERN (INSN)) == ADDR_VEC || \ GET_CODE (PATTERN (INSN)) == ADDR_DIFF_VEC)) -/* Predicate yielding nonzero iff X is a return. */ -#define ANY_RETURN_P(X) ((X) == ret_rtx) +/* Predicate yielding nonzero iff X is a return or simple_return. */ +#define ANY_RETURN_P(X) \ + (GET_CODE (X) == RETURN || GET_CODE (X) == SIMPLE_RETURN) /* 1 if X is a unary operator. */ @@ -2111,6 +2112,7 @@ enum global_rtl_index GR_PC, GR_CC0, GR_RETURN, + GR_SIMPLE_RETURN, GR_STACK_POINTER, GR_FRAME_POINTER, /* For register elimination to work properly these hard_frame_pointer_rtx, @@ -2206,6 +2208,7 @@ extern struct target_rtl *this_target_rt /* Standard pieces of rtx, to be substituted directly into things. */ #define pc_rtx (global_rtl[GR_PC]) #define ret_rtx (global_rtl[GR_RETURN]) +#define simple_return_rtx (global_rtl[GR_SIMPLE_RETURN]) #define cc0_rtx (global_rtl[GR_CC0]) /* All references to certain hard regs, except those created Index: gcc/combine.c =================================================================== --- gcc/combine.c (revision 177999) +++ gcc/combine.c (working copy) @@ -6303,7 +6303,7 @@ simplify_set (rtx x) rtx *cc_use; /* (set (pc) (return)) gets written as (return). */ - if (GET_CODE (dest) == PC && GET_CODE (src) == RETURN) + if (GET_CODE (dest) == PC && ANY_RETURN_P (src)) return src; /* Now that we know for sure which bits of SRC we are using, see if we can Index: gcc/resource.c =================================================================== --- gcc/resource.c (revision 177999) +++ gcc/resource.c (working copy) @@ -492,7 +492,7 @@ find_dead_or_set_registers (rtx target, if (jump_count++ < 10) { if (any_uncondjump_p (this_jump_insn) - || GET_CODE (PATTERN (this_jump_insn)) == RETURN) + || ANY_RETURN_P (PATTERN (this_jump_insn))) { next = JUMP_LABEL (this_jump_insn); if (ANY_RETURN_P (next)) @@ -829,7 +829,7 @@ mark_set_resources (rtx x, struct resour static bool return_insn_p (const_rtx insn) { - if (JUMP_P (insn) && GET_CODE (PATTERN (insn)) == RETURN) + if (JUMP_P (insn) && ANY_RETURN_P (PATTERN (insn))) return true; if (NONJUMP_INSN_P (insn) && GET_CODE (PATTERN (insn)) == SEQUENCE) Index: gcc/basic-block.h =================================================================== --- gcc/basic-block.h (revision 177999) +++ gcc/basic-block.h (working copy) @@ -804,7 +804,7 @@ extern rtx block_label (basic_block); extern bool purge_all_dead_edges (void); extern bool purge_dead_edges (basic_block); extern bool fixup_abnormal_edges (void); -extern basic_block force_nonfallthru_and_redirect (edge, basic_block); +extern basic_block force_nonfallthru_and_redirect (edge, basic_block, rtx); /* In cfgbuild.c. */ extern void find_many_sub_basic_blocks (sbitmap); Index: gcc/sched-vis.c =================================================================== --- gcc/sched-vis.c (revision 177999) +++ gcc/sched-vis.c (working copy) @@ -554,6 +554,9 @@ print_pattern (char *buf, const_rtx x, i case RETURN: sprintf (buf, "return"); break; + case SIMPLE_RETURN: + sprintf (buf, "simple_return"); + break; case CALL: print_exp (buf, x, verbose); break; Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 177999) +++ gcc/config/i386/i386.c (working copy) @@ -30545,7 +30545,7 @@ ix86_pad_returns (void) rtx prev; bool replace = false; - if (!JUMP_P (ret) || GET_CODE (PATTERN (ret)) != RETURN + if (!JUMP_P (ret) || !ANY_RETURN_P (PATTERN (ret)) || optimize_bb_for_size_p (bb)) continue; for (prev = PREV_INSN (ret); prev; prev = PREV_INSN (prev)) @@ -30596,7 +30596,7 @@ ix86_count_insn_bb (basic_block bb) { /* Only happen in exit blocks. */ if (JUMP_P (insn) - && GET_CODE (PATTERN (insn)) == RETURN) + && ANY_RETURN_P (PATTERN (insn))) break; if (NONDEBUG_INSN_P (insn) @@ -30669,7 +30669,7 @@ ix86_pad_short_function (void) FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR->preds) { rtx ret = BB_END (e->src); - if (JUMP_P (ret) && GET_CODE (PATTERN (ret)) == RETURN) + if (JUMP_P (ret) && ANY_RETURN_P (PATTERN (ret))) { int insn_count = ix86_count_insn (e->src); Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (revision 177999) +++ gcc/config/arm/arm.c (working copy) @@ -17723,6 +17723,7 @@ arm_final_prescan_insn (rtx insn) /* If we start with a return insn, we only succeed if we find another one. */ int seeking_return = 0; + enum rtx_code return_code = UNKNOWN; /* START_INSN will hold the insn from where we start looking. This is the first insn after the following code_label if REVERSE is true. */ @@ -17761,7 +17762,7 @@ arm_final_prescan_insn (rtx insn) else return; } - else if (GET_CODE (body) == RETURN) + else if (ANY_RETURN_P (body)) { start_insn = next_nonnote_insn (start_insn); if (GET_CODE (start_insn) == BARRIER) @@ -17772,6 +17773,7 @@ arm_final_prescan_insn (rtx insn) { reverse = TRUE; seeking_return = 1; + return_code = GET_CODE (body); } else return; @@ -17812,11 +17814,15 @@ arm_final_prescan_insn (rtx insn) label = XEXP (XEXP (SET_SRC (body), 2), 0); then_not_else = FALSE; } - else if (GET_CODE (XEXP (SET_SRC (body), 1)) == RETURN) - seeking_return = 1; - else if (GET_CODE (XEXP (SET_SRC (body), 2)) == RETURN) + else if (ANY_RETURN_P (XEXP (SET_SRC (body), 1))) + { + seeking_return = 1; + return_code = GET_CODE (XEXP (SET_SRC (body), 1)); + } + else if (ANY_RETURN_P (XEXP (SET_SRC (body), 2))) { seeking_return = 1; + return_code = GET_CODE (XEXP (SET_SRC (body), 2)); then_not_else = FALSE; } else @@ -17913,12 +17919,11 @@ arm_final_prescan_insn (rtx insn) } /* Fail if a conditional return is undesirable (e.g. on a StrongARM), but still allow this if optimizing for size. */ - else if (GET_CODE (scanbody) == RETURN + else if (GET_CODE (scanbody) == return_code && !use_return_insn (TRUE, NULL) && !optimize_size) fail = TRUE; - else if (GET_CODE (scanbody) == RETURN - && seeking_return) + else if (GET_CODE (scanbody) == return_code) { arm_ccfsm_state = 2; succeed = TRUE; Index: gcc/config/mips/mips.md =================================================================== --- gcc/config/mips/mips.md (revision 177999) +++ gcc/config/mips/mips.md (working copy) @@ -777,6 +777,8 @@ (define_code_iterator any_ge [ge geu]) (define_code_iterator any_lt [lt ltu]) (define_code_iterator any_le [le leu]) +(define_code_iterator any_return [return simple_return]) + ;; expands to an empty string when doing a signed operation and ;; "u" when doing an unsigned operation. (define_code_attr u [(sign_extend "") (zero_extend "u") @@ -798,7 +800,9 @@ (define_code_attr optab [(ashift "ashl") (xor "xor") (and "and") (plus "add") - (minus "sub")]) + (minus "sub") + (return "return") + (simple_return "simple_return")]) ;; expands to the name of the insn that implements a particular code. (define_code_attr insn [(ashift "sll") @@ -5713,21 +5717,26 @@ (define_expand "sibcall_epilogue" ;; allows jump optimizations to work better. (define_expand "return" - [(return)] + [(simple_return)] "mips_can_use_return_insn ()" { mips_expand_before_return (); }) -(define_insn "*return" - [(return)] - "mips_can_use_return_insn ()" +(define_expand "simple_return" + [(simple_return)] + "" + { mips_expand_before_return (); }) + +(define_insn "*" + [(any_return)] + "" "%*j\t$31%/" [(set_attr "type" "jump") (set_attr "mode" "none")]) ;; Normal return. -(define_insn "return_internal" - [(return) +(define_insn "_internal" + [(any_return) (use (match_operand 0 "pmode_register_operand" ""))] "" "%*j\t%0%/" Index: gcc/config/mips/mips.c =================================================================== --- gcc/config/mips/mips.c (revision 177999) +++ gcc/config/mips/mips.c (working copy) @@ -10453,7 +10453,8 @@ mips_expand_epilogue (bool sibcall_p) regno = GP_REG_FIRST + 7; else regno = RETURN_ADDR_REGNUM; - emit_jump_insn (gen_return_internal (gen_rtx_REG (Pmode, regno))); + emit_jump_insn (gen_simple_return_internal (gen_rtx_REG (Pmode, + regno))); } } Index: gcc/cfgrtl.c =================================================================== --- gcc/cfgrtl.c (revision 177999) +++ gcc/cfgrtl.c (working copy) @@ -1117,10 +1117,13 @@ rtl_redirect_edge_and_branch (edge e, ba } /* Like force_nonfallthru below, but additionally performs redirection - Used by redirect_edge_and_branch_force. */ + Used by redirect_edge_and_branch_force. JUMP_LABEL is used only + when redirecting to the EXIT_BLOCK, it is either ret_rtx or + simple_return_rtx, indicating which kind of returnjump to create. + It should be NULL otherwise. */ basic_block -force_nonfallthru_and_redirect (edge e, basic_block target) +force_nonfallthru_and_redirect (edge e, basic_block target, rtx jump_label) { basic_block jump_block, new_bb = NULL, src = e->src; rtx note; @@ -1252,12 +1255,25 @@ force_nonfallthru_and_redirect (edge e, e->flags &= ~EDGE_FALLTHRU; if (target == EXIT_BLOCK_PTR) { + if (jump_label == ret_rtx) + { #ifdef HAVE_return - emit_jump_insn_after_setloc (gen_return (), BB_END (jump_block), loc); - JUMP_LABEL (BB_END (jump_block)) = ret_rtx; + emit_jump_insn_after_setloc (gen_return (), BB_END (jump_block), loc); #else - gcc_unreachable (); + gcc_unreachable (); +#endif + } + else + { + gcc_assert (jump_label == simple_return_rtx); +#ifdef HAVE_simple_return + emit_jump_insn_after_setloc (gen_simple_return (), + BB_END (jump_block), loc); +#else + gcc_unreachable (); #endif + } + JUMP_LABEL (BB_END (jump_block)) = jump_label; } else { @@ -1284,7 +1300,7 @@ force_nonfallthru_and_redirect (edge e, static basic_block rtl_force_nonfallthru (edge e) { - return force_nonfallthru_and_redirect (e, e->dest); + return force_nonfallthru_and_redirect (e, e->dest, NULL_RTX); } /* Redirect edge even at the expense of creating new jump insn or @@ -1301,7 +1317,7 @@ rtl_redirect_edge_and_branch_force (edge /* In case the edge redirection failed, try to force it to be non-fallthru and redirect newly created simplejump. */ df_set_bb_dirty (e->src); - return force_nonfallthru_and_redirect (e, target); + return force_nonfallthru_and_redirect (e, target, NULL_RTX); } /* The given edge should potentially be a fallthru edge. If that is in Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 177999) +++ gcc/doc/md.texi (working copy) @@ -4992,6 +4992,20 @@ some class of functions only requires on return. Normally, the applicable functions are those which do not need to save any registers or allocate stack space. +It is valid for this pattern to expand to an instruction using +@code{simple_return} if no epilogue is required. + +@cindex @code{simple_return} instruction pattern +@item @samp{simple_return} +Subroutine return instruction. This instruction pattern name should be +defined only if a single instruction can do all the work of returning +from a function on a path where no epilogue is required. This pattern +is very similar to the @code{return} instruction pattern, but it is emitted +only by the shrink-wrapping optimization on paths where the function +prologue has not been executed, and a function return should occur without +any of the effects of the epilogue. Additional uses may be introduced on +paths where both the prologue and the epilogue have executed. + @findex reload_completed @findex leaf_function_p For such machines, the condition specified in this pattern should only