Message ID | CAAe5K+WEPgQLzO_LiKkvsNuL-JUZzJS7G9PCqFT5imVGTa5Smw@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Thu, May 23, 2013 at 6:18 AM, Teresa Johnson <tejohnson@google.com> wrote: > On Wed, May 22, 2013 at 2:05 PM, Teresa Johnson <tejohnson@google.com> wrote: >> Revised patch included below. The spacing of my pasted in patch text >> looks funky again, let me know if you want the patch as an attachment >> instead. >> >> I addressed all of Steven's comments, except for the suggestion to use >> gcc_assert >> instead of error() in verify_hot_cold_block_grouping() to keep this consistent >> with the rest of the verify_flow_info subroutines (let me know if this is ok). > > I fixed this issue too, which was actually in > insert_section_boundary_note(), so that it gcc_asserts more > efficiently as suggested. Retested, latest patch below. > > Honza, would you be able to review the patch? Ping. Still needs a global maintainer to review and approve. Also, I submitted a PR for the debug range issue: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57451 Thanks! Teresa > > Thanks! > Teresa > >> >> The other main changes: >> (1) Added several test cases (cloned from the torture subdirectories, >> where I manually >> built/ran with FDO and -freorder-blocks-and-partition with both the >> current trunk and >> my fixed trunk compiler, and was able to expose some failures I fixed. >> (2) Changed existing tree-prof tests that used >> -freorder-blocks-and-partition to be >> built with -O2 instead of -O, so that partitioning actually kicks in. >> (3) Fixed a couple of failures in the new >> verify_hot_cold_block_grouping() checks >> exposed by the torture tests I ran manually with splitting (2 of the >> tests cloned >> to tree-prof in this patch). One was in computed goto where we were >> too aggressive >> about cloning crossing edges, and the other was in rtl_split_edge >> called from the "stack" >> pass which was not correctly inserting the new bb in the correct partition since >> bb layout is complete at that point. >> >> Re-tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap >> builds and regression testing. Re-built/ran cpu2006int with profile >> feedback and -freorder-blocks-and-partition enabled. >> >> Ok for trunk? >> >> Thanks! >> Teresa > > 2013-05-23 Teresa Johnson <tejohnson@google.com> > > * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert > as this is now done by redirect_edge_and_branch_force. > * function.c (thread_prologue_and_epilogue_insns): Insert new bb after > barriers, and fix interaction with splitting. > * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes. > * cfgcleanup.c (try_forward_edges): Fix early return value to properly > reflect changes made in the routine. > * bb-reorder.c (emit_barrier_after_bb): Move to cfgrtl.c. > (fix_up_fall_thru_edges): Remove incorrect check for bb layout order > since this is called in cfglayout mode, and replace partition fixup > with assert as that is now done by force_nonfallthru_and_redirect. > (add_reg_crossing_jump_notes): Handle the fact that some jumps may > already be marked with region crossing note. > (insert_section_boundary_note): Make non-static, gate on flag > has_bb_partition, rewrite to also check for multiple partitions. > (rest_of_handle_reorder_blocks): Remove call to > insert_section_boundary_note, now done later during free_cfg. > (duplicate_computed_gotos): Don't duplicate partition crossing edge. > * bb-reorder.h (insert_section_boundary_note): Declare. > * Makefile.in (cfgrtl.o): Depend on bb-reorder.h > * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist > invoke insert_section_boundary_note. > (try_redirect_by_replacing_jump): Remove unnecessary > check for region crossing note. > (fixup_partition_crossing): New function. > (rtl_redirect_edge_and_branch): Fixup partition boundaries. > (emit_barrier_after_bb): Move here from bb-reorder.c, handle insertion > in non-cfglayout mode. > (force_nonfallthru_and_redirect): Fixup partition boundaries, > remove old code that tried to do this. Emit barrier correctly > when we are in cfglayout mode. > (last_bb_in_partition): New function. > (rtl_split_edge): Correctly fixup partition boundaries. > (commit_one_edge_insertion): Remove old code that tried to > fixup region crossing edge since this is now handled in > split_block, and set up insertion point correctly since > block may now end in a jump. > (verify_hot_cold_block_grouping): Guard against checking when not in > linearized RTL mode. > (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP > notes. > (rtl_verify_flow_info_1): Move verify_hot_cold_block_grouping to > rtl_verify_flow_info, so not called in cfglayout mode. > (rtl_verify_flow_info): Move verify_hot_cold_block_grouping here. > (fixup_reorder_chain): Remove old code that attempted to fixup region > crossing note as this is now handled in force_nonfallthru_and_redirect. > (duplicate_insn_chain): Don't duplicate switch section notes. > (rtl_can_remove_branch_p): Remove unnecessary check for region crossing > note. > * basic-block.h (emit_barrier_after_bb): Declare. > * testsuite/gcc.dg/tree-prof/va-arg-pack-1.c: Cloned from c-torture, made > into -freorder-blocks-and-partition test. > * testsuite/gcc.dg/tree-prof/comp-goto-1.c: Ditto. > * testsuite/gcc.dg/tree-prof/20041218-1.c: Ditto. > * testsuite/gcc.dg/tree-prof/pr52027.c: Use -O2. > * testsuite/gcc.dg/tree-prof/pr50907.c: Ditto. > * testsuite/gcc.dg/tree-prof/pr45354.c: Ditto. > * testsuite/g++.dg/tree-prof/partition2.C: Ditto. > * testsuite/g++.dg/tree-prof/partition3.C: Ditto. > > Index: ifcvt.c > =================================================================== > --- ifcvt.c (revision 199014) > +++ ifcvt.c (working copy) > @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg > if (new_bb) > { > df_bb_replace (then_bb_index, new_bb); > - /* Since the fallthru edge was redirected from test_bb to new_bb, > - we need to ensure that new_bb is in the same partition as > - test bb (you can not fall through across section boundaries). */ > - BB_COPY_PARTITION (new_bb, test_bb); > + /* This should have been done above via force_nonfallthru_and_redirect > + (possibly called from redirect_edge_and_branch_force). */ > + gcc_checking_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb)); > } > > num_true_changes++; > Index: function.c > =================================================================== > --- function.c (revision 199014) > +++ function.c (working copy) > @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void) > break; > if (e) > { > - copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)), > - NULL_RTX, e->src); > + /* Make sure we insert after any barriers. */ > + rtx end = get_last_bb_insn (e->src); > + copy_bb = create_basic_block (NEXT_INSN (end), > + NULL_RTX, e->src); > BB_COPY_PARTITION (copy_bb, e->src); > } > else > @@ -6538,7 +6540,7 @@ epilogue_done: > basic_block simple_return_block_cold = NULL; > edge pending_edge_hot = NULL; > edge pending_edge_cold = NULL; > - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb; > + basic_block exit_pred; > int i; > > gcc_assert (entry_edge != orig_entry_edge); > @@ -6566,6 +6568,12 @@ epilogue_done: > else > pending_edge_cold = e; > } > + > + /* Save a pointer to the exit's predecessor BB for use in > + inserting new BBs at the end of the function. Do this > + after the call to split_block above which may split > + the original exit pred. */ > + exit_pred = EXIT_BLOCK_PTR->prev_bb; > > FOR_EACH_VEC_ELT (unconverted_simple_returns, i, e) > { > Index: emit-rtl.c > =================================================================== > --- emit-rtl.c (revision 199014) > +++ emit-rtl.c (working copy) > @@ -3574,6 +3574,7 @@ try_split (rtx pat, rtx trial, int last) > break; > > case REG_NON_LOCAL_GOTO: > + case REG_CROSSING_JUMP: > for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn)) > { > if (JUMP_P (insn)) > Index: cfgcleanup.c > =================================================================== > --- cfgcleanup.c (revision 199014) > +++ cfgcleanup.c (working copy) > @@ -456,7 +456,7 @@ try_forward_edges (int mode, basic_block b) > > if (first != EXIT_BLOCK_PTR > && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX)) > - return false; > + return changed; > > while (counter < n_basic_blocks) > { > Index: bb-reorder.c > =================================================================== > --- bb-reorder.c (revision 199014) > +++ bb-reorder.c (working copy) > @@ -1380,15 +1380,6 @@ get_uncond_jump_length (void) > return length; > } > > -/* Emit a barrier into the footer of BB. */ > - > -static void > -emit_barrier_after_bb (basic_block bb) > -{ > - rtx barrier = emit_barrier_after (BB_END (bb)); > - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); > -} > - > /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions. > Duplicate the landing pad and split the edges so that no EH edge > crosses partitions. */ > @@ -1720,8 +1711,7 @@ fix_up_fall_thru_edges (void) > (i.e. fix it so the fall through does not cross and > the cond jump does). */ > > - if (!cond_jump_crosses > - && cur_bb->aux == cond_jump->dest) > + if (!cond_jump_crosses) > { > /* Find label in fall_thru block. We've already added > any missing labels, so there must be one. */ > @@ -1765,10 +1755,10 @@ fix_up_fall_thru_edges (void) > new_bb->aux = cur_bb->aux; > cur_bb->aux = new_bb; > > - /* Make sure new fall-through bb is in same > - partition as bb it's falling through from. */ > + /* This is done by force_nonfallthru_and_redirect. */ > + gcc_assert (BB_PARTITION (new_bb) > + == BB_PARTITION (cur_bb)); > > - BB_COPY_PARTITION (new_bb, cur_bb); > single_succ_edge (new_bb)->flags |= EDGE_CROSSING; > } > else > @@ -2064,7 +2054,10 @@ add_reg_crossing_jump_notes (void) > FOR_EACH_BB (bb) > FOR_EACH_EDGE (e, ei, bb->succs) > if ((e->flags & EDGE_CROSSING) > - && JUMP_P (BB_END (e->src))) > + && JUMP_P (BB_END (e->src)) > + /* Some notes were added during fix_up_fall_thru_edges, via > + force_nonfallthru_and_redirect. */ > + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX)) > add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > } > > @@ -2133,23 +2126,26 @@ reorder_basic_blocks (void) > encountering this note will make the compiler switch between the > hot and cold text sections. */ > > -static void > +void > insert_section_boundary_note (void) > { > basic_block bb; > - int first_partition = 0; > + bool switched_sections = false; > + int current_partition = 0; > > - if (!flag_reorder_blocks_and_partition) > + if (!crtl->has_bb_partition) > return; > > FOR_EACH_BB (bb) > { > - if (!first_partition) > - first_partition = BB_PARTITION (bb); > - if (BB_PARTITION (bb) != first_partition) > + if (!current_partition) > + current_partition = BB_PARTITION (bb); > + if (BB_PARTITION (bb) != current_partition) > { > - emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); > - break; > + gcc_assert (!switched_sections); > + switched_sections = true; > + emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); > + current_partition = BB_PARTITION (bb); > } > } > } > @@ -2180,8 +2176,6 @@ rest_of_handle_reorder_blocks (void) > bb->aux = bb->next_bb; > cfg_layout_finalize (); > > - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ > - insert_section_boundary_note (); > return 0; > } > > @@ -2315,6 +2309,11 @@ duplicate_computed_gotos (void) > if (!bitmap_bit_p (candidates, single_succ (bb)->index)) > continue; > > + /* Don't duplicate a partition crossing edge, which requires difficult > + fixup. */ > + if (find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) > + continue; > + > new_bb = duplicate_block (single_succ (bb), single_succ_edge (bb), bb); > new_bb->aux = bb->aux; > bb->aux = new_bb; > Index: bb-reorder.h > =================================================================== > --- bb-reorder.h (revision 199014) > +++ bb-reorder.h (working copy) > @@ -35,4 +35,6 @@ extern struct target_bb_reorder *this_target_bb_re > > extern int get_uncond_jump_length (void); > > +extern void insert_section_boundary_note (void); > + > #endif > Index: Makefile.in > =================================================================== > --- Makefile.in (revision 199014) > +++ Makefile.in (working copy) > @@ -3151,7 +3151,7 @@ cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) corety > $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \ > insn-config.h $(EXPR_H) \ > $(CFGLOOP_H) $(OBSTACK_H) $(TARGET_H) $(TREE_H) \ > - $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h > + $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h bb-reorder.h > cfganal.o : cfganal.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(BASIC_BLOCK_H) \ > $(TIMEVAR_H) sbitmap.h $(BITMAP_H) > cfgbuild.o : cfgbuild.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ > Index: cfgrtl.c > =================================================================== > --- cfgrtl.c (revision 199014) > +++ cfgrtl.c (working copy) > @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree.h" > #include "hard-reg-set.h" > #include "basic-block.h" > +#include "bb-reorder.h" > #include "regs.h" > #include "flags.h" > #include "function.h" > @@ -451,6 +452,9 @@ rest_of_pass_free_cfg (void) > } > #endif > > + if (crtl->has_bb_partition) > + insert_section_boundary_note (); > + > free_bb_for_insn (); > return 0; > } > @@ -981,8 +985,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc > partition boundaries). See the comments at the top of > bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ > > - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) > - || BB_PARTITION (src) != BB_PARTITION (target)) > + if (BB_PARTITION (src) != BB_PARTITION (target)) > return NULL; > > /* We can replace or remove a complex jump only when we have exactly > @@ -1291,6 +1294,53 @@ redirect_branch_edge (edge e, basic_block target) > return e; > } > > +/* Called when edge E has been redirected to a new destination, > + in order to update the region crossing flag on the edge and > + jump. */ > + > +static void > +fixup_partition_crossing (edge e) > +{ > + rtx note; > + > + if (e->src == ENTRY_BLOCK_PTR || e->dest == EXIT_BLOCK_PTR) > + return; > + /* If we redirected an existing edge, it may already be marked > + crossing, even though the new src is missing a reg crossing note. > + But make sure reg crossing note doesn't already exist before > + inserting. */ > + if (BB_PARTITION (e->src) != BB_PARTITION (e->dest)) > + { > + e->flags |= EDGE_CROSSING; > + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > + if (JUMP_P (BB_END (e->src)) > + && !note) > + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > + } > + else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) > + { > + e->flags &= ~EDGE_CROSSING; > + /* Remove the section crossing note from jump at end of > + src if it exists, and if no other successors are > + still crossing. */ > + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); > + if (note) > + { > + bool has_crossing_succ = false; > + edge e2; > + edge_iterator ei; > + FOR_EACH_EDGE (e2, ei, e->src->succs) > + { > + has_crossing_succ |= (e2->flags & EDGE_CROSSING); > + if (has_crossing_succ) > + break; > + } > + if (!has_crossing_succ) > + remove_note (BB_END (e->src), note); > + } > + } > +} > + > /* Attempt to change code to redirect edge E to TARGET. Don't do that on > expense of adding new instructions or reordering basic blocks. > > @@ -1307,16 +1357,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block > { > edge ret; > basic_block src = e->src; > + basic_block dest = e->dest; > > if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) > return NULL; > > - if (e->dest == target) > + if (dest == target) > return e; > > if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL) > { > df_set_bb_dirty (src); > + fixup_partition_crossing (ret); > return ret; > } > > @@ -1325,9 +1377,22 @@ rtl_redirect_edge_and_branch (edge e, basic_block > return NULL; > > df_set_bb_dirty (src); > + fixup_partition_crossing (ret); > return ret; > } > > +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */ > + > +void > +emit_barrier_after_bb (basic_block bb) > +{ > + rtx barrier = emit_barrier_after (BB_END (bb)); > + gcc_assert (current_ir_type() == IR_RTL_CFGRTL > + || current_ir_type () == IR_RTL_CFGLAYOUT); > + if (current_ir_type () == IR_RTL_CFGLAYOUT) > + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); > +} > + > /* Like force_nonfallthru below, but additionally performs redirection > Used by redirect_edge_and_branch_force. JUMP_LABEL is used only > when redirecting to the EXIT_BLOCK, it is either ret_rtx or > @@ -1492,12 +1557,6 @@ force_nonfallthru_and_redirect (edge e, basic_bloc > /* Make sure new block ends up in correct hot/cold section. */ > > BB_COPY_PARTITION (jump_block, e->src); > - if (flag_reorder_blocks_and_partition > - && targetm_common.have_named_sections > - && JUMP_P (BB_END (jump_block)) > - && !any_condjump_p (BB_END (jump_block)) > - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING)) > - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX); > > /* Wire edge in. */ > new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU); > @@ -1508,6 +1567,10 @@ force_nonfallthru_and_redirect (edge e, basic_bloc > redirect_edge_pred (e, jump_block); > e->probability = REG_BR_PROB_BASE; > > + /* If e->src was previously region crossing, it no longer is > + and the reg crossing note should be removed. */ > + fixup_partition_crossing (new_edge); > + > /* If asm goto has any label refs to target's label, > add also edge from asm goto bb to target. */ > if (asm_goto_edge) > @@ -1559,13 +1622,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc > LABEL_NUSES (label)++; > } > > - emit_barrier_after (BB_END (jump_block)); > + /* We might be in cfg layout mode, and if so, the following routine will > + insert the barrier correctly. */ > + emit_barrier_after_bb (jump_block); > redirect_edge_succ_nodup (e, target); > > if (abnormal_edge_flags) > make_edge (src, target, abnormal_edge_flags); > > df_mark_solutions_dirty (); > + fixup_partition_crossing (e); > return new_bb; > } > > @@ -1654,6 +1720,21 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU > return false; > } > > +/* Locate the last bb in the same partition as START_BB. */ > + > +static basic_block > +last_bb_in_partition (basic_block start_bb) > +{ > + basic_block bb; > + FOR_BB_BETWEEN (bb, start_bb, EXIT_BLOCK_PTR, next_bb) > + { > + if (BB_PARTITION (start_bb) != BB_PARTITION (bb->next_bb)) > + return bb; > + } > + /* Return bb before EXIT_BLOCK_PTR. */ > + return bb->prev_bb; > +} > + > /* Split a (typically critical) edge. Return the new block. > The edge must not be abnormal. > > @@ -1664,7 +1745,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU > static basic_block > rtl_split_edge (edge edge_in) > { > - basic_block bb; > + basic_block bb, new_bb; > rtx before; > > /* Abnormal edges cannot be split. */ > @@ -1696,13 +1777,50 @@ rtl_split_edge (edge edge_in) > } > else > { > - bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); > - /* ??? Why not edge_in->dest->prev_bb here? */ > - BB_COPY_PARTITION (bb, edge_in->dest); > + if (edge_in->src == ENTRY_BLOCK_PTR) > + { > + bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); > + BB_COPY_PARTITION (bb, edge_in->dest); > + } > + else > + { > + basic_block after = edge_in->dest->prev_bb; > + /* If this is post-bb reordering, and the edge crosses a partition > + boundary, the new block needs to be inserted in the bb chain > + at the end of the src partition (since we put the new bb into > + that partition, see below). Otherwise we may end up creating > + an extra partition crossing in the chain, which is illegal. > + It can't go after the src, because src may have a fall-through > + to a different block. */ > + if (crtl->bb_reorder_complete > + && (edge_in->flags & EDGE_CROSSING)) > + { > + after = last_bb_in_partition (edge_in->src); > + before = NEXT_INSN (BB_END (after)); > + /* The instruction following the last bb in partition should > + be a barrier, since it cannot end in a fall-through. */ > + gcc_checking_assert (BARRIER_P (before)); > + before = NEXT_INSN (before); > + } > + bb = create_basic_block (before, NULL, after); > + /* Put the split bb into the src partition, to avoid creating > + a situation where a cold bb dominates a hot bb, in the case > + where src is cold and dest is hot. The src will dominate > + the new bb (whereas it might not have dominated dest). */ > + BB_COPY_PARTITION (bb, edge_in->src); > + } > } > > make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU); > > + /* Can't allow a region crossing edge to be fallthrough. */ > + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest) > + && edge_in->dest != EXIT_BLOCK_PTR) > + { > + new_bb = force_nonfallthru (single_succ_edge (bb)); > + gcc_assert (!new_bb); > + } > + > /* For non-fallthru edges, we must adjust the predecessor's > jump instruction to target our new block. */ > if ((edge_in->flags & EDGE_FALLTHRU) == 0) > @@ -1815,17 +1933,13 @@ commit_one_edge_insertion (edge e) > else > { > bb = split_edge (e); > - after = BB_END (bb); > > - if (flag_reorder_blocks_and_partition > - && targetm_common.have_named_sections > - && e->src != ENTRY_BLOCK_PTR > - && BB_PARTITION (e->src) == BB_COLD_PARTITION > - && !(e->flags & EDGE_CROSSING) > - && JUMP_P (after) > - && !any_condjump_p (after) > - && (single_succ_edge (bb)->flags & EDGE_CROSSING)) > - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX); > + /* If E crossed a partition boundary, we needed to make bb end in > + a region-crossing jump, even though it was originally fallthru. */ > + if (JUMP_P (BB_END (bb))) > + before = BB_END (bb); > + else > + after = BB_END (bb); > } > > /* Now that we've found the spot, do the insertion. */ > @@ -2071,7 +2185,11 @@ verify_hot_cold_block_grouping (void) > bool switched_sections = false; > int current_partition = BB_UNPARTITIONED; > > - if (!crtl->bb_reorder_complete) > + /* Even after bb reordering is complete, we go into cfglayout mode > + again (in compgoto). Ensure we don't call this before going back > + into linearized RTL when any layout fixes would have been committed. */ > + if (!crtl->bb_reorder_complete > + || current_ir_type() != IR_RTL_CFGRTL) > return err; > > FOR_EACH_BB (bb) > @@ -2116,6 +2234,7 @@ rtl_verify_edges (void) > edge e, fallthru = NULL; > edge_iterator ei; > rtx note; > + bool has_crossing_edge = false; > > if (JUMP_P (BB_END (bb)) > && (note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX)) > @@ -2141,6 +2260,7 @@ rtl_verify_edges (void) > is_crossing = (BB_PARTITION (e->src) != BB_PARTITION (e->dest) > && e->src != ENTRY_BLOCK_PTR > && e->dest != EXIT_BLOCK_PTR); > + has_crossing_edge |= is_crossing; > if (e->flags & EDGE_CROSSING) > { > if (!is_crossing) > @@ -2160,6 +2280,13 @@ rtl_verify_edges (void) > e->src->index); > err = 1; > } > + if (JUMP_P (BB_END (bb)) > + && !find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) > + { > + error ("No region crossing jump at section boundary in bb %i", > + bb->index); > + err = 1; > + } > } > else if (is_crossing) > { > @@ -2188,6 +2315,15 @@ rtl_verify_edges (void) > n_abnormal++; > } > > + if (!has_crossing_edge > + && find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) > + { > + print_rtl_with_bb (stderr, get_insns (), TDF_RTL | > TDF_BLOCKS | TDF_DETAILS); > + error ("Region crossing jump across same section in bb %i", > + bb->index); > + err = 1; > + } > + > if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX)) > { > error ("missing REG_EH_REGION note at the end of bb %i", bb->index); > @@ -2395,8 +2531,6 @@ rtl_verify_flow_info_1 (void) > > err |= rtl_verify_edges (); > > - err |= verify_hot_cold_block_grouping(); > - > return err; > } > > @@ -2642,6 +2776,8 @@ rtl_verify_flow_info (void) > > err |= rtl_verify_bb_layout (); > > + err |= verify_hot_cold_block_grouping (); > + > return err; > } > > @@ -3343,7 +3479,7 @@ fixup_reorder_chain (void) > edge e_fall, e_taken, e; > rtx bb_end_insn; > rtx ret_label = NULL_RTX; > - basic_block nb, src_bb; > + basic_block nb; > edge_iterator ei; > > if (EDGE_COUNT (bb->succs) == 0) > @@ -3478,7 +3614,6 @@ fixup_reorder_chain (void) > /* We got here if we need to add a new jump insn. > Note force_nonfallthru can delete E_FALL and thus we have to > save E_FALL->src prior to the call to force_nonfallthru. */ > - src_bb = e_fall->src; > nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label); > if (nb) > { > @@ -3486,17 +3621,6 @@ fixup_reorder_chain (void) > bb->aux = nb; > /* Don't process this new block. */ > bb = nb; > - > - /* Make sure new bb is tagged for correct section (same as > - fall-thru source, since you cannot fall-thru across > - section boundaries). */ > - BB_COPY_PARTITION (src_bb, single_pred (bb)); > - if (flag_reorder_blocks_and_partition > - && targetm_common.have_named_sections > - && JUMP_P (BB_END (bb)) > - && !any_condjump_p (BB_END (bb)) > - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING)) > - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX); > } > } > > @@ -3796,10 +3920,11 @@ duplicate_insn_chain (rtx from, rtx to) > case NOTE_INSN_FUNCTION_BEG: > /* There is always just single entry to function. */ > case NOTE_INSN_BASIC_BLOCK: > + /* We should only switch text sections once. */ > + case NOTE_INSN_SWITCH_TEXT_SECTIONS: > break; > > case NOTE_INSN_EPILOGUE_BEG: > - case NOTE_INSN_SWITCH_TEXT_SECTIONS: > emit_note_copy (insn); > break; > > @@ -4611,8 +4736,7 @@ rtl_can_remove_branch_p (const_edge e) > if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) > return false; > > - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) > - || BB_PARTITION (src) != BB_PARTITION (target)) > + if (BB_PARTITION (src) != BB_PARTITION (target)) > return false; > > if (!onlyjump_p (insn) > Index: basic-block.h > =================================================================== > --- basic-block.h (revision 199014) > +++ basic-block.h (working copy) > @@ -796,6 +796,7 @@ extern basic_block force_nonfallthru_and_redirect > extern bool contains_no_active_insn_p (const_basic_block); > extern bool forwarder_block_p (const_basic_block); > extern bool can_fallthru (basic_block, basic_block); > +extern void emit_barrier_after_bb (basic_block bb); > > /* In cfgbuild.c. */ > extern void find_many_sub_basic_blocks (sbitmap); > Index: testsuite/gcc.dg/tree-prof/va-arg-pack-1.c > =================================================================== > --- testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) > +++ testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) > @@ -0,0 +1,145 @@ > +/* __builtin_va_arg_pack () builtin tests. */ > +/* { dg-require-effective-target freorder } */ > +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ > + > +#include <stdarg.h> > + > +extern void abort (void); > + > +int v1 = 8; > +long int v2 = 3; > +void *v3 = (void *) &v2; > +struct A { char c[16]; } v4 = { "foo" }; > +long double v5 = 40; > +char seen[20]; > +int cnt; > + > +__attribute__ ((noinline)) int > +foo1 (int x, int y, ...) > +{ > + int i; > + long int l; > + void *v; > + struct A a; > + long double ld; > + va_list ap; > + > + va_start (ap, y); > + if (x < 0 || x >= 20 || seen[x]) > + abort (); > + seen[x] = ++cnt; > + if (y != 6) > + abort (); > + i = va_arg (ap, int); > + if (i != 5) > + abort (); > + switch (x) > + { > + case 0: > + i = va_arg (ap, int); > + if (i != 9 || v1 != 9) > + abort (); > + a = va_arg (ap, struct A); > + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) > + abort (); > + v = (void *) va_arg (ap, struct A *); > + if (v != (void *) &v4) > + abort (); > + l = va_arg (ap, long int); > + if (l != 3 || v2 != 4) > + abort (); > + break; > + case 1: > + ld = va_arg (ap, long double); > + if (ld != 41 || v5 != ld) > + abort (); > + i = va_arg (ap, int); > + if (i != 8) > + abort (); > + v = va_arg (ap, void *); > + if (v != &v2) > + abort (); > + break; > + case 2: > + break; > + default: > + abort (); > + } > + va_end (ap); > + return x; > +} > + > +__attribute__ ((noinline)) int > +foo2 (int x, int y, ...) > +{ > + long long int ll; > + void *v; > + struct A a, b; > + long double ld; > + va_list ap; > + > + va_start (ap, y); > + if (x < 0 || x >= 20 || seen[x]) > + abort (); > + seen[x] = ++cnt | 64; > + if (y != 10) > + abort (); > + switch (x) > + { > + case 11: > + break; > + case 12: > + ld = va_arg (ap, long double); > + if (ld != 41 || v5 != 40) > + abort (); > + a = va_arg (ap, struct A); > + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) > + abort (); > + b = va_arg (ap, struct A); > + if (__builtin_memcmp (b.c, v4.c, sizeof (b.c)) != 0) > + abort (); > + v = va_arg (ap, void *); > + if (v != &v2) > + abort (); > + ll = va_arg (ap, long long int); > + if (ll != 16LL) > + abort (); > + break; > + case 2: > + break; > + default: > + abort (); > + } > + va_end (ap); > + return x + 8; > +} > + > +__attribute__ ((noinline)) int > +foo3 (void) > +{ > + return 6; > +} > + > +extern inline __attribute__ ((always_inline, gnu_inline)) int > +bar (int x, ...) > +{ > + if (x < 10) > + return foo1 (x, foo3 (), 5, __builtin_va_arg_pack ()); > + return foo2 (x, foo3 () + 4, __builtin_va_arg_pack ()); > +} > + > +int > +main (void) > +{ > + if (bar (0, ++v1, v4, &v4, v2++) != 0) > + abort (); > + if (bar (1, ++v5, 8, v3) != 1) > + abort (); > + if (bar (2) != 2) > + abort (); > + if (bar (v1 + 2) != 19) > + abort (); > + if (bar (v1 + 3, v5--, v4, v4, v3, 16LL) != 20) > + abort (); > + return 0; > +} > Index: testsuite/gcc.dg/tree-prof/comp-goto-1.c > =================================================================== > --- testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) > +++ testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) > @@ -0,0 +1,166 @@ > +/* { dg-require-effective-target freorder } */ > +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ > +#include <stdlib.h> > + > +#if !defined(NO_LABEL_VALUES) && (!defined(STACK_SIZE) || STACK_SIZE >>= 4000) && __INT_MAX__ >= 2147483647 > +typedef unsigned int uint32; > +typedef signed int sint32; > + > +typedef uint32 reg_t; > + > +typedef unsigned long int host_addr_t; > +typedef uint32 target_addr_t; > +typedef sint32 target_saddr_t; > + > +typedef union > +{ > + struct > + { > + unsigned int offset:18; > + unsigned int ignore:4; > + unsigned int s1:8; > + int :2; > + signed int simm:14; > + unsigned int s3:8; > + unsigned int s2:8; > + int pad2:2; > + } f1; > + long long ll; > + double d; > +} insn_t; > + > +typedef struct > +{ > + target_addr_t vaddr_tag; > + unsigned long int rigged_paddr; > +} tlb_entry_t; > + > +typedef struct > +{ > + insn_t *pc; > + reg_t registers[256]; > + insn_t *program; > + tlb_entry_t tlb_tab[0x100]; > +} environment_t; > + > +enum operations > +{ > + LOAD32_RR, > + METAOP_DONE > +}; > + > +host_addr_t > +f () > +{ > + abort (); > +} > + > +reg_t > +simulator_kernel (int what, environment_t *env) > +{ > + register insn_t *pc = env->pc; > + register reg_t *regs = env->registers; > + register insn_t insn; > + register int s1; > + register reg_t r2; > + register void *base_addr = &&sim_base_addr; > + register tlb_entry_t *tlb = env->tlb_tab; > + > + if (what != 0) > + { > + int i; > + static void *op_map[] = > + { > + &&L_LOAD32_RR, > + &&L_METAOP_DONE, > + }; > + insn_t *program = env->program; > + for (i = 0; i < what; i++) > + program[i].f1.offset = op_map[program[i].f1.offset] - base_addr; > + } > + > + sim_base_addr:; > + > + insn = *pc++; > + r2 = (*(reg_t *) (((char *) regs) + (insn.f1.s2 << 2))); > + s1 = (insn.f1.s1 << 2); > + goto *(base_addr + insn.f1.offset); > + > + L_LOAD32_RR: > + { > + target_addr_t vaddr_page = r2 / 4096; > + unsigned int x = vaddr_page % 0x100; > + insn = *pc++; > + > + for (;;) > + { > + target_addr_t tag = tlb[x].vaddr_tag; > + host_addr_t rigged_paddr = tlb[x].rigged_paddr; > + > + if (tag == vaddr_page) > + { > + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) (rigged_paddr + r2); > + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); > + s1 = insn.f1.s1 << 2; > + goto *(base_addr + insn.f1.offset); > + } > + > + if (((target_saddr_t) tag < 0)) > + { > + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) f (); > + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); > + s1 = insn.f1.s1 << 2; > + goto *(base_addr + insn.f1.offset); > + } > + > + x = (x - 1) % 0x100; > + } > + > + L_METAOP_DONE: > + return (*(reg_t *) (((char *) regs) + s1)); > + } > +} > + > +insn_t program[2 + 1]; > + > +void *malloc (); > + > +int > +main () > +{ > + environment_t env; > + insn_t insn; > + int i, res; > + host_addr_t a_page = (host_addr_t) malloc (2 * 4096); > + target_addr_t a_vaddr = 0x123450; > + target_addr_t vaddr_page = a_vaddr / 4096; > + a_page = (a_page + 4096 - 1) & -4096; > + > + env.tlb_tab[((vaddr_page) % 0x100)].vaddr_tag = vaddr_page; > + env.tlb_tab[((vaddr_page) % 0x100)].rigged_paddr = a_page - > vaddr_page * 4096; > + insn.f1.offset = LOAD32_RR; > + env.registers[0] = 0; > + env.registers[2] = a_vaddr; > + *(sint32 *) (a_page + a_vaddr % 4096) = 88; > + insn.f1.s1 = 0; > + insn.f1.s2 = 2; > + > + for (i = 0; i < 2; i++) > + program[i] = insn; > + > + insn.f1.offset = METAOP_DONE; > + insn.f1.s1 = 0; > + program[2] = insn; > + > + env.pc = program; > + env.program = program; > + > + res = simulator_kernel (2 + 1, &env); > + > + if (res != 88) > + abort (); > + exit (0); > +} > +#else > +main(){ exit (0); } > +#endif > Index: testsuite/gcc.dg/tree-prof/pr52027.c > =================================================================== > --- testsuite/gcc.dg/tree-prof/pr52027.c (revision 199014) > +++ testsuite/gcc.dg/tree-prof/pr52027.c (working copy) > @@ -1,6 +1,6 @@ > /* PR debug/52027 */ > /* { dg-require-effective-target freorder } */ > -/* { dg-options "-O -freorder-blocks-and-partition -fno-reorder-functions" } */ > +/* { dg-options "-O2 -freorder-blocks-and-partition > -fno-reorder-functions" } */ > > void > foo (int len) > Index: testsuite/gcc.dg/tree-prof/pr50907.c > =================================================================== > --- testsuite/gcc.dg/tree-prof/pr50907.c (revision 199014) > +++ testsuite/gcc.dg/tree-prof/pr50907.c (working copy) > @@ -1,5 +1,5 @@ > /* PR middle-end/50907 */ > /* { dg-require-effective-target freorder } */ > -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns > -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* > x86_64-*-* } && fpic } } } */ > +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns > -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* > x86_64-*-* } && fpic } } } */ > > #include "pr45354.c" > Index: testsuite/gcc.dg/tree-prof/pr45354.c > =================================================================== > --- testsuite/gcc.dg/tree-prof/pr45354.c (revision 199014) > +++ testsuite/gcc.dg/tree-prof/pr45354.c (working copy) > @@ -1,5 +1,5 @@ > /* { dg-require-effective-target freorder } */ > -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns > -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } > */ > +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns > -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } > */ > > extern void abort (void); > > Index: testsuite/gcc.dg/tree-prof/20041218-1.c > =================================================================== > --- testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) > +++ testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) > @@ -0,0 +1,119 @@ > +/* PR rtl-optimization/16968 */ > +/* Testcase by Jakub Jelinek <jakub@redhat.com> */ > +/* { dg-require-effective-target freorder } */ > +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ > + > +struct T > +{ > + unsigned int b, c, *d; > + unsigned char e; > +}; > +struct S > +{ > + unsigned int a; > + struct T f; > +}; > +struct U > +{ > + struct S g, h; > +}; > +struct V > +{ > + unsigned int i; > + struct U j; > +}; > + > +extern void exit (int); > +extern void abort (void); > + > +void * > +dummy1 (void *x) > +{ > + return ""; > +} > + > +void * > +dummy2 (void *x, void *y) > +{ > + exit (0); > +} > + > +struct V * > +baz (unsigned int x) > +{ > + static struct V v; > + __builtin_memset (&v, 0x55, sizeof (v)); > + return &v; > +} > + > +int > +check (void *x, struct S *y) > +{ > + if (y->a || y->f.b || y->f.c || y->f.d || y->f.e) > + abort (); > + return 1; > +} > + > +static struct V * > +bar (unsigned int x, void *y) > +{ > + const struct T t = { 0, 0, (void *) 0, 0 }; > + struct V *u; > + void *v; > + v = dummy1 (y); > + if (!v) > + return (void *) 0; > + > + u = baz (sizeof (struct V)); > + u->i = x; > + u->j.g.a = 0; > + u->j.g.f = t; > + u->j.h.a = 0; > + u->j.h.f = t; > + > + if (!check (v, &u->j.g) || !check (v, &u->j.h)) > + return (void *) 0; > + return u; > +} > + > +int > +foo (unsigned int *x, unsigned int y, void **z) > +{ > + void *v; > + unsigned int i, j; > + > + *z = v = (void *) 0; > + > + for (i = 0; i < y; i++) > + { > + struct V *c; > + > + j = *x; > + > + switch (j) > + { > + case 1: > + c = bar (j, x); > + break; > + default: > + c = 0; > + break; > + } > + if (c) > + v = dummy2 (v, c); > + else > + return 1; > + } > + > + *z = v; > + return 0; > +} > + > +int > +main (void) > +{ > + unsigned int one = 1; > + void *p; > + foo (&one, 1, &p); > + abort (); > +} > Index: testsuite/g++.dg/tree-prof/partition2.C > =================================================================== > --- testsuite/g++.dg/tree-prof/partition2.C (revision 199014) > +++ testsuite/g++.dg/tree-prof/partition2.C (working copy) > @@ -1,6 +1,6 @@ > // PR middle-end/45458 > // { dg-require-effective-target freorder } > -// { dg-options "-fnon-call-exceptions -freorder-blocks-and-partition" } > +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } > > int > main () > Index: testsuite/g++.dg/tree-prof/partition3.C > =================================================================== > --- testsuite/g++.dg/tree-prof/partition3.C (revision 199014) > +++ testsuite/g++.dg/tree-prof/partition3.C (working copy) > @@ -1,6 +1,6 @@ > // PR middle-end/45566 > // { dg-require-effective-target freorder } > -// { dg-options "-O -fnon-call-exceptions -freorder-blocks-and-partition" } > +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } > > int k; > > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On Wed, May 29, 2013 at 7:57 AM, Teresa Johnson <tejohnson@google.com> wrote: > On Thu, May 23, 2013 at 6:18 AM, Teresa Johnson <tejohnson@google.com> wrote: >> On Wed, May 22, 2013 at 2:05 PM, Teresa Johnson <tejohnson@google.com> wrote: >>> Revised patch included below. The spacing of my pasted in patch text >>> looks funky again, let me know if you want the patch as an attachment >>> instead. >>> >>> I addressed all of Steven's comments, except for the suggestion to use >>> gcc_assert >>> instead of error() in verify_hot_cold_block_grouping() to keep this consistent >>> with the rest of the verify_flow_info subroutines (let me know if this is ok). >> >> I fixed this issue too, which was actually in >> insert_section_boundary_note(), so that it gcc_asserts more >> efficiently as suggested. Retested, latest patch below. >> >> Honza, would you be able to review the patch? > > Ping. Still needs a global maintainer to review and approve. Ping. Thanks! Teresa > > Also, I submitted a PR for the debug range issue: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57451 > > Thanks! > Teresa > >> >> Thanks! >> Teresa >> >>> >>> The other main changes: >>> (1) Added several test cases (cloned from the torture subdirectories, >>> where I manually >>> built/ran with FDO and -freorder-blocks-and-partition with both the >>> current trunk and >>> my fixed trunk compiler, and was able to expose some failures I fixed. >>> (2) Changed existing tree-prof tests that used >>> -freorder-blocks-and-partition to be >>> built with -O2 instead of -O, so that partitioning actually kicks in. >>> (3) Fixed a couple of failures in the new >>> verify_hot_cold_block_grouping() checks >>> exposed by the torture tests I ran manually with splitting (2 of the >>> tests cloned >>> to tree-prof in this patch). One was in computed goto where we were >>> too aggressive >>> about cloning crossing edges, and the other was in rtl_split_edge >>> called from the "stack" >>> pass which was not correctly inserting the new bb in the correct partition since >>> bb layout is complete at that point. >>> >>> Re-tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap >>> builds and regression testing. Re-built/ran cpu2006int with profile >>> feedback and -freorder-blocks-and-partition enabled. >>> >>> Ok for trunk? >>> >>> Thanks! >>> Teresa >> >> 2013-05-23 Teresa Johnson <tejohnson@google.com> >> >> * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert >> as this is now done by redirect_edge_and_branch_force. >> * function.c (thread_prologue_and_epilogue_insns): Insert new bb after >> barriers, and fix interaction with splitting. >> * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes. >> * cfgcleanup.c (try_forward_edges): Fix early return value to properly >> reflect changes made in the routine. >> * bb-reorder.c (emit_barrier_after_bb): Move to cfgrtl.c. >> (fix_up_fall_thru_edges): Remove incorrect check for bb layout order >> since this is called in cfglayout mode, and replace partition fixup >> with assert as that is now done by force_nonfallthru_and_redirect. >> (add_reg_crossing_jump_notes): Handle the fact that some jumps may >> already be marked with region crossing note. >> (insert_section_boundary_note): Make non-static, gate on flag >> has_bb_partition, rewrite to also check for multiple partitions. >> (rest_of_handle_reorder_blocks): Remove call to >> insert_section_boundary_note, now done later during free_cfg. >> (duplicate_computed_gotos): Don't duplicate partition crossing edge. >> * bb-reorder.h (insert_section_boundary_note): Declare. >> * Makefile.in (cfgrtl.o): Depend on bb-reorder.h >> * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist >> invoke insert_section_boundary_note. >> (try_redirect_by_replacing_jump): Remove unnecessary >> check for region crossing note. >> (fixup_partition_crossing): New function. >> (rtl_redirect_edge_and_branch): Fixup partition boundaries. >> (emit_barrier_after_bb): Move here from bb-reorder.c, handle insertion >> in non-cfglayout mode. >> (force_nonfallthru_and_redirect): Fixup partition boundaries, >> remove old code that tried to do this. Emit barrier correctly >> when we are in cfglayout mode. >> (last_bb_in_partition): New function. >> (rtl_split_edge): Correctly fixup partition boundaries. >> (commit_one_edge_insertion): Remove old code that tried to >> fixup region crossing edge since this is now handled in >> split_block, and set up insertion point correctly since >> block may now end in a jump. >> (verify_hot_cold_block_grouping): Guard against checking when not in >> linearized RTL mode. >> (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP >> notes. >> (rtl_verify_flow_info_1): Move verify_hot_cold_block_grouping to >> rtl_verify_flow_info, so not called in cfglayout mode. >> (rtl_verify_flow_info): Move verify_hot_cold_block_grouping here. >> (fixup_reorder_chain): Remove old code that attempted to fixup region >> crossing note as this is now handled in force_nonfallthru_and_redirect. >> (duplicate_insn_chain): Don't duplicate switch section notes. >> (rtl_can_remove_branch_p): Remove unnecessary check for region crossing >> note. >> * basic-block.h (emit_barrier_after_bb): Declare. >> * testsuite/gcc.dg/tree-prof/va-arg-pack-1.c: Cloned from c-torture, made >> into -freorder-blocks-and-partition test. >> * testsuite/gcc.dg/tree-prof/comp-goto-1.c: Ditto. >> * testsuite/gcc.dg/tree-prof/20041218-1.c: Ditto. >> * testsuite/gcc.dg/tree-prof/pr52027.c: Use -O2. >> * testsuite/gcc.dg/tree-prof/pr50907.c: Ditto. >> * testsuite/gcc.dg/tree-prof/pr45354.c: Ditto. >> * testsuite/g++.dg/tree-prof/partition2.C: Ditto. >> * testsuite/g++.dg/tree-prof/partition3.C: Ditto. >> >> Index: ifcvt.c >> =================================================================== >> --- ifcvt.c (revision 199014) >> +++ ifcvt.c (working copy) >> @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg >> if (new_bb) >> { >> df_bb_replace (then_bb_index, new_bb); >> - /* Since the fallthru edge was redirected from test_bb to new_bb, >> - we need to ensure that new_bb is in the same partition as >> - test bb (you can not fall through across section boundaries). */ >> - BB_COPY_PARTITION (new_bb, test_bb); >> + /* This should have been done above via force_nonfallthru_and_redirect >> + (possibly called from redirect_edge_and_branch_force). */ >> + gcc_checking_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb)); >> } >> >> num_true_changes++; >> Index: function.c >> =================================================================== >> --- function.c (revision 199014) >> +++ function.c (working copy) >> @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void) >> break; >> if (e) >> { >> - copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)), >> - NULL_RTX, e->src); >> + /* Make sure we insert after any barriers. */ >> + rtx end = get_last_bb_insn (e->src); >> + copy_bb = create_basic_block (NEXT_INSN (end), >> + NULL_RTX, e->src); >> BB_COPY_PARTITION (copy_bb, e->src); >> } >> else >> @@ -6538,7 +6540,7 @@ epilogue_done: >> basic_block simple_return_block_cold = NULL; >> edge pending_edge_hot = NULL; >> edge pending_edge_cold = NULL; >> - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb; >> + basic_block exit_pred; >> int i; >> >> gcc_assert (entry_edge != orig_entry_edge); >> @@ -6566,6 +6568,12 @@ epilogue_done: >> else >> pending_edge_cold = e; >> } >> + >> + /* Save a pointer to the exit's predecessor BB for use in >> + inserting new BBs at the end of the function. Do this >> + after the call to split_block above which may split >> + the original exit pred. */ >> + exit_pred = EXIT_BLOCK_PTR->prev_bb; >> >> FOR_EACH_VEC_ELT (unconverted_simple_returns, i, e) >> { >> Index: emit-rtl.c >> =================================================================== >> --- emit-rtl.c (revision 199014) >> +++ emit-rtl.c (working copy) >> @@ -3574,6 +3574,7 @@ try_split (rtx pat, rtx trial, int last) >> break; >> >> case REG_NON_LOCAL_GOTO: >> + case REG_CROSSING_JUMP: >> for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn)) >> { >> if (JUMP_P (insn)) >> Index: cfgcleanup.c >> =================================================================== >> --- cfgcleanup.c (revision 199014) >> +++ cfgcleanup.c (working copy) >> @@ -456,7 +456,7 @@ try_forward_edges (int mode, basic_block b) >> >> if (first != EXIT_BLOCK_PTR >> && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX)) >> - return false; >> + return changed; >> >> while (counter < n_basic_blocks) >> { >> Index: bb-reorder.c >> =================================================================== >> --- bb-reorder.c (revision 199014) >> +++ bb-reorder.c (working copy) >> @@ -1380,15 +1380,6 @@ get_uncond_jump_length (void) >> return length; >> } >> >> -/* Emit a barrier into the footer of BB. */ >> - >> -static void >> -emit_barrier_after_bb (basic_block bb) >> -{ >> - rtx barrier = emit_barrier_after (BB_END (bb)); >> - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); >> -} >> - >> /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions. >> Duplicate the landing pad and split the edges so that no EH edge >> crosses partitions. */ >> @@ -1720,8 +1711,7 @@ fix_up_fall_thru_edges (void) >> (i.e. fix it so the fall through does not cross and >> the cond jump does). */ >> >> - if (!cond_jump_crosses >> - && cur_bb->aux == cond_jump->dest) >> + if (!cond_jump_crosses) >> { >> /* Find label in fall_thru block. We've already added >> any missing labels, so there must be one. */ >> @@ -1765,10 +1755,10 @@ fix_up_fall_thru_edges (void) >> new_bb->aux = cur_bb->aux; >> cur_bb->aux = new_bb; >> >> - /* Make sure new fall-through bb is in same >> - partition as bb it's falling through from. */ >> + /* This is done by force_nonfallthru_and_redirect. */ >> + gcc_assert (BB_PARTITION (new_bb) >> + == BB_PARTITION (cur_bb)); >> >> - BB_COPY_PARTITION (new_bb, cur_bb); >> single_succ_edge (new_bb)->flags |= EDGE_CROSSING; >> } >> else >> @@ -2064,7 +2054,10 @@ add_reg_crossing_jump_notes (void) >> FOR_EACH_BB (bb) >> FOR_EACH_EDGE (e, ei, bb->succs) >> if ((e->flags & EDGE_CROSSING) >> - && JUMP_P (BB_END (e->src))) >> + && JUMP_P (BB_END (e->src)) >> + /* Some notes were added during fix_up_fall_thru_edges, via >> + force_nonfallthru_and_redirect. */ >> + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX)) >> add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> } >> >> @@ -2133,23 +2126,26 @@ reorder_basic_blocks (void) >> encountering this note will make the compiler switch between the >> hot and cold text sections. */ >> >> -static void >> +void >> insert_section_boundary_note (void) >> { >> basic_block bb; >> - int first_partition = 0; >> + bool switched_sections = false; >> + int current_partition = 0; >> >> - if (!flag_reorder_blocks_and_partition) >> + if (!crtl->has_bb_partition) >> return; >> >> FOR_EACH_BB (bb) >> { >> - if (!first_partition) >> - first_partition = BB_PARTITION (bb); >> - if (BB_PARTITION (bb) != first_partition) >> + if (!current_partition) >> + current_partition = BB_PARTITION (bb); >> + if (BB_PARTITION (bb) != current_partition) >> { >> - emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); >> - break; >> + gcc_assert (!switched_sections); >> + switched_sections = true; >> + emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); >> + current_partition = BB_PARTITION (bb); >> } >> } >> } >> @@ -2180,8 +2176,6 @@ rest_of_handle_reorder_blocks (void) >> bb->aux = bb->next_bb; >> cfg_layout_finalize (); >> >> - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ >> - insert_section_boundary_note (); >> return 0; >> } >> >> @@ -2315,6 +2309,11 @@ duplicate_computed_gotos (void) >> if (!bitmap_bit_p (candidates, single_succ (bb)->index)) >> continue; >> >> + /* Don't duplicate a partition crossing edge, which requires difficult >> + fixup. */ >> + if (find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) >> + continue; >> + >> new_bb = duplicate_block (single_succ (bb), single_succ_edge (bb), bb); >> new_bb->aux = bb->aux; >> bb->aux = new_bb; >> Index: bb-reorder.h >> =================================================================== >> --- bb-reorder.h (revision 199014) >> +++ bb-reorder.h (working copy) >> @@ -35,4 +35,6 @@ extern struct target_bb_reorder *this_target_bb_re >> >> extern int get_uncond_jump_length (void); >> >> +extern void insert_section_boundary_note (void); >> + >> #endif >> Index: Makefile.in >> =================================================================== >> --- Makefile.in (revision 199014) >> +++ Makefile.in (working copy) >> @@ -3151,7 +3151,7 @@ cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) corety >> $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \ >> insn-config.h $(EXPR_H) \ >> $(CFGLOOP_H) $(OBSTACK_H) $(TARGET_H) $(TREE_H) \ >> - $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h >> + $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h bb-reorder.h >> cfganal.o : cfganal.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(BASIC_BLOCK_H) \ >> $(TIMEVAR_H) sbitmap.h $(BITMAP_H) >> cfgbuild.o : cfgbuild.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ >> Index: cfgrtl.c >> =================================================================== >> --- cfgrtl.c (revision 199014) >> +++ cfgrtl.c (working copy) >> @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see >> #include "tree.h" >> #include "hard-reg-set.h" >> #include "basic-block.h" >> +#include "bb-reorder.h" >> #include "regs.h" >> #include "flags.h" >> #include "function.h" >> @@ -451,6 +452,9 @@ rest_of_pass_free_cfg (void) >> } >> #endif >> >> + if (crtl->has_bb_partition) >> + insert_section_boundary_note (); >> + >> free_bb_for_insn (); >> return 0; >> } >> @@ -981,8 +985,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc >> partition boundaries). See the comments at the top of >> bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ >> >> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) >> - || BB_PARTITION (src) != BB_PARTITION (target)) >> + if (BB_PARTITION (src) != BB_PARTITION (target)) >> return NULL; >> >> /* We can replace or remove a complex jump only when we have exactly >> @@ -1291,6 +1294,53 @@ redirect_branch_edge (edge e, basic_block target) >> return e; >> } >> >> +/* Called when edge E has been redirected to a new destination, >> + in order to update the region crossing flag on the edge and >> + jump. */ >> + >> +static void >> +fixup_partition_crossing (edge e) >> +{ >> + rtx note; >> + >> + if (e->src == ENTRY_BLOCK_PTR || e->dest == EXIT_BLOCK_PTR) >> + return; >> + /* If we redirected an existing edge, it may already be marked >> + crossing, even though the new src is missing a reg crossing note. >> + But make sure reg crossing note doesn't already exist before >> + inserting. */ >> + if (BB_PARTITION (e->src) != BB_PARTITION (e->dest)) >> + { >> + e->flags |= EDGE_CROSSING; >> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> + if (JUMP_P (BB_END (e->src)) >> + && !note) >> + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> + } >> + else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) >> + { >> + e->flags &= ~EDGE_CROSSING; >> + /* Remove the section crossing note from jump at end of >> + src if it exists, and if no other successors are >> + still crossing. */ >> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >> + if (note) >> + { >> + bool has_crossing_succ = false; >> + edge e2; >> + edge_iterator ei; >> + FOR_EACH_EDGE (e2, ei, e->src->succs) >> + { >> + has_crossing_succ |= (e2->flags & EDGE_CROSSING); >> + if (has_crossing_succ) >> + break; >> + } >> + if (!has_crossing_succ) >> + remove_note (BB_END (e->src), note); >> + } >> + } >> +} >> + >> /* Attempt to change code to redirect edge E to TARGET. Don't do that on >> expense of adding new instructions or reordering basic blocks. >> >> @@ -1307,16 +1357,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block >> { >> edge ret; >> basic_block src = e->src; >> + basic_block dest = e->dest; >> >> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) >> return NULL; >> >> - if (e->dest == target) >> + if (dest == target) >> return e; >> >> if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL) >> { >> df_set_bb_dirty (src); >> + fixup_partition_crossing (ret); >> return ret; >> } >> >> @@ -1325,9 +1377,22 @@ rtl_redirect_edge_and_branch (edge e, basic_block >> return NULL; >> >> df_set_bb_dirty (src); >> + fixup_partition_crossing (ret); >> return ret; >> } >> >> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */ >> + >> +void >> +emit_barrier_after_bb (basic_block bb) >> +{ >> + rtx barrier = emit_barrier_after (BB_END (bb)); >> + gcc_assert (current_ir_type() == IR_RTL_CFGRTL >> + || current_ir_type () == IR_RTL_CFGLAYOUT); >> + if (current_ir_type () == IR_RTL_CFGLAYOUT) >> + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); >> +} >> + >> /* Like force_nonfallthru below, but additionally performs redirection >> Used by redirect_edge_and_branch_force. JUMP_LABEL is used only >> when redirecting to the EXIT_BLOCK, it is either ret_rtx or >> @@ -1492,12 +1557,6 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >> /* Make sure new block ends up in correct hot/cold section. */ >> >> BB_COPY_PARTITION (jump_block, e->src); >> - if (flag_reorder_blocks_and_partition >> - && targetm_common.have_named_sections >> - && JUMP_P (BB_END (jump_block)) >> - && !any_condjump_p (BB_END (jump_block)) >> - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING)) >> - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX); >> >> /* Wire edge in. */ >> new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU); >> @@ -1508,6 +1567,10 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >> redirect_edge_pred (e, jump_block); >> e->probability = REG_BR_PROB_BASE; >> >> + /* If e->src was previously region crossing, it no longer is >> + and the reg crossing note should be removed. */ >> + fixup_partition_crossing (new_edge); >> + >> /* If asm goto has any label refs to target's label, >> add also edge from asm goto bb to target. */ >> if (asm_goto_edge) >> @@ -1559,13 +1622,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >> LABEL_NUSES (label)++; >> } >> >> - emit_barrier_after (BB_END (jump_block)); >> + /* We might be in cfg layout mode, and if so, the following routine will >> + insert the barrier correctly. */ >> + emit_barrier_after_bb (jump_block); >> redirect_edge_succ_nodup (e, target); >> >> if (abnormal_edge_flags) >> make_edge (src, target, abnormal_edge_flags); >> >> df_mark_solutions_dirty (); >> + fixup_partition_crossing (e); >> return new_bb; >> } >> >> @@ -1654,6 +1720,21 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU >> return false; >> } >> >> +/* Locate the last bb in the same partition as START_BB. */ >> + >> +static basic_block >> +last_bb_in_partition (basic_block start_bb) >> +{ >> + basic_block bb; >> + FOR_BB_BETWEEN (bb, start_bb, EXIT_BLOCK_PTR, next_bb) >> + { >> + if (BB_PARTITION (start_bb) != BB_PARTITION (bb->next_bb)) >> + return bb; >> + } >> + /* Return bb before EXIT_BLOCK_PTR. */ >> + return bb->prev_bb; >> +} >> + >> /* Split a (typically critical) edge. Return the new block. >> The edge must not be abnormal. >> >> @@ -1664,7 +1745,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU >> static basic_block >> rtl_split_edge (edge edge_in) >> { >> - basic_block bb; >> + basic_block bb, new_bb; >> rtx before; >> >> /* Abnormal edges cannot be split. */ >> @@ -1696,13 +1777,50 @@ rtl_split_edge (edge edge_in) >> } >> else >> { >> - bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); >> - /* ??? Why not edge_in->dest->prev_bb here? */ >> - BB_COPY_PARTITION (bb, edge_in->dest); >> + if (edge_in->src == ENTRY_BLOCK_PTR) >> + { >> + bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); >> + BB_COPY_PARTITION (bb, edge_in->dest); >> + } >> + else >> + { >> + basic_block after = edge_in->dest->prev_bb; >> + /* If this is post-bb reordering, and the edge crosses a partition >> + boundary, the new block needs to be inserted in the bb chain >> + at the end of the src partition (since we put the new bb into >> + that partition, see below). Otherwise we may end up creating >> + an extra partition crossing in the chain, which is illegal. >> + It can't go after the src, because src may have a fall-through >> + to a different block. */ >> + if (crtl->bb_reorder_complete >> + && (edge_in->flags & EDGE_CROSSING)) >> + { >> + after = last_bb_in_partition (edge_in->src); >> + before = NEXT_INSN (BB_END (after)); >> + /* The instruction following the last bb in partition should >> + be a barrier, since it cannot end in a fall-through. */ >> + gcc_checking_assert (BARRIER_P (before)); >> + before = NEXT_INSN (before); >> + } >> + bb = create_basic_block (before, NULL, after); >> + /* Put the split bb into the src partition, to avoid creating >> + a situation where a cold bb dominates a hot bb, in the case >> + where src is cold and dest is hot. The src will dominate >> + the new bb (whereas it might not have dominated dest). */ >> + BB_COPY_PARTITION (bb, edge_in->src); >> + } >> } >> >> make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU); >> >> + /* Can't allow a region crossing edge to be fallthrough. */ >> + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest) >> + && edge_in->dest != EXIT_BLOCK_PTR) >> + { >> + new_bb = force_nonfallthru (single_succ_edge (bb)); >> + gcc_assert (!new_bb); >> + } >> + >> /* For non-fallthru edges, we must adjust the predecessor's >> jump instruction to target our new block. */ >> if ((edge_in->flags & EDGE_FALLTHRU) == 0) >> @@ -1815,17 +1933,13 @@ commit_one_edge_insertion (edge e) >> else >> { >> bb = split_edge (e); >> - after = BB_END (bb); >> >> - if (flag_reorder_blocks_and_partition >> - && targetm_common.have_named_sections >> - && e->src != ENTRY_BLOCK_PTR >> - && BB_PARTITION (e->src) == BB_COLD_PARTITION >> - && !(e->flags & EDGE_CROSSING) >> - && JUMP_P (after) >> - && !any_condjump_p (after) >> - && (single_succ_edge (bb)->flags & EDGE_CROSSING)) >> - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX); >> + /* If E crossed a partition boundary, we needed to make bb end in >> + a region-crossing jump, even though it was originally fallthru. */ >> + if (JUMP_P (BB_END (bb))) >> + before = BB_END (bb); >> + else >> + after = BB_END (bb); >> } >> >> /* Now that we've found the spot, do the insertion. */ >> @@ -2071,7 +2185,11 @@ verify_hot_cold_block_grouping (void) >> bool switched_sections = false; >> int current_partition = BB_UNPARTITIONED; >> >> - if (!crtl->bb_reorder_complete) >> + /* Even after bb reordering is complete, we go into cfglayout mode >> + again (in compgoto). Ensure we don't call this before going back >> + into linearized RTL when any layout fixes would have been committed. */ >> + if (!crtl->bb_reorder_complete >> + || current_ir_type() != IR_RTL_CFGRTL) >> return err; >> >> FOR_EACH_BB (bb) >> @@ -2116,6 +2234,7 @@ rtl_verify_edges (void) >> edge e, fallthru = NULL; >> edge_iterator ei; >> rtx note; >> + bool has_crossing_edge = false; >> >> if (JUMP_P (BB_END (bb)) >> && (note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX)) >> @@ -2141,6 +2260,7 @@ rtl_verify_edges (void) >> is_crossing = (BB_PARTITION (e->src) != BB_PARTITION (e->dest) >> && e->src != ENTRY_BLOCK_PTR >> && e->dest != EXIT_BLOCK_PTR); >> + has_crossing_edge |= is_crossing; >> if (e->flags & EDGE_CROSSING) >> { >> if (!is_crossing) >> @@ -2160,6 +2280,13 @@ rtl_verify_edges (void) >> e->src->index); >> err = 1; >> } >> + if (JUMP_P (BB_END (bb)) >> + && !find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) >> + { >> + error ("No region crossing jump at section boundary in bb %i", >> + bb->index); >> + err = 1; >> + } >> } >> else if (is_crossing) >> { >> @@ -2188,6 +2315,15 @@ rtl_verify_edges (void) >> n_abnormal++; >> } >> >> + if (!has_crossing_edge >> + && find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) >> + { >> + print_rtl_with_bb (stderr, get_insns (), TDF_RTL | >> TDF_BLOCKS | TDF_DETAILS); >> + error ("Region crossing jump across same section in bb %i", >> + bb->index); >> + err = 1; >> + } >> + >> if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX)) >> { >> error ("missing REG_EH_REGION note at the end of bb %i", bb->index); >> @@ -2395,8 +2531,6 @@ rtl_verify_flow_info_1 (void) >> >> err |= rtl_verify_edges (); >> >> - err |= verify_hot_cold_block_grouping(); >> - >> return err; >> } >> >> @@ -2642,6 +2776,8 @@ rtl_verify_flow_info (void) >> >> err |= rtl_verify_bb_layout (); >> >> + err |= verify_hot_cold_block_grouping (); >> + >> return err; >> } >> >> @@ -3343,7 +3479,7 @@ fixup_reorder_chain (void) >> edge e_fall, e_taken, e; >> rtx bb_end_insn; >> rtx ret_label = NULL_RTX; >> - basic_block nb, src_bb; >> + basic_block nb; >> edge_iterator ei; >> >> if (EDGE_COUNT (bb->succs) == 0) >> @@ -3478,7 +3614,6 @@ fixup_reorder_chain (void) >> /* We got here if we need to add a new jump insn. >> Note force_nonfallthru can delete E_FALL and thus we have to >> save E_FALL->src prior to the call to force_nonfallthru. */ >> - src_bb = e_fall->src; >> nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label); >> if (nb) >> { >> @@ -3486,17 +3621,6 @@ fixup_reorder_chain (void) >> bb->aux = nb; >> /* Don't process this new block. */ >> bb = nb; >> - >> - /* Make sure new bb is tagged for correct section (same as >> - fall-thru source, since you cannot fall-thru across >> - section boundaries). */ >> - BB_COPY_PARTITION (src_bb, single_pred (bb)); >> - if (flag_reorder_blocks_and_partition >> - && targetm_common.have_named_sections >> - && JUMP_P (BB_END (bb)) >> - && !any_condjump_p (BB_END (bb)) >> - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING)) >> - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX); >> } >> } >> >> @@ -3796,10 +3920,11 @@ duplicate_insn_chain (rtx from, rtx to) >> case NOTE_INSN_FUNCTION_BEG: >> /* There is always just single entry to function. */ >> case NOTE_INSN_BASIC_BLOCK: >> + /* We should only switch text sections once. */ >> + case NOTE_INSN_SWITCH_TEXT_SECTIONS: >> break; >> >> case NOTE_INSN_EPILOGUE_BEG: >> - case NOTE_INSN_SWITCH_TEXT_SECTIONS: >> emit_note_copy (insn); >> break; >> >> @@ -4611,8 +4736,7 @@ rtl_can_remove_branch_p (const_edge e) >> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) >> return false; >> >> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) >> - || BB_PARTITION (src) != BB_PARTITION (target)) >> + if (BB_PARTITION (src) != BB_PARTITION (target)) >> return false; >> >> if (!onlyjump_p (insn) >> Index: basic-block.h >> =================================================================== >> --- basic-block.h (revision 199014) >> +++ basic-block.h (working copy) >> @@ -796,6 +796,7 @@ extern basic_block force_nonfallthru_and_redirect >> extern bool contains_no_active_insn_p (const_basic_block); >> extern bool forwarder_block_p (const_basic_block); >> extern bool can_fallthru (basic_block, basic_block); >> +extern void emit_barrier_after_bb (basic_block bb); >> >> /* In cfgbuild.c. */ >> extern void find_many_sub_basic_blocks (sbitmap); >> Index: testsuite/gcc.dg/tree-prof/va-arg-pack-1.c >> =================================================================== >> --- testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) >> +++ testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) >> @@ -0,0 +1,145 @@ >> +/* __builtin_va_arg_pack () builtin tests. */ >> +/* { dg-require-effective-target freorder } */ >> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ >> + >> +#include <stdarg.h> >> + >> +extern void abort (void); >> + >> +int v1 = 8; >> +long int v2 = 3; >> +void *v3 = (void *) &v2; >> +struct A { char c[16]; } v4 = { "foo" }; >> +long double v5 = 40; >> +char seen[20]; >> +int cnt; >> + >> +__attribute__ ((noinline)) int >> +foo1 (int x, int y, ...) >> +{ >> + int i; >> + long int l; >> + void *v; >> + struct A a; >> + long double ld; >> + va_list ap; >> + >> + va_start (ap, y); >> + if (x < 0 || x >= 20 || seen[x]) >> + abort (); >> + seen[x] = ++cnt; >> + if (y != 6) >> + abort (); >> + i = va_arg (ap, int); >> + if (i != 5) >> + abort (); >> + switch (x) >> + { >> + case 0: >> + i = va_arg (ap, int); >> + if (i != 9 || v1 != 9) >> + abort (); >> + a = va_arg (ap, struct A); >> + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) >> + abort (); >> + v = (void *) va_arg (ap, struct A *); >> + if (v != (void *) &v4) >> + abort (); >> + l = va_arg (ap, long int); >> + if (l != 3 || v2 != 4) >> + abort (); >> + break; >> + case 1: >> + ld = va_arg (ap, long double); >> + if (ld != 41 || v5 != ld) >> + abort (); >> + i = va_arg (ap, int); >> + if (i != 8) >> + abort (); >> + v = va_arg (ap, void *); >> + if (v != &v2) >> + abort (); >> + break; >> + case 2: >> + break; >> + default: >> + abort (); >> + } >> + va_end (ap); >> + return x; >> +} >> + >> +__attribute__ ((noinline)) int >> +foo2 (int x, int y, ...) >> +{ >> + long long int ll; >> + void *v; >> + struct A a, b; >> + long double ld; >> + va_list ap; >> + >> + va_start (ap, y); >> + if (x < 0 || x >= 20 || seen[x]) >> + abort (); >> + seen[x] = ++cnt | 64; >> + if (y != 10) >> + abort (); >> + switch (x) >> + { >> + case 11: >> + break; >> + case 12: >> + ld = va_arg (ap, long double); >> + if (ld != 41 || v5 != 40) >> + abort (); >> + a = va_arg (ap, struct A); >> + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) >> + abort (); >> + b = va_arg (ap, struct A); >> + if (__builtin_memcmp (b.c, v4.c, sizeof (b.c)) != 0) >> + abort (); >> + v = va_arg (ap, void *); >> + if (v != &v2) >> + abort (); >> + ll = va_arg (ap, long long int); >> + if (ll != 16LL) >> + abort (); >> + break; >> + case 2: >> + break; >> + default: >> + abort (); >> + } >> + va_end (ap); >> + return x + 8; >> +} >> + >> +__attribute__ ((noinline)) int >> +foo3 (void) >> +{ >> + return 6; >> +} >> + >> +extern inline __attribute__ ((always_inline, gnu_inline)) int >> +bar (int x, ...) >> +{ >> + if (x < 10) >> + return foo1 (x, foo3 (), 5, __builtin_va_arg_pack ()); >> + return foo2 (x, foo3 () + 4, __builtin_va_arg_pack ()); >> +} >> + >> +int >> +main (void) >> +{ >> + if (bar (0, ++v1, v4, &v4, v2++) != 0) >> + abort (); >> + if (bar (1, ++v5, 8, v3) != 1) >> + abort (); >> + if (bar (2) != 2) >> + abort (); >> + if (bar (v1 + 2) != 19) >> + abort (); >> + if (bar (v1 + 3, v5--, v4, v4, v3, 16LL) != 20) >> + abort (); >> + return 0; >> +} >> Index: testsuite/gcc.dg/tree-prof/comp-goto-1.c >> =================================================================== >> --- testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) >> +++ testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) >> @@ -0,0 +1,166 @@ >> +/* { dg-require-effective-target freorder } */ >> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ >> +#include <stdlib.h> >> + >> +#if !defined(NO_LABEL_VALUES) && (!defined(STACK_SIZE) || STACK_SIZE >>>= 4000) && __INT_MAX__ >= 2147483647 >> +typedef unsigned int uint32; >> +typedef signed int sint32; >> + >> +typedef uint32 reg_t; >> + >> +typedef unsigned long int host_addr_t; >> +typedef uint32 target_addr_t; >> +typedef sint32 target_saddr_t; >> + >> +typedef union >> +{ >> + struct >> + { >> + unsigned int offset:18; >> + unsigned int ignore:4; >> + unsigned int s1:8; >> + int :2; >> + signed int simm:14; >> + unsigned int s3:8; >> + unsigned int s2:8; >> + int pad2:2; >> + } f1; >> + long long ll; >> + double d; >> +} insn_t; >> + >> +typedef struct >> +{ >> + target_addr_t vaddr_tag; >> + unsigned long int rigged_paddr; >> +} tlb_entry_t; >> + >> +typedef struct >> +{ >> + insn_t *pc; >> + reg_t registers[256]; >> + insn_t *program; >> + tlb_entry_t tlb_tab[0x100]; >> +} environment_t; >> + >> +enum operations >> +{ >> + LOAD32_RR, >> + METAOP_DONE >> +}; >> + >> +host_addr_t >> +f () >> +{ >> + abort (); >> +} >> + >> +reg_t >> +simulator_kernel (int what, environment_t *env) >> +{ >> + register insn_t *pc = env->pc; >> + register reg_t *regs = env->registers; >> + register insn_t insn; >> + register int s1; >> + register reg_t r2; >> + register void *base_addr = &&sim_base_addr; >> + register tlb_entry_t *tlb = env->tlb_tab; >> + >> + if (what != 0) >> + { >> + int i; >> + static void *op_map[] = >> + { >> + &&L_LOAD32_RR, >> + &&L_METAOP_DONE, >> + }; >> + insn_t *program = env->program; >> + for (i = 0; i < what; i++) >> + program[i].f1.offset = op_map[program[i].f1.offset] - base_addr; >> + } >> + >> + sim_base_addr:; >> + >> + insn = *pc++; >> + r2 = (*(reg_t *) (((char *) regs) + (insn.f1.s2 << 2))); >> + s1 = (insn.f1.s1 << 2); >> + goto *(base_addr + insn.f1.offset); >> + >> + L_LOAD32_RR: >> + { >> + target_addr_t vaddr_page = r2 / 4096; >> + unsigned int x = vaddr_page % 0x100; >> + insn = *pc++; >> + >> + for (;;) >> + { >> + target_addr_t tag = tlb[x].vaddr_tag; >> + host_addr_t rigged_paddr = tlb[x].rigged_paddr; >> + >> + if (tag == vaddr_page) >> + { >> + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) (rigged_paddr + r2); >> + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); >> + s1 = insn.f1.s1 << 2; >> + goto *(base_addr + insn.f1.offset); >> + } >> + >> + if (((target_saddr_t) tag < 0)) >> + { >> + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) f (); >> + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); >> + s1 = insn.f1.s1 << 2; >> + goto *(base_addr + insn.f1.offset); >> + } >> + >> + x = (x - 1) % 0x100; >> + } >> + >> + L_METAOP_DONE: >> + return (*(reg_t *) (((char *) regs) + s1)); >> + } >> +} >> + >> +insn_t program[2 + 1]; >> + >> +void *malloc (); >> + >> +int >> +main () >> +{ >> + environment_t env; >> + insn_t insn; >> + int i, res; >> + host_addr_t a_page = (host_addr_t) malloc (2 * 4096); >> + target_addr_t a_vaddr = 0x123450; >> + target_addr_t vaddr_page = a_vaddr / 4096; >> + a_page = (a_page + 4096 - 1) & -4096; >> + >> + env.tlb_tab[((vaddr_page) % 0x100)].vaddr_tag = vaddr_page; >> + env.tlb_tab[((vaddr_page) % 0x100)].rigged_paddr = a_page - >> vaddr_page * 4096; >> + insn.f1.offset = LOAD32_RR; >> + env.registers[0] = 0; >> + env.registers[2] = a_vaddr; >> + *(sint32 *) (a_page + a_vaddr % 4096) = 88; >> + insn.f1.s1 = 0; >> + insn.f1.s2 = 2; >> + >> + for (i = 0; i < 2; i++) >> + program[i] = insn; >> + >> + insn.f1.offset = METAOP_DONE; >> + insn.f1.s1 = 0; >> + program[2] = insn; >> + >> + env.pc = program; >> + env.program = program; >> + >> + res = simulator_kernel (2 + 1, &env); >> + >> + if (res != 88) >> + abort (); >> + exit (0); >> +} >> +#else >> +main(){ exit (0); } >> +#endif >> Index: testsuite/gcc.dg/tree-prof/pr52027.c >> =================================================================== >> --- testsuite/gcc.dg/tree-prof/pr52027.c (revision 199014) >> +++ testsuite/gcc.dg/tree-prof/pr52027.c (working copy) >> @@ -1,6 +1,6 @@ >> /* PR debug/52027 */ >> /* { dg-require-effective-target freorder } */ >> -/* { dg-options "-O -freorder-blocks-and-partition -fno-reorder-functions" } */ >> +/* { dg-options "-O2 -freorder-blocks-and-partition >> -fno-reorder-functions" } */ >> >> void >> foo (int len) >> Index: testsuite/gcc.dg/tree-prof/pr50907.c >> =================================================================== >> --- testsuite/gcc.dg/tree-prof/pr50907.c (revision 199014) >> +++ testsuite/gcc.dg/tree-prof/pr50907.c (working copy) >> @@ -1,5 +1,5 @@ >> /* PR middle-end/50907 */ >> /* { dg-require-effective-target freorder } */ >> -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns >> -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* >> x86_64-*-* } && fpic } } } */ >> +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns >> -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* >> x86_64-*-* } && fpic } } } */ >> >> #include "pr45354.c" >> Index: testsuite/gcc.dg/tree-prof/pr45354.c >> =================================================================== >> --- testsuite/gcc.dg/tree-prof/pr45354.c (revision 199014) >> +++ testsuite/gcc.dg/tree-prof/pr45354.c (working copy) >> @@ -1,5 +1,5 @@ >> /* { dg-require-effective-target freorder } */ >> -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns >> -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } >> */ >> +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns >> -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } >> */ >> >> extern void abort (void); >> >> Index: testsuite/gcc.dg/tree-prof/20041218-1.c >> =================================================================== >> --- testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) >> +++ testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) >> @@ -0,0 +1,119 @@ >> +/* PR rtl-optimization/16968 */ >> +/* Testcase by Jakub Jelinek <jakub@redhat.com> */ >> +/* { dg-require-effective-target freorder } */ >> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ >> + >> +struct T >> +{ >> + unsigned int b, c, *d; >> + unsigned char e; >> +}; >> +struct S >> +{ >> + unsigned int a; >> + struct T f; >> +}; >> +struct U >> +{ >> + struct S g, h; >> +}; >> +struct V >> +{ >> + unsigned int i; >> + struct U j; >> +}; >> + >> +extern void exit (int); >> +extern void abort (void); >> + >> +void * >> +dummy1 (void *x) >> +{ >> + return ""; >> +} >> + >> +void * >> +dummy2 (void *x, void *y) >> +{ >> + exit (0); >> +} >> + >> +struct V * >> +baz (unsigned int x) >> +{ >> + static struct V v; >> + __builtin_memset (&v, 0x55, sizeof (v)); >> + return &v; >> +} >> + >> +int >> +check (void *x, struct S *y) >> +{ >> + if (y->a || y->f.b || y->f.c || y->f.d || y->f.e) >> + abort (); >> + return 1; >> +} >> + >> +static struct V * >> +bar (unsigned int x, void *y) >> +{ >> + const struct T t = { 0, 0, (void *) 0, 0 }; >> + struct V *u; >> + void *v; >> + v = dummy1 (y); >> + if (!v) >> + return (void *) 0; >> + >> + u = baz (sizeof (struct V)); >> + u->i = x; >> + u->j.g.a = 0; >> + u->j.g.f = t; >> + u->j.h.a = 0; >> + u->j.h.f = t; >> + >> + if (!check (v, &u->j.g) || !check (v, &u->j.h)) >> + return (void *) 0; >> + return u; >> +} >> + >> +int >> +foo (unsigned int *x, unsigned int y, void **z) >> +{ >> + void *v; >> + unsigned int i, j; >> + >> + *z = v = (void *) 0; >> + >> + for (i = 0; i < y; i++) >> + { >> + struct V *c; >> + >> + j = *x; >> + >> + switch (j) >> + { >> + case 1: >> + c = bar (j, x); >> + break; >> + default: >> + c = 0; >> + break; >> + } >> + if (c) >> + v = dummy2 (v, c); >> + else >> + return 1; >> + } >> + >> + *z = v; >> + return 0; >> +} >> + >> +int >> +main (void) >> +{ >> + unsigned int one = 1; >> + void *p; >> + foo (&one, 1, &p); >> + abort (); >> +} >> Index: testsuite/g++.dg/tree-prof/partition2.C >> =================================================================== >> --- testsuite/g++.dg/tree-prof/partition2.C (revision 199014) >> +++ testsuite/g++.dg/tree-prof/partition2.C (working copy) >> @@ -1,6 +1,6 @@ >> // PR middle-end/45458 >> // { dg-require-effective-target freorder } >> -// { dg-options "-fnon-call-exceptions -freorder-blocks-and-partition" } >> +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } >> >> int >> main () >> Index: testsuite/g++.dg/tree-prof/partition3.C >> =================================================================== >> --- testsuite/g++.dg/tree-prof/partition3.C (revision 199014) >> +++ testsuite/g++.dg/tree-prof/partition3.C (working copy) >> @@ -1,6 +1,6 @@ >> // PR middle-end/45566 >> // { dg-require-effective-target freorder } >> -// { dg-options "-O -fnon-call-exceptions -freorder-blocks-and-partition" } >> +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } >> >> int k; >> >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On Wed, Jun 5, 2013 at 4:06 PM, Teresa Johnson <tejohnson@google.com> wrote: > On Wed, May 29, 2013 at 7:57 AM, Teresa Johnson <tejohnson@google.com> wrote: >> On Thu, May 23, 2013 at 6:18 AM, Teresa Johnson <tejohnson@google.com> wrote: >>> On Wed, May 22, 2013 at 2:05 PM, Teresa Johnson <tejohnson@google.com> wrote: >>>> Revised patch included below. The spacing of my pasted in patch text >>>> looks funky again, let me know if you want the patch as an attachment >>>> instead. >>>> >>>> I addressed all of Steven's comments, except for the suggestion to use >>>> gcc_assert >>>> instead of error() in verify_hot_cold_block_grouping() to keep this consistent >>>> with the rest of the verify_flow_info subroutines (let me know if this is ok). >>> >>> I fixed this issue too, which was actually in >>> insert_section_boundary_note(), so that it gcc_asserts more >>> efficiently as suggested. Retested, latest patch below. >>> >>> Honza, would you be able to review the patch? >> >> Ping. Still needs a global maintainer to review and approve. > > Ping. This is ok. Please watch for fallout! Thanks, Richard. > Thanks! > Teresa > >> >> Also, I submitted a PR for the debug range issue: >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57451 >> >> Thanks! >> Teresa >> >>> >>> Thanks! >>> Teresa >>> >>>> >>>> The other main changes: >>>> (1) Added several test cases (cloned from the torture subdirectories, >>>> where I manually >>>> built/ran with FDO and -freorder-blocks-and-partition with both the >>>> current trunk and >>>> my fixed trunk compiler, and was able to expose some failures I fixed. >>>> (2) Changed existing tree-prof tests that used >>>> -freorder-blocks-and-partition to be >>>> built with -O2 instead of -O, so that partitioning actually kicks in. >>>> (3) Fixed a couple of failures in the new >>>> verify_hot_cold_block_grouping() checks >>>> exposed by the torture tests I ran manually with splitting (2 of the >>>> tests cloned >>>> to tree-prof in this patch). One was in computed goto where we were >>>> too aggressive >>>> about cloning crossing edges, and the other was in rtl_split_edge >>>> called from the "stack" >>>> pass which was not correctly inserting the new bb in the correct partition since >>>> bb layout is complete at that point. >>>> >>>> Re-tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap >>>> builds and regression testing. Re-built/ran cpu2006int with profile >>>> feedback and -freorder-blocks-and-partition enabled. >>>> >>>> Ok for trunk? >>>> >>>> Thanks! >>>> Teresa >>> >>> 2013-05-23 Teresa Johnson <tejohnson@google.com> >>> >>> * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert >>> as this is now done by redirect_edge_and_branch_force. >>> * function.c (thread_prologue_and_epilogue_insns): Insert new bb after >>> barriers, and fix interaction with splitting. >>> * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes. >>> * cfgcleanup.c (try_forward_edges): Fix early return value to properly >>> reflect changes made in the routine. >>> * bb-reorder.c (emit_barrier_after_bb): Move to cfgrtl.c. >>> (fix_up_fall_thru_edges): Remove incorrect check for bb layout order >>> since this is called in cfglayout mode, and replace partition fixup >>> with assert as that is now done by force_nonfallthru_and_redirect. >>> (add_reg_crossing_jump_notes): Handle the fact that some jumps may >>> already be marked with region crossing note. >>> (insert_section_boundary_note): Make non-static, gate on flag >>> has_bb_partition, rewrite to also check for multiple partitions. >>> (rest_of_handle_reorder_blocks): Remove call to >>> insert_section_boundary_note, now done later during free_cfg. >>> (duplicate_computed_gotos): Don't duplicate partition crossing edge. >>> * bb-reorder.h (insert_section_boundary_note): Declare. >>> * Makefile.in (cfgrtl.o): Depend on bb-reorder.h >>> * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist >>> invoke insert_section_boundary_note. >>> (try_redirect_by_replacing_jump): Remove unnecessary >>> check for region crossing note. >>> (fixup_partition_crossing): New function. >>> (rtl_redirect_edge_and_branch): Fixup partition boundaries. >>> (emit_barrier_after_bb): Move here from bb-reorder.c, handle insertion >>> in non-cfglayout mode. >>> (force_nonfallthru_and_redirect): Fixup partition boundaries, >>> remove old code that tried to do this. Emit barrier correctly >>> when we are in cfglayout mode. >>> (last_bb_in_partition): New function. >>> (rtl_split_edge): Correctly fixup partition boundaries. >>> (commit_one_edge_insertion): Remove old code that tried to >>> fixup region crossing edge since this is now handled in >>> split_block, and set up insertion point correctly since >>> block may now end in a jump. >>> (verify_hot_cold_block_grouping): Guard against checking when not in >>> linearized RTL mode. >>> (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP >>> notes. >>> (rtl_verify_flow_info_1): Move verify_hot_cold_block_grouping to >>> rtl_verify_flow_info, so not called in cfglayout mode. >>> (rtl_verify_flow_info): Move verify_hot_cold_block_grouping here. >>> (fixup_reorder_chain): Remove old code that attempted to fixup region >>> crossing note as this is now handled in force_nonfallthru_and_redirect. >>> (duplicate_insn_chain): Don't duplicate switch section notes. >>> (rtl_can_remove_branch_p): Remove unnecessary check for region crossing >>> note. >>> * basic-block.h (emit_barrier_after_bb): Declare. >>> * testsuite/gcc.dg/tree-prof/va-arg-pack-1.c: Cloned from c-torture, made >>> into -freorder-blocks-and-partition test. >>> * testsuite/gcc.dg/tree-prof/comp-goto-1.c: Ditto. >>> * testsuite/gcc.dg/tree-prof/20041218-1.c: Ditto. >>> * testsuite/gcc.dg/tree-prof/pr52027.c: Use -O2. >>> * testsuite/gcc.dg/tree-prof/pr50907.c: Ditto. >>> * testsuite/gcc.dg/tree-prof/pr45354.c: Ditto. >>> * testsuite/g++.dg/tree-prof/partition2.C: Ditto. >>> * testsuite/g++.dg/tree-prof/partition3.C: Ditto. >>> >>> Index: ifcvt.c >>> =================================================================== >>> --- ifcvt.c (revision 199014) >>> +++ ifcvt.c (working copy) >>> @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg >>> if (new_bb) >>> { >>> df_bb_replace (then_bb_index, new_bb); >>> - /* Since the fallthru edge was redirected from test_bb to new_bb, >>> - we need to ensure that new_bb is in the same partition as >>> - test bb (you can not fall through across section boundaries). */ >>> - BB_COPY_PARTITION (new_bb, test_bb); >>> + /* This should have been done above via force_nonfallthru_and_redirect >>> + (possibly called from redirect_edge_and_branch_force). */ >>> + gcc_checking_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb)); >>> } >>> >>> num_true_changes++; >>> Index: function.c >>> =================================================================== >>> --- function.c (revision 199014) >>> +++ function.c (working copy) >>> @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void) >>> break; >>> if (e) >>> { >>> - copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)), >>> - NULL_RTX, e->src); >>> + /* Make sure we insert after any barriers. */ >>> + rtx end = get_last_bb_insn (e->src); >>> + copy_bb = create_basic_block (NEXT_INSN (end), >>> + NULL_RTX, e->src); >>> BB_COPY_PARTITION (copy_bb, e->src); >>> } >>> else >>> @@ -6538,7 +6540,7 @@ epilogue_done: >>> basic_block simple_return_block_cold = NULL; >>> edge pending_edge_hot = NULL; >>> edge pending_edge_cold = NULL; >>> - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb; >>> + basic_block exit_pred; >>> int i; >>> >>> gcc_assert (entry_edge != orig_entry_edge); >>> @@ -6566,6 +6568,12 @@ epilogue_done: >>> else >>> pending_edge_cold = e; >>> } >>> + >>> + /* Save a pointer to the exit's predecessor BB for use in >>> + inserting new BBs at the end of the function. Do this >>> + after the call to split_block above which may split >>> + the original exit pred. */ >>> + exit_pred = EXIT_BLOCK_PTR->prev_bb; >>> >>> FOR_EACH_VEC_ELT (unconverted_simple_returns, i, e) >>> { >>> Index: emit-rtl.c >>> =================================================================== >>> --- emit-rtl.c (revision 199014) >>> +++ emit-rtl.c (working copy) >>> @@ -3574,6 +3574,7 @@ try_split (rtx pat, rtx trial, int last) >>> break; >>> >>> case REG_NON_LOCAL_GOTO: >>> + case REG_CROSSING_JUMP: >>> for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn)) >>> { >>> if (JUMP_P (insn)) >>> Index: cfgcleanup.c >>> =================================================================== >>> --- cfgcleanup.c (revision 199014) >>> +++ cfgcleanup.c (working copy) >>> @@ -456,7 +456,7 @@ try_forward_edges (int mode, basic_block b) >>> >>> if (first != EXIT_BLOCK_PTR >>> && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX)) >>> - return false; >>> + return changed; >>> >>> while (counter < n_basic_blocks) >>> { >>> Index: bb-reorder.c >>> =================================================================== >>> --- bb-reorder.c (revision 199014) >>> +++ bb-reorder.c (working copy) >>> @@ -1380,15 +1380,6 @@ get_uncond_jump_length (void) >>> return length; >>> } >>> >>> -/* Emit a barrier into the footer of BB. */ >>> - >>> -static void >>> -emit_barrier_after_bb (basic_block bb) >>> -{ >>> - rtx barrier = emit_barrier_after (BB_END (bb)); >>> - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); >>> -} >>> - >>> /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions. >>> Duplicate the landing pad and split the edges so that no EH edge >>> crosses partitions. */ >>> @@ -1720,8 +1711,7 @@ fix_up_fall_thru_edges (void) >>> (i.e. fix it so the fall through does not cross and >>> the cond jump does). */ >>> >>> - if (!cond_jump_crosses >>> - && cur_bb->aux == cond_jump->dest) >>> + if (!cond_jump_crosses) >>> { >>> /* Find label in fall_thru block. We've already added >>> any missing labels, so there must be one. */ >>> @@ -1765,10 +1755,10 @@ fix_up_fall_thru_edges (void) >>> new_bb->aux = cur_bb->aux; >>> cur_bb->aux = new_bb; >>> >>> - /* Make sure new fall-through bb is in same >>> - partition as bb it's falling through from. */ >>> + /* This is done by force_nonfallthru_and_redirect. */ >>> + gcc_assert (BB_PARTITION (new_bb) >>> + == BB_PARTITION (cur_bb)); >>> >>> - BB_COPY_PARTITION (new_bb, cur_bb); >>> single_succ_edge (new_bb)->flags |= EDGE_CROSSING; >>> } >>> else >>> @@ -2064,7 +2054,10 @@ add_reg_crossing_jump_notes (void) >>> FOR_EACH_BB (bb) >>> FOR_EACH_EDGE (e, ei, bb->succs) >>> if ((e->flags & EDGE_CROSSING) >>> - && JUMP_P (BB_END (e->src))) >>> + && JUMP_P (BB_END (e->src)) >>> + /* Some notes were added during fix_up_fall_thru_edges, via >>> + force_nonfallthru_and_redirect. */ >>> + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX)) >>> add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >>> } >>> >>> @@ -2133,23 +2126,26 @@ reorder_basic_blocks (void) >>> encountering this note will make the compiler switch between the >>> hot and cold text sections. */ >>> >>> -static void >>> +void >>> insert_section_boundary_note (void) >>> { >>> basic_block bb; >>> - int first_partition = 0; >>> + bool switched_sections = false; >>> + int current_partition = 0; >>> >>> - if (!flag_reorder_blocks_and_partition) >>> + if (!crtl->has_bb_partition) >>> return; >>> >>> FOR_EACH_BB (bb) >>> { >>> - if (!first_partition) >>> - first_partition = BB_PARTITION (bb); >>> - if (BB_PARTITION (bb) != first_partition) >>> + if (!current_partition) >>> + current_partition = BB_PARTITION (bb); >>> + if (BB_PARTITION (bb) != current_partition) >>> { >>> - emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); >>> - break; >>> + gcc_assert (!switched_sections); >>> + switched_sections = true; >>> + emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); >>> + current_partition = BB_PARTITION (bb); >>> } >>> } >>> } >>> @@ -2180,8 +2176,6 @@ rest_of_handle_reorder_blocks (void) >>> bb->aux = bb->next_bb; >>> cfg_layout_finalize (); >>> >>> - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ >>> - insert_section_boundary_note (); >>> return 0; >>> } >>> >>> @@ -2315,6 +2309,11 @@ duplicate_computed_gotos (void) >>> if (!bitmap_bit_p (candidates, single_succ (bb)->index)) >>> continue; >>> >>> + /* Don't duplicate a partition crossing edge, which requires difficult >>> + fixup. */ >>> + if (find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) >>> + continue; >>> + >>> new_bb = duplicate_block (single_succ (bb), single_succ_edge (bb), bb); >>> new_bb->aux = bb->aux; >>> bb->aux = new_bb; >>> Index: bb-reorder.h >>> =================================================================== >>> --- bb-reorder.h (revision 199014) >>> +++ bb-reorder.h (working copy) >>> @@ -35,4 +35,6 @@ extern struct target_bb_reorder *this_target_bb_re >>> >>> extern int get_uncond_jump_length (void); >>> >>> +extern void insert_section_boundary_note (void); >>> + >>> #endif >>> Index: Makefile.in >>> =================================================================== >>> --- Makefile.in (revision 199014) >>> +++ Makefile.in (working copy) >>> @@ -3151,7 +3151,7 @@ cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) corety >>> $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \ >>> insn-config.h $(EXPR_H) \ >>> $(CFGLOOP_H) $(OBSTACK_H) $(TARGET_H) $(TREE_H) \ >>> - $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h >>> + $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h bb-reorder.h >>> cfganal.o : cfganal.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(BASIC_BLOCK_H) \ >>> $(TIMEVAR_H) sbitmap.h $(BITMAP_H) >>> cfgbuild.o : cfgbuild.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ >>> Index: cfgrtl.c >>> =================================================================== >>> --- cfgrtl.c (revision 199014) >>> +++ cfgrtl.c (working copy) >>> @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see >>> #include "tree.h" >>> #include "hard-reg-set.h" >>> #include "basic-block.h" >>> +#include "bb-reorder.h" >>> #include "regs.h" >>> #include "flags.h" >>> #include "function.h" >>> @@ -451,6 +452,9 @@ rest_of_pass_free_cfg (void) >>> } >>> #endif >>> >>> + if (crtl->has_bb_partition) >>> + insert_section_boundary_note (); >>> + >>> free_bb_for_insn (); >>> return 0; >>> } >>> @@ -981,8 +985,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc >>> partition boundaries). See the comments at the top of >>> bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ >>> >>> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) >>> - || BB_PARTITION (src) != BB_PARTITION (target)) >>> + if (BB_PARTITION (src) != BB_PARTITION (target)) >>> return NULL; >>> >>> /* We can replace or remove a complex jump only when we have exactly >>> @@ -1291,6 +1294,53 @@ redirect_branch_edge (edge e, basic_block target) >>> return e; >>> } >>> >>> +/* Called when edge E has been redirected to a new destination, >>> + in order to update the region crossing flag on the edge and >>> + jump. */ >>> + >>> +static void >>> +fixup_partition_crossing (edge e) >>> +{ >>> + rtx note; >>> + >>> + if (e->src == ENTRY_BLOCK_PTR || e->dest == EXIT_BLOCK_PTR) >>> + return; >>> + /* If we redirected an existing edge, it may already be marked >>> + crossing, even though the new src is missing a reg crossing note. >>> + But make sure reg crossing note doesn't already exist before >>> + inserting. */ >>> + if (BB_PARTITION (e->src) != BB_PARTITION (e->dest)) >>> + { >>> + e->flags |= EDGE_CROSSING; >>> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >>> + if (JUMP_P (BB_END (e->src)) >>> + && !note) >>> + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >>> + } >>> + else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) >>> + { >>> + e->flags &= ~EDGE_CROSSING; >>> + /* Remove the section crossing note from jump at end of >>> + src if it exists, and if no other successors are >>> + still crossing. */ >>> + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); >>> + if (note) >>> + { >>> + bool has_crossing_succ = false; >>> + edge e2; >>> + edge_iterator ei; >>> + FOR_EACH_EDGE (e2, ei, e->src->succs) >>> + { >>> + has_crossing_succ |= (e2->flags & EDGE_CROSSING); >>> + if (has_crossing_succ) >>> + break; >>> + } >>> + if (!has_crossing_succ) >>> + remove_note (BB_END (e->src), note); >>> + } >>> + } >>> +} >>> + >>> /* Attempt to change code to redirect edge E to TARGET. Don't do that on >>> expense of adding new instructions or reordering basic blocks. >>> >>> @@ -1307,16 +1357,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block >>> { >>> edge ret; >>> basic_block src = e->src; >>> + basic_block dest = e->dest; >>> >>> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) >>> return NULL; >>> >>> - if (e->dest == target) >>> + if (dest == target) >>> return e; >>> >>> if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL) >>> { >>> df_set_bb_dirty (src); >>> + fixup_partition_crossing (ret); >>> return ret; >>> } >>> >>> @@ -1325,9 +1377,22 @@ rtl_redirect_edge_and_branch (edge e, basic_block >>> return NULL; >>> >>> df_set_bb_dirty (src); >>> + fixup_partition_crossing (ret); >>> return ret; >>> } >>> >>> +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */ >>> + >>> +void >>> +emit_barrier_after_bb (basic_block bb) >>> +{ >>> + rtx barrier = emit_barrier_after (BB_END (bb)); >>> + gcc_assert (current_ir_type() == IR_RTL_CFGRTL >>> + || current_ir_type () == IR_RTL_CFGLAYOUT); >>> + if (current_ir_type () == IR_RTL_CFGLAYOUT) >>> + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); >>> +} >>> + >>> /* Like force_nonfallthru below, but additionally performs redirection >>> Used by redirect_edge_and_branch_force. JUMP_LABEL is used only >>> when redirecting to the EXIT_BLOCK, it is either ret_rtx or >>> @@ -1492,12 +1557,6 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >>> /* Make sure new block ends up in correct hot/cold section. */ >>> >>> BB_COPY_PARTITION (jump_block, e->src); >>> - if (flag_reorder_blocks_and_partition >>> - && targetm_common.have_named_sections >>> - && JUMP_P (BB_END (jump_block)) >>> - && !any_condjump_p (BB_END (jump_block)) >>> - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING)) >>> - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX); >>> >>> /* Wire edge in. */ >>> new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU); >>> @@ -1508,6 +1567,10 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >>> redirect_edge_pred (e, jump_block); >>> e->probability = REG_BR_PROB_BASE; >>> >>> + /* If e->src was previously region crossing, it no longer is >>> + and the reg crossing note should be removed. */ >>> + fixup_partition_crossing (new_edge); >>> + >>> /* If asm goto has any label refs to target's label, >>> add also edge from asm goto bb to target. */ >>> if (asm_goto_edge) >>> @@ -1559,13 +1622,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc >>> LABEL_NUSES (label)++; >>> } >>> >>> - emit_barrier_after (BB_END (jump_block)); >>> + /* We might be in cfg layout mode, and if so, the following routine will >>> + insert the barrier correctly. */ >>> + emit_barrier_after_bb (jump_block); >>> redirect_edge_succ_nodup (e, target); >>> >>> if (abnormal_edge_flags) >>> make_edge (src, target, abnormal_edge_flags); >>> >>> df_mark_solutions_dirty (); >>> + fixup_partition_crossing (e); >>> return new_bb; >>> } >>> >>> @@ -1654,6 +1720,21 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU >>> return false; >>> } >>> >>> +/* Locate the last bb in the same partition as START_BB. */ >>> + >>> +static basic_block >>> +last_bb_in_partition (basic_block start_bb) >>> +{ >>> + basic_block bb; >>> + FOR_BB_BETWEEN (bb, start_bb, EXIT_BLOCK_PTR, next_bb) >>> + { >>> + if (BB_PARTITION (start_bb) != BB_PARTITION (bb->next_bb)) >>> + return bb; >>> + } >>> + /* Return bb before EXIT_BLOCK_PTR. */ >>> + return bb->prev_bb; >>> +} >>> + >>> /* Split a (typically critical) edge. Return the new block. >>> The edge must not be abnormal. >>> >>> @@ -1664,7 +1745,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU >>> static basic_block >>> rtl_split_edge (edge edge_in) >>> { >>> - basic_block bb; >>> + basic_block bb, new_bb; >>> rtx before; >>> >>> /* Abnormal edges cannot be split. */ >>> @@ -1696,13 +1777,50 @@ rtl_split_edge (edge edge_in) >>> } >>> else >>> { >>> - bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); >>> - /* ??? Why not edge_in->dest->prev_bb here? */ >>> - BB_COPY_PARTITION (bb, edge_in->dest); >>> + if (edge_in->src == ENTRY_BLOCK_PTR) >>> + { >>> + bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); >>> + BB_COPY_PARTITION (bb, edge_in->dest); >>> + } >>> + else >>> + { >>> + basic_block after = edge_in->dest->prev_bb; >>> + /* If this is post-bb reordering, and the edge crosses a partition >>> + boundary, the new block needs to be inserted in the bb chain >>> + at the end of the src partition (since we put the new bb into >>> + that partition, see below). Otherwise we may end up creating >>> + an extra partition crossing in the chain, which is illegal. >>> + It can't go after the src, because src may have a fall-through >>> + to a different block. */ >>> + if (crtl->bb_reorder_complete >>> + && (edge_in->flags & EDGE_CROSSING)) >>> + { >>> + after = last_bb_in_partition (edge_in->src); >>> + before = NEXT_INSN (BB_END (after)); >>> + /* The instruction following the last bb in partition should >>> + be a barrier, since it cannot end in a fall-through. */ >>> + gcc_checking_assert (BARRIER_P (before)); >>> + before = NEXT_INSN (before); >>> + } >>> + bb = create_basic_block (before, NULL, after); >>> + /* Put the split bb into the src partition, to avoid creating >>> + a situation where a cold bb dominates a hot bb, in the case >>> + where src is cold and dest is hot. The src will dominate >>> + the new bb (whereas it might not have dominated dest). */ >>> + BB_COPY_PARTITION (bb, edge_in->src); >>> + } >>> } >>> >>> make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU); >>> >>> + /* Can't allow a region crossing edge to be fallthrough. */ >>> + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest) >>> + && edge_in->dest != EXIT_BLOCK_PTR) >>> + { >>> + new_bb = force_nonfallthru (single_succ_edge (bb)); >>> + gcc_assert (!new_bb); >>> + } >>> + >>> /* For non-fallthru edges, we must adjust the predecessor's >>> jump instruction to target our new block. */ >>> if ((edge_in->flags & EDGE_FALLTHRU) == 0) >>> @@ -1815,17 +1933,13 @@ commit_one_edge_insertion (edge e) >>> else >>> { >>> bb = split_edge (e); >>> - after = BB_END (bb); >>> >>> - if (flag_reorder_blocks_and_partition >>> - && targetm_common.have_named_sections >>> - && e->src != ENTRY_BLOCK_PTR >>> - && BB_PARTITION (e->src) == BB_COLD_PARTITION >>> - && !(e->flags & EDGE_CROSSING) >>> - && JUMP_P (after) >>> - && !any_condjump_p (after) >>> - && (single_succ_edge (bb)->flags & EDGE_CROSSING)) >>> - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX); >>> + /* If E crossed a partition boundary, we needed to make bb end in >>> + a region-crossing jump, even though it was originally fallthru. */ >>> + if (JUMP_P (BB_END (bb))) >>> + before = BB_END (bb); >>> + else >>> + after = BB_END (bb); >>> } >>> >>> /* Now that we've found the spot, do the insertion. */ >>> @@ -2071,7 +2185,11 @@ verify_hot_cold_block_grouping (void) >>> bool switched_sections = false; >>> int current_partition = BB_UNPARTITIONED; >>> >>> - if (!crtl->bb_reorder_complete) >>> + /* Even after bb reordering is complete, we go into cfglayout mode >>> + again (in compgoto). Ensure we don't call this before going back >>> + into linearized RTL when any layout fixes would have been committed. */ >>> + if (!crtl->bb_reorder_complete >>> + || current_ir_type() != IR_RTL_CFGRTL) >>> return err; >>> >>> FOR_EACH_BB (bb) >>> @@ -2116,6 +2234,7 @@ rtl_verify_edges (void) >>> edge e, fallthru = NULL; >>> edge_iterator ei; >>> rtx note; >>> + bool has_crossing_edge = false; >>> >>> if (JUMP_P (BB_END (bb)) >>> && (note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX)) >>> @@ -2141,6 +2260,7 @@ rtl_verify_edges (void) >>> is_crossing = (BB_PARTITION (e->src) != BB_PARTITION (e->dest) >>> && e->src != ENTRY_BLOCK_PTR >>> && e->dest != EXIT_BLOCK_PTR); >>> + has_crossing_edge |= is_crossing; >>> if (e->flags & EDGE_CROSSING) >>> { >>> if (!is_crossing) >>> @@ -2160,6 +2280,13 @@ rtl_verify_edges (void) >>> e->src->index); >>> err = 1; >>> } >>> + if (JUMP_P (BB_END (bb)) >>> + && !find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) >>> + { >>> + error ("No region crossing jump at section boundary in bb %i", >>> + bb->index); >>> + err = 1; >>> + } >>> } >>> else if (is_crossing) >>> { >>> @@ -2188,6 +2315,15 @@ rtl_verify_edges (void) >>> n_abnormal++; >>> } >>> >>> + if (!has_crossing_edge >>> + && find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) >>> + { >>> + print_rtl_with_bb (stderr, get_insns (), TDF_RTL | >>> TDF_BLOCKS | TDF_DETAILS); >>> + error ("Region crossing jump across same section in bb %i", >>> + bb->index); >>> + err = 1; >>> + } >>> + >>> if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX)) >>> { >>> error ("missing REG_EH_REGION note at the end of bb %i", bb->index); >>> @@ -2395,8 +2531,6 @@ rtl_verify_flow_info_1 (void) >>> >>> err |= rtl_verify_edges (); >>> >>> - err |= verify_hot_cold_block_grouping(); >>> - >>> return err; >>> } >>> >>> @@ -2642,6 +2776,8 @@ rtl_verify_flow_info (void) >>> >>> err |= rtl_verify_bb_layout (); >>> >>> + err |= verify_hot_cold_block_grouping (); >>> + >>> return err; >>> } >>> >>> @@ -3343,7 +3479,7 @@ fixup_reorder_chain (void) >>> edge e_fall, e_taken, e; >>> rtx bb_end_insn; >>> rtx ret_label = NULL_RTX; >>> - basic_block nb, src_bb; >>> + basic_block nb; >>> edge_iterator ei; >>> >>> if (EDGE_COUNT (bb->succs) == 0) >>> @@ -3478,7 +3614,6 @@ fixup_reorder_chain (void) >>> /* We got here if we need to add a new jump insn. >>> Note force_nonfallthru can delete E_FALL and thus we have to >>> save E_FALL->src prior to the call to force_nonfallthru. */ >>> - src_bb = e_fall->src; >>> nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label); >>> if (nb) >>> { >>> @@ -3486,17 +3621,6 @@ fixup_reorder_chain (void) >>> bb->aux = nb; >>> /* Don't process this new block. */ >>> bb = nb; >>> - >>> - /* Make sure new bb is tagged for correct section (same as >>> - fall-thru source, since you cannot fall-thru across >>> - section boundaries). */ >>> - BB_COPY_PARTITION (src_bb, single_pred (bb)); >>> - if (flag_reorder_blocks_and_partition >>> - && targetm_common.have_named_sections >>> - && JUMP_P (BB_END (bb)) >>> - && !any_condjump_p (BB_END (bb)) >>> - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING)) >>> - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX); >>> } >>> } >>> >>> @@ -3796,10 +3920,11 @@ duplicate_insn_chain (rtx from, rtx to) >>> case NOTE_INSN_FUNCTION_BEG: >>> /* There is always just single entry to function. */ >>> case NOTE_INSN_BASIC_BLOCK: >>> + /* We should only switch text sections once. */ >>> + case NOTE_INSN_SWITCH_TEXT_SECTIONS: >>> break; >>> >>> case NOTE_INSN_EPILOGUE_BEG: >>> - case NOTE_INSN_SWITCH_TEXT_SECTIONS: >>> emit_note_copy (insn); >>> break; >>> >>> @@ -4611,8 +4736,7 @@ rtl_can_remove_branch_p (const_edge e) >>> if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) >>> return false; >>> >>> - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) >>> - || BB_PARTITION (src) != BB_PARTITION (target)) >>> + if (BB_PARTITION (src) != BB_PARTITION (target)) >>> return false; >>> >>> if (!onlyjump_p (insn) >>> Index: basic-block.h >>> =================================================================== >>> --- basic-block.h (revision 199014) >>> +++ basic-block.h (working copy) >>> @@ -796,6 +796,7 @@ extern basic_block force_nonfallthru_and_redirect >>> extern bool contains_no_active_insn_p (const_basic_block); >>> extern bool forwarder_block_p (const_basic_block); >>> extern bool can_fallthru (basic_block, basic_block); >>> +extern void emit_barrier_after_bb (basic_block bb); >>> >>> /* In cfgbuild.c. */ >>> extern void find_many_sub_basic_blocks (sbitmap); >>> Index: testsuite/gcc.dg/tree-prof/va-arg-pack-1.c >>> =================================================================== >>> --- testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) >>> +++ testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) >>> @@ -0,0 +1,145 @@ >>> +/* __builtin_va_arg_pack () builtin tests. */ >>> +/* { dg-require-effective-target freorder } */ >>> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ >>> + >>> +#include <stdarg.h> >>> + >>> +extern void abort (void); >>> + >>> +int v1 = 8; >>> +long int v2 = 3; >>> +void *v3 = (void *) &v2; >>> +struct A { char c[16]; } v4 = { "foo" }; >>> +long double v5 = 40; >>> +char seen[20]; >>> +int cnt; >>> + >>> +__attribute__ ((noinline)) int >>> +foo1 (int x, int y, ...) >>> +{ >>> + int i; >>> + long int l; >>> + void *v; >>> + struct A a; >>> + long double ld; >>> + va_list ap; >>> + >>> + va_start (ap, y); >>> + if (x < 0 || x >= 20 || seen[x]) >>> + abort (); >>> + seen[x] = ++cnt; >>> + if (y != 6) >>> + abort (); >>> + i = va_arg (ap, int); >>> + if (i != 5) >>> + abort (); >>> + switch (x) >>> + { >>> + case 0: >>> + i = va_arg (ap, int); >>> + if (i != 9 || v1 != 9) >>> + abort (); >>> + a = va_arg (ap, struct A); >>> + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) >>> + abort (); >>> + v = (void *) va_arg (ap, struct A *); >>> + if (v != (void *) &v4) >>> + abort (); >>> + l = va_arg (ap, long int); >>> + if (l != 3 || v2 != 4) >>> + abort (); >>> + break; >>> + case 1: >>> + ld = va_arg (ap, long double); >>> + if (ld != 41 || v5 != ld) >>> + abort (); >>> + i = va_arg (ap, int); >>> + if (i != 8) >>> + abort (); >>> + v = va_arg (ap, void *); >>> + if (v != &v2) >>> + abort (); >>> + break; >>> + case 2: >>> + break; >>> + default: >>> + abort (); >>> + } >>> + va_end (ap); >>> + return x; >>> +} >>> + >>> +__attribute__ ((noinline)) int >>> +foo2 (int x, int y, ...) >>> +{ >>> + long long int ll; >>> + void *v; >>> + struct A a, b; >>> + long double ld; >>> + va_list ap; >>> + >>> + va_start (ap, y); >>> + if (x < 0 || x >= 20 || seen[x]) >>> + abort (); >>> + seen[x] = ++cnt | 64; >>> + if (y != 10) >>> + abort (); >>> + switch (x) >>> + { >>> + case 11: >>> + break; >>> + case 12: >>> + ld = va_arg (ap, long double); >>> + if (ld != 41 || v5 != 40) >>> + abort (); >>> + a = va_arg (ap, struct A); >>> + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) >>> + abort (); >>> + b = va_arg (ap, struct A); >>> + if (__builtin_memcmp (b.c, v4.c, sizeof (b.c)) != 0) >>> + abort (); >>> + v = va_arg (ap, void *); >>> + if (v != &v2) >>> + abort (); >>> + ll = va_arg (ap, long long int); >>> + if (ll != 16LL) >>> + abort (); >>> + break; >>> + case 2: >>> + break; >>> + default: >>> + abort (); >>> + } >>> + va_end (ap); >>> + return x + 8; >>> +} >>> + >>> +__attribute__ ((noinline)) int >>> +foo3 (void) >>> +{ >>> + return 6; >>> +} >>> + >>> +extern inline __attribute__ ((always_inline, gnu_inline)) int >>> +bar (int x, ...) >>> +{ >>> + if (x < 10) >>> + return foo1 (x, foo3 (), 5, __builtin_va_arg_pack ()); >>> + return foo2 (x, foo3 () + 4, __builtin_va_arg_pack ()); >>> +} >>> + >>> +int >>> +main (void) >>> +{ >>> + if (bar (0, ++v1, v4, &v4, v2++) != 0) >>> + abort (); >>> + if (bar (1, ++v5, 8, v3) != 1) >>> + abort (); >>> + if (bar (2) != 2) >>> + abort (); >>> + if (bar (v1 + 2) != 19) >>> + abort (); >>> + if (bar (v1 + 3, v5--, v4, v4, v3, 16LL) != 20) >>> + abort (); >>> + return 0; >>> +} >>> Index: testsuite/gcc.dg/tree-prof/comp-goto-1.c >>> =================================================================== >>> --- testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) >>> +++ testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) >>> @@ -0,0 +1,166 @@ >>> +/* { dg-require-effective-target freorder } */ >>> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ >>> +#include <stdlib.h> >>> + >>> +#if !defined(NO_LABEL_VALUES) && (!defined(STACK_SIZE) || STACK_SIZE >>>>= 4000) && __INT_MAX__ >= 2147483647 >>> +typedef unsigned int uint32; >>> +typedef signed int sint32; >>> + >>> +typedef uint32 reg_t; >>> + >>> +typedef unsigned long int host_addr_t; >>> +typedef uint32 target_addr_t; >>> +typedef sint32 target_saddr_t; >>> + >>> +typedef union >>> +{ >>> + struct >>> + { >>> + unsigned int offset:18; >>> + unsigned int ignore:4; >>> + unsigned int s1:8; >>> + int :2; >>> + signed int simm:14; >>> + unsigned int s3:8; >>> + unsigned int s2:8; >>> + int pad2:2; >>> + } f1; >>> + long long ll; >>> + double d; >>> +} insn_t; >>> + >>> +typedef struct >>> +{ >>> + target_addr_t vaddr_tag; >>> + unsigned long int rigged_paddr; >>> +} tlb_entry_t; >>> + >>> +typedef struct >>> +{ >>> + insn_t *pc; >>> + reg_t registers[256]; >>> + insn_t *program; >>> + tlb_entry_t tlb_tab[0x100]; >>> +} environment_t; >>> + >>> +enum operations >>> +{ >>> + LOAD32_RR, >>> + METAOP_DONE >>> +}; >>> + >>> +host_addr_t >>> +f () >>> +{ >>> + abort (); >>> +} >>> + >>> +reg_t >>> +simulator_kernel (int what, environment_t *env) >>> +{ >>> + register insn_t *pc = env->pc; >>> + register reg_t *regs = env->registers; >>> + register insn_t insn; >>> + register int s1; >>> + register reg_t r2; >>> + register void *base_addr = &&sim_base_addr; >>> + register tlb_entry_t *tlb = env->tlb_tab; >>> + >>> + if (what != 0) >>> + { >>> + int i; >>> + static void *op_map[] = >>> + { >>> + &&L_LOAD32_RR, >>> + &&L_METAOP_DONE, >>> + }; >>> + insn_t *program = env->program; >>> + for (i = 0; i < what; i++) >>> + program[i].f1.offset = op_map[program[i].f1.offset] - base_addr; >>> + } >>> + >>> + sim_base_addr:; >>> + >>> + insn = *pc++; >>> + r2 = (*(reg_t *) (((char *) regs) + (insn.f1.s2 << 2))); >>> + s1 = (insn.f1.s1 << 2); >>> + goto *(base_addr + insn.f1.offset); >>> + >>> + L_LOAD32_RR: >>> + { >>> + target_addr_t vaddr_page = r2 / 4096; >>> + unsigned int x = vaddr_page % 0x100; >>> + insn = *pc++; >>> + >>> + for (;;) >>> + { >>> + target_addr_t tag = tlb[x].vaddr_tag; >>> + host_addr_t rigged_paddr = tlb[x].rigged_paddr; >>> + >>> + if (tag == vaddr_page) >>> + { >>> + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) (rigged_paddr + r2); >>> + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); >>> + s1 = insn.f1.s1 << 2; >>> + goto *(base_addr + insn.f1.offset); >>> + } >>> + >>> + if (((target_saddr_t) tag < 0)) >>> + { >>> + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) f (); >>> + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); >>> + s1 = insn.f1.s1 << 2; >>> + goto *(base_addr + insn.f1.offset); >>> + } >>> + >>> + x = (x - 1) % 0x100; >>> + } >>> + >>> + L_METAOP_DONE: >>> + return (*(reg_t *) (((char *) regs) + s1)); >>> + } >>> +} >>> + >>> +insn_t program[2 + 1]; >>> + >>> +void *malloc (); >>> + >>> +int >>> +main () >>> +{ >>> + environment_t env; >>> + insn_t insn; >>> + int i, res; >>> + host_addr_t a_page = (host_addr_t) malloc (2 * 4096); >>> + target_addr_t a_vaddr = 0x123450; >>> + target_addr_t vaddr_page = a_vaddr / 4096; >>> + a_page = (a_page + 4096 - 1) & -4096; >>> + >>> + env.tlb_tab[((vaddr_page) % 0x100)].vaddr_tag = vaddr_page; >>> + env.tlb_tab[((vaddr_page) % 0x100)].rigged_paddr = a_page - >>> vaddr_page * 4096; >>> + insn.f1.offset = LOAD32_RR; >>> + env.registers[0] = 0; >>> + env.registers[2] = a_vaddr; >>> + *(sint32 *) (a_page + a_vaddr % 4096) = 88; >>> + insn.f1.s1 = 0; >>> + insn.f1.s2 = 2; >>> + >>> + for (i = 0; i < 2; i++) >>> + program[i] = insn; >>> + >>> + insn.f1.offset = METAOP_DONE; >>> + insn.f1.s1 = 0; >>> + program[2] = insn; >>> + >>> + env.pc = program; >>> + env.program = program; >>> + >>> + res = simulator_kernel (2 + 1, &env); >>> + >>> + if (res != 88) >>> + abort (); >>> + exit (0); >>> +} >>> +#else >>> +main(){ exit (0); } >>> +#endif >>> Index: testsuite/gcc.dg/tree-prof/pr52027.c >>> =================================================================== >>> --- testsuite/gcc.dg/tree-prof/pr52027.c (revision 199014) >>> +++ testsuite/gcc.dg/tree-prof/pr52027.c (working copy) >>> @@ -1,6 +1,6 @@ >>> /* PR debug/52027 */ >>> /* { dg-require-effective-target freorder } */ >>> -/* { dg-options "-O -freorder-blocks-and-partition -fno-reorder-functions" } */ >>> +/* { dg-options "-O2 -freorder-blocks-and-partition >>> -fno-reorder-functions" } */ >>> >>> void >>> foo (int len) >>> Index: testsuite/gcc.dg/tree-prof/pr50907.c >>> =================================================================== >>> --- testsuite/gcc.dg/tree-prof/pr50907.c (revision 199014) >>> +++ testsuite/gcc.dg/tree-prof/pr50907.c (working copy) >>> @@ -1,5 +1,5 @@ >>> /* PR middle-end/50907 */ >>> /* { dg-require-effective-target freorder } */ >>> -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns >>> -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* >>> x86_64-*-* } && fpic } } } */ >>> +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns >>> -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* >>> x86_64-*-* } && fpic } } } */ >>> >>> #include "pr45354.c" >>> Index: testsuite/gcc.dg/tree-prof/pr45354.c >>> =================================================================== >>> --- testsuite/gcc.dg/tree-prof/pr45354.c (revision 199014) >>> +++ testsuite/gcc.dg/tree-prof/pr45354.c (working copy) >>> @@ -1,5 +1,5 @@ >>> /* { dg-require-effective-target freorder } */ >>> -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns >>> -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } >>> */ >>> +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns >>> -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } >>> */ >>> >>> extern void abort (void); >>> >>> Index: testsuite/gcc.dg/tree-prof/20041218-1.c >>> =================================================================== >>> --- testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) >>> +++ testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) >>> @@ -0,0 +1,119 @@ >>> +/* PR rtl-optimization/16968 */ >>> +/* Testcase by Jakub Jelinek <jakub@redhat.com> */ >>> +/* { dg-require-effective-target freorder } */ >>> +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ >>> + >>> +struct T >>> +{ >>> + unsigned int b, c, *d; >>> + unsigned char e; >>> +}; >>> +struct S >>> +{ >>> + unsigned int a; >>> + struct T f; >>> +}; >>> +struct U >>> +{ >>> + struct S g, h; >>> +}; >>> +struct V >>> +{ >>> + unsigned int i; >>> + struct U j; >>> +}; >>> + >>> +extern void exit (int); >>> +extern void abort (void); >>> + >>> +void * >>> +dummy1 (void *x) >>> +{ >>> + return ""; >>> +} >>> + >>> +void * >>> +dummy2 (void *x, void *y) >>> +{ >>> + exit (0); >>> +} >>> + >>> +struct V * >>> +baz (unsigned int x) >>> +{ >>> + static struct V v; >>> + __builtin_memset (&v, 0x55, sizeof (v)); >>> + return &v; >>> +} >>> + >>> +int >>> +check (void *x, struct S *y) >>> +{ >>> + if (y->a || y->f.b || y->f.c || y->f.d || y->f.e) >>> + abort (); >>> + return 1; >>> +} >>> + >>> +static struct V * >>> +bar (unsigned int x, void *y) >>> +{ >>> + const struct T t = { 0, 0, (void *) 0, 0 }; >>> + struct V *u; >>> + void *v; >>> + v = dummy1 (y); >>> + if (!v) >>> + return (void *) 0; >>> + >>> + u = baz (sizeof (struct V)); >>> + u->i = x; >>> + u->j.g.a = 0; >>> + u->j.g.f = t; >>> + u->j.h.a = 0; >>> + u->j.h.f = t; >>> + >>> + if (!check (v, &u->j.g) || !check (v, &u->j.h)) >>> + return (void *) 0; >>> + return u; >>> +} >>> + >>> +int >>> +foo (unsigned int *x, unsigned int y, void **z) >>> +{ >>> + void *v; >>> + unsigned int i, j; >>> + >>> + *z = v = (void *) 0; >>> + >>> + for (i = 0; i < y; i++) >>> + { >>> + struct V *c; >>> + >>> + j = *x; >>> + >>> + switch (j) >>> + { >>> + case 1: >>> + c = bar (j, x); >>> + break; >>> + default: >>> + c = 0; >>> + break; >>> + } >>> + if (c) >>> + v = dummy2 (v, c); >>> + else >>> + return 1; >>> + } >>> + >>> + *z = v; >>> + return 0; >>> +} >>> + >>> +int >>> +main (void) >>> +{ >>> + unsigned int one = 1; >>> + void *p; >>> + foo (&one, 1, &p); >>> + abort (); >>> +} >>> Index: testsuite/g++.dg/tree-prof/partition2.C >>> =================================================================== >>> --- testsuite/g++.dg/tree-prof/partition2.C (revision 199014) >>> +++ testsuite/g++.dg/tree-prof/partition2.C (working copy) >>> @@ -1,6 +1,6 @@ >>> // PR middle-end/45458 >>> // { dg-require-effective-target freorder } >>> -// { dg-options "-fnon-call-exceptions -freorder-blocks-and-partition" } >>> +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } >>> >>> int >>> main () >>> Index: testsuite/g++.dg/tree-prof/partition3.C >>> =================================================================== >>> --- testsuite/g++.dg/tree-prof/partition3.C (revision 199014) >>> +++ testsuite/g++.dg/tree-prof/partition3.C (working copy) >>> @@ -1,6 +1,6 @@ >>> // PR middle-end/45566 >>> // { dg-require-effective-target freorder } >>> -// { dg-options "-O -fnon-call-exceptions -freorder-blocks-and-partition" } >>> +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } >>> >>> int k; >>> >>> >>> >>> >>> -- >>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Index: ifcvt.c =================================================================== --- ifcvt.c (revision 199014) +++ ifcvt.c (working copy) @@ -3905,10 +3905,9 @@ find_if_case_1 (basic_block test_bb, edge then_edg if (new_bb) { df_bb_replace (then_bb_index, new_bb); - /* Since the fallthru edge was redirected from test_bb to new_bb, - we need to ensure that new_bb is in the same partition as - test bb (you can not fall through across section boundaries). */ - BB_COPY_PARTITION (new_bb, test_bb); + /* This should have been done above via force_nonfallthru_and_redirect + (possibly called from redirect_edge_and_branch_force). */ + gcc_checking_assert (BB_PARTITION (new_bb) == BB_PARTITION (test_bb)); } num_true_changes++; Index: function.c =================================================================== --- function.c (revision 199014) +++ function.c (working copy) @@ -6270,8 +6270,10 @@ thread_prologue_and_epilogue_insns (void) break; if (e) { - copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)), - NULL_RTX, e->src); + /* Make sure we insert after any barriers. */ + rtx end = get_last_bb_insn (e->src); + copy_bb = create_basic_block (NEXT_INSN (end), + NULL_RTX, e->src); BB_COPY_PARTITION (copy_bb, e->src); } else @@ -6538,7 +6540,7 @@ epilogue_done: basic_block simple_return_block_cold = NULL; edge pending_edge_hot = NULL; edge pending_edge_cold = NULL; - basic_block exit_pred = EXIT_BLOCK_PTR->prev_bb; + basic_block exit_pred; int i; gcc_assert (entry_edge != orig_entry_edge); @@ -6566,6 +6568,12 @@ epilogue_done: else pending_edge_cold = e; } + + /* Save a pointer to the exit's predecessor BB for use in + inserting new BBs at the end of the function. Do this + after the call to split_block above which may split + the original exit pred. */ + exit_pred = EXIT_BLOCK_PTR->prev_bb; FOR_EACH_VEC_ELT (unconverted_simple_returns, i, e) { Index: emit-rtl.c =================================================================== --- emit-rtl.c (revision 199014) +++ emit-rtl.c (working copy) @@ -3574,6 +3574,7 @@ try_split (rtx pat, rtx trial, int last) break; case REG_NON_LOCAL_GOTO: + case REG_CROSSING_JUMP: for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn)) { if (JUMP_P (insn)) Index: cfgcleanup.c =================================================================== --- cfgcleanup.c (revision 199014) +++ cfgcleanup.c (working copy) @@ -456,7 +456,7 @@ try_forward_edges (int mode, basic_block b) if (first != EXIT_BLOCK_PTR && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX)) - return false; + return changed; while (counter < n_basic_blocks) { Index: bb-reorder.c =================================================================== --- bb-reorder.c (revision 199014) +++ bb-reorder.c (working copy) @@ -1380,15 +1380,6 @@ get_uncond_jump_length (void) return length; } -/* Emit a barrier into the footer of BB. */ - -static void -emit_barrier_after_bb (basic_block bb) -{ - rtx barrier = emit_barrier_after (BB_END (bb)); - BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); -} - /* The landing pad OLD_LP, in block OLD_BB, has edges from both partitions. Duplicate the landing pad and split the edges so that no EH edge crosses partitions. */ @@ -1720,8 +1711,7 @@ fix_up_fall_thru_edges (void) (i.e. fix it so the fall through does not cross and the cond jump does). */ - if (!cond_jump_crosses - && cur_bb->aux == cond_jump->dest) + if (!cond_jump_crosses) { /* Find label in fall_thru block. We've already added any missing labels, so there must be one. */ @@ -1765,10 +1755,10 @@ fix_up_fall_thru_edges (void) new_bb->aux = cur_bb->aux; cur_bb->aux = new_bb; - /* Make sure new fall-through bb is in same - partition as bb it's falling through from. */ + /* This is done by force_nonfallthru_and_redirect. */ + gcc_assert (BB_PARTITION (new_bb) + == BB_PARTITION (cur_bb)); - BB_COPY_PARTITION (new_bb, cur_bb); single_succ_edge (new_bb)->flags |= EDGE_CROSSING; } else @@ -2064,7 +2054,10 @@ add_reg_crossing_jump_notes (void) FOR_EACH_BB (bb) FOR_EACH_EDGE (e, ei, bb->succs) if ((e->flags & EDGE_CROSSING) - && JUMP_P (BB_END (e->src))) + && JUMP_P (BB_END (e->src)) + /* Some notes were added during fix_up_fall_thru_edges, via + force_nonfallthru_and_redirect. */ + && !find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX)) add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); } @@ -2133,23 +2126,26 @@ reorder_basic_blocks (void) encountering this note will make the compiler switch between the hot and cold text sections. */ -static void +void insert_section_boundary_note (void) { basic_block bb; - int first_partition = 0; + bool switched_sections = false; + int current_partition = 0; - if (!flag_reorder_blocks_and_partition) + if (!crtl->has_bb_partition) return; FOR_EACH_BB (bb) { - if (!first_partition) - first_partition = BB_PARTITION (bb); - if (BB_PARTITION (bb) != first_partition) + if (!current_partition) + current_partition = BB_PARTITION (bb); + if (BB_PARTITION (bb) != current_partition) { - emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); - break; + gcc_assert (!switched_sections); + switched_sections = true; + emit_note_before (NOTE_INSN_SWITCH_TEXT_SECTIONS, BB_HEAD (bb)); + current_partition = BB_PARTITION (bb); } } } @@ -2180,8 +2176,6 @@ rest_of_handle_reorder_blocks (void) bb->aux = bb->next_bb; cfg_layout_finalize (); - /* Add NOTE_INSN_SWITCH_TEXT_SECTIONS notes. */ - insert_section_boundary_note (); return 0; } @@ -2315,6 +2309,11 @@ duplicate_computed_gotos (void) if (!bitmap_bit_p (candidates, single_succ (bb)->index)) continue; + /* Don't duplicate a partition crossing edge, which requires difficult + fixup. */ + if (find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) + continue; + new_bb = duplicate_block (single_succ (bb), single_succ_edge (bb), bb); new_bb->aux = bb->aux; bb->aux = new_bb; Index: bb-reorder.h =================================================================== --- bb-reorder.h (revision 199014) +++ bb-reorder.h (working copy) @@ -35,4 +35,6 @@ extern struct target_bb_reorder *this_target_bb_re extern int get_uncond_jump_length (void); +extern void insert_section_boundary_note (void); + #endif Index: Makefile.in =================================================================== --- Makefile.in (revision 199014) +++ Makefile.in (working copy) @@ -3151,7 +3151,7 @@ cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) corety $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) $(INSN_ATTR_H) \ insn-config.h $(EXPR_H) \ $(CFGLOOP_H) $(OBSTACK_H) $(TARGET_H) $(TREE_H) \ - $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h + $(TREE_PASS_H) $(DF_H) $(GGC_H) $(COMMON_TARGET_H) gt-cfgrtl.h bb-reorder.h cfganal.o : cfganal.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(BASIC_BLOCK_H) \ $(TIMEVAR_H) sbitmap.h $(BITMAP_H) cfgbuild.o : cfgbuild.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ Index: cfgrtl.c =================================================================== --- cfgrtl.c (revision 199014) +++ cfgrtl.c (working copy) @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see #include "tree.h" #include "hard-reg-set.h" #include "basic-block.h" +#include "bb-reorder.h" #include "regs.h" #include "flags.h" #include "function.h" @@ -451,6 +452,9 @@ rest_of_pass_free_cfg (void) } #endif + if (crtl->has_bb_partition) + insert_section_boundary_note (); + free_bb_for_insn (); return 0; } @@ -981,8 +985,7 @@ try_redirect_by_replacing_jump (edge e, basic_bloc partition boundaries). See the comments at the top of bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) - || BB_PARTITION (src) != BB_PARTITION (target)) + if (BB_PARTITION (src) != BB_PARTITION (target)) return NULL; /* We can replace or remove a complex jump only when we have exactly @@ -1291,6 +1294,53 @@ redirect_branch_edge (edge e, basic_block target) return e; } +/* Called when edge E has been redirected to a new destination, + in order to update the region crossing flag on the edge and + jump. */ + +static void +fixup_partition_crossing (edge e) +{ + rtx note; + + if (e->src == ENTRY_BLOCK_PTR || e->dest == EXIT_BLOCK_PTR) + return; + /* If we redirected an existing edge, it may already be marked + crossing, even though the new src is missing a reg crossing note. + But make sure reg crossing note doesn't already exist before + inserting. */ + if (BB_PARTITION (e->src) != BB_PARTITION (e->dest)) + { + e->flags |= EDGE_CROSSING; + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); + if (JUMP_P (BB_END (e->src)) + && !note) + add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); + } + else if (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) + { + e->flags &= ~EDGE_CROSSING; + /* Remove the section crossing note from jump at end of + src if it exists, and if no other successors are + still crossing. */ + note = find_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); + if (note) + { + bool has_crossing_succ = false; + edge e2; + edge_iterator ei; + FOR_EACH_EDGE (e2, ei, e->src->succs) + { + has_crossing_succ |= (e2->flags & EDGE_CROSSING); + if (has_crossing_succ) + break; + } + if (!has_crossing_succ) + remove_note (BB_END (e->src), note); + } + } +} + /* Attempt to change code to redirect edge E to TARGET. Don't do that on expense of adding new instructions or reordering basic blocks. @@ -1307,16 +1357,18 @@ rtl_redirect_edge_and_branch (edge e, basic_block { edge ret; basic_block src = e->src; + basic_block dest = e->dest; if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) return NULL; - if (e->dest == target) + if (dest == target) return e; if ((ret = try_redirect_by_replacing_jump (e, target, false)) != NULL) { df_set_bb_dirty (src); + fixup_partition_crossing (ret); return ret; } @@ -1325,9 +1377,22 @@ rtl_redirect_edge_and_branch (edge e, basic_block return NULL; df_set_bb_dirty (src); + fixup_partition_crossing (ret); return ret; } +/* Emit a barrier after BB, into the footer if we are in CFGLAYOUT mode. */ + +void +emit_barrier_after_bb (basic_block bb) +{ + rtx barrier = emit_barrier_after (BB_END (bb)); + gcc_assert (current_ir_type() == IR_RTL_CFGRTL + || current_ir_type () == IR_RTL_CFGLAYOUT); + if (current_ir_type () == IR_RTL_CFGLAYOUT) + BB_FOOTER (bb) = unlink_insn_chain (barrier, barrier); +} + /* Like force_nonfallthru below, but additionally performs redirection Used by redirect_edge_and_branch_force. JUMP_LABEL is used only when redirecting to the EXIT_BLOCK, it is either ret_rtx or @@ -1492,12 +1557,6 @@ force_nonfallthru_and_redirect (edge e, basic_bloc /* Make sure new block ends up in correct hot/cold section. */ BB_COPY_PARTITION (jump_block, e->src); - if (flag_reorder_blocks_and_partition - && targetm_common.have_named_sections - && JUMP_P (BB_END (jump_block)) - && !any_condjump_p (BB_END (jump_block)) - && (EDGE_SUCC (jump_block, 0)->flags & EDGE_CROSSING)) - add_reg_note (BB_END (jump_block), REG_CROSSING_JUMP, NULL_RTX); /* Wire edge in. */ new_edge = make_edge (e->src, jump_block, EDGE_FALLTHRU); @@ -1508,6 +1567,10 @@ force_nonfallthru_and_redirect (edge e, basic_bloc redirect_edge_pred (e, jump_block); e->probability = REG_BR_PROB_BASE; + /* If e->src was previously region crossing, it no longer is + and the reg crossing note should be removed. */ + fixup_partition_crossing (new_edge); + /* If asm goto has any label refs to target's label, add also edge from asm goto bb to target. */ if (asm_goto_edge) @@ -1559,13 +1622,16 @@ force_nonfallthru_and_redirect (edge e, basic_bloc LABEL_NUSES (label)++; } - emit_barrier_after (BB_END (jump_block)); + /* We might be in cfg layout mode, and if so, the following routine will + insert the barrier correctly. */ + emit_barrier_after_bb (jump_block); redirect_edge_succ_nodup (e, target); if (abnormal_edge_flags) make_edge (src, target, abnormal_edge_flags); df_mark_solutions_dirty (); + fixup_partition_crossing (e); return new_bb; } @@ -1654,6 +1720,21 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU return false; } +/* Locate the last bb in the same partition as START_BB. */ + +static basic_block +last_bb_in_partition (basic_block start_bb) +{ + basic_block bb; + FOR_BB_BETWEEN (bb, start_bb, EXIT_BLOCK_PTR, next_bb) + { + if (BB_PARTITION (start_bb) != BB_PARTITION (bb->next_bb)) + return bb; + } + /* Return bb before EXIT_BLOCK_PTR. */ + return bb->prev_bb; +} + /* Split a (typically critical) edge. Return the new block. The edge must not be abnormal. @@ -1664,7 +1745,7 @@ rtl_move_block_after (basic_block bb ATTRIBUTE_UNU static basic_block rtl_split_edge (edge edge_in) { - basic_block bb; + basic_block bb, new_bb; rtx before; /* Abnormal edges cannot be split. */ @@ -1696,13 +1777,50 @@ rtl_split_edge (edge edge_in) } else { - bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); - /* ??? Why not edge_in->dest->prev_bb here? */ - BB_COPY_PARTITION (bb, edge_in->dest); + if (edge_in->src == ENTRY_BLOCK_PTR) + { + bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); + BB_COPY_PARTITION (bb, edge_in->dest); + } + else + { + basic_block after = edge_in->dest->prev_bb; + /* If this is post-bb reordering, and the edge crosses a partition + boundary, the new block needs to be inserted in the bb chain + at the end of the src partition (since we put the new bb into + that partition, see below). Otherwise we may end up creating + an extra partition crossing in the chain, which is illegal. + It can't go after the src, because src may have a fall-through + to a different block. */ + if (crtl->bb_reorder_complete + && (edge_in->flags & EDGE_CROSSING)) + { + after = last_bb_in_partition (edge_in->src); + before = NEXT_INSN (BB_END (after)); + /* The instruction following the last bb in partition should + be a barrier, since it cannot end in a fall-through. */ + gcc_checking_assert (BARRIER_P (before)); + before = NEXT_INSN (before); + } + bb = create_basic_block (before, NULL, after); + /* Put the split bb into the src partition, to avoid creating + a situation where a cold bb dominates a hot bb, in the case + where src is cold and dest is hot. The src will dominate + the new bb (whereas it might not have dominated dest). */ + BB_COPY_PARTITION (bb, edge_in->src); + } } make_single_succ_edge (bb, edge_in->dest, EDGE_FALLTHRU); + /* Can't allow a region crossing edge to be fallthrough. */ + if (BB_PARTITION (bb) != BB_PARTITION (edge_in->dest) + && edge_in->dest != EXIT_BLOCK_PTR) + { + new_bb = force_nonfallthru (single_succ_edge (bb)); + gcc_assert (!new_bb); + } + /* For non-fallthru edges, we must adjust the predecessor's jump instruction to target our new block. */ if ((edge_in->flags & EDGE_FALLTHRU) == 0) @@ -1815,17 +1933,13 @@ commit_one_edge_insertion (edge e) else { bb = split_edge (e); - after = BB_END (bb); - if (flag_reorder_blocks_and_partition - && targetm_common.have_named_sections - && e->src != ENTRY_BLOCK_PTR - && BB_PARTITION (e->src) == BB_COLD_PARTITION - && !(e->flags & EDGE_CROSSING) - && JUMP_P (after) - && !any_condjump_p (after) - && (single_succ_edge (bb)->flags & EDGE_CROSSING)) - add_reg_note (after, REG_CROSSING_JUMP, NULL_RTX); + /* If E crossed a partition boundary, we needed to make bb end in + a region-crossing jump, even though it was originally fallthru. */ + if (JUMP_P (BB_END (bb))) + before = BB_END (bb); + else + after = BB_END (bb); } /* Now that we've found the spot, do the insertion. */ @@ -2071,7 +2185,11 @@ verify_hot_cold_block_grouping (void) bool switched_sections = false; int current_partition = BB_UNPARTITIONED; - if (!crtl->bb_reorder_complete) + /* Even after bb reordering is complete, we go into cfglayout mode + again (in compgoto). Ensure we don't call this before going back + into linearized RTL when any layout fixes would have been committed. */ + if (!crtl->bb_reorder_complete + || current_ir_type() != IR_RTL_CFGRTL) return err; FOR_EACH_BB (bb) @@ -2116,6 +2234,7 @@ rtl_verify_edges (void) edge e, fallthru = NULL; edge_iterator ei; rtx note; + bool has_crossing_edge = false; if (JUMP_P (BB_END (bb)) && (note = find_reg_note (BB_END (bb), REG_BR_PROB, NULL_RTX)) @@ -2141,6 +2260,7 @@ rtl_verify_edges (void) is_crossing = (BB_PARTITION (e->src) != BB_PARTITION (e->dest) && e->src != ENTRY_BLOCK_PTR && e->dest != EXIT_BLOCK_PTR); + has_crossing_edge |= is_crossing; if (e->flags & EDGE_CROSSING) { if (!is_crossing) @@ -2160,6 +2280,13 @@ rtl_verify_edges (void) e->src->index); err = 1; } + if (JUMP_P (BB_END (bb)) + && !find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) + { + error ("No region crossing jump at section boundary in bb %i", + bb->index); + err = 1; + } } else if (is_crossing) { @@ -2188,6 +2315,15 @@ rtl_verify_edges (void) n_abnormal++; } + if (!has_crossing_edge + && find_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX)) + { + print_rtl_with_bb (stderr, get_insns (), TDF_RTL | TDF_BLOCKS | TDF_DETAILS); + error ("Region crossing jump across same section in bb %i", + bb->index); + err = 1; + } + if (n_eh && !find_reg_note (BB_END (bb), REG_EH_REGION, NULL_RTX)) { error ("missing REG_EH_REGION note at the end of bb %i", bb->index); @@ -2395,8 +2531,6 @@ rtl_verify_flow_info_1 (void) err |= rtl_verify_edges (); - err |= verify_hot_cold_block_grouping(); - return err; } @@ -2642,6 +2776,8 @@ rtl_verify_flow_info (void) err |= rtl_verify_bb_layout (); + err |= verify_hot_cold_block_grouping (); + return err; } @@ -3343,7 +3479,7 @@ fixup_reorder_chain (void) edge e_fall, e_taken, e; rtx bb_end_insn; rtx ret_label = NULL_RTX; - basic_block nb, src_bb; + basic_block nb; edge_iterator ei; if (EDGE_COUNT (bb->succs) == 0) @@ -3478,7 +3614,6 @@ fixup_reorder_chain (void) /* We got here if we need to add a new jump insn. Note force_nonfallthru can delete E_FALL and thus we have to save E_FALL->src prior to the call to force_nonfallthru. */ - src_bb = e_fall->src; nb = force_nonfallthru_and_redirect (e_fall, e_fall->dest, ret_label); if (nb) { @@ -3486,17 +3621,6 @@ fixup_reorder_chain (void) bb->aux = nb; /* Don't process this new block. */ bb = nb; - - /* Make sure new bb is tagged for correct section (same as - fall-thru source, since you cannot fall-thru across - section boundaries). */ - BB_COPY_PARTITION (src_bb, single_pred (bb)); - if (flag_reorder_blocks_and_partition - && targetm_common.have_named_sections - && JUMP_P (BB_END (bb)) - && !any_condjump_p (BB_END (bb)) - && (EDGE_SUCC (bb, 0)->flags & EDGE_CROSSING)) - add_reg_note (BB_END (bb), REG_CROSSING_JUMP, NULL_RTX); } } @@ -3796,10 +3920,11 @@ duplicate_insn_chain (rtx from, rtx to) case NOTE_INSN_FUNCTION_BEG: /* There is always just single entry to function. */ case NOTE_INSN_BASIC_BLOCK: + /* We should only switch text sections once. */ + case NOTE_INSN_SWITCH_TEXT_SECTIONS: break; case NOTE_INSN_EPILOGUE_BEG: - case NOTE_INSN_SWITCH_TEXT_SECTIONS: emit_note_copy (insn); break; @@ -4611,8 +4736,7 @@ rtl_can_remove_branch_p (const_edge e) if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) return false; - if (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) - || BB_PARTITION (src) != BB_PARTITION (target)) + if (BB_PARTITION (src) != BB_PARTITION (target)) return false; if (!onlyjump_p (insn) Index: basic-block.h =================================================================== --- basic-block.h (revision 199014) +++ basic-block.h (working copy) @@ -796,6 +796,7 @@ extern basic_block force_nonfallthru_and_redirect extern bool contains_no_active_insn_p (const_basic_block); extern bool forwarder_block_p (const_basic_block); extern bool can_fallthru (basic_block, basic_block); +extern void emit_barrier_after_bb (basic_block bb); /* In cfgbuild.c. */ extern void find_many_sub_basic_blocks (sbitmap); Index: testsuite/gcc.dg/tree-prof/va-arg-pack-1.c =================================================================== --- testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) +++ testsuite/gcc.dg/tree-prof/va-arg-pack-1.c (revision 0) @@ -0,0 +1,145 @@ +/* __builtin_va_arg_pack () builtin tests. */ +/* { dg-require-effective-target freorder } */ +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ + +#include <stdarg.h> + +extern void abort (void); + +int v1 = 8; +long int v2 = 3; +void *v3 = (void *) &v2; +struct A { char c[16]; } v4 = { "foo" }; +long double v5 = 40; +char seen[20]; +int cnt; + +__attribute__ ((noinline)) int +foo1 (int x, int y, ...) +{ + int i; + long int l; + void *v; + struct A a; + long double ld; + va_list ap; + + va_start (ap, y); + if (x < 0 || x >= 20 || seen[x]) + abort (); + seen[x] = ++cnt; + if (y != 6) + abort (); + i = va_arg (ap, int); + if (i != 5) + abort (); + switch (x) + { + case 0: + i = va_arg (ap, int); + if (i != 9 || v1 != 9) + abort (); + a = va_arg (ap, struct A); + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) + abort (); + v = (void *) va_arg (ap, struct A *); + if (v != (void *) &v4) + abort (); + l = va_arg (ap, long int); + if (l != 3 || v2 != 4) + abort (); + break; + case 1: + ld = va_arg (ap, long double); + if (ld != 41 || v5 != ld) + abort (); + i = va_arg (ap, int); + if (i != 8) + abort (); + v = va_arg (ap, void *); + if (v != &v2) + abort (); + break; + case 2: + break; + default: + abort (); + } + va_end (ap); + return x; +} + +__attribute__ ((noinline)) int +foo2 (int x, int y, ...) +{ + long long int ll; + void *v; + struct A a, b; + long double ld; + va_list ap; + + va_start (ap, y); + if (x < 0 || x >= 20 || seen[x]) + abort (); + seen[x] = ++cnt | 64; + if (y != 10) + abort (); + switch (x) + { + case 11: + break; + case 12: + ld = va_arg (ap, long double); + if (ld != 41 || v5 != 40) + abort (); + a = va_arg (ap, struct A); + if (__builtin_memcmp (a.c, v4.c, sizeof (a.c)) != 0) + abort (); + b = va_arg (ap, struct A); + if (__builtin_memcmp (b.c, v4.c, sizeof (b.c)) != 0) + abort (); + v = va_arg (ap, void *); + if (v != &v2) + abort (); + ll = va_arg (ap, long long int); + if (ll != 16LL) + abort (); + break; + case 2: + break; + default: + abort (); + } + va_end (ap); + return x + 8; +} + +__attribute__ ((noinline)) int +foo3 (void) +{ + return 6; +} + +extern inline __attribute__ ((always_inline, gnu_inline)) int +bar (int x, ...) +{ + if (x < 10) + return foo1 (x, foo3 (), 5, __builtin_va_arg_pack ()); + return foo2 (x, foo3 () + 4, __builtin_va_arg_pack ()); +} + +int +main (void) +{ + if (bar (0, ++v1, v4, &v4, v2++) != 0) + abort (); + if (bar (1, ++v5, 8, v3) != 1) + abort (); + if (bar (2) != 2) + abort (); + if (bar (v1 + 2) != 19) + abort (); + if (bar (v1 + 3, v5--, v4, v4, v3, 16LL) != 20) + abort (); + return 0; +} Index: testsuite/gcc.dg/tree-prof/comp-goto-1.c =================================================================== --- testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) +++ testsuite/gcc.dg/tree-prof/comp-goto-1.c (revision 0) @@ -0,0 +1,166 @@ +/* { dg-require-effective-target freorder } */ +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ +#include <stdlib.h> + +#if !defined(NO_LABEL_VALUES) && (!defined(STACK_SIZE) || STACK_SIZE >= 4000) && __INT_MAX__ >= 2147483647 +typedef unsigned int uint32; +typedef signed int sint32; + +typedef uint32 reg_t; + +typedef unsigned long int host_addr_t; +typedef uint32 target_addr_t; +typedef sint32 target_saddr_t; + +typedef union +{ + struct + { + unsigned int offset:18; + unsigned int ignore:4; + unsigned int s1:8; + int :2; + signed int simm:14; + unsigned int s3:8; + unsigned int s2:8; + int pad2:2; + } f1; + long long ll; + double d; +} insn_t; + +typedef struct +{ + target_addr_t vaddr_tag; + unsigned long int rigged_paddr; +} tlb_entry_t; + +typedef struct +{ + insn_t *pc; + reg_t registers[256]; + insn_t *program; + tlb_entry_t tlb_tab[0x100]; +} environment_t; + +enum operations +{ + LOAD32_RR, + METAOP_DONE +}; + +host_addr_t +f () +{ + abort (); +} + +reg_t +simulator_kernel (int what, environment_t *env) +{ + register insn_t *pc = env->pc; + register reg_t *regs = env->registers; + register insn_t insn; + register int s1; + register reg_t r2; + register void *base_addr = &&sim_base_addr; + register tlb_entry_t *tlb = env->tlb_tab; + + if (what != 0) + { + int i; + static void *op_map[] = + { + &&L_LOAD32_RR, + &&L_METAOP_DONE, + }; + insn_t *program = env->program; + for (i = 0; i < what; i++) + program[i].f1.offset = op_map[program[i].f1.offset] - base_addr; + } + + sim_base_addr:; + + insn = *pc++; + r2 = (*(reg_t *) (((char *) regs) + (insn.f1.s2 << 2))); + s1 = (insn.f1.s1 << 2); + goto *(base_addr + insn.f1.offset); + + L_LOAD32_RR: + { + target_addr_t vaddr_page = r2 / 4096; + unsigned int x = vaddr_page % 0x100; + insn = *pc++; + + for (;;) + { + target_addr_t tag = tlb[x].vaddr_tag; + host_addr_t rigged_paddr = tlb[x].rigged_paddr; + + if (tag == vaddr_page) + { + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) (rigged_paddr + r2); + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); + s1 = insn.f1.s1 << 2; + goto *(base_addr + insn.f1.offset); + } + + if (((target_saddr_t) tag < 0)) + { + *(reg_t *) (((char *) regs) + s1) = *(uint32 *) f (); + r2 = *(reg_t *) (((char *) regs) + (insn.f1.s2 << 2)); + s1 = insn.f1.s1 << 2; + goto *(base_addr + insn.f1.offset); + } + + x = (x - 1) % 0x100; + } + + L_METAOP_DONE: + return (*(reg_t *) (((char *) regs) + s1)); + } +} + +insn_t program[2 + 1]; + +void *malloc (); + +int +main () +{ + environment_t env; + insn_t insn; + int i, res; + host_addr_t a_page = (host_addr_t) malloc (2 * 4096); + target_addr_t a_vaddr = 0x123450; + target_addr_t vaddr_page = a_vaddr / 4096; + a_page = (a_page + 4096 - 1) & -4096; + + env.tlb_tab[((vaddr_page) % 0x100)].vaddr_tag = vaddr_page; + env.tlb_tab[((vaddr_page) % 0x100)].rigged_paddr = a_page - vaddr_page * 4096; + insn.f1.offset = LOAD32_RR; + env.registers[0] = 0; + env.registers[2] = a_vaddr; + *(sint32 *) (a_page + a_vaddr % 4096) = 88; + insn.f1.s1 = 0; + insn.f1.s2 = 2; + + for (i = 0; i < 2; i++) + program[i] = insn; + + insn.f1.offset = METAOP_DONE; + insn.f1.s1 = 0; + program[2] = insn; + + env.pc = program; + env.program = program; + + res = simulator_kernel (2 + 1, &env); + + if (res != 88) + abort (); + exit (0); +} +#else +main(){ exit (0); } +#endif Index: testsuite/gcc.dg/tree-prof/pr52027.c =================================================================== --- testsuite/gcc.dg/tree-prof/pr52027.c (revision 199014) +++ testsuite/gcc.dg/tree-prof/pr52027.c (working copy) @@ -1,6 +1,6 @@ /* PR debug/52027 */ /* { dg-require-effective-target freorder } */ -/* { dg-options "-O -freorder-blocks-and-partition -fno-reorder-functions" } */ +/* { dg-options "-O2 -freorder-blocks-and-partition -fno-reorder-functions" } */ void foo (int len) Index: testsuite/gcc.dg/tree-prof/pr50907.c =================================================================== --- testsuite/gcc.dg/tree-prof/pr50907.c (revision 199014) +++ testsuite/gcc.dg/tree-prof/pr50907.c (working copy) @@ -1,5 +1,5 @@ /* PR middle-end/50907 */ /* { dg-require-effective-target freorder } */ -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* x86_64-*-* } && fpic } } } */ +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns -fselective-scheduling -fpic" { target { { powerpc*-*-* ia64-*-* x86_64-*-* } && fpic } } } */ #include "pr45354.c" Index: testsuite/gcc.dg/tree-prof/pr45354.c =================================================================== --- testsuite/gcc.dg/tree-prof/pr45354.c (revision 199014) +++ testsuite/gcc.dg/tree-prof/pr45354.c (working copy) @@ -1,5 +1,5 @@ /* { dg-require-effective-target freorder } */ -/* { dg-options "-O -freorder-blocks-and-partition -fschedule-insns -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -freorder-blocks-and-partition -fschedule-insns -fselective-scheduling" { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ extern void abort (void); Index: testsuite/gcc.dg/tree-prof/20041218-1.c =================================================================== --- testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) +++ testsuite/gcc.dg/tree-prof/20041218-1.c (revision 0) @@ -0,0 +1,119 @@ +/* PR rtl-optimization/16968 */ +/* Testcase by Jakub Jelinek <jakub@redhat.com> */ +/* { dg-require-effective-target freorder } */ +/* { dg-options "-O2 -freorder-blocks-and-partition" } */ + +struct T +{ + unsigned int b, c, *d; + unsigned char e; +}; +struct S +{ + unsigned int a; + struct T f; +}; +struct U +{ + struct S g, h; +}; +struct V +{ + unsigned int i; + struct U j; +}; + +extern void exit (int); +extern void abort (void); + +void * +dummy1 (void *x) +{ + return ""; +} + +void * +dummy2 (void *x, void *y) +{ + exit (0); +} + +struct V * +baz (unsigned int x) +{ + static struct V v; + __builtin_memset (&v, 0x55, sizeof (v)); + return &v; +} + +int +check (void *x, struct S *y) +{ + if (y->a || y->f.b || y->f.c || y->f.d || y->f.e) + abort (); + return 1; +} + +static struct V * +bar (unsigned int x, void *y) +{ + const struct T t = { 0, 0, (void *) 0, 0 }; + struct V *u; + void *v; + v = dummy1 (y); + if (!v) + return (void *) 0; + + u = baz (sizeof (struct V)); + u->i = x; + u->j.g.a = 0; + u->j.g.f = t; + u->j.h.a = 0; + u->j.h.f = t; + + if (!check (v, &u->j.g) || !check (v, &u->j.h)) + return (void *) 0; + return u; +} + +int +foo (unsigned int *x, unsigned int y, void **z) +{ + void *v; + unsigned int i, j; + + *z = v = (void *) 0; + + for (i = 0; i < y; i++) + { + struct V *c; + + j = *x; + + switch (j) + { + case 1: + c = bar (j, x); + break; + default: + c = 0; + break; + } + if (c) + v = dummy2 (v, c); + else + return 1; + } + + *z = v; + return 0; +} + +int +main (void) +{ + unsigned int one = 1; + void *p; + foo (&one, 1, &p); + abort (); +} Index: testsuite/g++.dg/tree-prof/partition2.C =================================================================== --- testsuite/g++.dg/tree-prof/partition2.C (revision 199014) +++ testsuite/g++.dg/tree-prof/partition2.C (working copy) @@ -1,6 +1,6 @@ // PR middle-end/45458 // { dg-require-effective-target freorder } -// { dg-options "-fnon-call-exceptions -freorder-blocks-and-partition" } +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } int main () Index: testsuite/g++.dg/tree-prof/partition3.C =================================================================== --- testsuite/g++.dg/tree-prof/partition3.C (revision 199014) +++ testsuite/g++.dg/tree-prof/partition3.C (working copy) @@ -1,6 +1,6 @@ // PR middle-end/45566 // { dg-require-effective-target freorder } -// { dg-options "-O -fnon-call-exceptions -freorder-blocks-and-partition" } +// { dg-options "-O2 -fnon-call-exceptions -freorder-blocks-and-partition" } int k;