Message ID | 20130514214201.5DA3980BE7@tjsboxrox.mtv.corp.google.com |
---|---|
State | New |
Headers | show |
On Tue, May 14, 2013 at 11:42 PM, Teresa Johnson wrote: > > * function.h (has_bb_partition): New rtl_data flag. > (bb_reorder_complete): Ditto. > * cfgrtl.c (rtl_verify_flow_info_1): After bbro, verify > that text sections switch at most once in layout order. > * bb-reorder.c (connect_traces): Check for has_bb_partition > instead of flag_reorder_blocks_and_partition. > (verify_hot_cold_block_grouping): Remove. > (reorder_basic_blocks): Remove call to deleted > verify_hot_cold_block_grouping, and set bb_reorder_complete. > (partition_hot_cold_basic_blocks): Set has_bb_partition. > * cfgcleanup.c (try_crossjump_to_edge): Check for has_bb_partition > instead of flag_reorder_blocks_and_partition. > I can't approve this (I'm probably the most experienced GCC contributor without any approval authority :-) but it looks good to me. One nit: Can you keep the verify_hot_cold_block_grouping function separate? rtl_verify_flow_info* is already too big and complex (somewhere down on my TODO list is splitting it up and improving cfglayout mode checking, e.g. to make sure there are no barriers/notes between basic blocks...). Ciao! Steven
On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote: > On Tue, May 14, 2013 at 11:42 PM, Teresa Johnson wrote: >> >> * function.h (has_bb_partition): New rtl_data flag. >> (bb_reorder_complete): Ditto. >> * cfgrtl.c (rtl_verify_flow_info_1): After bbro, verify >> that text sections switch at most once in layout order. >> * bb-reorder.c (connect_traces): Check for has_bb_partition >> instead of flag_reorder_blocks_and_partition. >> (verify_hot_cold_block_grouping): Remove. >> (reorder_basic_blocks): Remove call to deleted >> verify_hot_cold_block_grouping, and set bb_reorder_complete. >> (partition_hot_cold_basic_blocks): Set has_bb_partition. >> * cfgcleanup.c (try_crossjump_to_edge): Check for has_bb_partition >> instead of flag_reorder_blocks_and_partition. >> > > I can't approve this (I'm probably the most experienced GCC > contributor without any approval authority :-) but it looks good to > me. > > One nit: Can you keep the verify_hot_cold_block_grouping function > separate? rtl_verify_flow_info* is already too big and complex > (somewhere down on my TODO list is splitting it up and improving > cfglayout mode checking, e.g. to make sure there are no barriers/notes > between basic blocks...). Initially that's what I did, but then it seemed less efficient because it adds another iteration over the BBs, so I instead merged the check with one of the existing BB iterations in that routine. Do you still prefer that I outline it? As an alternative, perhaps rtl_verify_flow_info_1 could be split in half in a follow-on patch since there are a couple walks over the BBs in there. WDYT? Thanks, Teresa > > Ciao! > Steven -- Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On 05/14/2013 03:42 PM, Teresa Johnson wrote: > Patch 1 of 3 split out from the patch I sent last week that fixes problems with > -freorder-blocks-and-partition, with some additional verification improvements. > > See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context. > > This patch adds a flag to the rtl_data structure to indicate whether any > partitioning was actually performed, so that optimizations which were > conservatively disabled whenever the flag_reorder_blocks_and_partition > is enabled (e.g. try_crossjump_to_edge, part of connect_traces) can be less > conservative for functions where no partitions were formed (e.g. they are > completely hot). > > It also adds another flag to the rtl_data structure to indicate whether bb > reordering is complete, and if so enables sanity checking that there is > at most one transition in the layout order between hot and cold sections. > This was moved from verify_hot_cold_block_grouping, which was only called > once at the end of the bbro pass and is now removed. > > Tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap > builds and regression testing. Additionally built/ran cpu2006 with profile > feedback and -freorder-blocks-and-partition enabled (which currently has > build failures from splitting until my follow-on patches are in, but confirmed > no new failures due to this patch), as well as gcc regression testing with > -freorder-blocks-and-partition enabled. > > Ok for trunk? > > Thanks, > Teresa > > 2013-05-14 Teresa Johnson <tejohnson@google.com> > > * function.h (has_bb_partition): New rtl_data flag. > (bb_reorder_complete): Ditto. > * cfgrtl.c (rtl_verify_flow_info_1): After bbro, verify > that text sections switch at most once in layout order. > * bb-reorder.c (connect_traces): Check for has_bb_partition > instead of flag_reorder_blocks_and_partition. > (verify_hot_cold_block_grouping): Remove. > (reorder_basic_blocks): Remove call to deleted > verify_hot_cold_block_grouping, and set bb_reorder_complete. > (partition_hot_cold_basic_blocks): Set has_bb_partition. > * cfgcleanup.c (try_crossjump_to_edge): Check for has_bb_partition > instead of flag_reorder_blocks_and_partition. This is good. Please install on the trunk. jeff
On Wed, May 15, 2013 at 12:50 AM, Teresa Johnson wrote: > On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher wrote: >> One nit: Can you keep the verify_hot_cold_block_grouping function >> separate? rtl_verify_flow_info* is already too big and complex >> (somewhere down on my TODO list is splitting it up and improving >> cfglayout mode checking, e.g. to make sure there are no barriers/notes >> between basic blocks...). > > Initially that's what I did, but then it seemed less efficient because > it adds another iteration over the BBs, so I instead merged the check > with one of the existing BB iterations in that routine. Do you still > prefer that I outline it? I'm not concerned about efficiency of the checker routines. They only run with checking enabled anyway. These checkers are almost the best "documentation" of the rules of GCC's intermediate representations that we have, so it's more important to me to make them easy to understand, and to make sure these verifiers themselves are complete and correct. So another BB walk wouldn't be a problem IMHO. Some of the other checkers do far worse things (some of them include non-linear algorithms, for example). Ciao! Steven
On 05/15/2013 12:02 AM, Steven Bosscher wrote: > On Wed, May 15, 2013 at 12:50 AM, Teresa Johnson wrote: >> On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher wrote: >>> One nit: Can you keep the verify_hot_cold_block_grouping function >>> separate? rtl_verify_flow_info* is already too big and complex >>> (somewhere down on my TODO list is splitting it up and improving >>> cfglayout mode checking, e.g. to make sure there are no barriers/notes >>> between basic blocks...). >> >> Initially that's what I did, but then it seemed less efficient because >> it adds another iteration over the BBs, so I instead merged the check >> with one of the existing BB iterations in that routine. Do you still >> prefer that I outline it? > > I'm not concerned about efficiency of the checker routines. They only > run with checking enabled anyway. These checkers are almost the best > "documentation" of the rules of GCC's intermediate representations > that we have, so it's more important to me to make them easy to > understand, and to make sure these verifiers themselves are complete > and correct. Right. > > So another BB walk wouldn't be a problem IMHO. Some of the other > checkers do far worse things (some of them include non-linear > algorithms, for example). I've got no problem with another BB walk. I'll pre-approve the existing patch with the checker moved back out. jeff
On Tue, May 14, 2013 at 11:04 PM, Jeff Law <law@redhat.com> wrote: > On 05/15/2013 12:02 AM, Steven Bosscher wrote: >> >> On Wed, May 15, 2013 at 12:50 AM, Teresa Johnson wrote: >>> >>> On Tue, May 14, 2013 at 2:49 PM, Steven Bosscher wrote: >>>> >>>> One nit: Can you keep the verify_hot_cold_block_grouping function >>>> separate? rtl_verify_flow_info* is already too big and complex >>>> (somewhere down on my TODO list is splitting it up and improving >>>> cfglayout mode checking, e.g. to make sure there are no barriers/notes >>>> between basic blocks...). >>> >>> >>> Initially that's what I did, but then it seemed less efficient because >>> it adds another iteration over the BBs, so I instead merged the check >>> with one of the existing BB iterations in that routine. Do you still >>> prefer that I outline it? >> >> >> I'm not concerned about efficiency of the checker routines. They only >> run with checking enabled anyway. These checkers are almost the best >> "documentation" of the rules of GCC's intermediate representations >> that we have, so it's more important to me to make them easy to >> understand, and to make sure these verifiers themselves are complete >> and correct. > > Right. > > >> >> So another BB walk wouldn't be a problem IMHO. Some of the other >> checkers do far worse things (some of them include non-linear >> algorithms, for example). > > I've got no problem with another BB walk. I'll pre-approve the existing > patch with the checker moved back out. Done, retested with bootstrap/regression on x86_64-unknown-linux-gnu, and committed as r198934. Steven - I was looking at rtl_verify_flow_info_1 last night and have some ideas about how to refactor it a bit. I will try to send a patch for that in the next day since I will be adding to that verification code with my other function splitting patches and it would be good to get that refactored first. Thanks, Teresa > > jeff -- Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Index: bb-reorder.c =================================================================== --- bb-reorder.c (revision 198891) +++ bb-reorder.c (working copy) @@ -1053,7 +1053,7 @@ connect_traces (int n_traces, struct trace *traces current_partition = BB_PARTITION (traces[0].first); two_passes = false; - if (flag_reorder_blocks_and_partition) + if (crtl->has_bb_partition) for (i = 0; i < n_traces && !two_passes; i++) if (BB_PARTITION (traces[0].first) != BB_PARTITION (traces[i].first)) @@ -1262,7 +1262,7 @@ connect_traces (int n_traces, struct trace *traces } } - if (flag_reorder_blocks_and_partition) + if (crtl->has_bb_partition) try_copy = false; /* Copy tiny blocks always; copy larger blocks only when the @@ -2068,43 +2068,6 @@ add_reg_crossing_jump_notes (void) add_reg_note (BB_END (e->src), REG_CROSSING_JUMP, NULL_RTX); } -/* Verify, in the basic block chain, that there is at most one switch - between hot/cold partitions. This is modelled on - rtl_verify_flow_info_1, but it cannot go inside that function - because this condition will not be true until after - reorder_basic_blocks is called. */ - -static void -verify_hot_cold_block_grouping (void) -{ - basic_block bb; - int err = 0; - bool switched_sections = false; - int current_partition = 0; - - FOR_EACH_BB (bb) - { - if (!current_partition) - current_partition = BB_PARTITION (bb); - if (BB_PARTITION (bb) != current_partition) - { - if (switched_sections) - { - error ("multiple hot/cold transitions found (bb %i)", - bb->index); - err = 1; - } - else - { - switched_sections = true; - current_partition = BB_PARTITION (bb); - } - } - } - - gcc_assert(!err); -} - /* Reorder basic blocks. The main entry point to this file. FLAGS is the set of flags to pass to cfg_layout_initialize(). */ @@ -2157,8 +2120,9 @@ reorder_basic_blocks (void) dump_flow_info (dump_file, dump_flags); } - if (flag_reorder_blocks_and_partition) - verify_hot_cold_block_grouping (); + /* Signal that rtl_verify_flow_info_1 can now verify that there + is at most one switch between hot/cold sections. */ + crtl->bb_reorder_complete = true; } /* Determine which partition the first basic block in the function @@ -2503,6 +2467,8 @@ partition_hot_cold_basic_blocks (void) if (!crossing_edges.exists ()) return 0; + crtl->has_bb_partition = true; + /* Make sure the source of any crossing edge ends in a jump and the destination of any crossing edge has a label. */ add_labels_and_missing_jumps (crossing_edges); Index: cfgrtl.c =================================================================== --- cfgrtl.c (revision 198891) +++ cfgrtl.c (working copy) @@ -2082,6 +2082,8 @@ rtl_verify_flow_info_1 (void) rtx x; int err = 0; basic_block bb; + bool switched_sections = false; + int current_partition = BB_UNPARTITIONED; /* Check the general integrity of the basic blocks. */ FOR_EACH_BB_REVERSE (bb) @@ -2299,6 +2301,29 @@ rtl_verify_flow_info_1 (void) err = 1; } + /* Verify, in the basic block chain, that there is at most one switch + between hot/cold partitions. This condition will not be true until + after reorder_basic_blocks is called. */ + if (crtl->bb_reorder_complete) + { + if (current_partition != BB_UNPARTITIONED + && BB_PARTITION (bb) != current_partition) + { + if (switched_sections) + { + error ("multiple hot/cold transitions found (bb %i)", + bb->index); + err = 1; + } + else + switched_sections = true; + + if (!crtl->has_bb_partition) + error ("partition found but function partition flag not set"); + } + current_partition = BB_PARTITION (bb); + } + if (BB_END (bb) == x) /* Do checks for empty blocks here. */ ; Index: function.h =================================================================== --- function.h (revision 198891) +++ function.h (working copy) @@ -446,6 +446,15 @@ struct GTY(()) rtl_data { sched2) and is useful only if the port defines LEAF_REGISTERS. */ bool uses_only_leaf_regs; + /* Nonzero if the function being compiled has undergone hot/cold partitioning + (under flag_reorder_blocks_and_partition) and has at least one cold + block. */ + bool has_bb_partition; + + /* Nonzero if the function being compiled has completed the bb reordering + pass. */ + bool bb_reorder_complete; + /* Like regs_ever_live, but 1 if a reg is set or clobbered from an asm. Unlike regs_ever_live, elements of this array corresponding to eliminable regs (like the frame pointer) are set if an asm Index: cfgcleanup.c =================================================================== --- cfgcleanup.c (revision 198891) +++ cfgcleanup.c (working copy) @@ -1864,7 +1864,7 @@ try_crossjump_to_edge (int mode, edge e1, edge e2, partition boundaries). See the comments at the top of bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ - if (flag_reorder_blocks_and_partition && reload_completed) + if (crtl->has_bb_partition && reload_completed) return false; /* Search backward through forwarder blocks. We don't need to worry