Message ID | 000001cfc5bc$9876f4d0$c964de70$@arm.com |
---|---|
State | New |
Headers | show |
On 09/01/14 02:13, Zhenqiang Chen wrote: > > To split live-range of register, split_live_ranges_for_shrink_wrap will > introduce additional register copies. If such copies can not be optimized by > later optimizations, it will lead to code size and performance regression. > My tests on ARM THUMB1 code size show lots of regressions due to additional > register copies. Shrink-wrap is not enabled for ARM THUMB1, so I think > split_live_ranges_for_shrink_wrap should not be called. So has anyone looked at why IRA ends up selecting different registers for the source/dest of these copies? Odds are it's just an artifact of the heuristics in use, but I'd like to make sure there isn't something inherently wrong happening in IRA that's causing it to not tie the source/dest of those copies. > ChangeLog: > 2014-09-01 Zhenqiang Chen <zhenqiang.chen@arm.com> > > * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED. > * ira.c: #include "shrink-wrap.h" > (split_live_ranges_for_shrink_wrap): Use SHRINK_WRAPPING_ENABLED. > * ifcvt.c: #include "shrink-wrap.h" > (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED. > > testsuite/ChangeLog: > 2014-09-01 Zhenqiang Chen <zhenqiang.chen@arm.com> > > * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test. Thanks. OK for the trunk. As noted above, it'd may be worth spending a little time looking at the regressions without this patch installed to see why IRA isn't doing a good job of tying the source/dest of these copies together -- perhaps there's something that's been overlooked and fixing it may be beneficial. jeff
> -----Original Message----- > From: Jeff Law [mailto:law@redhat.com] > Sent: Friday, September 05, 2014 12:45 PM > To: Zhenqiang Chen > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap > > On 09/01/14 02:13, Zhenqiang Chen wrote: > > > > To split live-range of register, split_live_ranges_for_shrink_wrap > > will introduce additional register copies. If such copies can not be > > optimized by later optimizations, it will lead to code size and performance > regression. > > My tests on ARM THUMB1 code size show lots of regressions due to > > additional register copies. Shrink-wrap is not enabled for ARM THUMB1, > > so I think split_live_ranges_for_shrink_wrap should not be called. > So has anyone looked at why IRA ends up selecting different registers > for the source/dest of these copies? Odds are it's just an artifact of > the heuristics in use, but I'd like to make sure there isn't something > inherently wrong happening in IRA that's causing it to not tie the source/dest > of those copies. > > > > > ChangeLog: > > 2014-09-01 Zhenqiang Chen <zhenqiang.chen@arm.com> > > > > * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED. > > * ira.c: #include "shrink-wrap.h" > > (split_live_ranges_for_shrink_wrap): Use > SHRINK_WRAPPING_ENABLED. > > * ifcvt.c: #include "shrink-wrap.h" > > (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED. > > > > testsuite/ChangeLog: > > 2014-09-01 Zhenqiang Chen <zhenqiang.chen@arm.com> > > > > * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test. > Thanks. OK for the trunk. Thanks. The patch is installed @r215041. > As noted above, it'd may be worth spending a little time looking at the > regressions without this patch installed to see why IRA isn't doing a good job > of tying the source/dest of these copies together -- perhaps there's > something that's been overlooked and fixing it may be beneficial. I had investigated it. Compared with 4.8, the allocation order and conflict cost might be the root cause. A bug is submitted: PR63210. Thanks! -Zhenqiang > jeff >
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 94b96f3..d2af0f9 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -42,6 +42,7 @@ #include "df.h" #include "vec.h" #include "dbgcnt.h" +#include "shrink-wrap.h" #ifndef HAVE_conditional_move #define HAVE_conditional_move 0 @@ -4287,14 +4288,13 @@ dead_or_predicable (basic_block test_bb, basic_block merge_bb, if (NONDEBUG_INSN_P (insn)) df_simulate_find_defs (insn, merge_set); -#ifdef HAVE_simple_return /* If shrink-wrapping, disable this optimization when test_bb is the first basic block and merge_bb exits. The idea is to not move code setting up a return register as that may clobber a register used to pass function parameters, which then must be saved in caller-saved regs. A caller-saved reg requires the prologue, killing a shrink-wrap opportunity. */ - if ((flag_shrink_wrap && HAVE_simple_return && !epilogue_completed) + if ((SHRINK_WRAPPING_ENABLED && !epilogue_completed) && ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb == test_bb && single_succ_p (new_dest) && single_succ (new_dest) == EXIT_BLOCK_PTR_FOR_FN (cfun) @@ -4341,7 +4341,6 @@ dead_or_predicable (basic_block test_bb, basic_block merge_bb, } BITMAP_FREE (return_regs); } -#endif } no_body: diff --git a/gcc/ira.c b/gcc/ira.c index 7c18496..f4140e4 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -392,6 +392,7 @@ along with GCC; see the file COPYING3. If not see #include "lra.h" #include "dce.h" #include "dbgcnt.h" +#include "shrink-wrap.h" struct target_ira default_target_ira; struct target_ira_int default_target_ira_int; @@ -4781,7 +4782,7 @@ split_live_ranges_for_shrink_wrap (void) bitmap_head need_new, reachable; vec<basic_block> queue; - if (!flag_shrink_wrap) + if (!SHRINK_WRAPPING_ENABLED) return false; bitmap_initialize (&need_new, 0); diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h index 66bd26d..afcfec3 100644 --- a/gcc/shrink-wrap.h +++ b/gcc/shrink-wrap.h @@ -46,6 +46,9 @@ extern edge get_unconverted_simple_return (edge, bitmap_head, extern void convert_to_simple_return (edge entry_edge, edge orig_entry_edge, bitmap_head bb_flags, rtx returnjump, vec<edge> unconverted_simple_returns); +#define SHRINK_WRAPPING_ENABLED (flag_shrink_wrap && HAVE_simple_return) +#else +#define SHRINK_WRAPPING_ENABLED false #endif #endif /* GCC_SHRINK_WRAP_H */ diff --git a/gcc/testsuite/gcc.target/arm/split-live-ranges-for-shrink-wrap.c b/gcc/testsuite/gcc.target/arm/split-live-ranges-for-shrink-wrap.c new file mode 100644 index 0000000..e36000b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/split-live-ranges-for-shrink-wrap.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-options "-mthumb -Os -fdump-rtl-ira " } */ +/* { dg-require-effective-target arm_thumb1_ok } */ + +int foo (char *, char *, int); +int test (int d, char * out, char *in, int len) +{ + if (out != in) + foo (out, in, len); + return 0; +} +/* { dg-final { object-size text <= 20 } } */ +/* { dg-final { scan-rtl-dump-not "Split live-range of register" "ira" } } */