Message ID | AANLkTinQMTqA02PZOEfqzFj132kKD0NF8O46Wk_4qUZm@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Thu, Nov 18, 2010 at 6:32 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> insn != BB_END (bb) && NEXT_INSN (insn) == NEXT_INSN (BB_END (bb)) >>> >>> We should check NEXT_INSN (insn) != NEXT_INSN (BB_END (bb)) in >>> move_or_delete_vzeroupper_2. This patch does it. >> >> Huh? The loop does simple linear scan of all insns in the bb, so it >> can't miss BB_END. IIUC, in your case the bb does not have BB_END >> (bb), but it has NEXT_INSN (BB_END (bb))? >> >> Can you please provide a test case that illustrates this? >> > > ix86_pad_returns forgot to update BB_END when it > replaces it with a new one. OK for trunk? IMO, you should just move the call to vzeroupper optimization to be the first thing in ix86_reorg. This way, ix86_pad_short_function, ix86_pad_returns and ix86_avoid_jump_mispredict will also count emitted vzeroupper. The only possible drawback of this approach would be different position of nops w.r.t to vzeroupper in case of ix86_pad_short_functions: vzeroupper nop nop nop ret Uros.
On Thu, Nov 18, 2010 at 10:38 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Thu, Nov 18, 2010 at 6:32 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > >>>> insn != BB_END (bb) && NEXT_INSN (insn) == NEXT_INSN (BB_END (bb)) >>>> >>>> We should check NEXT_INSN (insn) != NEXT_INSN (BB_END (bb)) in >>>> move_or_delete_vzeroupper_2. This patch does it. >>> >>> Huh? The loop does simple linear scan of all insns in the bb, so it >>> can't miss BB_END. IIUC, in your case the bb does not have BB_END >>> (bb), but it has NEXT_INSN (BB_END (bb))? >>> >>> Can you please provide a test case that illustrates this? >>> >> >> ix86_pad_returns forgot to update BB_END when it >> replaces it with a new one. OK for trunk? > > IMO, you should just move the call to vzeroupper optimization to be > the first thing in ix86_reorg. This way, ix86_pad_short_function, > ix86_pad_returns and ix86_avoid_jump_mispredict will also count > emitted vzeroupper. But it will leave bad BB_END in place. Any uses of BB_END later will still be screwed. > The only possible drawback of this approach would be different > position of nops w.r.t to vzeroupper in case of > ix86_pad_short_functions: > > vzeroupper > nop > nop > nop > ret That is one issue.
On Thu, Nov 18, 2010 at 8:37 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >>> ix86_pad_returns forgot to update BB_END when it >>> replaces it with a new one. OK for trunk? >> >> IMO, you should just move the call to vzeroupper optimization to be >> the first thing in ix86_reorg. This way, ix86_pad_short_function, >> ix86_pad_returns and ix86_avoid_jump_mispredict will also count >> emitted vzeroupper. > > But it will leave bad BB_END in place. Any uses of BB_END later > will still be screwed. I think that Jan or Richard (CC'd) can provide better answer on this issue. Uros.
On Thu, Nov 18, 2010 at 11:43 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Thu, Nov 18, 2010 at 8:37 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > >>>> ix86_pad_returns forgot to update BB_END when it >>>> replaces it with a new one. OK for trunk? >>> >>> IMO, you should just move the call to vzeroupper optimization to be >>> the first thing in ix86_reorg. This way, ix86_pad_short_function, >>> ix86_pad_returns and ix86_avoid_jump_mispredict will also count >>> emitted vzeroupper. >> >> But it will leave bad BB_END in place. Any uses of BB_END later >> will still be screwed. > > I think that Jan or Richard (CC'd) can provide better answer on this issue. > I opened a bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46546
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 7eb4116..3cd066d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -29749,7 +29746,8 @@ ix86_pad_returns (void) } if (replace) { - emit_jump_insn_before (gen_return_internal_long (), ret); + BB_END (bb) + = emit_jump_insn_before (gen_return_internal_long (), ret); delete_insn (ret); } }