Message ID | mcrmxr977wr.fsf@google.com |
---|---|
State | New |
Headers | show |
On 09/22/2010 03:03 PM, Ian Lance Taylor wrote: > This is the fourth of the -fsplit-stack patches. This patch adds a new > feature to the DWARF frame note. If an insn has a new REG_CFA_TEMPORARY > note, then the CFA is set to the value of the note only for that insn. > The i386 backend uses this new feature to set the unwind information for > a special ret instruction that it uses to permit the processor to do > call/return prediction. Er, I'm not keen on this. In particular, how does > + dwarf2out_frame_debug_remember_state (); > + dwarf2out_frame_debug_def_cfa (XEXP (note, 0), label); > + } > + else > + dwarf2out_frame_debug_restore_state (); interact with the save/restore that we already emit for epilogues? Seems to me that it's more-or-less redundant. An alternative implementation is to simply add a REG_CFA_{ADJUST,DEF}_CFA note to a blockage insn immediately before the special return. r~
On 09/23/2010 10:28 AM, Richard Henderson wrote: > An alternative implementation is to simply add a REG_CFA_{ADJUST,DEF}_CFA > note to a blockage insn immediately before the special return. ... or more properly, now that i think about it, to the insn that does the cfa adjustment. I really can't tell what you're wanting to do with this REG_CFA_TEMP thing... r~
On 09/22/2010 03:03 PM, Ian Lance Taylor wrote: > This is the fourth of the -fsplit-stack patches. This patch adds a new > feature to the DWARF frame note. If an insn has a new REG_CFA_TEMPORARY > note, then the CFA is set to the value of the note only for that insn. > The i386 backend uses this new feature to set the unwind information for > a special ret instruction that it uses to permit the processor to do > call/return prediction. Ok, scratch the previous sub-thread. I've looked through 6/7 to see how you're actually using it. That said, I still don't like it the note. I think this can be entirely handled within the special return insn. { char *lab; lab = dwarf2out_cfi_label (false); dwarf2out_frame_debug_remember_state (lab); dwarf2out_frame_debug_def_cfa (lab, plus_constant (stack_pointer_rtx, UNITS_PER_WORD)); // output ret lab = dwarf2out_cfi_label (false); dwarf2out_frame_debug_restore_state (lab); return ""; } (with the obvious tweeks to the dwarf2out_frame_debug_* functions. I also wonder if it wouldn't be better to merge the call and return insns into the same pattern. This would guarantee that nothing sneeks in, when we're playing games with ret_addr+1. r~
Richard Henderson <rth@redhat.com> writes: > On 09/22/2010 03:03 PM, Ian Lance Taylor wrote: >> This is the fourth of the -fsplit-stack patches. This patch adds a new >> feature to the DWARF frame note. If an insn has a new REG_CFA_TEMPORARY >> note, then the CFA is set to the value of the note only for that insn. >> The i386 backend uses this new feature to set the unwind information for >> a special ret instruction that it uses to permit the processor to do >> call/return prediction. > > Ok, scratch the previous sub-thread. I've looked through 6/7 to > see how you're actually using it. > > That said, I still don't like it the note. Actually, I'm going to withdraw this part of the patch. It was necessary at one time to unwind the stack, but now the assembly code uses slightly tricky unwind info to skip the single return instruction entirely during the unwind process. There are still some issues with the unwind information in 32-bit mode, but they aren't helped by this patch. So please consider this patch withdrawn, with the corresponding changes to patch 5 (x86 backend). > I also wonder if it wouldn't be better to merge the call and return > insns into the same pattern. This would guarantee that nothing > sneeks in, when we're playing games with ret_addr+1. I'm a bit reluctant to do that now, since there seem to be several different variants of the x86 call instruction. It's certainly a possibility for later. There is very little opportunity for anything to slip in, since the return instruction is unspec_volatile and is immediately followed by the label which is the target of the branch of there is enough stack space. Ian
On 09/24/2010 10:33 AM, Ian Lance Taylor wrote: > Actually, I'm going to withdraw this part of the patch. It was > necessary at one time to unwind the stack, but now the assembly code > uses slightly tricky unwind info to skip the single return instruction > entirely during the unwind process. Actually -fasynchronous-unwind-tables needs it. If the interrupt that samples the call-stack happens at that lone return insn, we'll have wrong unwind info. > I'm a bit reluctant to do that now, since there seem to be several > different variants of the x86 call instruction. It's certainly a > possibility for later. There's really no need to consider all the different rtl call variants. There's only one actual direct call insn. Ignoring the unwind info and return stack popping for clarity, the insn pattern could be as simple as (define_insn "morestack" [(unspec_v [(match_operand 0)] UNSPECV_MORESTACK)] "" "call __morestack\;ret" ) That said, I wonder about the relative impact of the call-stack branch (mis)prediction vs the double-branch and extra icache fetches. r~
Richard Henderson <rth@redhat.com> writes: > On 09/24/2010 10:33 AM, Ian Lance Taylor wrote: >> Actually, I'm going to withdraw this part of the patch. It was >> necessary at one time to unwind the stack, but now the assembly code >> uses slightly tricky unwind info to skip the single return instruction >> entirely during the unwind process. > > Actually -fasynchronous-unwind-tables needs it. > > If the interrupt that samples the call-stack happens at > that lone return insn, we'll have wrong unwind info. That can certainly happen and should be fixed at some point, but it is not a significant issue in practice because we are always within the unwind-regime of the start of the function at this point, and this gives us a valid stack unwind. > >> I'm a bit reluctant to do that now, since there seem to be several >> different variants of the x86 call instruction. It's certainly a >> possibility for later. > > There's really no need to consider all the different rtl call > variants. There's only one actual direct call insn. > > Ignoring the unwind info and return stack popping for clarity, > the insn pattern could be as simple as > > (define_insn "morestack" > [(unspec_v [(match_operand 0)] UNSPECV_MORESTACK)] > "" > "call __morestack\;ret" > ) Oh yeah, good point. > That said, I wonder about the relative impact of the call-stack > branch (mis)prediction vs the double-branch and extra icache fetches. It's not just one call-stack misprediction, it means that the entire call stack get mispredicted. I think that will outweigh the extra icache fetch. Ian
On 09/24/2010 12:30 PM, Ian Lance Taylor wrote: > That can certainly happen and should be fixed at some point, but it is > not a significant issue in practice because we are always within the > unwind-regime of the start of the function at this point, and this gives > us a valid stack unwind. Oh yeah, we're still before the prologue, so we should be exactly at the entry point's unwind info. Fantastic. r~
Index: gcc/reg-notes.def =================================================================== --- gcc/reg-notes.def (revision 164490) +++ gcc/reg-notes.def (working copy) @@ -165,6 +165,13 @@ REG_NOTE (CFA_RESTORE) to the argument, if it is a MEM, it is ignored. */ REG_NOTE (CFA_SET_VDRAP) +/* Temporarily set the CFA for just one insn. This saves the old + state, sets the CFA, and then restores the state after the insn. + The pattern for this note is the new CFA value. This is used by + the split stack code to support unwinding through the call into the + split stack routine. */ +REG_NOTE (CFA_TEMPORARY) + /* Indicates that REG holds the exception context for the function. This context is shared by inline functions, so the code to acquire the real exception context is delayed until after inlining. */ Index: gcc/dwarf2out.c =================================================================== --- gcc/dwarf2out.c (revision 164490) +++ gcc/dwarf2out.c (working copy) @@ -473,6 +473,7 @@ static void output_call_frame_info (int) static void dwarf2out_note_section_used (void); static bool clobbers_queued_reg_save (const_rtx); static void dwarf2out_frame_debug_expr (rtx, const char *); +static void dwarf2out_frame_debug_remember_state (void); /* Support for complex CFA locations. */ static void output_cfa_loc (dw_cfi_ref); @@ -2832,6 +2833,17 @@ dwarf2out_frame_debug (rtx insn, bool af handled_one = true; break; + case REG_CFA_TEMPORARY: + if (!after_p) + { + dwarf2out_frame_debug_remember_state (); + dwarf2out_frame_debug_def_cfa (XEXP (note, 0), label); + } + else + dwarf2out_frame_debug_restore_state (); + handled_one = true; + break; + default: break; } @@ -2924,9 +2936,17 @@ dwarf2out_cfi_begin_epilogue (rtx insn) } emit_note_before (NOTE_INSN_CFA_RESTORE_STATE, i); + dwarf2out_frame_debug_remember_state (); +} + +/* Remember the current state. */ + +static void +dwarf2out_frame_debug_remember_state (void) +{ emit_cfa_remember = true; - /* And emulate the state save. */ + /* Emulate the state save. */ gcc_assert (!cfa_remember.in_use); cfa_remember = cfa; cfa_remember.in_use = 1;