Message ID | 20110727053045.GO1081@bubble.grove.modra.org |
---|---|
State | New |
Headers | show |
On Wed, Jul 27, 2011 at 1:30 AM, Alan Modra <amodra@gmail.com> wrote: > * config/rs6000/linux-unwind.h (frob_update_context <__powerpc64__>): > Leave r2 REG_UNSAVED if stopped on the instruction that saves r2 > in a plt call stub. Do restore r2 if stopped on bctrl. Okay. Thanks, David
On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote: > Ideally what I'd like to > do is have ld and gcc emit accurate r2 tracking unwind info and > dispense with hacks like frob_update_context. If ld did emit accurate > unwind info for .glink, then the justification for frob_update_context > disappears. For the record, this statement of mine doesn't make sense. A .glink stub doesn't make a frame, so a backtrace won't normally pass through a stub, thus having accurate unwind info for .glink doesn't help at all. ld would need to insert unwind info for r2 on the call, but that involves editing .eh_frame and in any case isn't accurate since the r2 save doesn't happen until one or two instructions after the call, in the stub. I think we are stuck with frob_update_context.
On 07/28/2011 12:27 AM, Alan Modra wrote: > On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote: >> Ideally what I'd like to >> do is have ld and gcc emit accurate r2 tracking unwind info and >> dispense with hacks like frob_update_context. If ld did emit accurate >> unwind info for .glink, then the justification for frob_update_context >> disappears. > > For the record, this statement of mine doesn't make sense. A .glink > stub doesn't make a frame, so a backtrace won't normally pass through a > stub, thus having accurate unwind info for .glink doesn't help at all. It does, for the duration of the stub. The whole problem is that toc pointer copy in 40(1) is only valid during indirect call sequences, and iff ld inserted a stub? I.e. direct calls between functions that share toc pointers never save the copy? Would it make sense, if a function has any indirect call, to move the toc pointer save into the prologue? You'd get to avoid that store all the time. Of course you'd not be able to sink the load after the call, but it might still be a win. And in that special case you can annotate the r2 save slot just once, correctly. For functions that do not contain an indirect function call, I don't believe that there's a any way to use DW_CFA_offset that is always correct. One could, however, move the code in frob_update_context into a (series of) DW_CFA_val_expression's. DW_CFA_val_expression DW_OP_reg2 // Default to the value currently in R2 DW_OP_regx LR // Test the insn following the call, as per frob_update_context DW_OP_deref_size 4 DW_OP_const4u 0xE8410028 DW_OP_ne DW_OP_bra L1 DW_OP_drop // Could be omitted, given that we only examine top-of-stack at the end DW_OP_breg1 40 // Pull the value from *(R1+40) DW_OP_deref L1: This version could appear in the CIE. You'd have to adjust it once LR gets saved to the stack, and R2 isn't itself being saved as per above. There isn't currently a hook in dwarf2cfi to add extra stuff to the CIE program, but that wouldn't be hard to add. The version that gets emitted after LR is saved would need a new note as well. But it all seems fairly tractable to actually implement, if we think it'll actually solve the problem. r~
On Thu, Jul 28, 2011 at 2:49 PM, Richard Henderson <rth@redhat.com> wrote: > The whole problem is that toc pointer copy in 40(1) is only valid > during indirect call sequences, and iff ld inserted a stub? I.e. > direct calls between functions that share toc pointers never save > the copy? > > Would it make sense, if a function has any indirect call, to move > the toc pointer save into the prologue? You'd get to avoid that > store all the time. Of course you'd not be able to sink the load > after the call, but it might still be a win. And in that special > case you can annotate the r2 save slot just once, correctly. Michael Meissner recently did move R2 save into the prologue, under certain circumstances. See TARGET_SAVE_TOC_INDIRECT. Limitations include alloca (unless one re-copies the R2. Mike also encountered some problems with EH, which may be related to this discussion. The other problem is hoisting the store into the prologue is not always profitable for performance. It should be better once shrink wrapping is implemented. Currently the PPC ABI may perform a lot of stores in the prologue if the function *may* make a call. R2 adds yet another store to the common path. - David
On 07/28/2011 12:02 PM, David Edelsohn wrote: > The other problem is hoisting the store into the prologue is not > always profitable for performance. It should be better once shrink > wrapping is implemented. Currently the PPC ABI may perform a lot of > stores in the prologue if the function *may* make a call. R2 adds yet > another store to the common path. Well, even if we're not able to hoist the R2 store, we may be able to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns in the stream. r~
On Thu, Jul 28, 2011 at 12:09:51PM -0700, Richard Henderson wrote: > Well, even if we're not able to hoist the R2 store, we may be able > to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns > in the stream. You'd need to mark every non-local call with something that says R2 may be saved, effectively duplicating md_frob_update in dwarf. I guess that is possible even without extending our eh encoding, but each call would have at least 6 bytes added to eh_frame: DW_CFA_expression, 2, 3, DW_OP_skip, offset_to_r2_prog and you'd need to emit multiple copies of "r2_prog" for functions that have a lot of calls, since the offset is limited to +/-32k. I think that would inflate the size of .eh_frame too much, and slow down handling of exceptions dramatically.
Index: libgcc/config/rs6000/linux-unwind.h =================================================================== --- libgcc/config/rs6000/linux-unwind.h (revision 176780) +++ libgcc/config/rs6000/linux-unwind.h (working copy) @@ -346,10 +346,28 @@ frob_update_context (struct _Unwind_Cont figure out if it was saved. The big problem here is that the code that does the save/restore is generated by the linker, so we have no good way to determine at compile time what to do. */ - unsigned int *insn - = (unsigned int *) _Unwind_GetGR (context, R_LR); - if (insn && *insn == 0xE8410028) - _Unwind_SetGRPtr (context, 2, context->cfa + 40); + if (pc[0] == 0xF8410028 + || ((pc[0] & 0xFFFF0000) == 0x3D820000 + && pc[1] == 0xF8410028)) + { + /* We are in a plt call stub or r2 adjusting long branch stub, + before r2 has been saved. Keep REG_UNSAVED. */ + } + else if (pc[0] == 0x4E800421 + && pc[1] == 0xE8410028) + { + /* We are at the bctrl instruction in a call via function + pointer. gcc always emits the load of the new r2 just + before the bctrl. */ + _Unwind_SetGRPtr (context, 2, context->cfa + 40); + } + else + { + unsigned int *insn + = (unsigned int *) _Unwind_GetGR (context, R_LR); + if (insn && *insn == 0xE8410028) + _Unwind_SetGRPtr (context, 2, context->cfa + 40); + } } #endif }