Message ID | DB6PR0801MB20538770C0B5E53FBBB7D48B83B80@DB6PR0801MB2053.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On 07/25/2017 07:58 AM, Wilco Dijkstra wrote: > This patch makes some changes to the frame layout in order to simplify > stack probing. We want to use the save of LR as a probe in any non-leaf > function. With shrinkwrapping we may only save LR before a call, so it > is useful to define a fixed location in the callee-saves. So force LR at > the bottom of the callee-saves even with -fomit-frame-pointer. > > Also remove a rarely used frame layout that saves the callee-saves first > with -fomit-frame-pointer. > > OK for commit (and backport to GCC7)? > > ChangeLog: > 2017-07-25 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.c (aarch64_layout_frame): > Ensure LR is always stored at the bottom of the callee-saves. > Remove frame option which saves callee-saves at top of frame. I'll let the appropriate aarch64 maintainers comment on correctness. But I wanted to give an explicit thanks for simplifying this so that we can rely on it. Jeff
On Tue, Jul 25, 2017 at 02:58:04PM +0100, Wilco Dijkstra wrote: > This patch makes some changes to the frame layout in order to simplify > stack probing. We want to use the save of LR as a probe in any non-leaf > function. With shrinkwrapping we may only save LR before a call, so it > is useful to define a fixed location in the callee-saves. So force LR at > the bottom of the callee-saves even with -fomit-frame-pointer. > > Also remove a rarely used frame layout that saves the callee-saves first > with -fomit-frame-pointer. > > OK for commit (and backport to GCC7)? OK. Leave it a week before backporting. Reviewed by: James Greenhalgh <james.greenhalgh@arm.com> Thanks, James > > ChangeLog: > 2017-07-25 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.c (aarch64_layout_frame): > Ensure LR is always stored at the bottom of the callee-saves. > Remove frame option which saves callee-saves at top of frame. >
On Thu, Oct 26, 2017 at 04:19:35PM +0100, James Greenhalgh wrote: > On Tue, Jul 25, 2017 at 02:58:04PM +0100, Wilco Dijkstra wrote: > > This patch makes some changes to the frame layout in order to simplify > > stack probing. We want to use the save of LR as a probe in any non-leaf > > function. With shrinkwrapping we may only save LR before a call, so it > > is useful to define a fixed location in the callee-saves. So force LR at > > the bottom of the callee-saves even with -fomit-frame-pointer. > > > > Also remove a rarely used frame layout that saves the callee-saves first > > with -fomit-frame-pointer. > > > > OK for commit (and backport to GCC7)? > > OK. Leave it a week before backporting. This caused: Failures: gcc.target/aarch64/test_frame_4.c gcc.target/aarch64/test_frame_2.c gcc.target/aarch64/test_frame_7.c gcc.target/aarch64/test_frame_10.c Bisected to: Author: wilco Date: Thu Oct 26 16:40:25 2017 +0000 Simplify frame layout for stack probing This patch makes some changes to the frame layout in order to simplify stack probing. We want to use the save of LR as a probe in any non-leaf function. With shrinkwrapping we may only save LR before a call, so it is useful to define a fixed location in the callee-saves. So force LR at the bottom of the callee-saves even with -fomit-frame-pointer. Also remove a rarely used frame layout that saves the callee-saves first with -fomit-frame-pointer. Doing so allows the store of LR to be used as a valid stack probe in all frames. gcc/ * config/aarch64/aarch64.c (aarch64_layout_frame): Ensure LR is always stored at the bottom of the callee-saves. Remove rarely used frame layout which saves callee-saves at top of frame, so the store of LR can be used as a valid probe in all cases. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@254112 Please look in to this. This will also block the request to backport the patch until after the failures have been resolved. There's no reason we shouldn't be catching bugs like this (simple scan assembler tests which have been in the port for years, that will obviously never pass after your changes) before the patch makes it to trunk. How was this patch tested? Thanks, James
James Greenhalgh wrote: > > This caused: > > Failures: > gcc.target/aarch64/test_frame_4.c > gcc.target/aarch64/test_frame_2.c > gcc.target/aarch64/test_frame_7.c > gcc.target/aarch64/test_frame_10.c Sorry, I missed that in testing. I've reverted part of the patch that caused this. The tests are definitely too picky but they also uncovered a real code generation inefficiency, so I need to look into that further. I've committed this: 2017-11-03 Wilco Dijkstra <wdijkstr@arm.com> PR target/82786 * config/aarch64/aarch64.c (aarch64_layout_frame): Undo forcing of LR at bottom of frame. -- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2fc7db4..949f3cb 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2017-11-03 Wilco Dijkstra <wdijkstr@arm.com> + + PR target/82786 + * config/aarch64/aarch64.c (aarch64_layout_frame): + Undo forcing of LR at bottom of frame. + 2017-11-03 Jeff Law <law@redhat.com> * cfganal.c (single_pred_edge_ignoring_loop_edges): New function diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 1e12645..12f247d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2908,8 +2908,7 @@ aarch64_frame_pointer_required (void) /* Mark the registers that need to be saved by the callee and calculate the size of the callee-saved registers area and frame record (both FP - and LR may be omitted). If the function is not a leaf, ensure LR is - saved at the bottom of the callee-save area. */ + and LR may be omitted). */ static void aarch64_layout_frame (void) { @@ -2966,13 +2965,6 @@ aarch64_layout_frame (void) cfun->machine->frame.wb_candidate2 = R30_REGNUM; offset = 2 * UNITS_PER_WORD; } - else if (!crtl->is_leaf) - { - /* Ensure LR is saved at the bottom of the callee-saves. */ - cfun->machine->frame.reg_offset[R30_REGNUM] = 0; - cfun->machine->frame.wb_candidate1 = R30_REGNUM; - offset = UNITS_PER_WORD; - } /* Now assign stack slots for them. */ for (regno = R0_REGNUM; regno <= R30_REGNUM; regno++)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index b8a4160d9de8e689ccd26cb9f0ce046ee65e0ef4..3fc36ae28d18b9635480fd99f1fa7719267e66e4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2875,7 +2875,8 @@ aarch64_frame_pointer_required (void) /* Mark the registers that need to be saved by the callee and calculate the size of the callee-saved registers area and frame record (both FP - and LR may be omitted). */ + and LR may be omitted). If the function is not a leaf, ensure LR is + saved at the bottom of the callee-save area. */ static void aarch64_layout_frame (void) { @@ -2926,7 +2927,14 @@ aarch64_layout_frame (void) cfun->machine->frame.wb_candidate1 = R29_REGNUM; cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD; cfun->machine->frame.wb_candidate2 = R30_REGNUM; - offset += 2 * UNITS_PER_WORD; + offset = 2 * UNITS_PER_WORD; + } + else if (!crtl->is_leaf) + { + /* Ensure LR is saved at the bottom of the callee-saves. */ + cfun->machine->frame.reg_offset[R30_REGNUM] = 0; + cfun->machine->frame.wb_candidate1 = R30_REGNUM; + offset = UNITS_PER_WORD; } /* Now assign stack slots for them. */ @@ -3025,20 +3033,6 @@ aarch64_layout_frame (void) cfun->machine->frame.final_adjust = cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust; } - else if (!frame_pointer_needed - && varargs_and_saved_regs_size < max_push_offset) - { - /* Frame with large local area and outgoing arguments (this pushes the - callee-saves first, followed by the locals and outgoing area): - stp reg1, reg2, [sp, -varargs_and_saved_regs_size]! - stp reg3, reg4, [sp, 16] - sub sp, sp, frame_size - varargs_and_saved_regs_size */ - cfun->machine->frame.callee_adjust = varargs_and_saved_regs_size; - cfun->machine->frame.final_adjust - = cfun->machine->frame.frame_size - cfun->machine->frame.callee_adjust; - cfun->machine->frame.hard_fp_offset = cfun->machine->frame.callee_adjust; - cfun->machine->frame.locals_offset = cfun->machine->frame.hard_fp_offset; - } else { /* Frame with large local area and outgoing arguments using frame pointer: