From patchwork Tue Jun 13 13:59:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 775204 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wnBGG4WDBz9s0Z for ; Tue, 13 Jun 2017 23:59:34 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="haq7oJ8w"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=Hx20xCMTHl0xaIhPCCNzCLpXr8aGeta6S6K00yoN+Bvo8dQvjVUbj CqU9VPBF/6H7O2v6SRRuojZ+sy3w1Vixm8npf6w8zTE5G2zC196mUaYijkwfjQDZ JzOUW9Z3aaxNZheTyeTsVpj6c58N6q+lvGndD3WTXWJno2BIPTVsew= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:content-transfer-encoding:mime-version; s=default; bh=A1KkEyMbmLUj7utdLoLBdKkTbJ0=; b=haq7oJ8wvR95+Rdd9V1hRWgG1w7F N44VlcQ6r6hZI5IzdUEjkfAR/4BY8qVwrG6ZApiPk3Uri+Bw6BPyP50FIiokRTWz zJRgEG4gk8B52wIGWpwNUzGeYXz9q5ZM6krHHi9LHUPGksY3CMyty/cnKJ0fWWiG JQZNTNvR2640n7k= Received: (qmail 123640 invoked by alias); 13 Jun 2017 13:59:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 123602 invoked by uid 89); 13 Jun 2017 13:59:17 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com Received: from mail-eopbgr20057.outbound.protection.outlook.com (HELO EUR02-VE1-obe.outbound.protection.outlook.com) (40.107.2.57) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 13 Jun 2017 13:59:14 +0000 Received: from VI1PR0802MB2621.eurprd08.prod.outlook.com (10.175.20.147) by DB6PR08MB2662.eurprd08.prod.outlook.com (10.175.234.158) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1157.12; Tue, 13 Jun 2017 13:59:15 +0000 Received: from VI1PR0802MB2621.eurprd08.prod.outlook.com ([fe80::4434:9169:8398:f9dd]) by VI1PR0802MB2621.eurprd08.prod.outlook.com ([fe80::4434:9169:8398:f9dd%13]) with mapi id 15.01.1157.017; Tue, 13 Jun 2017 13:59:15 +0000 From: Wilco Dijkstra To: GCC Patches , James Greenhalgh CC: nd Subject: Re: [RFC][PATCH][AArch64] Cleanup frame pointer usage Date: Tue, 13 Jun 2017 13:59:14 +0000 Message-ID: References: , In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR08MB2662; 7:Xz67GezTi0sJ90A+KxW/f7U/pF/PA3XepHuH2uOSjhoNzfP8Nvs7XCTzFKc+nIJmKXQiV9XXLmkYjGe1G+B/OVRc2mwSy+O6rAX2MWQgX5THf8P3gD7SJOlJR/aqOEanD6fEr7bMWHBLcgkkBcKYvnhox8LRypcSTmkKHPLwUFIpGACKYQU12BPGg4sYN2KVKqvyhRWkM2MfUV68mSxYK7TebdaOiY+BTffkVb2fu+iz2klYPYvtBlUaJsPUPClux2bbAahNm+WKLVtp+y+y7yTDTuKnWtkS9pUJ4zrV2gAp8NmZkTNc+ZMnVmTRW4ssYcOCAAMkeNV2eqJX3wossg== x-ms-traffictypediagnostic: DB6PR08MB2662: x-ms-office365-filtering-correlation-id: 70291168-191c-4603-4e65-08d4b2646088 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(2017030254075)(48565401081)(201703131423075)(201703031133081); SRVR:DB6PR08MB2662; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(100000703101)(100105400095)(93006095)(93001095)(6055026)(6041248)(20161123555025)(20161123562025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123558100)(6072148)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DB6PR08MB2662; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DB6PR08MB2662; x-forefront-prvs: 0337AFFE9A x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(39840400002)(39850400002)(39410400002)(39860400002)(39450400003)(39400400002)(199003)(189002)(377424004)(54534003)(305945005)(7736002)(33656002)(14454004)(74316002)(8676002)(81166006)(6636002)(81156014)(2950100002)(5660300001)(8936002)(7696004)(68736007)(54356999)(6116002)(575784001)(106356001)(3660700001)(3280700002)(229853002)(105586002)(50986999)(101416001)(102836003)(86362001)(76176999)(2906002)(6246003)(25786009)(5250100002)(38730400002)(189998001)(53936002)(99286003)(53546009)(55016002)(6436002)(4001150100001)(6506006)(4326008)(9686003)(2900100001)(478600001)(97736004)(72206003); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR08MB2662; H:VI1PR0802MB2621.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Jun 2017 13:59:14.8820 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR08MB2662 ping From: Wilco Dijkstra Sent: 31 October 2016 18:29 To: GCC Patches Cc: nd Subject: [RFC][PATCH][AArch64] Cleanup frame pointer usage     This patch cleans up all code related to the frame pointer.  On AArch64 we emit a frame chain even in cases where the frame pointer is not required. So make this explicit by introducing a boolean emit_frame_chain in aarch64_frame record. When the frame pointer is enabled but not strictly required (eg. no use of alloca), we emit a frame chain in non-leaf functions, but continue to use the stack pointer to access locals.  This results in smaller code and unwind info. Also simplify the complex logic in aarch64_override_options_after_change_1 and compute whether the frame chain is required in aarch64_layout_frame instead.  As a result aarch64_frame_pointer_required is now redundant and aarch64_can_eliminate can be greatly simplified. Finally convert all callee save/restore functions to use gen_frame_mem. Bootstrap OK. Any comments? ChangeLog: 2016-10-31  Wilco Dijkstra      gcc/         * config/aarch64/aarch64.h (aarch64_frame):          Add emit_frame_chain boolean.         * config/aarch64/aarch64.c (aarch64_frame_pointer_required)         Remove.         (aarch64_layout_frame): Initialise emit_frame_chain.         (aarch64_pushwb_single_reg): Use gen_frame_mem.         (aarch64_pop_regs): Likewise.         (aarch64_gen_load_pair): Likewise.         (aarch64_save_callee_saves): Likewise.         (aarch64_restore_callee_saves): Likewise.         (aarch64_expand_prologue): Use emit_frame_chain.         (aarch64_can_eliminate): Simplify. When FP needed or outgoing         arguments are large, eliminate to FP, otherwise SP.         (aarch64_override_options_after_change_1): Simplify.         (TARGET_FRAME_POINTER_REQUIRED): Remove define. diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index fa81e4b853dafcccc08842955288861ec7e7acca..6e32dc9f6f171dde0c182fdd7857230251f71712 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -583,6 +583,9 @@ struct GTY (()) aarch64_frame    /* The size of the stack adjustment after saving callee-saves.  */    HOST_WIDE_INT final_adjust;   +  /* Store FP,LR and setup a frame pointer.  */ +  bool emit_frame_chain; +    unsigned wb_candidate1;    unsigned wb_candidate2;   diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f07d771ea343803e054e03f59c8c1efb698bf474..6c06ac18d16f8afa7ee1cc5e8530e285a60e2b0f 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -2728,24 +2728,6 @@ aarch64_output_probe_stack_range (rtx reg1, rtx reg2)    return "";  }   -static bool -aarch64_frame_pointer_required (void) -{ -  /* In aarch64_override_options_after_change -     flag_omit_leaf_frame_pointer turns off the frame pointer by -     default.  Turn it back on now if we've not got a leaf -     function.  */ -  if (flag_omit_leaf_frame_pointer -      && (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM))) -    return true; - -  /* Force a frame pointer for EH returns so the return address is at FP+8.  */ -  if (crtl->calls_eh_return) -    return true; - -  return false; -} -  /* Mark the registers that need to be saved by the callee and calculate     the size of the callee-saved registers area and frame record (both FP     and LR may be omitted).  */ @@ -2758,6 +2740,18 @@ aarch64_layout_frame (void)    if (reload_completed && cfun->machine->frame.laid_out)      return;   +  /* Force a frame chain for EH returns so the return address is at FP+8.  */ +  cfun->machine->frame.emit_frame_chain +    = frame_pointer_needed || crtl->calls_eh_return; + +  /* Emit a frame chain if the frame pointer is enabled. +     If -momit-leaf-frame-pointer is used, do not use a frame chain +     in leaf functions which do not use LR.  */ +  if (flag_omit_frame_pointer == 2 +      && !(flag_omit_leaf_frame_pointer && crtl->is_leaf +          && !df_regs_ever_live_p (LR_REGNUM))) +    cfun->machine->frame.emit_frame_chain = true; +  #define SLOT_NOT_REQUIRED (-2)  #define SLOT_REQUIRED     (-1)   @@ -2789,7 +2783,7 @@ aarch64_layout_frame (void)          && !call_used_regs[regno])        cfun->machine->frame.reg_offset[regno] = SLOT_REQUIRED;   -  if (frame_pointer_needed) +  if (cfun->machine->frame.emit_frame_chain)      {        /* FP and LR are placed in the linkage record.  */        cfun->machine->frame.reg_offset[R29_REGNUM] = 0; @@ -2937,7 +2931,7 @@ aarch64_pushwb_single_reg (machine_mode mode, unsigned regno,    reg = gen_rtx_REG (mode, regno);    mem = gen_rtx_PRE_MODIFY (Pmode, base_rtx,                              plus_constant (Pmode, base_rtx, -adjustment)); -  mem = gen_rtx_MEM (mode, mem); +  mem = gen_frame_mem (mode, mem);      insn = emit_move_insn (mem, reg);    RTX_FRAME_RELATED_P (insn) = 1; @@ -3011,7 +3005,7 @@ aarch64_pop_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment,      {        rtx mem = plus_constant (Pmode, stack_pointer_rtx, adjustment);        mem = gen_rtx_POST_MODIFY (Pmode, stack_pointer_rtx, mem); -      emit_move_insn (reg1, gen_rtx_MEM (mode, mem)); +      emit_move_insn (reg1, gen_frame_mem (mode, mem));      }    else      { @@ -3062,8 +3056,6 @@ aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,                             unsigned start, unsigned limit, bool skip_wb)  {    rtx_insn *insn; -  rtx (*gen_mem_ref) (machine_mode, rtx) = (frame_pointer_needed -                                                ? gen_frame_mem : gen_rtx_MEM);    unsigned regno;    unsigned regno2;   @@ -3081,8 +3073,8 @@ aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,          reg = gen_rtx_REG (mode, regno);        offset = start_offset + cfun->machine->frame.reg_offset[regno]; -      mem = gen_mem_ref (mode, plus_constant (Pmode, stack_pointer_rtx, -                                             offset)); +      mem = gen_frame_mem (mode, plus_constant (Pmode, stack_pointer_rtx, +                                               offset));          regno2 = aarch64_next_callee_save (regno + 1, limit);   @@ -3095,8 +3087,8 @@ aarch64_save_callee_saves (machine_mode mode, HOST_WIDE_INT start_offset,            rtx mem2;              offset = start_offset + cfun->machine->frame.reg_offset[regno2]; -         mem2 = gen_mem_ref (mode, plus_constant (Pmode, stack_pointer_rtx, -                                                  offset)); +         mem2 = gen_frame_mem (mode, plus_constant (Pmode, stack_pointer_rtx, +                                                    offset));            insn = emit_insn (aarch64_gen_store_pair (mode, mem, reg, mem2,                                                      reg2));   @@ -3120,8 +3112,6 @@ aarch64_restore_callee_saves (machine_mode mode,                                unsigned limit, bool skip_wb, rtx *cfi_ops)  {    rtx base_rtx = stack_pointer_rtx; -  rtx (*gen_mem_ref) (machine_mode, rtx) = (frame_pointer_needed -                                                ? gen_frame_mem : gen_rtx_MEM);    unsigned regno;    unsigned regno2;    HOST_WIDE_INT offset; @@ -3139,7 +3129,7 @@ aarch64_restore_callee_saves (machine_mode mode,          reg = gen_rtx_REG (mode, regno);        offset = start_offset + cfun->machine->frame.reg_offset[regno]; -      mem = gen_mem_ref (mode, plus_constant (Pmode, base_rtx, offset)); +      mem = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset));          regno2 = aarch64_next_callee_save (regno + 1, limit);   @@ -3151,7 +3141,7 @@ aarch64_restore_callee_saves (machine_mode mode,            rtx mem2;              offset = start_offset + cfun->machine->frame.reg_offset[regno2]; -         mem2 = gen_mem_ref (mode, plus_constant (Pmode, base_rtx, offset)); +         mem2 = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset));            emit_insn (aarch64_gen_load_pair (mode, reg, mem, reg2, mem2));              *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg2, *cfi_ops); @@ -3217,6 +3207,7 @@ aarch64_expand_prologue (void)    HOST_WIDE_INT callee_offset = cfun->machine->frame.callee_offset;    unsigned reg1 = cfun->machine->frame.wb_candidate1;    unsigned reg2 = cfun->machine->frame.wb_candidate2; +  bool emit_frame_chain = cfun->machine->frame.emit_frame_chain;    rtx_insn *insn;      if (flag_stack_usage_info) @@ -3239,7 +3230,7 @@ aarch64_expand_prologue (void)    if (callee_adjust != 0)      aarch64_push_regs (reg1, reg2, callee_adjust);   -  if (frame_pointer_needed) +  if (emit_frame_chain)      {        if (callee_adjust == 0)          aarch64_save_callee_saves (DImode, callee_offset, R29_REGNUM, @@ -3247,12 +3238,12 @@ aarch64_expand_prologue (void)        insn = emit_insn (gen_add3_insn (hard_frame_pointer_rtx,                                         stack_pointer_rtx,                                         GEN_INT (callee_offset))); -      RTX_FRAME_RELATED_P (insn) = 1; +      RTX_FRAME_RELATED_P (insn) = frame_pointer_needed;        emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx));      }      aarch64_save_callee_saves (DImode, callee_offset, R0_REGNUM, R30_REGNUM, -                            callee_adjust != 0 || frame_pointer_needed); +                            callee_adjust != 0 || emit_frame_chain);    aarch64_save_callee_saves (DFmode, callee_offset, V0_REGNUM, V31_REGNUM,                               callee_adjust != 0 || frame_pointer_needed);    aarch64_sub_sp (IP1_REGNUM, final_adjust, !frame_pointer_needed); @@ -5157,36 +5148,13 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,  }    static bool -aarch64_can_eliminate (const int from, const int to) +aarch64_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)  { -  /* If we need a frame pointer, we must eliminate FRAME_POINTER_REGNUM into -     HARD_FRAME_POINTER_REGNUM and not into STACK_POINTER_REGNUM.  */ - -  if (frame_pointer_needed) -    { -      if (from == ARG_POINTER_REGNUM && to == HARD_FRAME_POINTER_REGNUM) -       return true; -      if (from == ARG_POINTER_REGNUM && to == STACK_POINTER_REGNUM) -       return false; -      if (from == FRAME_POINTER_REGNUM && to == STACK_POINTER_REGNUM -         && !cfun->calls_alloca) -       return true; -      if (from == FRAME_POINTER_REGNUM && to == HARD_FRAME_POINTER_REGNUM) -       return true; - -      return false; -    } -  else -    { -      /* If we decided that we didn't need a leaf frame pointer but then used -        LR in the function, then we'll want a frame pointer after all, so -        prevent this elimination to ensure a frame pointer is used.  */ -      if (to == STACK_POINTER_REGNUM -         && flag_omit_leaf_frame_pointer -         && df_regs_ever_live_p (LR_REGNUM)) -       return false; -    } - +  /* If we need a frame pointer, eliminate to HARD_FRAME_POINTER_REGNUM. +     Use FP as well as with a large number of outgoing arguments so +     that stack offsets are smaller - this may generate better code.  */ +  if (frame_pointer_needed || crtl->outgoing_args_size > 64) +    return to == HARD_FRAME_POINTER_REGNUM;    return true;  }   @@ -8112,24 +8080,13 @@ aarch64_parse_override_string (const char* input_string,  static void  aarch64_override_options_after_change_1 (struct gcc_options *opts)  { -  /* The logic here is that if we are disabling all frame pointer generation -     then we do not need to disable leaf frame pointer generation as a -     separate operation.  But if we are *only* disabling leaf frame pointer -     generation then we set flag_omit_frame_pointer to true, but in -     aarch64_frame_pointer_required we return false only for leaf functions. - -     PR 70044: We have to be careful about being called multiple times for the -     same function.  Once we have decided to set flag_omit_frame_pointer just -     so that we can omit leaf frame pointers, we must then not interpret a -     second call as meaning that all frame pointer generation should be -     omitted.  We do this by setting flag_omit_frame_pointer to a special, -     non-zero value.  */ -  if (opts->x_flag_omit_frame_pointer == 2) -    opts->x_flag_omit_frame_pointer = 0; - -  if (opts->x_flag_omit_frame_pointer) -    opts->x_flag_omit_leaf_frame_pointer = false; -  else if (opts->x_flag_omit_leaf_frame_pointer) +  /* PR 70044: We have to be careful about being called multiple times for the +     same function.  This means all changes should be repeatable.  */ + +  /* If the frame pointer is enabled, set the flag to a special value. +     To implement -momit-leaf-frame-pointer this special value is checked in +     aarch64_layout_frame.  The frame chain is emitted only when required.  */ +  if (opts->x_flag_omit_frame_pointer == 0)      opts->x_flag_omit_frame_pointer = 2;      /* If not optimizing for size, set the default @@ -14126,9 +14083,6 @@ aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode,  #undef TARGET_FUNCTION_VALUE_REGNO_P  #define TARGET_FUNCTION_VALUE_REGNO_P aarch64_function_value_regno_p   -#undef TARGET_FRAME_POINTER_REQUIRED -#define TARGET_FRAME_POINTER_REQUIRED aarch64_frame_pointer_required -  #undef TARGET_GIMPLE_FOLD_BUILTIN  #define TARGET_GIMPLE_FOLD_BUILTIN aarch64_gimple_fold_builtin