From patchwork Thu Jun 16 14:47:10 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Edlinger X-Patchwork-Id: 636496 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rVmSn61KBz9t0l for ; Fri, 17 Jun 2016 00:47:33 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=wDelPYer; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; q=dns; s=default; b=s/NxUDiMEQX/qH2hfS9/RlvFnTQVhfiuPygNaYI75eHvwi1F/o oXh81GHxU31Qfovnyqiuz2/5JbfC91ay4C/C/I4i/xH844Cp1fnRTLoL+/P9Ffvg rQidyR/oJD7FQuAEVF7IEztDSUg+3BfReDaAuSGP9VuQRRnYAaiv8Svwc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; s= default; bh=+qE7hzfpVsl6DgMd5D5C/NQ7rgE=; b=wDelPYerz3iWUFa2UsWs 5p6SSRkUA1Cp17diuWaCYR9Qq+eviPYfNjP0bUValP1P15RdXCStYUA0fRKA6Qux +qxG+7k39+DLVjyNKO5hdsXTAxndGyMDza/OS7SBhRafaxsICNPUjg2jnKDHOcjg c0Q0A2pg+6jIVjlVgfj1fTQ= Received: (qmail 43652 invoked by alias); 16 Jun 2016 14:47:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 42738 invoked by uid 89); 16 Jun 2016 14:47:25 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 spammy=H*Ad:D*googlemail.com, Hx-spam-relays-external:15.1.517.7, H*RU:15.1.517.7, Stack X-HELO: COL004-OMC1S9.hotmail.com Received: from col004-omc1s9.hotmail.com (HELO COL004-OMC1S9.hotmail.com) (65.55.34.19) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA256 encrypted) ESMTPS; Thu, 16 Jun 2016 14:47:14 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com ([65.55.34.9]) by COL004-OMC1S9.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Thu, 16 Jun 2016 07:47:12 -0700 Received: from DB5EUR01FT021.eop-EUR01.prod.protection.outlook.com (10.152.4.60) by DB5EUR01HT183.eop-EUR01.prod.protection.outlook.com (10.152.5.24) with Microsoft SMTP Server (TLS) id 15.1.517.7; Thu, 16 Jun 2016 14:47:11 +0000 Received: from AM4PR0701MB2162.eurprd07.prod.outlook.com (10.152.4.56) by DB5EUR01FT021.mail.protection.outlook.com (10.152.4.245) with Microsoft SMTP Server (TLS) id 15.1.511.7 via Frontend Transport; Thu, 16 Jun 2016 14:47:11 +0000 Received: from AM4PR0701MB2162.eurprd07.prod.outlook.com ([10.167.132.147]) by AM4PR0701MB2162.eurprd07.prod.outlook.com ([10.167.132.147]) with mapi id 15.01.0517.014; Thu, 16 Jun 2016 14:47:10 +0000 From: Bernd Edlinger To: "gcc-patches@gcc.gnu.org" CC: Richard Biener , Jakub Jelinek , Richard Sandiford , Ramana Radhakrishnan , Jeff Law Subject: [PATCH] Add a new target hook to compute the frame layout Date: Thu, 16 Jun 2016 14:47:10 +0000 Message-ID: authentication-results: spf=softfail (sender IP is 25.152.4.56) smtp.mailfrom=hotmail.de; gcc.gnu.org; dkim=none (message not signed) header.d=none; gcc.gnu.org; dmarc=none action=none header.from=hotmail.de; received-spf: SoftFail (protection.outlook.com: domain of transitioning hotmail.de discourages use of 25.152.4.56 as permitted sender) x-ms-exchange-messagesentrepresentingtype: 1 x-eopattributedmessage: 0 x-forefront-antispam-report: CIP:25.152.4.56; IPV:NLI; CTRY:GB; EFV:NLI; SFV:NSPM; SFS:(10019020)(98900003); DIR:OUT; SFP:1102; SCL:1; SRVR:DB5EUR01HT183; H:AM4PR0701MB2162.eurprd07.prod.outlook.com; FPR:; SPF:None; CAT:NONE; LANG:en; CAT:NONE; x-ms-office365-filtering-correlation-id: bf5a4783-fc37-4c52-fc4b-08d395f5183c x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(1601124038)(5061506196)(5061507196)(1603103041)(1601125047); SRVR:DB5EUR01HT183; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(432015012)(102415321)(82015046); SRVR:DB5EUR01HT183; BCL:0; PCL:0; RULEID:; SRVR:DB5EUR01HT183; x-forefront-prvs: 09752BC779 MIME-Version: 1.0 X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Jun 2016 14:47:10.4685 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5EUR01HT183 Hi! By the design of the target hook INITIAL_ELIMINATION_OFFSET it is necessary to call this function several times with different register combinations. Most targets use a cached data structure that describes the exact frame layout of the current function. It is safe to skip the computation when reload_completed = true, and most targets do that already. However while reload is doing its work, it is not clear when to do the computation and when not. This results in unnecessary work. Computing the frame layout can be a simple function or an arbitrarily complex one, that walks all instructions of the current function for instance, which is more or less the common case. This patch adds a new optional target hook that can be used by the target to factor the INITIAL_ELIMINATION_OFFSET-hook into a O(n) computation part, and a O(1) result function. The patch implements a compute_frame_layout target hook just for ARM in the moment, to show the principle. Other targets may also implement that hook, if it seems appropriate. Boot-strapped and reg-tested on arm-linux-gnueabihf. OK for trunk? Thanks Bernd. 2016-06-16 Bernd Edlinger * target.def (compute_frame_layout): New optional target hook. * doc/tm.texi.in (TARGET_COMPUTE_FRAME_LAYOUT): Add hook. * doc/tm.texi (TARGET_COMPUTE_FRAME_LAYOUT): Add documentation. * lra-eliminations.c (update_reg_eliminate): Call compute_frame_layout target hook. * reload1.c (verify_initial_elim_offsets): Likewise. * config/arm/arm.c (TARGET_COMPUTE_FRAME_LAYOUT): Define. (use_simple_return_p): Call arm_compute_frame_layout if needed. (arm_get_frame_offsets): Split up into this ... (arm_compute_frame_layout): ... and this function. Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (Revision 233176) +++ gcc/config/arm/arm.c (Arbeitskopie) @@ -81,6 +81,7 @@ static bool arm_const_not_ok_for_debug_p (rtx); static bool arm_needs_doubleword_align (machine_mode, const_tree); static int arm_compute_static_chain_stack_bytes (void); static arm_stack_offsets *arm_get_frame_offsets (void); +static void arm_compute_frame_layout (void); static void arm_add_gc_roots (void); static int arm_gen_constant (enum rtx_code, machine_mode, rtx, unsigned HOST_WIDE_INT, rtx, rtx, int, int); @@ -669,6 +670,9 @@ static const struct attribute_spec arm_attribute_t #undef TARGET_SCALAR_MODE_SUPPORTED_P #define TARGET_SCALAR_MODE_SUPPORTED_P arm_scalar_mode_supported_p +#undef TARGET_COMPUTE_FRAME_LAYOUT +#define TARGET_COMPUTE_FRAME_LAYOUT arm_compute_frame_layout + #undef TARGET_FRAME_POINTER_REQUIRED #define TARGET_FRAME_POINTER_REQUIRED arm_frame_pointer_required @@ -3813,6 +3817,10 @@ use_simple_return_p (void) { arm_stack_offsets *offsets; + /* Note this function can be called before or after reload. */ + if (!reload_completed) + arm_compute_frame_layout (); + offsets = arm_get_frame_offsets (); return offsets->outgoing_args != 0; } @@ -19238,7 +19246,7 @@ arm_compute_static_chain_stack_bytes (void) /* Compute a bit mask of which registers need to be saved on the stack for the current function. - This is used by arm_get_frame_offsets, which may add extra registers. */ + This is used by arm_compute_frame_layout, which may add extra registers. */ static unsigned long arm_compute_save_reg_mask (void) @@ -20789,12 +20797,25 @@ any_sibcall_could_use_r3 (void) alignment. */ +/* Return cached stack offsets. */ + +static arm_stack_offsets * +arm_get_frame_offsets (void) +{ + struct arm_stack_offsets *offsets; + + offsets = &cfun->machine->stack_offsets; + + return offsets; +} + + /* Calculate stack offsets. These are used to calculate register elimination offsets and in prologue/epilogue code. Also calculates which registers should be saved. */ -static arm_stack_offsets * -arm_get_frame_offsets (void) +static void +arm_compute_frame_layout (void) { struct arm_stack_offsets *offsets; unsigned long func_type; @@ -20806,19 +20827,6 @@ any_sibcall_could_use_r3 (void) offsets = &cfun->machine->stack_offsets; - /* We need to know if we are a leaf function. Unfortunately, it - is possible to be called after start_sequence has been called, - which causes get_insns to return the insns for the sequence, - not the function, which will cause leaf_function_p to return - the incorrect result. - - to know about leaf functions once reload has completed, and the - frame size cannot be changed after that time, so we can safely - use the cached value. */ - - if (reload_completed) - return offsets; - /* Initially this is the size of the local variables. It will translated into an offset once we have determined the size of preceding data. */ frame_size = ROUND_UP_WORD (get_frame_size ()); @@ -20885,7 +20893,7 @@ any_sibcall_could_use_r3 (void) { offsets->outgoing_args = offsets->soft_frame; offsets->locals_base = offsets->soft_frame; - return offsets; + return; } /* Ensure SFP has the correct alignment. */ @@ -20961,8 +20969,6 @@ any_sibcall_could_use_r3 (void) offsets->outgoing_args += 4; gcc_assert (!(offsets->outgoing_args & 7)); } - - return offsets; } Index: gcc/doc/tm.texi =================================================================== --- gcc/doc/tm.texi (Revision 233176) +++ gcc/doc/tm.texi (Arbeitskopie) @@ -3693,6 +3693,14 @@ registers. This macro must be defined if @code{EL defined. @end defmac +@deftypefn {Target Hook} void TARGET_COMPUTE_FRAME_LAYOUT (void) +This target hook is called immediately before reload wants to call +@code{INITIAL_ELIMINATION_OFFSET} and allows the target to cache the frame +layout instead of re-computing it on every invocation. This is particularly +useful for targets that have an O(n) frame layout function. Implementing +this callback is optional. +@end deftypefn + @node Stack Arguments @subsection Passing Function Arguments on the Stack @cindex arguments on stack Index: gcc/doc/tm.texi.in =================================================================== --- gcc/doc/tm.texi.in (Revision 233176) +++ gcc/doc/tm.texi.in (Arbeitskopie) @@ -3227,6 +3227,8 @@ registers. This macro must be defined if @code{EL defined. @end defmac +@hook TARGET_COMPUTE_FRAME_LAYOUT + @node Stack Arguments @subsection Passing Function Arguments on the Stack @cindex arguments on stack Index: gcc/lra-eliminations.c =================================================================== --- gcc/lra-eliminations.c (Revision 233176) +++ gcc/lra-eliminations.c (Arbeitskopie) @@ -1202,6 +1202,10 @@ update_reg_eliminate (bitmap insns_with_changed_of struct lra_elim_table *ep, *ep1; HARD_REG_SET temp_hard_reg_set; +#ifdef ELIMINABLE_REGS + targetm.compute_frame_layout (); +#endif + /* Clear self elimination offsets. */ for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) self_elim_offsets[ep->from] = 0; Index: gcc/reload1.c =================================================================== --- gcc/reload1.c (Revision 233176) +++ gcc/reload1.c (Arbeitskopie) @@ -3856,6 +3856,7 @@ verify_initial_elim_offsets (void) { struct elim_table *ep; + targetm.compute_frame_layout (); for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) { INITIAL_ELIMINATION_OFFSET (ep->from, ep->to, t); @@ -3880,6 +3881,7 @@ set_initial_elim_offsets (void) struct elim_table *ep = reg_eliminate; #ifdef ELIMINABLE_REGS + targetm.compute_frame_layout (); for (; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) { INITIAL_ELIMINATION_OFFSET (ep->from, ep->to, ep->initial_offset); Index: gcc/target.def =================================================================== --- gcc/target.def (Revision 233176) +++ gcc/target.def (Arbeitskopie) @@ -5245,8 +5245,19 @@ five otherwise. This is best for most machines.", unsigned int, (void), default_case_values_threshold) -/* Retutn true if a function must have and use a frame pointer. */ +/* Optional callback to advise the target to compute the frame layout. */ DEFHOOK +(compute_frame_layout, + "This target hook is called immediately before reload wants to call\n\ +@code{INITIAL_ELIMINATION_OFFSET} and allows the target to cache the frame\n\ +layout instead of re-computing it on every invocation. This is particularly\n\ +useful for targets that have an O(n) frame layout function. Implementing\n\ +this callback is optional.", + void, (void), + hook_void_void) + +/* Return true if a function must have and use a frame pointer. */ +DEFHOOK (frame_pointer_required, "This target hook should return @code{true} if a function must have and use\n\ a frame pointer. This target hook is called in the reload pass. If its return\n\