From patchwork Tue May 16 19:52:25 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Edlinger X-Patchwork-Id: 763147 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wS7Qj00rNz9s7k for ; Wed, 17 May 2017 05:52:43 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="i8sWrqkD"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; q=dns; s=default; b=CHeqGSzXYGP4iip2 u1FykuAdWVWiwdZdMBLAfN7rHzITLY0m9c063fvtixAGHZTPcbEu5oVtyTusn+Yr 0Y+eLBegsMmlbnsLuL+kGP53ioyW5A8NhYu2dK4PNsG3YpxlV0pll4vNVPwZyzj4 rrFaEnYxkFTtLIkP1NWQIf6uXS8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; s=default; bh=+i06Mh7Sd7AyCiS6/QlJJy 3ykcI=; b=i8sWrqkDzq0spPHQMcisO7F5Sn3ZfXlG80UEHiSeOjvl8/f90Pc2Sz itL16CNo/dntdgI7C6DMYbLUm2jMP6TRLtI4TT5oH2CDcVIkqWnRcHjofuJ4WuPV lcvSDQuyDoSbT8HSXt+AqPAr1ijvbESdBH9hRiS9QRnyzs8/zBL5M= Received: (qmail 16238 invoked by alias); 16 May 2017 19:52:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16196 invoked by uid 89); 16 May 2017 19:52:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com Received: from mail-oln040092069109.outbound.protection.outlook.com (HELO EUR02-VE1-obe.outbound.protection.outlook.com) (40.92.69.109) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 16 May 2017 19:52:26 +0000 Received: from HE1EUR02FT025.eop-EUR02.prod.protection.outlook.com (10.152.10.60) by HE1EUR02HT236.eop-EUR02.prod.protection.outlook.com (10.152.11.43) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.1075.5; Tue, 16 May 2017 19:52:25 +0000 Received: from AM4PR0701MB2162.eurprd07.prod.outlook.com (10.152.10.59) by HE1EUR02FT025.mail.protection.outlook.com (10.152.10.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1075.5 via Frontend Transport; Tue, 16 May 2017 19:52:25 +0000 Received: from AM4PR0701MB2162.eurprd07.prod.outlook.com ([fe80::55fc:f172:98b5:d2bc]) by AM4PR0701MB2162.eurprd07.prod.outlook.com ([fe80::55fc:f172:98b5:d2bc%19]) with mapi id 15.01.1101.011; Tue, 16 May 2017 19:52:25 +0000 From: Bernd Edlinger To: Ian Lance Taylor , Daniel Santos CC: gcc-patches , Uros Bizjak Subject: Re: [PATCH] [i386] Recompute the frame layout less often Date: Tue, 16 May 2017 19:52:25 +0000 Message-ID: References: <9aa7427e-6cfc-b603-2ec4-1f0e02ae294d@pobox.com> <20c96fa0-328b-af1a-c558-dab6652c482e@pobox.com> <8d8b4700-98eb-aaa8-7718-d603cae106e8@pobox.com> In-Reply-To: authentication-results: google.com; dkim=none (message not signed) header.d=none; google.com; dmarc=none action=none header.from=hotmail.de; x-incomingtopheadermarker: OriginalChecksum:3BFDA7FFB24CCCB6E96D9307C3C85A655D2B5754CD7FD36A12D974A4F4E797CC; UpperCasedChecksum:884DC4059B1217364CA68ECEA40ADDAAD5F52E1638D5B4131C331FC529F3C986; SizeAsReceived:8838; Count:43 x-ms-exchange-messagesentrepresentingtype: 1 x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; HE1EUR02HT236; 5:uRCXX9A4N5ej/xMhsSpGWv5K5qbBOzuqFQF91DhtTc/QuVTIBeg3c+imXW4r/QoPZfKrd4Y4yw1o0jI08sD2oxvCLVecUODcbwNABT92DbK0LM96BiMG8pnMMqg4sOq8Z26CqZGmYE0F/5KDBbqn8w==; 24:bpKBxIBIj07B/s1QWzuCDM568GX5HqvV2bUJTKio7cPI9d3ZfjTSrVPTtIUyEb0KZ/1EvqFfPkrfFa9r6fKOW7lysutr3kjNWimfwOvkqP8=; 7:f/Jn494ENbxGEBlyVy9ipUVOFCdjEGWJX7mqrY0Cj+ct/0CejDUNJ20FKoRnoK5bputTQuwv69E/YB5/u/QAK2b5bxUr73sTQU5OH7pG9ntQs0fmm1p5f9EeRW1PQm1WnV+ktpdP2XL++dKg5A3Co4kQMnS5mOPBPvUp58aH/ikmm1bk9Yh/HpNtYFPi2H1oIZrK1k7+DjOxcUsu93cndIKPADhHSeDW/ay4JmDY7b5tXIohjL+MZJ/Qqto1ER3tZsbeQ+5v7V07crn9fOohAMdnpsOI13j2GlD6H+E4FMrzrQbqNZmCmsmMAbsGdEzr x-incomingheadercount: 43 x-eopattributedmessage: 0 x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(7070007)(98901004); DIR:OUT; SFP:1901; SCL:1; SRVR:HE1EUR02HT236; H:AM4PR0701MB2162.eurprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: 6faac311-626f-44b4-a48d-08d49c951294 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(201702061074)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031324274)(2017031323274)(2017031322274)(1601125374)(1603101448)(1701031045); SRVR:HE1EUR02HT236; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(444000031); SRVR:HE1EUR02HT236; BCL:0; PCL:0; RULEID:; SRVR:HE1EUR02HT236; x-forefront-prvs: 03094A4065 spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: outlook.com X-MS-Exchange-CrossTenant-originalarrivaltime: 16 May 2017 19:52:25.5515 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1EUR02HT236 On 05/16/17 19:19, Ian Lance Taylor wrote: > On Mon, May 15, 2017 at 10:00 PM, Daniel Santos wrote: >> >> Ian, would you mind looking at this please? A combination of my >> -mcall-ms2sysv-xlogues patch with Bernd's patch is causing problems when >> ix86_expand_split_stack_prologue() calls ix86_expand_call(). > > I don't have a lot of context here. I assume that ms2sysv is going to > be used on Windows systems, where -fsplit-stack isn't really going to > work anyhow, so I think it would probably be OK that reject that > combination if it causes trouble. > > Also, it's overkill for ix86_expand_split_stack_prologue to call > ix86_expand_call. The call is always to __morestack, and __morestack > is written in assembler, so we could use a simpler version of > ix86_expand_call if that helps. In particular we can decide that > __morestack doesn't clobber any unusual registers, if that is what is > causing the problem. > I think I solved the problem with -fsplit-stack, I am not sure if ix86_static_chain_on_stack might change after reload due to final.c possibly calling targetm.calls.static_chain, but if that is the case, that is an already pre-existing problem. The goal of this patch is to make all decisions regarding the frame layout before the reload pass, and to make sure that the frame layout does not change unexpectedly it asserts that the data that goes into the decision does not change after reload_completed. With the attached patch -fsplit-stack and the attribute ms_hook_prologue is handed directly at the ix86_expand_call, because that data is already known before expansion. The calls_eh_return and ix86_static_chain_on_stack may become known at a later time, but after reload it should not change any more. To be sure, I added an assertion at ix86_static_chain, which the regression test did not trigger, neither with -m64 nor with -m32. I have bootstrapped the patch several times, and a few times I encounterd a segfault in the garbage collection, but it did not happen every time. Currently I think that is unrelated to this patch. Bootstrapped and reg-tested on x86_64-pc-linux-gnu with -m64/-m32. Is it OK for trunk? Thanks Bernd. 2017-05-16 Bernd Edlinger * config/i386/i386.c (x86_64_ms_sysv_extra_clobbered_registers): Make static. (xlogue_layout::get_stack_space_used, xlogue_layout::s_instances, xlogue_layout::get_instance, logue_layout::xlogue_layout, xlogue_layout::get_stub_name, xlogue_layout::get_stub_rtx, sp_valid_at, fp_valid_at, choose_basereg): Formatting. (xlogue_layout::compute_stub_managed_regs): Clear out param first. (stub_managed_regs): Remove. (ix86_save_reg): Use xlogue_layout::compute_stub_managed_regs. (disable_call_ms2sysv_xlogues): Rename to... (warn_once_call_ms2sysv_xlogues): ...this, and warn only once. (ix86_initial_elimination_offset, ix86_expand_call): Fix call_ms2sysv warning logic. (ix86_static_chain): Make sure that ix86_static_chain_on_stack can't change after reload_completed. (ix86_can_use_return_insn_p): Use the ix86_frame data structure directly. (ix86_expand_prologue): Likewise. (ix86_expand_epilogue): Likewise. (ix86_expand_split_stack_prologue): Likewise. (ix86_compute_frame_layout): Remove frame parameter ... (TARGET_COMPUTE_FRAME_LAYOUT): ... and export it as a target hook. (ix86_finalize_stack_realign_flags): Call ix86_compute_frame_layout only if necessary. (ix86_init_machine_status): Don't set use_fast_prologue_epilogue_nregs. (ix86_frame): Move from here ... * config/i386/i386.h (ix86_frame): ... to here. (machine_function): Remove use_fast_prologue_epilogue_nregs, cache the complete ix86_frame data structure instead. Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 248031) +++ gcc/config/i386/i386.c (working copy) @@ -2425,7 +2425,9 @@ static int const x86_64_int_return_registers[4] = /* Additional registers that are clobbered by SYSV calls. */ -unsigned const x86_64_ms_sysv_extra_clobbered_registers[12] = +#define NUM_X86_64_MS_CLOBBERED_REGS 12 +static int const x86_64_ms_sysv_extra_clobbered_registers + [NUM_X86_64_MS_CLOBBERED_REGS] = { SI_REG, DI_REG, XMM6_REG, XMM7_REG, @@ -2484,13 +2486,13 @@ class xlogue_layout { needs to store registers based upon data in the machine_function. */ HOST_WIDE_INT get_stack_space_used () const { - const struct machine_function &m = *cfun->machine; - unsigned last_reg = m.call_ms2sysv_extra_regs + MIN_REGS - 1; + const struct machine_function *m = cfun->machine; + unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1; - gcc_assert (m.call_ms2sysv_extra_regs <= MAX_EXTRA_REGS); + gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS); return m_regs[last_reg].offset - + (m.call_ms2sysv_pad_out ? 8 : 0) - + STUB_INDEX_OFFSET; + + (m->call_ms2sysv_pad_out ? 8 : 0) + + STUB_INDEX_OFFSET; } /* Returns the offset for the base pointer used by the stub. */ @@ -2532,7 +2534,7 @@ class xlogue_layout { /* Lazy-inited cache of symbol names for stubs. */ char m_stub_names[XLOGUE_STUB_COUNT][VARIANT_COUNT][STUB_NAME_MAX_LEN]; - static const struct xlogue_layout GTY(()) s_instances[XLOGUE_SET_COUNT]; + static const struct GTY(()) xlogue_layout s_instances[XLOGUE_SET_COUNT]; }; const char * const xlogue_layout::STUB_BASE_NAMES[XLOGUE_STUB_COUNT] = { @@ -2573,7 +2575,7 @@ const unsigned xlogue_layout::REG_ORDER[xlogue_lay }; /* Instantiates all xlogue_layout instances. */ -const struct xlogue_layout GTY(()) +const struct GTY(()) xlogue_layout xlogue_layout::s_instances[XLOGUE_SET_COUNT] = { xlogue_layout (0, false), xlogue_layout (8, false), @@ -2583,7 +2585,8 @@ xlogue_layout::s_instances[XLOGUE_SET_COUNT] = { /* Return an appropriate const instance of xlogue_layout based upon values in cfun->machine and crtl. */ -const struct xlogue_layout &xlogue_layout::get_instance () +const struct xlogue_layout & +xlogue_layout::get_instance () { enum xlogue_stub_sets stub_set; bool aligned_plus_8 = cfun->machine->call_ms2sysv_pad_in; @@ -2607,10 +2610,11 @@ unsigned xlogue_layout::compute_stub_managed_regs (HARD_REG_SET &stub_managed_regs) { bool hfp = frame_pointer_needed || stack_realign_fp; - unsigned i, count; unsigned regno; + CLEAR_HARD_REG_SET (stub_managed_regs); + for (i = 0; i < NUM_X86_64_MS_CLOBBERED_REGS; ++i) { regno = x86_64_ms_sysv_extra_clobbered_registers[i]; @@ -2630,8 +2634,8 @@ xlogue_layout::compute_stub_managed_regs (HARD_REG add_to_hard_reg_set (&stub_managed_regs, DImode, regno); ++count; } - gcc_assert (count >= MIN_REGS && count <= MAX_REGS); - return count; + gcc_assert (count >= MIN_REGS && count <= MAX_REGS); + return count; } /* Constructor for xlogue_layout. */ @@ -2639,11 +2643,11 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_ : m_hfp (hfp) , m_nregs (hfp ? 17 : 18), m_stack_align_off_in (stack_align_off_in) { + HOST_WIDE_INT offset = stack_align_off_in; + unsigned i, j; + memset (m_regs, 0, sizeof (m_regs)); memset (m_stub_names, 0, sizeof (m_stub_names)); - - HOST_WIDE_INT offset = stack_align_off_in; - unsigned i, j; for (i = j = 0; i < MAX_REGS; ++i) { unsigned regno = REG_ORDER[i]; @@ -2662,11 +2666,12 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_ m_regs[j].regno = regno; m_regs[j++].offset = offset - STUB_INDEX_OFFSET; } - gcc_assert (j == m_nregs); + gcc_assert (j == m_nregs); } -const char *xlogue_layout::get_stub_name (enum xlogue_stub stub, - unsigned n_extra_regs) const +const char * +xlogue_layout::get_stub_name (enum xlogue_stub stub, + unsigned n_extra_regs) const { xlogue_layout *writey_this = const_cast(this); char *name = writey_this->m_stub_names[stub][n_extra_regs]; @@ -2676,7 +2681,7 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_ { int res = snprintf (name, STUB_NAME_MAX_LEN, "__%s_%u", STUB_BASE_NAMES[stub], MIN_REGS + n_extra_regs); - gcc_checking_assert (res <= (int)STUB_NAME_MAX_LEN); + gcc_checking_assert (res < (int)STUB_NAME_MAX_LEN); } return name; @@ -2684,7 +2689,8 @@ xlogue_layout::xlogue_layout (HOST_WIDE_INT stack_ /* Return rtx of a symbol ref for the entry point (based upon cfun->machine->call_ms2sysv_extra_regs) of the specified stub. */ -rtx xlogue_layout::get_stub_rtx (enum xlogue_stub stub) const +rtx +xlogue_layout::get_stub_rtx (enum xlogue_stub stub) const { const unsigned n_extra_regs = cfun->machine->call_ms2sysv_extra_regs; gcc_checking_assert (n_extra_regs <= MAX_EXTRA_REGS); @@ -2703,73 +2709,6 @@ struct GTY(()) stack_local_entry { struct stack_local_entry *next; }; -/* Structure describing stack frame layout. - Stack grows downward: - - [arguments] - <- ARG_POINTER - saved pc - - saved static chain if ix86_static_chain_on_stack - - saved frame pointer if frame_pointer_needed - <- HARD_FRAME_POINTER - [saved regs] - <- reg_save_offset - [padding0] - <- stack_realign_offset - [saved SSE regs] - OR - [stub-saved registers for ms x64 --> sysv clobbers - <- Start of out-of-line, stub-saved/restored regs - (see libgcc/config/i386/(sav|res)ms64*.S) - [XMM6-15] - [RSI] - [RDI] - [?RBX] only if RBX is clobbered - [?RBP] only if RBP and RBX are clobbered - [?R12] only if R12 and all previous regs are clobbered - [?R13] only if R13 and all previous regs are clobbered - [?R14] only if R14 and all previous regs are clobbered - [?R15] only if R15 and all previous regs are clobbered - <- end of stub-saved/restored regs - [padding1] - ] - <- outlined_save_offset - <- sse_regs_save_offset - [padding2] - | <- FRAME_POINTER - [va_arg registers] | - | - [frame] | - | - [padding2] | = to_allocate - <- STACK_POINTER - */ -struct ix86_frame -{ - int nsseregs; - int nregs; - int va_arg_size; - int red_zone_size; - int outgoing_arguments_size; - - /* The offsets relative to ARG_POINTER. */ - HOST_WIDE_INT frame_pointer_offset; - HOST_WIDE_INT hard_frame_pointer_offset; - HOST_WIDE_INT stack_pointer_offset; - HOST_WIDE_INT hfp_save_offset; - HOST_WIDE_INT reg_save_offset; - HOST_WIDE_INT stack_realign_allocate_offset; - HOST_WIDE_INT stack_realign_offset; - HOST_WIDE_INT outlined_save_offset; - HOST_WIDE_INT sse_reg_save_offset; - - /* When save_regs_using_mov is set, emit prologue using - move instead of push instructions. */ - bool save_regs_using_mov; -}; - /* Which cpu are we scheduling for. */ enum attr_cpu ix86_schedule; @@ -2861,7 +2800,7 @@ static unsigned int ix86_function_arg_boundary (ma const_tree); static rtx ix86_static_chain (const_tree, bool); static int ix86_function_regparm (const_tree, const_tree); -static void ix86_compute_frame_layout (struct ix86_frame *); +static void ix86_compute_frame_layout (void); static bool ix86_expand_vector_init_one_nonzero (bool, machine_mode, rtx, rtx, int); static void ix86_add_new_builtins (HOST_WIDE_INT, HOST_WIDE_INT); @@ -12293,7 +12232,7 @@ ix86_can_use_return_insn_p (void) if (crtl->args.pops_args && crtl->args.size >= 32768) return 0; - ix86_compute_frame_layout (&frame); + frame = cfun->machine->frame; return (frame.stack_pointer_offset == UNITS_PER_WORD && (frame.nregs + frame.nsseregs) == 0); } @@ -12634,10 +12573,6 @@ ix86_hard_regno_scratch_ok (unsigned int regno) && df_regs_ever_live_p (regno))); } -/* Registers who's save & restore will be managed by stubs called from - pro/epilogue. */ -static HARD_REG_SET GTY(()) stub_managed_regs; - /* Return true if register class CL should be an additional allocno class. */ @@ -12718,10 +12653,17 @@ ix86_save_reg (unsigned int regno, bool maybe_eh_r } } - if (ignore_outlined && cfun->machine->call_ms2sysv - && in_hard_reg_set_p (stub_managed_regs, DImode, regno)) - return false; + if (ignore_outlined && cfun->machine->call_ms2sysv) + { + /* Registers who's save & restore will be managed by stubs called from + pro/epilogue. */ + HARD_REG_SET stub_managed_regs; + xlogue_layout::compute_stub_managed_regs (stub_managed_regs); + if (in_hard_reg_set_p (stub_managed_regs, DImode, regno)) + return false; + } + if (crtl->drap_reg && regno == REGNO (crtl->drap_reg) && !cfun->machine->no_drap_save_restore) @@ -12787,8 +12729,7 @@ ix86_can_eliminate (const int from, const int to) HOST_WIDE_INT ix86_initial_elimination_offset (int from, int to) { - struct ix86_frame frame; - ix86_compute_frame_layout (&frame); + struct ix86_frame frame = cfun->machine->frame; if (from == ARG_POINTER_REGNUM && to == HARD_FRAME_POINTER_REGNUM) return frame.hard_frame_pointer_offset; @@ -12818,13 +12759,16 @@ ix86_builtin_setjmp_frame_value (void) return stack_realign_fp ? hard_frame_pointer_rtx : virtual_stack_vars_rtx; } -/* Disables out-of-lined msabi to sysv pro/epilogues and emits a warning if - warn_once is null, or *warn_once is zero. */ -static void disable_call_ms2sysv_xlogues (const char *feature) +/* Emits a warning for unsupported msabi to sysv pro/epilogues. */ +static void warn_once_call_ms2sysv_xlogues (const char *feature) { - cfun->machine->call_ms2sysv = false; - warning (OPT_mcall_ms2sysv_xlogues, "not currently compatible with %s.", - feature); + static bool warned_once = false; + if (!warned_once) + { + warning (0, "-mcall-ms2sysv-xlogues is not compatible with %s", + feature); + warned_once = true; + } } /* When using -fsplit-stack, the allocation routines set a field in @@ -12836,8 +12780,9 @@ ix86_builtin_setjmp_frame_value (void) /* Fill structure ix86_frame about frame of currently computed function. */ static void -ix86_compute_frame_layout (struct ix86_frame *frame) +ix86_compute_frame_layout (void) { + struct ix86_frame *frame = &cfun->machine->frame; struct machine_function *m = cfun->machine; unsigned HOST_WIDE_INT stack_alignment_needed; HOST_WIDE_INT offset; @@ -12845,46 +12790,41 @@ static void HOST_WIDE_INT size = get_frame_size (); HOST_WIDE_INT to_allocate; - CLEAR_HARD_REG_SET (stub_managed_regs); - /* m->call_ms2sysv is initially enabled in ix86_expand_call for all 64-bit * ms_abi functions that call a sysv function. We now need to prune away * cases where it should be disabled. */ if (TARGET_64BIT && m->call_ms2sysv) - { - gcc_assert (TARGET_64BIT_MS_ABI); - gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES); - gcc_assert (!TARGET_SEH); + { + gcc_assert (TARGET_64BIT_MS_ABI); + gcc_assert (TARGET_CALL_MS2SYSV_XLOGUES); + gcc_assert (!TARGET_SEH); + gcc_assert (TARGET_SSE); + gcc_assert (!ix86_using_red_zone ()); - if (!TARGET_SSE) - m->call_ms2sysv = false; + if (crtl->calls_eh_return) + { + gcc_assert (!reload_completed); + m->call_ms2sysv = false; + warn_once_call_ms2sysv_xlogues ("__builtin_eh_return"); + } - /* Don't break hot-patched functions. */ - else if (ix86_function_ms_hook_prologue (current_function_decl)) - m->call_ms2sysv = false; + else if (ix86_static_chain_on_stack) + { + gcc_assert (!reload_completed); + m->call_ms2sysv = false; + warn_once_call_ms2sysv_xlogues ("static call chains"); + } - /* TODO: Cases not yet examined. */ - else if (crtl->calls_eh_return) - disable_call_ms2sysv_xlogues ("__builtin_eh_return"); + /* Finally, compute which registers the stub will manage. */ + else + { + HARD_REG_SET stub_managed_regs; + unsigned count = xlogue_layout + ::compute_stub_managed_regs (stub_managed_regs); + m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS; + } + } - else if (ix86_static_chain_on_stack) - disable_call_ms2sysv_xlogues ("static call chains"); - - else if (ix86_using_red_zone ()) - disable_call_ms2sysv_xlogues ("red zones"); - - else if (flag_split_stack) - disable_call_ms2sysv_xlogues ("split stack"); - - /* Finally, compute which registers the stub will manage. */ - else - { - unsigned count = xlogue_layout - ::compute_stub_managed_regs (stub_managed_regs); - m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS; - } - } - frame->nregs = ix86_nsaved_regs (); frame->nsseregs = ix86_nsaved_sseregs (); m->call_ms2sysv_pad_in = 0; @@ -12916,19 +12856,11 @@ static void in doing anything except PUSHs. */ if (TARGET_SEH) m->use_fast_prologue_epilogue = false; - - /* During reload iteration the amount of registers saved can change. - Recompute the value as needed. Do not recompute when amount of registers - didn't change as reload does multiple calls to the function and does not - expect the decision to change within single iteration. */ - else if (!optimize_bb_for_size_p (ENTRY_BLOCK_PTR_FOR_FN (cfun)) - && m->use_fast_prologue_epilogue_nregs != frame->nregs) + else if (!optimize_bb_for_size_p (ENTRY_BLOCK_PTR_FOR_FN (cfun))) { int count = frame->nregs; struct cgraph_node *node = cgraph_node::get (current_function_decl); - m->use_fast_prologue_epilogue_nregs = count; - /* The fast prologue uses move instead of push to save registers. This is significantly longer, but also executes faster as modern hardware can execute the moves in parallel, but can't do that for push/pop. @@ -13145,7 +13077,8 @@ choose_baseaddr_len (unsigned int regno, HOST_WIDE /* Determine if the stack pointer is valid for accessing the cfa_offset. */ -static inline bool sp_valid_at (HOST_WIDE_INT cfa_offset) +static inline bool +sp_valid_at (HOST_WIDE_INT cfa_offset) { const struct machine_frame_state &fs = cfun->machine->fs; return fs.sp_valid && !(fs.sp_realigned @@ -13154,7 +13087,8 @@ choose_baseaddr_len (unsigned int regno, HOST_WIDE /* Determine if the frame pointer is valid for accessing the cfa_offset. */ -static inline bool fp_valid_at (HOST_WIDE_INT cfa_offset) +static inline bool +fp_valid_at (HOST_WIDE_INT cfa_offset) { const struct machine_frame_state &fs = cfun->machine->fs; return fs.fp_valid && !(fs.sp_valid && fs.sp_realigned @@ -13164,9 +13098,10 @@ choose_baseaddr_len (unsigned int regno, HOST_WIDE /* Choose a base register based upon alignment requested, speed and/or size. */ -static void choose_basereg (HOST_WIDE_INT cfa_offset, rtx &base_reg, - HOST_WIDE_INT &base_offset, - unsigned int align_reqested, unsigned int *align) +static void +choose_basereg (HOST_WIDE_INT cfa_offset, rtx &base_reg, + HOST_WIDE_INT &base_offset, + unsigned int align_reqested, unsigned int *align) { const struct machine_function *m = cfun->machine; unsigned int hfp_align; @@ -14159,6 +14094,7 @@ ix86_finalize_stack_realign_flags (void) < (crtl->is_leaf && !ix86_current_function_calls_tls_descriptor ? crtl->max_used_stack_slot_alignment : crtl->stack_alignment_needed)); + bool recompute_frame_layout_p = false; if (crtl->stack_realign_finalized) { @@ -14208,8 +14144,12 @@ ix86_finalize_stack_realign_flags (void) && requires_stack_frame_p (insn, prologue_used, set_up_by_prologue)) { + if (crtl->stack_realign_needed != stack_realign) + recompute_frame_layout_p = true; crtl->stack_realign_needed = stack_realign; crtl->stack_realign_finalized = true; + if (recompute_frame_layout_p) + ix86_compute_frame_layout (); return; } } @@ -14240,10 +14180,15 @@ ix86_finalize_stack_realign_flags (void) df_scan_blocks (); df_compute_regs_ever_live (true); df_analyze (); + recompute_frame_layout_p = true; } + if (crtl->stack_realign_needed != stack_realign) + recompute_frame_layout_p = true; crtl->stack_realign_needed = stack_realign; crtl->stack_realign_finalized = true; + if (recompute_frame_layout_p) + ix86_compute_frame_layout (); } /* Delete SET_GOT right after entry block if it is allocated to reg. */ @@ -14372,7 +14317,7 @@ ix86_expand_prologue (void) m->fs.sp_valid = true; m->fs.sp_realigned = false; - ix86_compute_frame_layout (&frame); + frame = m->frame; if (!TARGET_64BIT && ix86_function_ms_hook_prologue (current_function_decl)) { @@ -15212,7 +15157,7 @@ ix86_expand_epilogue (int style) bool restore_stub_is_tail = false; ix86_finalize_stack_realign_flags (); - ix86_compute_frame_layout (&frame); + frame = m->frame; m->fs.sp_realigned = stack_realign_fp; m->fs.sp_valid = stack_realign_fp @@ -15757,7 +15702,7 @@ ix86_expand_split_stack_prologue (void) gcc_assert (flag_split_stack && reload_completed); ix86_finalize_stack_realign_flags (); - ix86_compute_frame_layout (&frame); + frame = cfun->machine->frame; allocate = frame.stack_pointer_offset - INCOMING_FRAME_SP_OFFSET; /* This is the label we will branch to if we have enough stack @@ -29320,7 +29265,24 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call /* Set here, but it may get cleared later. */ if (TARGET_CALL_MS2SYSV_XLOGUES) - cfun->machine->call_ms2sysv = true; + { + if (!TARGET_SSE) + ; + + /* Don't break hot-patched functions. */ + else if (ix86_function_ms_hook_prologue (current_function_decl)) + ; + + /* TODO: Cases not yet examined. */ + else if (flag_split_stack) + warn_once_call_ms2sysv_xlogues ("-fsplit-stack"); + + else + { + gcc_assert (!reload_completed); + cfun->machine->call_ms2sysv = true; + } + } } if (vec_len > 1) @@ -29455,7 +29417,6 @@ ix86_init_machine_status (void) struct machine_function *f; f = ggc_cleared_alloc (); - f->use_fast_prologue_epilogue_nregs = -1; f->call_abi = ix86_abi; return f; @@ -31516,7 +31477,10 @@ ix86_static_chain (const_tree fndecl_or_type, bool if (incoming_p) { if (fndecl == current_function_decl) - ix86_static_chain_on_stack = true; + { + gcc_assert (!reload_completed); + ix86_static_chain_on_stack = true; + } return gen_frame_mem (SImode, plus_constant (Pmode, arg_pointer_rtx, -8)); @@ -52828,6 +52792,9 @@ ix86_run_selftests (void) #undef TARGET_LEGITIMATE_CONSTANT_P #define TARGET_LEGITIMATE_CONSTANT_P ix86_legitimate_constant_p +#undef TARGET_COMPUTE_FRAME_LAYOUT +#define TARGET_COMPUTE_FRAME_LAYOUT ix86_compute_frame_layout + #undef TARGET_FRAME_POINTER_REQUIRED #define TARGET_FRAME_POINTER_REQUIRED ix86_frame_pointer_required Index: gcc/config/i386/i386.h =================================================================== --- gcc/config/i386/i386.h (revision 248031) +++ gcc/config/i386/i386.h (working copy) @@ -2163,10 +2163,6 @@ extern int const dbx_register_map[FIRST_PSEUDO_REG extern int const dbx64_register_map[FIRST_PSEUDO_REGISTER]; extern int const svr4_dbx_register_map[FIRST_PSEUDO_REGISTER]; -extern unsigned const x86_64_ms_sysv_extra_clobbered_registers[12]; -#define NUM_X86_64_MS_CLOBBERED_REGS \ - (ARRAY_SIZE (x86_64_ms_sysv_extra_clobbered_registers)) - /* Before the prologue, RA is at 0(%esp). */ #define INCOMING_RETURN_ADDR_RTX \ gen_rtx_MEM (Pmode, gen_rtx_REG (Pmode, STACK_POINTER_REGNUM)) @@ -2448,9 +2444,76 @@ enum avx_u128_state #define FASTCALL_PREFIX '@' +#ifndef USED_FOR_TARGET +/* Structure describing stack frame layout. + Stack grows downward: + + [arguments] + <- ARG_POINTER + saved pc + + saved static chain if ix86_static_chain_on_stack + + saved frame pointer if frame_pointer_needed + <- HARD_FRAME_POINTER + [saved regs] + <- reg_save_offset + [padding0] + <- stack_realign_offset + [saved SSE regs] + OR + [stub-saved registers for ms x64 --> sysv clobbers + <- Start of out-of-line, stub-saved/restored regs + (see libgcc/config/i386/(sav|res)ms64*.S) + [XMM6-15] + [RSI] + [RDI] + [?RBX] only if RBX is clobbered + [?RBP] only if RBP and RBX are clobbered + [?R12] only if R12 and all previous regs are clobbered + [?R13] only if R13 and all previous regs are clobbered + [?R14] only if R14 and all previous regs are clobbered + [?R15] only if R15 and all previous regs are clobbered + <- end of stub-saved/restored regs + [padding1] + ] + <- outlined_save_offset + <- sse_regs_save_offset + [padding2] + | <- FRAME_POINTER + [va_arg registers] | + | + [frame] | + | + [padding2] | = to_allocate + <- STACK_POINTER + */ +struct GTY(()) ix86_frame +{ + int nsseregs; + int nregs; + int va_arg_size; + int red_zone_size; + int outgoing_arguments_size; + + /* The offsets relative to ARG_POINTER. */ + HOST_WIDE_INT frame_pointer_offset; + HOST_WIDE_INT hard_frame_pointer_offset; + HOST_WIDE_INT stack_pointer_offset; + HOST_WIDE_INT hfp_save_offset; + HOST_WIDE_INT reg_save_offset; + HOST_WIDE_INT stack_realign_allocate_offset; + HOST_WIDE_INT stack_realign_offset; + HOST_WIDE_INT outlined_save_offset; + HOST_WIDE_INT sse_reg_save_offset; + + /* When save_regs_using_mov is set, emit prologue using + move instead of push instructions. */ + bool save_regs_using_mov; +}; + /* Machine specific frame tracking during prologue/epilogue generation. */ -#ifndef USED_FOR_TARGET struct GTY(()) machine_frame_state { /* This pair tracks the currently active CFA as reg+offset. When reg @@ -2520,9 +2583,8 @@ struct GTY(()) machine_function { int varargs_fpr_size; int optimize_mode_switching[MAX_386_ENTITIES]; - /* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE - has been computed for. */ - int use_fast_prologue_epilogue_nregs; + /* Cached initial frame layout for the current function. */ + struct ix86_frame frame; /* For -fsplit-stack support: A stack local which holds a pointer to the stack arguments for a function with a variable number of