From patchwork Mon Nov 11 14:44:10 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ulrich Weigand X-Patchwork-Id: 290333 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id AC2882C009F for ; Tue, 12 Nov 2013 01:50:17 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:to:date:from:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=MXmprFYPG2Gr6JkM 80gXWXEKwIK/8Kzlekr6b389HLTXn7qxMMStZJebgiFv4xYXlw/qJ6wg0FUlayiw gpWoihm712O+N1jnVScpogRZAYznsQVi26iO0XSJH0NezLoW4Qwgq7BF446XwPvf /y96fdjoB5+IJ9o2oXmnL+HqKEc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:subject:to:date:from:mime-version:content-type :content-transfer-encoding; s=default; bh=8CCZCclDL7ITBVjN2MQDQD ZwMsw=; b=Rl2W616B38nU7uEgHBjlzp+bvwloRFakd8t8CIhUqReGrbxyVSf/Fb aCqr74K7uQSO3/ZBNVZhPn0BL+K8KYQe+T0HyOI4xuAEIMi4/eDvBdHOOGUbb8T1 9tkINnW91CfgsZwlGFknuk8b7+bMtGty4BLO+xv23ljCTyRy06r80= Received: (qmail 10701 invoked by alias); 11 Nov 2013 14:44:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 10674 invoked by uid 89); 11 Nov 2013 14:44:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.0 required=5.0 tests=AWL, BAYES_50, MSGID_FROM_MTA_HEADER, RDNS_NONE, SPF_PASS autolearn=no version=3.3.2 X-HELO: e06smtp14.uk.ibm.com Received: from Unknown (HELO e06smtp14.uk.ibm.com) (195.75.94.110) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 11 Nov 2013 14:44:24 +0000 Received: from /spool/local by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 11 Nov 2013 14:44:15 -0000 Received: from d06dlp01.portsmouth.uk.ibm.com (9.149.20.13) by e06smtp14.uk.ibm.com (192.168.101.144) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 11 Nov 2013 14:44:14 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 5E50517D805C for ; Mon, 11 Nov 2013 14:43:53 +0000 (GMT) Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by b06cxnps4076.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rABEi1Vb55902342 for ; Mon, 11 Nov 2013 14:44:01 GMT Received: from d06av02.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id rABEiCXm013762 for ; Mon, 11 Nov 2013 07:44:12 -0700 Received: from tuxmaker.boeblingen.de.ibm.com (tuxmaker.boeblingen.de.ibm.com [9.152.85.9]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with SMTP id rABEiA6w013685 for ; Mon, 11 Nov 2013 07:44:10 -0700 Message-Id: <201311111444.rABEiA6w013685@d06av02.portsmouth.uk.ibm.com> Received: by tuxmaker.boeblingen.de.ibm.com (sSMTP sendmail emulation); Mon, 11 Nov 2013 15:44:10 +0100 Subject: [PATCH, rs6000] ELFv2 ABI 7/8: Eliminate some stack frame fields To: gcc-patches@gcc.gnu.org Date: Mon, 11 Nov 2013 15:44:10 +0100 (CET) From: "Ulrich Weigand" MIME-Version: 1.0 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13111114-1948-0000-0000-000006DB91B6 Hello, this is the second part of reducing stack space consumption for the ELFv2 ABI: the old ABI reserved six words in every stack frame for various purposes, and two of those were basically unused: one word for "compiler use" and one word for "linker use". Since neither the compiler nor the linker actually ever made any nontrivial use of these fields, they're now eliminated in the new ABI. This patch implements this change, which is mostly straightforward except for the fact that due to the change, the stack offset of the TOC save area changes, which was hard-coded in a number of places ... The patch also updates a number of test cases that hardcoded the stack layout. Bye, Ulrich gcc/ChangeLog: 2013-11-11 Ulrich Weigand Alan Modra * config/rs6000/rs6000.h (RS6000_SAVE_AREA): Handle ABI_ELFv2. (RS6000_SAVE_TOC): Remove. (RS6000_TOC_SAVE_SLOT): New macro. * config/rs6000/rs6000.c (rs6000_parm_offset): New function. (rs6000_parm_start): Use it. (rs6000_function_arg_advance_1): Likewise. (rs6000_emit_prologue): Use RS6000_TOC_SAVE_SLOT. (rs6000_emit_epilogue): Likewise. (rs6000_call_aix): Likewise. (rs6000_output_function_prologue): Do not save/restore r11 around calling _mcount for ABI_ELFv2. libgcc/ChangeLog: 2013-11-11 Ulrich Weigand Alan Modra * config/rs6000/linux-unwind.h (TOC_SAVE_SLOT): Define. (frob_update_context): Use it. gcc/testsuite/ChangeLog: 2013-11-11 Ulrich Weigand * gcc.target/powerpc/ppc64-abi-1.c (stack_frame_t): Remove compiler and linker field if _CALL_ELF == 2. * gcc.target/powerpc/ppc64-abi-2.c (stack_frame_t): Likewise. * gcc.target/powerpc/ppc64-abi-dfp-1.c (stack_frame_t): Likewise. * gcc.dg/stack-usage-1.c (SIZE): Update value for _CALL_ELF == 2. Index: gcc/gcc/config/rs6000/rs6000.h =================================================================== --- gcc.orig/gcc/config/rs6000/rs6000.h +++ gcc/gcc/config/rs6000/rs6000.h @@ -1529,12 +1529,12 @@ extern enum reg_class rs6000_constraints /* Size of the fixed area on the stack */ #define RS6000_SAVE_AREA \ - ((DEFAULT_ABI == ABI_V4 ? 8 : 24) << (TARGET_64BIT ? 1 : 0)) + ((DEFAULT_ABI == ABI_V4 ? 8 : DEFAULT_ABI == ABI_ELFv2 ? 16 : 24) \ + << (TARGET_64BIT ? 1 : 0)) -/* MEM representing address to save the TOC register */ -#define RS6000_SAVE_TOC gen_rtx_MEM (Pmode, \ - plus_constant (Pmode, stack_pointer_rtx, \ - (TARGET_32BIT ? 20 : 40))) +/* Stack offset for toc save slot. */ +#define RS6000_TOC_SAVE_SLOT \ + ((DEFAULT_ABI == ABI_ELFv2 ? 12 : 20) << (TARGET_64BIT ? 1 : 0)) /* Align an address */ #define RS6000_ALIGN(n,a) (((n) + (a) - 1) & ~((a) - 1)) Index: gcc/gcc/config/rs6000/rs6000.c =================================================================== --- gcc.orig/gcc/config/rs6000/rs6000.c +++ gcc/gcc/config/rs6000/rs6000.c @@ -9029,6 +9029,16 @@ rs6000_function_arg_boundary (enum machi return PARM_BOUNDARY; } +/* The offset in words to the start of the parameter save area. */ + +static unsigned int +rs6000_parm_offset (void) +{ + return (DEFAULT_ABI == ABI_V4 ? 2 + : DEFAULT_ABI == ABI_ELFv2 ? 4 + : 6); +} + /* For a function parm of MODE and TYPE, return the starting word in the parameter area. NWORDS of the parameter area are already used. */ @@ -9037,11 +9047,9 @@ rs6000_parm_start (enum machine_mode mod unsigned int nwords) { unsigned int align; - unsigned int parm_offset; align = rs6000_function_arg_boundary (mode, type) / PARM_BOUNDARY - 1; - parm_offset = DEFAULT_ABI == ABI_V4 ? 2 : 6; - return nwords + (-(parm_offset + nwords) & align); + return nwords + (-(rs6000_parm_offset () + nwords) & align); } /* Compute the size (in words) of a function argument. */ @@ -9281,15 +9289,13 @@ rs6000_function_arg_advance_1 (CUMULATIV { int align; - /* Vector parameters must be 16-byte aligned. This places - them at 2 mod 4 in terms of words in 32-bit mode, since - the parameter save area starts at offset 24 from the - stack. In 64-bit mode, they just have to start on an - even word, since the parameter save area is 16-byte - aligned. Space for GPRs is reserved even if the argument - will be passed in memory. */ + /* Vector parameters must be 16-byte aligned. In 32-bit + mode this means we need to take into account the offset + to the parameter save area. In 64-bit mode, they just + have to start on an even word, since the parameter save + area is 16-byte aligned. */ if (TARGET_32BIT) - align = (2 - cum->words) & 3; + align = -(rs6000_parm_offset () + cum->words) & 3; else align = cum->words & 1; cum->words += align + rs6000_arg_size (mode, type); @@ -9965,13 +9971,13 @@ rs6000_function_arg (cumulative_args_t c int align, align_words, n_words; enum machine_mode part_mode; - /* Vector parameters must be 16-byte aligned. This places them at - 2 mod 4 in terms of words in 32-bit mode, since the parameter - save area starts at offset 24 from the stack. In 64-bit mode, - they just have to start on an even word, since the parameter - save area is 16-byte aligned. */ + /* Vector parameters must be 16-byte aligned. In 32-bit + mode this means we need to take into account the offset + to the parameter save area. In 64-bit mode, they just + have to start on an even word, since the parameter save + area is 16-byte aligned. */ if (TARGET_32BIT) - align = (2 - cum->words) & 3; + align = -(rs6000_parm_offset () + cum->words) & 3; else align = cum->words & 1; align_words = cum->words + align; @@ -20188,6 +20194,34 @@ rs6000_savres_strategy (rs6000_stack_t * The required alignment for AIX configurations is two words (i.e., 8 or 16 bytes). + The ELFv2 ABI is a variant of the AIX ABI. Stack frames look like: + + SP----> +---------------------------------------+ + | Back chain to caller | 0 + +---------------------------------------+ + | Save area for CR | 8 + +---------------------------------------+ + | Saved LR | 16 + +---------------------------------------+ + | Saved TOC pointer | 24 + +---------------------------------------+ + | Parameter save area (P) | 32 + +---------------------------------------+ + | Alloca space (A) | 32+P + +---------------------------------------+ + | Local variable space (L) | 32+P+A + +---------------------------------------+ + | Save area for AltiVec registers (W) | 32+P+A+L + +---------------------------------------+ + | AltiVec alignment padding (Y) | 32+P+A+L+W + +---------------------------------------+ + | Save area for GP registers (G) | 32+P+A+L+W+Y + +---------------------------------------+ + | Save area for FP registers (F) | 32+P+A+L+W+Y+G + +---------------------------------------+ + old SP->| back chain to caller's caller | 32+P+A+L+W+Y+G+F + +---------------------------------------+ + V.4 stack frames look like: @@ -22532,7 +22566,8 @@ rs6000_emit_prologue (void) be updated if we arrived at this function via a plt call or toc adjusting stub. */ emit_move_insn (tmp_reg_si, gen_rtx_MEM (SImode, tmp_reg)); - toc_restore_insn = TARGET_32BIT ? 0x80410014 : 0xE8410028; + toc_restore_insn = ((TARGET_32BIT ? 0x80410000 : 0xE8410000) + + RS6000_TOC_SAVE_SLOT); hi = gen_int_mode (toc_restore_insn & ~0xffff, SImode); emit_insn (gen_xorsi3 (tmp_reg_si, tmp_reg_si, hi)); compare_result = gen_rtx_REG (CCUNSmode, CR0_REGNO); @@ -22551,7 +22586,7 @@ rs6000_emit_prologue (void) LABEL_NUSES (toc_save_done) += 1; save_insn = emit_frame_save (frame_reg_rtx, reg_mode, - TOC_REGNUM, frame_off + 5 * reg_size, + TOC_REGNUM, frame_off + RS6000_TOC_SAVE_SLOT, sp_off - frame_off); emit_label (toc_save_done); @@ -22946,7 +22981,7 @@ rs6000_emit_prologue (void) if (rs6000_save_toc_in_prologue_p ()) { rtx reg = gen_rtx_REG (reg_mode, TOC_REGNUM); - emit_insn (gen_frame_store (reg, sp_reg_rtx, 5 * reg_size)); + emit_insn (gen_frame_store (reg, sp_reg_rtx, RS6000_TOC_SAVE_SLOT)); } } @@ -23014,7 +23049,11 @@ rs6000_output_function_prologue (FILE *f asm_fprintf (file, "\tmflr %s\n", reg_names[0]); asm_fprintf (file, "\tstd %s,16(%s)\n", reg_names[0], reg_names[1]); - if (cfun->static_chain_decl != NULL) + /* In the ELFv2 ABI we have no compiler stack word. It must be + the resposibility of _mcount to preserve the static chain + register if required. */ + if (DEFAULT_ABI != ABI_ELFv2 + && cfun->static_chain_decl != NULL) { asm_fprintf (file, "\tstd %s,24(%s)\n", reg_names[STATIC_CHAIN_REGNUM], reg_names[1]); @@ -23758,7 +23797,7 @@ rs6000_emit_epilogue (int sibcall) { rtx reg = gen_rtx_REG (reg_mode, 2); emit_insn (gen_frame_load (reg, frame_reg_rtx, - frame_off + 5 * reg_size)); + frame_off + RS6000_TOC_SAVE_SLOT)); } for (i = 0; ; ++i) @@ -31335,7 +31374,7 @@ rs6000_call_aix (rtx value, rtx func_des /* Save the TOC into its reserved slot before the call, and prepare to restore it after the call. */ rtx stack_ptr = gen_rtx_REG (Pmode, STACK_POINTER_REGNUM); - rtx stack_toc_offset = GEN_INT (5 * GET_MODE_SIZE (Pmode)); + rtx stack_toc_offset = GEN_INT (RS6000_TOC_SAVE_SLOT); rtx stack_toc_mem = gen_frame_mem (Pmode, gen_rtx_PLUS (Pmode, stack_ptr, stack_toc_offset)); Index: gcc/libgcc/config/rs6000/linux-unwind.h =================================================================== --- gcc.orig/libgcc/config/rs6000/linux-unwind.h +++ gcc/libgcc/config/rs6000/linux-unwind.h @@ -29,6 +29,14 @@ #define R_VR0 77 #define R_VRSAVE 109 +#ifdef __powerpc64__ +#if _CALL_ELF == 2 +#define TOC_SAVE_SLOT 24 +#else +#define TOC_SAVE_SLOT 40 +#endif +#endif + struct gcc_vregs { __attribute__ ((vector_size (16))) int vr[32]; @@ -310,11 +318,11 @@ frob_update_context (struct _Unwind_Cont figure out if it was saved. The big problem here is that the code that does the save/restore is generated by the linker, so we have no good way to determine at compile time what to do. */ - if (pc[0] == 0xF8410028 + if (pc[0] == 0xF8410000 + TOC_SAVE_SLOT #if _CALL_ELF != 2 /* The ELFv2 linker never generates the old PLT stub form. */ || ((pc[0] & 0xFFFF0000) == 0x3D820000 - && pc[1] == 0xF8410028) + && pc[1] == 0xF8410000 + TOC_SAVE_SLOT) #endif ) { @@ -325,19 +333,19 @@ frob_update_context (struct _Unwind_Cont { unsigned int *insn = (unsigned int *) _Unwind_GetGR (context, R_LR); - if (insn && *insn == 0xE8410028) - _Unwind_SetGRPtr (context, 2, context->cfa + 40); + if (insn && *insn == 0xE8410000 + TOC_SAVE_SLOT) + _Unwind_SetGRPtr (context, 2, context->cfa + TOC_SAVE_SLOT); #if _CALL_ELF != 2 /* ELFv2 does not use this function pointer call sequence. */ else if (pc[0] == 0x4E800421 - && pc[1] == 0xE8410028) + && pc[1] == 0xE8410000 + TOC_SAVE_SLOT) { /* We are at the bctrl instruction in a call via function pointer. gcc always emits the load of the new R2 just before the bctrl so this is the first and only place we need to use the stored R2. */ _Unwind_Word sp = _Unwind_GetGR (context, 1); - _Unwind_SetGRPtr (context, 2, (void *)(sp + 40)); + _Unwind_SetGRPtr (context, 2, (void *)(sp + TOC_SAVE_SLOT)); } #endif } Index: gcc/gcc/testsuite/gcc.target/powerpc/ppc64-abi-1.c =================================================================== --- gcc.orig/gcc/testsuite/gcc.target/powerpc/ppc64-abi-1.c +++ gcc/gcc/testsuite/gcc.target/powerpc/ppc64-abi-1.c @@ -89,8 +89,10 @@ typedef struct sf long a1; long a2; long a3; +#if _CALL_ELF != 2 long a4; long a5; +#endif parm_t slot[100]; } stack_frame_t; Index: gcc/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c =================================================================== --- gcc.orig/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c +++ gcc/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c @@ -107,8 +107,10 @@ typedef struct sf long a1; long a2; long a3; +#if _CALL_ELF != 2 long a4; long a5; +#endif parm_t slot[100]; } stack_frame_t; Index: gcc/gcc/testsuite/gcc.target/powerpc/ppc64-abi-dfp-1.c =================================================================== --- gcc.orig/gcc/testsuite/gcc.target/powerpc/ppc64-abi-dfp-1.c +++ gcc/gcc/testsuite/gcc.target/powerpc/ppc64-abi-dfp-1.c @@ -87,8 +87,10 @@ typedef struct sf long a1; long a2; long a3; +#if _CALL_ELF != 2 long a4; long a5; +#endif unsigned long slot[100]; } stack_frame_t; Index: gcc/gcc/testsuite/gcc.dg/stack-usage-1.c =================================================================== --- gcc.orig/gcc/testsuite/gcc.dg/stack-usage-1.c +++ gcc/gcc/testsuite/gcc.dg/stack-usage-1.c @@ -40,7 +40,11 @@ # endif #elif defined (__powerpc64__) || defined (__ppc64__) || defined (__POWERPC64__) \ || defined (__PPC64__) -# define SIZE 180 +# if _CALL_ELF == 2 +# define SIZE 208 +# else +# define SIZE 180 +# endif #elif defined (__powerpc__) || defined (__PPC__) || defined (__ppc__) \ || defined (__POWERPC__) || defined (PPC) || defined (_IBMR2) # if defined (__ALTIVEC__)