From patchwork Fri May 29 21:37:51 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sriraman Tallam X-Patchwork-Id: 478057 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8FB51140F96 for ; Sat, 30 May 2015 07:38:03 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=jPLaFphc; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=HDTC/jKGA5inxenLZH rND12PKrqd/1qzXHjtOEbGgNB+sxdFkRqlHRNDKdprmpAm2W3H1EqVjfK42+LTDv gUQbVTGtp173YyQxLP4GrZ68nYI/KfhKz8+6QLTmV0au1jayS1VjvoPXHqjt6n+7 mTyuswfos37jgPLLABs/asKvs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=9pLk38M+zpRdvs6228HAgOMM 91M=; b=jPLaFphcInbiFddlUs87oRea7aZhj8kZM5KlLa4CKt+El4lOAAmwCx8j ipgP79YYLBlK+ON02olzP0Aeh4UU63JqmFAJgYilt/Z3AbDRAXBVr0xEdNmRBRJF bKhsHftBh539sR4r4zIDX+2Y7zYBAu9Bkw2vItcOS1Kqesz02OI= Received: (qmail 81341 invoked by alias); 29 May 2015 21:37:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 81331 invoked by uid 89); 29 May 2015 21:37:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.5 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_STOCKGEN, RCVD_IN_DNSWL_LOW, SPF_PASS, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mail-vn0-f41.google.com Received: from mail-vn0-f41.google.com (HELO mail-vn0-f41.google.com) (209.85.216.41) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 29 May 2015 21:37:54 +0000 Received: by vnbf1 with SMTP id f1so9814817vnb.2 for ; Fri, 29 May 2015 14:37:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=yZcKu1B1mhfxviE+26UBC/NOgYxzFn4T43Lac8LfSHQ=; b=IQQA2Wd/ROBQgoB43FOKBnam0CXCbFgWHkiw6OmgZe4n9yB2qMiaxQhC7guE9na9uS eZwto3w1cO1hQ29NNkw/HrAWzeNmaNN1OUolA7SycObQWeicgyHTHkPds+aUB0wbLpnv CWaQMi0Yh/KWs7tnDY8gCUZ7xwXFvEf5vTi2/Y9l/9lRgk17YFzgq/ZYuZP3TXtLIAhK XKnvC6f3XCJ5LX9yDBLNN0m4nDkjGViKoBBik4djfbdeO8cb6TNZhAnGXq6Bl4SnirdM G3ajhTHjf82+1/kBHoiwRxQhYt27GT+xOmLiOCpzGWj5BhapuzJmn45eOU/GcA8GN6MG b7Qw== X-Gm-Message-State: ALoCoQk4ozHeh0VOjrt5WFB5Q248tMSi1L0EbIw3GcG2FBIVhuWyI6KFme1eVGQh+Cyu1U83BERE MIME-Version: 1.0 X-Received: by 10.52.89.174 with SMTP id bp14mr9298547vdb.58.1432935471410; Fri, 29 May 2015 14:37:51 -0700 (PDT) Received: by 10.52.229.196 with HTTP; Fri, 29 May 2015 14:37:51 -0700 (PDT) In-Reply-To: <20150529193552.GA52215@kam.mff.cuni.cz> References: <20150529193552.GA52215@kam.mff.cuni.cz> Date: Fri, 29 May 2015 14:37:51 -0700 Message-ID: Subject: Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt= From: Sriraman Tallam To: Jan Hubicka Cc: "H.J. Lu" , Pedro Alves , Michael Matz , David Li , GCC Patches X-IsSubscribed: yes On Fri, May 29, 2015 at 12:35 PM, Jan Hubicka wrote: >> * config/i386/i386.c (avoid_plt_to_call): New function. >> (ix86_output_call_insn): Generate indirect call for functions >> marked with "noplt" attribute. >> (attribute_spec ix86_attribute_): Define new attribute "noplt". >> * doc/extend.texi: Document new attribute "noplt". >> * gcc.target/i386/noplt-1.c: New testcase. >> * gcc.target/i386/noplt-2.c: New testcase. >> >> Index: config/i386/i386.c >> =================================================================== >> --- config/i386/i386.c (revision 223720) >> +++ config/i386/i386.c (working copy) >> @@ -25599,6 +25599,24 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call >> return call; >> } >> >> +/* Return true if the function being called was marked with attribute >> + "noplt". If this function is defined, this should return false. */ >> +static bool >> +avoid_plt_to_call (rtx call_op) >> +{ >> + if (SYMBOL_REF_LOCAL_P (call_op)) >> + return false; >> + >> + tree symbol_decl = SYMBOL_REF_DECL (call_op); >> + >> + if (symbol_decl != NULL_TREE >> + && TREE_CODE (symbol_decl) == FUNCTION_DECL >> + && lookup_attribute ("noplt", DECL_ATTRIBUTES (symbol_decl))) >> + return true; >> + >> + return false; >> +} > > OK, now we have __attribute__ (optimize("noplt")) which binds to the caller and makes > all calls in the function to skip PLT and __attribute__ ("noplt") which binds to callee > and makes all calls to function to not use PLT. > > That sort of makes sense to me, but why "noplt" attribute is not implemented at generic level > just like -fplt? Is it only because every target supporting PLT would need update in its > call expansion patterns? Yes, that is what I had in mind. > > Also I think the PLT calls have EBX in call fusage wich is added by ix86_expand_call. > else > { > /* Static functions and indirect calls don't need the pic register. */ > if (flag_pic > && (!TARGET_64BIT > || (ix86_cmodel == CM_LARGE_PIC > && DEFAULT_ABI != MS_ABI)) > && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF > && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0))) > { > use_reg (&use, gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM)); > if (ix86_use_pseudo_pic_reg ()) > emit_move_insn (gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM), > pic_offset_table_rtx); > } > > I think you want to take that away from FUSAGE there just like we do for local calls > (and in fact the code should already check flag_pic && flag_plt I suppose. Done that now and patch attached. Thanks Sri > > Honza * config/i386/i386.c (avoid_plt_to_call): New function. (ix86_expand_call): Dont use the PIC register when external function calls are not made via PLT. (ix86_output_call_insn): Generate indirect call for functions marked with "noplt" attribute. (attribute_spec ix86_attribute_): Define new attribute "noplt". * doc/extend.texi: Document new attribute "noplt". * gcc.target/i386/noplt-1.c: New testcase. * gcc.target/i386/noplt-2.c: New testcase. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 223720) +++ config/i386/i386.c (working copy) @@ -25475,6 +25475,28 @@ construct_plt_address (rtx symbol) return tmp; } +/* Return true if the function being called was marked with attribute + "noplt". If this function is defined, this should return false. This + is currently used only with 64-bit ELF targets. */ +static bool +avoid_plt_to_call (rtx call_op) +{ + if (!TARGET_64BIT || TARGET_MACHO|| TARGET_SEH || TARGET_PECOFF) + return false; + + if (SYMBOL_REF_LOCAL_P (call_op)) + return false; + + tree symbol_decl = SYMBOL_REF_DECL (call_op); + + if (symbol_decl != NULL_TREE + && TREE_CODE (symbol_decl) == FUNCTION_DECL + && lookup_attribute ("noplt", DECL_ATTRIBUTES (symbol_decl))) + return true; + + return false; +} + rtx ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1, rtx callarg2, @@ -25497,13 +25519,16 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call } else { - /* Static functions and indirect calls don't need the pic register. */ + /* Static functions and indirect calls don't need the pic register. Also, + check if PLT was explicitly avoided via no-plt or "noplt" attribute, making + it an indirect call. */ if (flag_pic && (!TARGET_64BIT || (ix86_cmodel == CM_LARGE_PIC && DEFAULT_ABI != MS_ABI)) && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF - && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0))) + && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0)) + && flag_plt && !avoid_plt_to_call (XEXP (fnaddr, 0))) { use_reg (&use, gen_rtx_REG (Pmode, REAL_PIC_OFFSET_TABLE_REGNUM)); if (ix86_use_pseudo_pic_reg ()) @@ -25611,7 +25636,13 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op if (SIBLING_CALL_P (insn)) { if (direct_p) - xasm = "%!jmp\t%P0"; + { + if (!TARGET_MACHO && !TARGET_SEH && !TARGET_PECOFF + && TARGET_64BIT && avoid_plt_to_call (call_op)) + xasm = "%!jmp\t*%p0@GOTPCREL(%%rip)"; + else + xasm = "%!jmp\t%P0"; + } /* SEH epilogue detection requires the indirect branch case to include REX.W. */ else if (TARGET_SEH) @@ -25654,7 +25685,13 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op } if (direct_p) - xasm = "%!call\t%P0"; + { + if (!TARGET_MACHO && !TARGET_SEH && !TARGET_PECOFF + && TARGET_64BIT && avoid_plt_to_call (call_op)) + xasm = "%!call\t*%p0@GOTPCREL(%%rip)"; + else + xasm = "%!call\t%P0"; + } else xasm = "%!call\t%A0"; @@ -46628,6 +46665,9 @@ static const struct attribute_spec ix86_attribute_ false }, { "callee_pop_aggregate_return", 1, 1, false, true, true, ix86_handle_callee_pop_aggregate_return, true }, + /* Attribute to avoid calling function via PLT. */ + { "noplt", 0, 0, true, false, false, ix86_handle_fndecl_attribute, + false }, /* End element. */ { NULL, 0, 0, false, false, false, NULL, false } }; Index: doc/extend.texi =================================================================== --- doc/extend.texi (revision 223720) +++ doc/extend.texi (working copy) @@ -4858,6 +4858,13 @@ On x86-32 targets, the @code{stdcall} attribute ca assume that the called function pops off the stack space used to pass arguments, unless it takes a variable number of arguments. +@item noplt +@cindex @code{noplt} function attribute, x86-64 +@cindex functions whose calls do not go via PLT +On x86-64 targets. the @code{noplt} attribute causes the compiler to +call this external function indirectly using a GOT entry and avoid the +PLT. + @item target (@var{options}) @cindex @code{target} function attribute As discussed in @ref{Common Function Attributes}, this attribute Index: testsuite/gcc.target/i386/noplt-1.c =================================================================== --- testsuite/gcc.target/i386/noplt-1.c (revision 0) +++ testsuite/gcc.target/i386/noplt-1.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-do compile { target x86_64-*-linux* } } */ + + +__attribute__ ((noplt)) +void foo(); + +int main() +{ + foo(); + return 0; +} + +/* { dg-final { scan-assembler "call\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ Index: testsuite/gcc.target/i386/noplt-2.c =================================================================== --- testsuite/gcc.target/i386/noplt-2.c (revision 0) +++ testsuite/gcc.target/i386/noplt-2.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-do compile { target x86_64-*-linux* } } */ +/* { dg-options "-O2" } */ + + +__attribute__ ((noplt)) +int foo(); + +int main() +{ + return foo(); +} + +/* { dg-final { scan-assembler "jmp\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */