From patchwork Fri May 29 06:03:59 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sriraman Tallam X-Patchwork-Id: 477578 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 09558140E1E for ; Fri, 29 May 2015 16:04:11 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Wn39yrAG; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=kyYkJv/sWwrUr8ZUYS S9ZYRaOWJWheuftXBLnNM7j+lvGQiGJljBWegQx/m1W/KXZcEbel443Ph3lO7/cA qZxNerCRnqxYlTmNgpqPE1DpsZz2/A27Gm84O+GLeTvTK9ayBI3++dFCSjSS7qG5 FH0++FwmPC2MzI2vFmaoW3Hl4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=bBPf7zwpdwRZXbXRz4p5EdOd VFU=; b=Wn39yrAG7HF/VsS+AmiuLVcY910dXvkmak/ixu335CvDS/yZRpPWgwVv 5eOR08AO7747SmG/bLtNw8fY98a9tcVLkApYF8mg1uxQQtJwEtGlDnXDPTJAeA5N C7m4tAJneYo7XggbJcgkH35S2SpNo3pCsfNyPNbNgHiA4ECSwdQ= Received: (qmail 100270 invoked by alias); 29 May 2015 06:04:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100258 invoked by uid 89); 29 May 2015 06:04:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.5 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_STOCKGEN, RCVD_IN_DNSWL_LOW, SPF_PASS, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mail-vn0-f52.google.com Received: from mail-vn0-f52.google.com (HELO mail-vn0-f52.google.com) (209.85.216.52) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 29 May 2015 06:04:02 +0000 Received: by vnbf190 with SMTP id f190so7175071vnb.5 for ; Thu, 28 May 2015 23:03:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=946NIizf2uhrynoKIP8EZo81Qdgi5/AWWIZMmHT4w80=; b=S2enLYcq12kj93zKNysBsoZ3ywyCcukuoJDcknmMHhC5moKDwq8ZBjW88gb51Bvmcb PBN3VCnlSEsECGh35nPsKqeN1+YR+JVFM3bKFQ28CgXoHEhWoUONmL8aHy+JWwZsy7Wu JV5lyItXKkYy/wE7wR2AAsaYp5rYTSr20ZLiVVGSO+H00G2k/mK2twAxTkMqJ7seAOB/ pgMPSmwvFjsdDO/kxx4lrP1maHS1BnkQxFu333wO7UkdO5W6fPBMCZiqcD6S8NL0uBHq /c8mlUqLqUp19OMkX2NMzH57+mnZRJb3Vz9VbbGmcx0oPBxXiCvhxM+nxXEzb+0Yv93o KZhA== X-Gm-Message-State: ALoCoQmhLU5QoVWWfFYNZRQKOixjAmgZdQCqxODPWSzv9UZymMAcvGZKD+4dEbb++Iu9lfr4AtlN MIME-Version: 1.0 X-Received: by 10.52.138.11 with SMTP id qm11mr5911588vdb.40.1432879439535; Thu, 28 May 2015 23:03:59 -0700 (PDT) Received: by 10.52.229.196 with HTTP; Thu, 28 May 2015 23:03:59 -0700 (PDT) In-Reply-To: References: <555E5376.3060706@redhat.com> <555EF018.2050309@redhat.com> Date: Thu, 28 May 2015 23:03:59 -0700 Message-ID: Subject: Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt= From: Sriraman Tallam To: "H.J. Lu" Cc: Pedro Alves , Michael Matz , David Li , GCC Patches , Jan Hubicka X-IsSubscribed: yes On Thu, May 28, 2015 at 5:05 PM, H.J. Lu wrote: > On Thu, May 28, 2015 at 4:54 PM, Sriraman Tallam wrote: >> On Thu, May 28, 2015 at 2:52 PM, H.J. Lu wrote: >>> On Thu, May 28, 2015 at 2:27 PM, Sriraman Tallam wrote: >>>> On Thu, May 28, 2015 at 2:01 PM, H.J. Lu wrote: >>>>> On Thu, May 28, 2015 at 1:54 PM, Sriraman Tallam wrote: >>>>>> On Thu, May 28, 2015 at 12:05 PM, H.J. Lu wrote: >>>>>>> On Thu, May 28, 2015 at 11:50 AM, Sriraman Tallam wrote: >>>>>>>> On Thu, May 28, 2015 at 11:42 AM, H.J. Lu wrote: >>>>>>>>> On Thu, May 28, 2015 at 11:34 AM, Sriraman Tallam wrote: >>>>>>>>>> I have attached a patch that adds the new attribute "noplt". Please review. >>>>>>>>>> >>>>>>>>>> * config/i386/i386.c (avoid_plt_to_call): New function. >>>>>>>>>> (ix86_output_call_insn): Generate indirect call for functions >>>>>>>>>> marked with "noplt" attribute. >>>>>>>>>> (attribute_spec ix86_attribute_): Define new attribute "noplt". >>>>>>>>>> * doc/extend.texi: Document new attribute "noplt". >>>>>>>>>> * gcc.target/i386/noplt-1.c: New testcase. >>>>>>>>>> * gcc.target/i386/noplt-2.c: New testcase. >>>>>>>>>> >>>>>>>>> >>>>>>>>> 2 comments: >>>>>>>>> >>>>>>>>> 1. Don't remove "%!" prefix before call/jmp. It is needed for MPX. >>>>>>>>> 2. Don't you need to check >>>>>>>>> >>>>>>>>> && !TARGET_MACHO >>>>>>>>> && !TARGET_SEH >>>>>>>>> && !TARGET_PECOFF >>>>>>>>> >>>>>>>>> since it only works for ELF. >>>>>>>> >>>>>>>> Ok, I will make this change. OTOH, is it just better to piggy-back on >>>>>>>> existing -fno-plt change by Alex in calls.c >>>>>>>> and do this: >>>>>>>> >>>>>>>> Index: calls.c >>>>>>>> =================================================================== >>>>>>>> --- calls.c (revision 223720) >>>>>>>> +++ calls.c (working copy) >>>>>>>> @@ -226,9 +226,11 @@ prepare_call_address (tree fndecl_or_type, rtx fun >>>>>>>> && targetm.small_register_classes_for_mode_p (FUNCTION_MODE)) >>>>>>>> ? force_not_mem (memory_address (FUNCTION_MODE, funexp)) >>>>>>>> : memory_address (FUNCTION_MODE, funexp)); >>>>>>>> - else if (flag_pic && !flag_plt && fndecl_or_type >>>>>>>> + else if (fndecl_or_type >>>>>>>> && TREE_CODE (fndecl_or_type) == FUNCTION_DECL >>>>>>>> - && !targetm.binds_local_p (fndecl_or_type)) >>>>>>>> + && !targetm.binds_local_p (fndecl_or_type) >>>>>>>> + && ((flag_pic && !flag_plt) >>>>>>>> + || (lookup_attribute ("noplt", DECL_ATTRIBUTES(fndecl_or_type))))) >>>>>>>> { >>>>>>>> funexp = force_reg (Pmode, funexp); >>>>>>>> } >>>>>>>> >>>>>>> >>>>>>> Does it work on non-PIC calls? >>>>>> >>>>>> You are right, it doesnt work. I have attached the patch with the >>>>>> changes you mentioned. >>>>>> >>>>> >>>>> Since direct_p is true, do wee need >>>>> >>>>> + if (GET_CODE (call_op) != SYMBOL_REF >>>>> + || SYMBOL_REF_LOCAL_P (call_op)) >>>>> + return false; >>>> >>>> We do need it right because for this case below, I do not want an >>>> indirect call: >>>> >>>> __attribute__((noplt)) >>>> int foo() { >>>> return 0; >>>> } >>>> >>>> int main() >>>> { >>>> return foo(); >>>> } >>>> >>>> Assuming foo is not inlined, if I remove the lines you mentioned, I >>>> will get an indirect call which is unnecessary. >>>> >>> >>> I meant the "GET_CODE (call_op) != SYMBOL_REF" part isn't >>> needed. >> >> I should have realized that :), sorry. Patch fixed. >> > > --- testsuite/gcc.target/i386/noplt-1.c (revision 0) > +++ testsuite/gcc.target/i386/noplt-1.c (working copy) > @@ -0,0 +1,13 @@ > +/* { dg-do compile { target x86_64-*-* } } */ > ... > +/* { dg-final { scan-assembler "call\[ > \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ > > The test will fail on Windows and Darwin. Changed to use x86_64-*-linux* target. > > > -- > H.J. * config/i386/i386.c (avoid_plt_to_call): New function. (ix86_output_call_insn): Generate indirect call for functions marked with "noplt" attribute. (attribute_spec ix86_attribute_): Define new attribute "noplt". * doc/extend.texi: Document new attribute "noplt". * gcc.target/i386/noplt-1.c: New testcase. * gcc.target/i386/noplt-2.c: New testcase. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 223720) +++ config/i386/i386.c (working copy) @@ -25599,6 +25599,24 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call return call; } +/* Return true if the function being called was marked with attribute + "noplt". If this function is defined, this should return false. */ +static bool +avoid_plt_to_call (rtx call_op) +{ + if (SYMBOL_REF_LOCAL_P (call_op)) + return false; + + tree symbol_decl = SYMBOL_REF_DECL (call_op); + + if (symbol_decl != NULL_TREE + && TREE_CODE (symbol_decl) == FUNCTION_DECL + && lookup_attribute ("noplt", DECL_ATTRIBUTES (symbol_decl))) + return true; + + return false; +} + /* Output the assembly for a call instruction. */ const char * @@ -25611,7 +25629,13 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op if (SIBLING_CALL_P (insn)) { if (direct_p) - xasm = "%!jmp\t%P0"; + { + if (!TARGET_MACHO && !TARGET_SEH && !TARGET_PECOFF + && TARGET_64BIT && avoid_plt_to_call (call_op)) + xasm = "%!jmp\t*%p0@GOTPCREL(%%rip)"; + else + xasm = "%!jmp\t%P0"; + } /* SEH epilogue detection requires the indirect branch case to include REX.W. */ else if (TARGET_SEH) @@ -25654,7 +25678,13 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op } if (direct_p) - xasm = "%!call\t%P0"; + { + if (!TARGET_MACHO && !TARGET_SEH && !TARGET_PECOFF + && TARGET_64BIT && avoid_plt_to_call (call_op)) + xasm = "%!call\t*%p0@GOTPCREL(%%rip)"; + else + xasm = "%!call\t%P0"; + } else xasm = "%!call\t%A0"; @@ -46628,6 +46658,9 @@ static const struct attribute_spec ix86_attribute_ false }, { "callee_pop_aggregate_return", 1, 1, false, true, true, ix86_handle_callee_pop_aggregate_return, true }, + /* Attribute to avoid calling function via PLT. */ + { "noplt", 0, 0, true, false, false, ix86_handle_fndecl_attribute, + false }, /* End element. */ { NULL, 0, 0, false, false, false, NULL, false } }; Index: doc/extend.texi =================================================================== --- doc/extend.texi (revision 223720) +++ doc/extend.texi (working copy) @@ -4858,6 +4858,13 @@ On x86-32 targets, the @code{stdcall} attribute ca assume that the called function pops off the stack space used to pass arguments, unless it takes a variable number of arguments. +@item noplt +@cindex @code{noplt} function attribute, x86-64 +@cindex functions whose calls do not go via PLT +On x86-64 targets. the @code{noplt} attribute causes the compiler to +call this external function indirectly using a GOT entry and avoid the +PLT. + @item target (@var{options}) @cindex @code{target} function attribute As discussed in @ref{Common Function Attributes}, this attribute Index: testsuite/gcc.target/i386/noplt-1.c =================================================================== --- testsuite/gcc.target/i386/noplt-1.c (revision 0) +++ testsuite/gcc.target/i386/noplt-1.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-do compile { target x86_64-*-linux* } } */ + + +__attribute__ ((noplt)) +void foo(); + +int main() +{ + foo(); + return 0; +} + +/* { dg-final { scan-assembler "call\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */ Index: testsuite/gcc.target/i386/noplt-2.c =================================================================== --- testsuite/gcc.target/i386/noplt-2.c (revision 0) +++ testsuite/gcc.target/i386/noplt-2.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-do compile { target x86_64-*-linux* } } */ +/* { dg-options "-O2" } */ + + +__attribute__ ((noplt)) +int foo(); + +int main() +{ + return foo(); +} + +/* { dg-final { scan-assembler "jmp\[ \t\]\\*.*foo.*@GOTPCREL\\(%rip\\)" } } */