From patchwork Mon Nov 19 23:25:44 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 200216 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 38DC92C0098 for ; Tue, 20 Nov 2012 10:26:44 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1353972404; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Received:Received:Received:Received:Received:Date:From:To: Subject:Message-ID:Mail-Followup-To:MIME-Version:Content-Type: Content-Disposition:User-Agent:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=7jT8MORd1ZQBObirt4Mxd97/sD0=; b=Z6tYqKrA4bixLy4 tK5XayKbYcvFoR6LI9vZQbEnm2qtnY+Lc511q8l3OjZhNekbbHP3w5eWfiksR/6a lNx2WtBSWkL/9Y8EYp3Yuseo7CuSbwA3F6FMZG/Y90OpcoydpbaD3hbPSZo6OhQ1 QUF9BwsU3JYcCT1GqzUKIxHV2FoU= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Received:Received:Received:Date:From:To:Subject:Message-ID:Mail-Followup-To:MIME-Version:Content-Type:Content-Disposition:User-Agent:X-Content-Scanned:x-cbid:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=P2StACiUfrafRZl/cLS9sM/7GVDHG/UC8CHVqkF7p7yeelvb0yxjTAaGAsxfjy t+yVyiIGMKIbhRdGo1nXZ3iHKv6pvZsfpqL5qedglKXJhUANCfjqDzAs1IVSCYrT PT5vbnKUsvBFiWztgge+BcklJPrcdzJVtuWMrfbCDoRrg=; Received: (qmail 14294 invoked by alias); 19 Nov 2012 23:26:33 -0000 Received: (qmail 14268 invoked by uid 22791); 19 Nov 2012 23:26:33 -0000 X-SWARE-Spam-Status: No, hits=-3.8 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, TW_XL X-Spam-Check-By: sourceware.org Received: from e32.co.us.ibm.com (HELO e32.co.us.ibm.com) (32.97.110.150) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 19 Nov 2012 23:26:26 +0000 Received: from /spool/local by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 19 Nov 2012 16:26:25 -0700 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e32.co.us.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 19 Nov 2012 16:25:48 -0700 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id CAB3419D8036 for ; Mon, 19 Nov 2012 16:25:47 -0700 (MST) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id qAJNPkQK380806 for ; Mon, 19 Nov 2012 16:25:47 -0700 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id qAJNPk18004652 for ; Mon, 19 Nov 2012 16:25:46 -0700 Received: from ibm-tiger.the-meissners.org ([9.33.37.85]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id qAJNPk14004540; Mon, 19 Nov 2012 16:25:46 -0700 Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id A42C7425CB; Mon, 19 Nov 2012 18:25:44 -0500 (EST) Date: Mon, 19 Nov 2012 18:25:44 -0500 From: Michael Meissner To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Subject: [Patch] Slightly improve powerpc floating point handling Message-ID: <20121119232544.GA24478@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12111923-5406-0000-0000-0000022FE37A X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org I am working on support for a future processor, and I noticed that when I did the power7 work initially in 2009, that I ordered the DF moves so that the VSX moves came before traditional floating point moves. If reload needs to reload a floating point register, it will match first on the VSX instructions and generate LXSDX or STXSDX instead of the traditional LFD/LFDX and STFD/STFDX instructions. Because the LXSDX/STXSDX instructions are only REG+REG, reload needs to generate the stack offset in a GPR and use this. Note, for normal loads/stores, the register allocator will see if there are other options, and eventually match against the traditional floating point load and store. Reload however, seems to stop as soon as it finds an appropriate instruction. The following patch reorders the movdf patterns so that first the traditional floating point registers are considered, then the VSX registers, and finally the general purpose registers. I have bootstrapped the compiler with these changes, and had no regressions in the testsuite. I also ran the spec 2006 benchmark suite with/without these patches (using subversion id 193503 as the base). There were no slow downs that were outside of the normal range that I consider to be noise level (2%). The 447.dealII benchmark sped up by 14% (456.hmmer and 471.omnetpp sped up by 2%). Are these patches ok to apply? 2012-11-19 Michael Meissner * config/rs6000/rs6000.md (movdf_hardfloat32): Reorder move constraints so that the traditional floating point loads, stores, and moves are done first, then the VSX loads, stores, and moves, and finally the GPR loads, stores, and moves so that reload chooses FPRs over GPRs, and uses the traditional load/store instructions which provide an offset. (movdf_hardfloat64): Likewise. Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 193635) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -8019,46 +8019,30 @@ (define_split ;; less efficient than loading the constant into an FP register, since ;; it will probably be used there. (define_insn "*movdf_hardfloat32" - [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,ws,?wa,ws,?wa,Z,?Z,m,d,d,wa,!r,!r,!r") - (match_operand:DF 1 "input_operand" "r,Y,r,ws,wa,Z,Z,ws,wa,d,m,d,j,G,H,F"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,!r,!r,!r") + (match_operand:DF 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,G,H,F"))] "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" - "* -{ - switch (which_alternative) - { - default: - gcc_unreachable (); - case 0: - case 1: - case 2: - return \"#\"; - case 3: - case 4: - return \"xxlor %x0,%x1,%x1\"; - case 5: - case 6: - return \"lxsd%U1x %x0,%y1\"; - case 7: - case 8: - return \"stxsd%U0x %x1,%y0\"; - case 9: - return \"stfd%U0%X0 %1,%0\"; - case 10: - return \"lfd%U1%X1 %0,%1\"; - case 11: - return \"fmr %0,%1\"; - case 12: - return \"xxlxor %x0,%x0,%x0\"; - case 13: - case 14: - case 15: - return \"#\"; - } -}" - [(set_attr "type" "store,load,two,fp,fp,fpload,fpload,fpstore,fpstore,fpstore,fpload,fp,vecsimple,*,*,*") - (set_attr "length" "8,8,8,4,4,4,4,4,4,4,4,4,4,8,12,16")]) + "@ + stfd%U0%X0 %1,%0 + lfd%U1%X1 %0,%1 + fmr %0,%1 + lxsd%U1x %x0,%y1 + lxsd%U1x %x0,%y1 + stxsd%U0x %x1,%y0 + stxsd%U0x %x1,%y0 + xxlor %x0,%x1,%x1 + xxlor %x0,%x1,%x1 + xxlxor %x0,%x0,%x0 + # + # + # + # + # + #" + [(set_attr "type" "fpstore,fpload,fp,fpload,fpload,fpstore,fpstore,vecsimple,vecsimple,vecsimple,store,load,two,fp,fp,*") + (set_attr "length" "4,4,4,4,4,4,4,4,4,4,8,8,8,8,12,16")]) (define_insn "*movdf_softfloat32" [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,r,r,r,r") @@ -8131,25 +8115,25 @@ (define_insn "*movdf_hardfloat64_mfpgpr" ; ld/std require word-aligned displacements -> 'Y' constraint. ; List Y->r and r->Y before r->r for reload. (define_insn "*movdf_hardfloat64" - [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,ws,?wa,ws,?wa,Z,?Z,m,d,d,wa,*c*l,!r,*h,!r,!r,!r") - (match_operand:DF 1 "input_operand" "r,Y,r,ws,wa,Z,Z,ws,wa,d,m,d,j,r,h,0,G,H,F"))] + [(set (match_operand:DF 0 "nonimmediate_operand" "=m,d,d,Y,r,!r,ws,?wa,Z,?Z,ws,?wa,wa,*c*l,!r,*h,!r,!r,!r") + (match_operand:DF 1 "input_operand" "d,m,d,r,Y,r,Z,Z,ws,wa,ws,wa,j,r,h,0,G,H,F"))] "TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (gpc_reg_operand (operands[0], DFmode) || gpc_reg_operand (operands[1], DFmode))" "@ + stfd%U0%X0 %1,%0 + lfd%U1%X1 %0,%1 + fmr %0,%1 std%U0%X0 %1,%0 ld%U1%X1 %0,%1 mr %0,%1 - xxlor %x0,%x1,%x1 - xxlor %x0,%x1,%x1 lxsd%U1x %x0,%y1 lxsd%U1x %x0,%y1 stxsd%U0x %x1,%y0 stxsd%U0x %x1,%y0 - stfd%U0%X0 %1,%0 - lfd%U1%X1 %0,%1 - fmr %0,%1 + xxlor %x0,%x1,%x1 + xxlor %x0,%x1,%x1 xxlxor %x0,%x0,%x0 mt%0 %1 mf%1 %0 @@ -8157,7 +8141,7 @@ (define_insn "*movdf_hardfloat64" # # #" - [(set_attr "type" "store,load,*,fp,fp,fpload,fpload,fpstore,fpstore,fpstore,fpload,fp,vecsimple,mtjmpr,mfjmpr,*,*,*,*") + [(set_attr "type" "fpstore,fpload,fp,store,load,*,fpload,fpload,fpstore,fpstore,vecsimple,vecsimple,vecsimple,mtjmpr,mfjmpr,*,*,*,*") (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16")]) (define_insn "*movdf_softfloat64"