From patchwork Mon May 20 21:34:09 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 245121 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id C70692C00D6 for ; Tue, 21 May 2013 07:34:31 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; q=dns; s=default; b=WEakXeQefRTqyZMq4n2VRMLiU/OXiQ d+ZXE6M+xkUUVIauB/xcNkC/nHMEbriVKU1LCpf3ZNNCympK2Y03vfpAzgV9N3eT bd+Rxgb4LAvrUsTa9nDZDX1EwoQht8WDRV4/WR7yIK/wCr9MEdzmsiQpe5YFMBK5 3p+JjZG5sd+/w= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; s=default; bh=9vfH+5qbw/1QYyR06T0X9Xwc6hI=; b=Jor6 jWuFG4Xno6EkqEJ+4peER1ctgBUFuvp+tcPYHLlsDBRUHo7QcCiHO1UwiJOG3kPq yWC3rubPeK3XlW/wD2Rvo9MWRIU0vudPGg3Sww2TuythyYM9hZtHpz1CtiE0pJHU YmlfSSIGa4ae/wAaaDdXXeMYDEfwqSyKB85K8rg= Received: (qmail 28314 invoked by alias); 20 May 2013 21:34:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 28287 invoked by uid 89); 20 May 2013 21:34:18 -0000 X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=AWL, BAYES_50, KHOP_RCVD_UNTRUST, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, TW_GF, TW_MF, TW_MZ, TW_VS autolearn=no version=3.3.1 Received: from e9.ny.us.ibm.com (HELO e9.ny.us.ibm.com) (32.97.182.139) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Mon, 20 May 2013 21:34:15 +0000 Received: from /spool/local by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 20 May 2013 17:34:13 -0400 Received: from d01dlp01.pok.ibm.com (9.56.250.166) by e9.ny.us.ibm.com (192.168.1.109) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 20 May 2013 17:34:11 -0400 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 276F938C804F for ; Mon, 20 May 2013 17:34:10 -0400 (EDT) Received: from d01av05.pok.ibm.com (d01av05.pok.ibm.com [9.56.224.195]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r4KLYAFZ305040 for ; Mon, 20 May 2013 17:34:10 -0400 Received: from d01av05.pok.ibm.com (loopback [127.0.0.1]) by d01av05.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r4KLYA1P028770 for ; Mon, 20 May 2013 17:34:10 -0400 Received: from ibm-tiger.the-meissners.org (dhcp-9-32-77-206.usma.ibm.com [9.32.77.206]) by d01av05.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id r4KLY9X8028759; Mon, 20 May 2013 17:34:09 -0400 Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 38B53421F3; Mon, 20 May 2013 17:34:09 -0400 (EDT) Date: Mon, 20 May 2013 17:34:09 -0400 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, pthaugen@us.ibm.com, bergner@vnet.ibm.com Subject: Re: [PATCH, rs6000] power8 patch #1, infrastructure changes (revised patch) Message-ID: <20130520213408.GA30353@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, pthaugen@us.ibm.com, bergner@vnet.ibm.com References: <20130520204053.GA21090@ibm-tiger.the-meissners.org> <20130520204923.GA25144@ibm-tiger.the-meissners.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20130520204923.GA25144@ibm-tiger.the-meissners.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13052021-7182-0000-0000-000006CC0CCA X-Virus-Found: No After submitting the patch, I realized I had submitted a previous version of the patch, that had the wq constraint that was initially for the quad memory operations, and also had the changes for ChangeLog.ibm, that I keep on the branch. However, the wq constraint was always equal to the r constraint, do I have removed it, and used the 'r' constraint once again. I have also done bootstraps and make check with the patches submitted, with no regressions found. Can I check in the revised patch? Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 199121) +++ gcc/doc/invoke.texi (revision 199122) @@ -860,7 +860,10 @@ See RS/6000 and PowerPC Options. -mno-recip-precision @gol -mveclibabi=@var{type} -mfriz -mno-friz @gol -mpointers-to-nested-functions -mno-pointers-to-nested-functions @gol --msave-toc-indirect -mno-save-toc-indirect} +-msave-toc-indirect -mno-save-toc-indirect @gol +-mpower8-fusion -mno-mpower8-fusion -mpower8-vector -mno-power8-vector @gol +-mcrypto -mno-crypto -mdirect-move -mno-direct-move @gol +-mquad-memory -mno-quad-memory} @emph{RX Options} @gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol @@ -17341,7 +17344,8 @@ following options: @gccoptlist{-maltivec -mfprnd -mhard-float -mmfcrf -mmultiple @gol -mpopcntb -mpopcntd -mpowerpc64 @gol -mpowerpc-gpopt -mpowerpc-gfxopt -msingle-float -mdouble-float @gol --msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx} +-msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx @gol +-mcrypto -mdirect-move -mpower8-fusion -mpower8-vector -mquad-memory} The particular options set for any particular CPU varies between compiler versions, depending on what setting seems to produce optimal @@ -17459,6 +17463,47 @@ Generate code that uses (does not use) v instructions, and also enable the use of built-in functions that allow more direct access to the VSX instruction set. +@item -mcrypto +@itemx -mno-crypto +@opindex mcrypto +@opindex mno-crypto +Enable the use (disable) of the built-in functions that allow direct +access to the cryptographic instructions that were added in version +2.07 of the PowerPC ISA. + +@item -mdirect-move +@itemx -mno-direct-move +@opindex mdirect-move +@opindex mno-direct-move +Generate code that uses (does not use) the instructions to move data +between the general purpose registers and the vector/scalar (VSX) +registers that were added in version 2.07 of the PowerPC ISA. + +@item -mpower8-fusion +@itemx -mno-power8-fusion +@opindex mpower8-fusion +@opindex mno-power8-fusion +Generate code that keeps (does not keeps) some integer operations +adjacent so that the instructions can be fused together on power8 and +later processors. + +@item -mpower8-vector +@itemx -mno-power8-vector +@opindex mpower8-vector +@opindex mno-power8-vector +Generate code that uses (does not use) the vector and scalar +instructions that were added in version 2.07 of the PowerPC ISA. Also +enable the use of built-in functions that allow more direct access to +the vector instructions. + +@item -mquad-memory +@itemx -mno-quad-memory +@opindex mquad-memory +@opindex mno-quad-memory +Generate code that uses (does not use) the quad word memory +instructions. The @option{-mquad-memory} option requires use of +64-bit mode. + @item -mfloat-gprs=@var{yes/single/double/no} @itemx -mfloat-gprs @opindex mfloat-gprs Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 199121) +++ gcc/doc/md.texi (revision 199122) @@ -2055,7 +2055,7 @@ Any constant whose absolute value is no @end table -@item PowerPC and IBM RS6000---@file{config/rs6000/rs6000.h} +@item PowerPC and IBM RS6000---@file{config/rs6000/constraints.md} @table @code @item b Address base register @@ -2069,6 +2069,9 @@ Floating point register (containing 32-b @item v Altivec vector register +@item wa +Any VSX register + @item wd VSX vector register to hold vector double data @@ -2081,6 +2084,15 @@ If @option{-mmfpgpr} was used, a floatin @item wl If the LFIWAX instruction is enabled, a floating point register +@item wm +If direct moves are enabled, a VSX register. + +@item wn +No register. + +@item wr +General purpose register if 64-bit mode is used + @item ws VSX vector register to hold scalar float data @@ -2093,8 +2105,9 @@ If the STFIWX instruction is enabled, a @item wz If the LFIWZX instruction is enabled, a floating point register -@item wa -Any VSX register +@item wQ +A memory address that will work with the @code{lq} and @code{stq} +instructions. @item h @samp{MQ}, @samp{CTR}, or @samp{LINK} register Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 199121) +++ gcc/config/rs6000/rs6000.opt (revision 199122) @@ -517,4 +517,28 @@ Control whether we save the TOC in the p mvsx-timode Target Undocumented Mask(VSX_TIMODE) Var(rs6000_isa_flags) -; Allow/disallow TImode in VSX registers +Allow 128-bit integers in VSX registers + +mpower8-fusion +Target Report Mask(P8_FUSION) Var(rs6000_isa_flags) +Fuse certain integer operations together for better performance on power8 + +mpower8-fusion-sign +Target Undocumented Mask(P8_FUSION_SIGN) Var(rs6000_isa_flags) +Allow sign extension in fusion operations + +mpower8-vector +Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags) +Use/do not use vector and scalar instructions added in ISA 2.07. + +mcrypto +Target Report Mask(CRYPTO) Var(rs6000_isa_flags) +Use ISA 2.07 crypto instructions + +mdirect-move +Target Report Mask(DIRECT_MOVE) Var(rs6000_isa_flags) +Use ISA 2.07 direct move between GPR & VSX register instructions + +mquad-memory +Target Report Mask(QUAD_MEMORY) Var(rs6000_isa_flags) +Generate the quad word memory instructions (lq/stq/lqarx/stqcx). Index: gcc/config/rs6000/rs6000-c.c =================================================================== --- gcc/config/rs6000/rs6000-c.c (revision 199121) +++ gcc/config/rs6000/rs6000-c.c (revision 199122) @@ -315,6 +315,8 @@ rs6000_target_modify_macros (bool define rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6X"); if ((flags & OPTION_MASK_POPCNTD) != 0) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7"); + if ((flags & OPTION_MASK_DIRECT_MOVE) != 0) + rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8"); if ((flags & OPTION_MASK_SOFT_FLOAT) != 0) rs6000_define_or_undefine_macro (define_p, "_SOFT_FLOAT"); if ((flags & OPTION_MASK_RECIP_PRECISION) != 0) @@ -331,6 +333,8 @@ rs6000_target_modify_macros (bool define } if ((flags & OPTION_MASK_VSX) != 0) rs6000_define_or_undefine_macro (define_p, "__VSX__"); + if ((flags & OPTION_MASK_P8_VECTOR) != 0) + rs6000_define_or_undefine_macro (define_p, "__POWER8_VECTOR__"); /* options from the builtin masks. */ if ((bu_mask & RS6000_BTM_SPE) != 0) Index: gcc/config/rs6000/constraints.md =================================================================== --- gcc/config/rs6000/constraints.md (revision 199121) +++ gcc/config/rs6000/constraints.md (revision 199122) @@ -79,12 +79,31 @@ (define_register_constraint "wg" "rs6000 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]" "Floating point register if the LFIWAX instruction is enabled or NO_REGS.") +(define_register_constraint "wm" "rs6000_constraints[RS6000_CONSTRAINT_wm]" + "VSX register if direct move instructions are enabled, or NO_REGS.") + +(define_register_constraint "wr" "rs6000_constraints[RS6000_CONSTRAINT_wr]" + "General purpose register if 64-bit instructions are enabled or NO_REGS.") + +(define_register_constraint "wv" "rs6000_constraints[RS6000_CONSTRAINT_wv]" + "Altivec register if -mpower8-vector is used or NO_REGS.") + (define_register_constraint "wx" "rs6000_constraints[RS6000_CONSTRAINT_wx]" "Floating point register if the STFIWX instruction is enabled or NO_REGS.") (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]" "Floating point register if the LFIWZX instruction is enabled or NO_REGS.") +;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use +;; direct move directly, and movsf can't to move between the register sets. +;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode +(define_register_constraint "wn" "NO_REGS") + +;; Lq/stq validates the address for load/store quad +(define_memory_constraint "wQ" + "Memory operand suitable for the load/store quad instructions" + (match_operand 0 "quad_memory_operand")) + ;; Altivec style load/store that ignores the bottom bits of the address (define_memory_constraint "wZ" "Indexed or indirect memory operand, ignoring the bottom 4 bits" Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 199121) +++ gcc/config/rs6000/rs6000.c (revision 199122) @@ -831,6 +831,25 @@ struct processor_costs power7_cost = { 12, /* prefetch streams */ }; +/* Instruction costs on POWER8 processors. */ +static const +struct processor_costs power8_cost = { + COSTS_N_INSNS (3), /* mulsi */ + COSTS_N_INSNS (3), /* mulsi_const */ + COSTS_N_INSNS (3), /* mulsi_const9 */ + COSTS_N_INSNS (3), /* muldi */ + COSTS_N_INSNS (19), /* divsi */ + COSTS_N_INSNS (35), /* divdi */ + COSTS_N_INSNS (3), /* fp */ + COSTS_N_INSNS (3), /* dmul */ + COSTS_N_INSNS (14), /* sdiv */ + COSTS_N_INSNS (17), /* ddiv */ + 128, /* cache line size */ + 32, /* l1 cache */ + 256, /* l2 cache */ + 12, /* prefetch streams */ +}; + /* Instruction costs on POWER A2 processors. */ static const struct processor_costs ppca2_cost = { @@ -1547,6 +1566,15 @@ rs6000_hard_regno_mode_ok (int regno, en { int last_regno = regno + rs6000_hard_regno_nregs[mode][regno] - 1; + /* PTImode can only go in GPRs. Quad word memory operations require even/odd + register combinations, and use PTImode where we need to deal with quad + word memory operations. Don't allow quad words in the argument or frame + pointer registers, just registers 0..31. */ + if (mode == PTImode) + return (IN_RANGE (regno, FIRST_GPR_REGNO, LAST_GPR_REGNO) + && IN_RANGE (last_regno, FIRST_GPR_REGNO, LAST_GPR_REGNO) + && ((regno & 1) == 0)); + /* VSX registers that overlap the FPR registers are larger than for non-VSX implementations. Don't allow an item to be split between a FP register and an Altivec register. */ @@ -1678,6 +1706,16 @@ rs6000_debug_reg_print (int first_regno, comma = ""; } + len += fprintf (stderr, "%sreg-class = %s", comma, + reg_class_names[(int)rs6000_regno_regclass[r]]); + comma = ", "; + + if (len > 70) + { + fprintf (stderr, ",\n\t"); + comma = ""; + } + fprintf (stderr, "%sregno = %d\n", comma, r); } } @@ -1710,6 +1748,7 @@ rs6000_debug_reg_global (void) "none", "altivec", "vsx", + "p8_vector", "paired", "spe", "other" @@ -1802,8 +1841,11 @@ rs6000_debug_reg_global (void) "wf reg_class = %s\n" "wg reg_class = %s\n" "wl reg_class = %s\n" + "wm reg_class = %s\n" + "wr reg_class = %s\n" "ws reg_class = %s\n" "wt reg_class = %s\n" + "wv reg_class = %s\n" "wx reg_class = %s\n" "wz reg_class = %s\n" "\n", @@ -1815,8 +1857,11 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wm]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wt]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wv]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]); @@ -2050,6 +2095,10 @@ rs6000_debug_reg_global (void) if (targetm.lra_p ()) fprintf (stderr, DEBUG_FMT_S, "lra", "true"); + if (TARGET_P8_FUSION) + fprintf (stderr, DEBUG_FMT_S, "p8 fusion", + (TARGET_P8_FUSION_SIGN) ? "zero+sign" : "zero"); + fprintf (stderr, DEBUG_FMT_S, "plt-format", TARGET_SECURE_PLT ? "secure" : "bss"); fprintf (stderr, DEBUG_FMT_S, "struct-return", @@ -2240,6 +2289,15 @@ rs6000_init_hard_regno_mode_ok (bool glo if (TARGET_LFIWAX) rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS; + if (TARGET_DIRECT_MOVE) + rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS; + + if (TARGET_POWERPC64) + rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS; + + if (TARGET_P8_VECTOR) + rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS; + if (TARGET_STFIWX) rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS; @@ -2520,16 +2578,18 @@ darwin_rs6000_override_options (void) HOST_WIDE_INT rs6000_builtin_mask_calculate (void) { - return (((TARGET_ALTIVEC) ? RS6000_BTM_ALTIVEC : 0) - | ((TARGET_VSX) ? RS6000_BTM_VSX : 0) - | ((TARGET_SPE) ? RS6000_BTM_SPE : 0) - | ((TARGET_PAIRED_FLOAT) ? RS6000_BTM_PAIRED : 0) - | ((TARGET_FRE) ? RS6000_BTM_FRE : 0) - | ((TARGET_FRES) ? RS6000_BTM_FRES : 0) - | ((TARGET_FRSQRTE) ? RS6000_BTM_FRSQRTE : 0) - | ((TARGET_FRSQRTES) ? RS6000_BTM_FRSQRTES : 0) - | ((TARGET_POPCNTD) ? RS6000_BTM_POPCNTD : 0) - | ((rs6000_cpu == PROCESSOR_CELL) ? RS6000_BTM_CELL : 0)); + return (((TARGET_ALTIVEC) ? RS6000_BTM_ALTIVEC : 0) + | ((TARGET_VSX) ? RS6000_BTM_VSX : 0) + | ((TARGET_SPE) ? RS6000_BTM_SPE : 0) + | ((TARGET_PAIRED_FLOAT) ? RS6000_BTM_PAIRED : 0) + | ((TARGET_FRE) ? RS6000_BTM_FRE : 0) + | ((TARGET_FRES) ? RS6000_BTM_FRES : 0) + | ((TARGET_FRSQRTE) ? RS6000_BTM_FRSQRTE : 0) + | ((TARGET_FRSQRTES) ? RS6000_BTM_FRSQRTES : 0) + | ((TARGET_POPCNTD) ? RS6000_BTM_POPCNTD : 0) + | ((rs6000_cpu == PROCESSOR_CELL) ? RS6000_BTM_CELL : 0) + | ((TARGET_P8_VECTOR) ? RS6000_BTM_P8_VECTOR : 0) + | ((TARGET_CRYPTO) ? RS6000_BTM_CRYPTO : 0)); } /* Override command line options. Mostly we process the processor type and @@ -2803,7 +2863,9 @@ rs6000_option_override_internal (bool gl /* For the newer switches (vsx, dfp, etc.) set some of the older options, unless the user explicitly used the -mno-