From patchwork Thu May 5 18:05:19 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 619011 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3r12rv4bbZz9t4h for ; Fri, 6 May 2016 04:05:50 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=gi+pUdR6; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=OQHB/CralNcCsfpxF HaQrhbii5kD24+PvvXDMZsARqLfVyXvjTUZxBfqBNGOUXAIx453ORS/JUDUY4Rtl VRcF8pY0+Iqe2q0S+Hf/F9XxPdNgPe/SSuMUEUxw0AK1u53kDLs/7gxXxhMmEw/J Ko0HqIyW3lj82NVr7yOk+c8eXo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=enf1yuCkiroQXjD5ADikMiH EhG8=; b=gi+pUdR6RnM3OqBRCMCf87M5ivC1VT8qRf7H/7vORGDJ4krnTMaV6EN tkeoQ5UjykaORsckhs+rLfeiXuMibZCXCK/TSyVRTXRlmnlxFaa04lkoIf80K3AQ VEeoumaZ1qYUgLcki+AQaGFSFL7bJDJ2G+0svru4Ivt9hNjJl3vE= Received: (qmail 76257 invoked by alias); 5 May 2016 18:05:35 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 76241 invoked by uid 89); 5 May 2016 18:05:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.6 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, KAM_STOCKGEN autolearn=no version=3.3.2 spammy=King, 2506r, 8994797, littleton X-HELO: e32.co.us.ibm.com Received: from e32.co.us.ibm.com (HELO e32.co.us.ibm.com) (32.97.110.150) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 05 May 2016 18:05:24 +0000 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 5 May 2016 12:05:22 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e32.co.us.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 5 May 2016 12:05:21 -0600 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: meissner@ibm-tiger.the-meissners.org X-IBM-RcptTo: gcc-patches@gcc.gnu.org; dje.gcc@gmail.com; segher@kernel.crashing.org Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 8657619D8058; Thu, 5 May 2016 12:05:04 -0600 (MDT) Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u45I5KVI46006352; Thu, 5 May 2016 11:05:20 -0700 Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9CE3A6A041; Thu, 5 May 2016 12:05:20 -0600 (MDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP id 4AD9A6A03C; Thu, 5 May 2016 12:05:20 -0600 (MDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 8823C45EAA; Thu, 5 May 2016 14:05:19 -0400 (EDT) Date: Thu, 5 May 2016 14:05:19 -0400 From: Michael Meissner To: Segher Boessenkool Cc: Michael Meissner , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt Subject: Re: [PATCH], Add PowerPC ISA 3.0 vector d-form addressing Message-ID: <20160505180519.GA26712@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , Segher Boessenkool , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt References: <20160503223955.GA12329@ibm-tiger.the-meissners.org> <20160504161651.GA32139@gate.crashing.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160504161651.GA32139@gate.crashing.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16050518-0005-0000-0000-0000399C0E36 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused X-IsSubscribed: yes On Wed, May 04, 2016 at 11:16:52AM -0500, Segher Boessenkool wrote: > Hi Mike, > > On Tue, May 03, 2016 at 06:39:55PM -0400, Michael Meissner wrote: > > With this patch, I enable -mlra if the user did not specify either -mlra or > > -mno-lra on the command line, and -mcpu=power9 or -mpower9-dform-vector were > > used. I also enabled -mvsx-timode if LRA was used, which also is a RELOAD > > issue, that works with LRA. > > I don't like enabling LRA if the user didn't ask for it; it is a bit too > surprising. What do you do if there is -mno-lra explicitly? You can just > do the same if no-lra is implicit? Ok. > > * doc/md.texi (wO constraint): Likewise. > > Everything is "likewise", that isn't very helpful. Writing big changelogs > is annoying, I totally agree, but please try a bit harder. > > > --- gcc/config/rs6000/rs6000.opt (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) > > +++ gcc/config/rs6000/rs6000.opt (.../gcc/config/rs6000) (working copy) > > @@ -470,8 +470,8 @@ Target RejectNegative Joined UInteger Va > > -mlong-double- Specify size of long double (64 or 128 bits). > > > > mlra > > -Target Report Var(rs6000_lra_flag) Init(0) Save > > -Use LRA instead of reload. > > +Target Undocumented Mask(LRA) Var(rs6000_isa_flags) > > +Use the LRA register allocator instead of the reload register allocator. > > It wasn't "undocumented" before? Why the change to a mask bit btw? It was always meant to be undocumented, but I changed to be similar to before. I am trying to change all of the random switches that set a word to be an option mask, so I made that part of the change in these next patches. I did remove setting it for -mcpu=power9. > > +mpower9-dform-scalar > > +Target Report Mask(P9_DFORM_SCALAR) Var(rs6000_isa_flags) > > +Use/do not use scalar register+offset memory instructions added in ISA 3.0. > > + > > +mpower9-dform-vector > > +Target Report Mask(P9_DFORM_VECTOR) Var(rs6000_isa_flags) > > +Use/do not use vector register+offset memory instructions added in ISA 3.0. > > + > > mpower9-dform > > -Target Undocumented Mask(P9_DFORM) Var(rs6000_isa_flags) > > -Use/do not use vector and scalar instructions added in ISA 3.0. > > +Target Report Var(TARGET_P9_DFORM_BOTH) Init(-1) Save > > +Use/do not use register+offset memory instructions added in ISA 3.0. > > These should probably all be undocumented, though (they're not something > users should use). I will make -mpower9-dform public (which I thought it was, but evidently I missed adding the documentation for GCC 6), but I will make the -scalar and -vector forms private. > > +/* Return true if the ADDR is an acceptiable address for a quad memory > ^ spelling Ok. > > + if (((addr_mask & RELOAD_REG_QUAD_OFFSET) == 0) > > + || !quad_address_p (addr, mode, false)) > Here is the latest version of the patch. Like before, it bootstraps and has no regressions. Is it ok to apply to the trunk? [gcc] 2016-05-05 Michael Meissner * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Use -mpower9-dform-scalar instead of -mpower9-dform. Add note not to include -mpower9-dform-vector until we switch over to LRA. (POWERPC_MASKS): Add -mlra. Split -mpower9-dform into two switches, -mpower9-dform-scalar and -mpower9-dform-vector. * config/rs6000/rs6000.opt (-mlra): Switch to being an option mask bit instead of being a separate word. Split -mpower9-dform into two switches, -mpower9-dform-scalar and -mpower9-dform-vector. * config/rs6000/rs6000.c (RELOAD_REG_QUAD_OFFSET): New addr_mask for the register class supporting 128-bit quad word memory offsets. (mode_supports_vsx_dform_quad): Helper function to return if the register class uses quad word memory offsets. (rs6000_debug_addr_mask): Add support for quad word memory offsets. (rs6000_debug_reg_global): Use TARGET_LRA instead of calling the lra_p target hook. (rs6000_setup_reg_addr_masks): If ISA 3.0 vector d-form instructions are enabled, set up the appropriate addr_masks for 128-bit types. (rs6000_init_hard_regno_mode_ok): wb constraint is now based on -mpower9-dform-scalar, instead of -mpower9-dform. (rs6000_option_override_internal): Split -mpower9-dform into two switches, -mpower9-dform-scalar and -mpower9-dform-vector. The -mpower9-dform switch sets or clears both. If we are not using the LRA register allocator, do not enable -mpower9-dform-vector by default. If we are using LRA, enable -mpower9-dform-vector and -mvsx-timode if it is appropriate. Issue a warning if either -mpower9-dform-vector or -mvsx-timode are explicitly used without enabling LRA. (quad_address_offset_p): New helper function to return if the offset is legal for quad word memory instructions. (quad_address_p): New function to determin if GPR or vector register quad word memory addresses are legal. (mem_operand_gpr): Validate quad word address offsets. (reg_offset_addressing_ok_p): Add support for ISA 3.0 vector d-form (register + offset) instructions. (offsettable_ok_by_alignment): Likewise. (rs6000_legitimate_offset_address_p): Likewise. (legitimate_lo_sum_address_p): Likewise. (rs6000_legitimize_address): Likewise. (rs6000_legitimize_reload_address): Add more debug statements for -mdebug=addr. (rs6000_legitimate_address_p): Add support for ISA 3.0 vector d-form instructions. (rs6000_secondary_reload_memory): Add support for ISA 3.0 vector d-form instructions. Distinguish different cases in debug output. (rs6000_secondary_reload_inner): Add support for ISA 3.0 vector d-form instructions. (rs6000_preferred_reload_class): Likewise. (rs6000_output_move_128bit): Add support for ISA 3.0 d-form instructions. If ISA 3.0 is available, generate lxvx/stxvx instead of the ISA 2.06 indexed memory instructions. (rs6000_emit_prologue): If we have ISA 3.0 d-form instructions, use them to save/restore the saved vector registers instead of using Altivec instructions. (rs6000_emit_epilogue): Likewise. (rs6000_lra_p): Use TARGET_LRA instead of the old option word. (rs6000_opt_masks): Split -mpower9-dform into -mpower9-dform-scalar and -mpower9-dform-vector. (rs6000_print_options_internal): Print -mno- if was not selected. * config/rs6000/constraints.md (wO constraint): New constraint for ISA 3.0 vector d-form support. * config/rs6000/predicates.md (quad_memory_operand): Move most of the code into quad_address_p and call it to share code with vsx_quad_dform_memory_operand. (vsx_quad_dform_memory_operand): New predicate for ISA 3.0 vector d-form support. * config/rs6000/rs6000-protos.h (quad_address_p): Add declaration. * config/rs6000/rs6000.md (p9_vecload_): Delete hack to emit ISA 3.0 vector indexed memory instructions, and fold the code into the normal mov patterns. (p9_vecstore_): Likewise. (vsx_mov): Add support for ISA 3.0 vector d-form instructions. (vsx_movti_64bit): Likewise. (vsx_movti_32bit): Likewise. * doc/invoke.texi (RS/6000 and PowerPC Options): Add documentation for -mpower9-dform. * doc/md.texi (wO constraint): Document wO constraint. [gcc/testsuite] 2016-05-05 Michael Meissner * gcc.target/powerpc/p8vector-int128-1.c: Add -mlra to silence warning when using -mvsx-timode. * gcc.target/powerpc/pr68805.c: Likewise. * gcc.target/powerpc/dform-3.c: New test for ISA 3.0 vector d-form support. * gcc.target/powerpc/dform-1.c: Add -mlra option. * gcc.target/powerpc/dform-2.c: Likewise. Index: gcc/config/rs6000/rs6000-cpus.def =================================================================== --- gcc/config/rs6000/rs6000-cpus.def (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000-cpus.def (.../gcc/config/rs6000) (working copy) @@ -60,13 +60,14 @@ | OPTION_MASK_UPPER_REGS_SF) /* Add ISEL back into ISA 3.0, since it is supposed to be a win. Do not add - P9_DFORM or P9_MINMAX until they are fully debugged. */ + P9_MINMAX until the hardware that supports it is available. Do not add + P9_DFORM_VECTOR until LRA is the default register allocator. */ #define ISA_3_0_MASKS_SERVER (ISA_2_7_MASKS_SERVER \ | OPTION_MASK_FLOAT128_HW \ | OPTION_MASK_ISEL \ | OPTION_MASK_MODULO \ | OPTION_MASK_P9_FUSION \ - | OPTION_MASK_P9_DFORM \ + | OPTION_MASK_P9_DFORM_SCALAR \ | OPTION_MASK_P9_VECTOR) #define POWERPC_7400_MASK (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC) @@ -94,6 +95,7 @@ | OPTION_MASK_FPRND \ | OPTION_MASK_HTM \ | OPTION_MASK_ISEL \ + | OPTION_MASK_LRA \ | OPTION_MASK_MFCRF \ | OPTION_MASK_MFPGPR \ | OPTION_MASK_MODULO \ @@ -101,7 +103,8 @@ | OPTION_MASK_NO_UPDATE \ | OPTION_MASK_P8_FUSION \ | OPTION_MASK_P8_VECTOR \ - | OPTION_MASK_P9_DFORM \ + | OPTION_MASK_P9_DFORM_SCALAR \ + | OPTION_MASK_P9_DFORM_VECTOR \ | OPTION_MASK_P9_FUSION \ | OPTION_MASK_P9_MINMAX \ | OPTION_MASK_P9_VECTOR \ Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000.opt (.../gcc/config/rs6000) (working copy) @@ -470,7 +470,7 @@ Target RejectNegative Joined UInteger Va -mlong-double- Specify size of long double (64 or 128 bits). mlra -Target Report Var(rs6000_lra_flag) Init(0) Save +Target Report Mask(LRA) Var(rs6000_isa_flags) Use LRA instead of reload. msched-costly-dep= @@ -609,9 +609,17 @@ mpower9-vector Target Report Mask(P9_VECTOR) Var(rs6000_isa_flags) Use/do not use vector and scalar instructions added in ISA 3.0. +mpower9-dform-scalar +Target Undocumented Mask(P9_DFORM_SCALAR) Var(rs6000_isa_flags) +Use/do not use scalar register+offset memory instructions added in ISA 3.0. + +mpower9-dform-vector +Target Undocumented Mask(P9_DFORM_VECTOR) Var(rs6000_isa_flags) +Use/do not use vector register+offset memory instructions added in ISA 3.0. + mpower9-dform -Target Undocumented Mask(P9_DFORM) Var(rs6000_isa_flags) -Use/do not use vector and scalar instructions added in ISA 3.0. +Target Report Var(TARGET_P9_DFORM_BOTH) Init(-1) Save +Use/do not use register+offset memory instructions added in ISA 3.0. mpower9-minmax Target Undocumented Mask(P9_MINMAX) Var(rs6000_isa_flags) Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) @@ -452,6 +452,7 @@ typedef unsigned char addr_mask_type; #define RELOAD_REG_PRE_INCDEC 0x10 /* PRE_INC/PRE_DEC valid. */ #define RELOAD_REG_PRE_MODIFY 0x20 /* PRE_MODIFY valid. */ #define RELOAD_REG_AND_M16 0x40 /* AND -16 addressing. */ +#define RELOAD_REG_QUAD_OFFSET 0x80 /* quad offset is limited. */ /* Register type masks based on the type, of valid addressing modes. */ struct rs6000_reg_addr { @@ -499,6 +500,16 @@ mode_supports_vmx_dform (machine_mode mo return ((reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_OFFSET) != 0); } +/* Return true if we have D-form addressing in VSX registers. This addressing + is more limited than normal d-form addressing in that the offset must be + aligned on a 16-byte boundary. */ +static inline bool +mode_supports_vsx_dform_quad (machine_mode mode) +{ + return ((reg_addr[mode].addr_mask[RELOAD_REG_ANY] & RELOAD_REG_QUAD_OFFSET) + != 0); +} + /* Target cpu costs. */ @@ -2108,7 +2119,9 @@ rs6000_debug_addr_mask (addr_mask_type m else if (keep_spaces) *p++ = ' '; - if ((mask & RELOAD_REG_OFFSET) != 0) + if ((mask & RELOAD_REG_QUAD_OFFSET) != 0) + *p++ = 'O'; + else if ((mask & RELOAD_REG_OFFSET) != 0) *p++ = 'o'; else if (keep_spaces) *p++ = ' '; @@ -2645,7 +2658,7 @@ rs6000_debug_reg_global (void) if (TARGET_LINK_STACK) fprintf (stderr, DEBUG_FMT_S, "link_stack", "true"); - if (targetm.lra_p ()) + if (TARGET_LRA) fprintf (stderr, DEBUG_FMT_S, "lra", "true"); if (TARGET_P8_FUSION) @@ -2781,17 +2794,31 @@ rs6000_setup_reg_addr_masks (void) } /* GPR and FPR registers can do REG+OFFSET addressing, except - possibly for SDmode. ISA 3.0 (i.e. power9) adds D-form - addressing for scalars to altivec registers. */ + possibly for SDmode. ISA 3.0 (i.e. power9) adds D-form addressing + for 64-bit scalars and 32-bit SFmode to altivec registers. */ if ((addr_mask != 0) && !indexed_only_p && msize <= 8 && (rc == RELOAD_REG_GPR - || rc == RELOAD_REG_FPR - || (rc == RELOAD_REG_VMX - && TARGET_P9_DFORM - && (m2 == DFmode || m2 == SFmode)))) + || ((msize == 8 || m2 == SFmode) + && (rc == RELOAD_REG_FPR + || (rc == RELOAD_REG_VMX + && TARGET_P9_DFORM_SCALAR))))) addr_mask |= RELOAD_REG_OFFSET; + /* VSX registers can do REG+OFFSET addresssing if ISA 3.0 + instructions are enabled. The offset for 128-bit VSX registers is + only 12-bits. While GPRs can handle the full offset range, VSX + registers can only handle the restricted range. */ + else if ((addr_mask != 0) && !indexed_only_p + && msize == 16 && TARGET_P9_DFORM_VECTOR + && (ALTIVEC_OR_VSX_VECTOR_MODE (m2) + || (m2 == TImode && TARGET_VSX_TIMODE))) + { + addr_mask |= RELOAD_REG_OFFSET; + if (rc == RELOAD_REG_FPR || rc == RELOAD_REG_VMX) + addr_mask |= RELOAD_REG_QUAD_OFFSET; + } + /* VMX registers can do (REG & -16) and ((REG+REG) & -16) addressing on 128-bit types. */ if (rc == RELOAD_REG_VMX && msize == 16 @@ -3114,7 +3141,7 @@ rs6000_init_hard_regno_mode_ok (bool glo } /* Support for new D-form instructions. */ - if (TARGET_P9_DFORM) + if (TARGET_P9_DFORM_SCALAR) rs6000_constraints[RS6000_CONSTRAINT_wb] = ALTIVEC_REGS; /* Support for ISA 3.0 (power9) vectors. */ @@ -3987,7 +4014,8 @@ rs6000_option_override_internal (bool gl /* For the newer switches (vsx, dfp, etc.) set some of the older options, unless the user explicitly used the -mno-