From patchwork Tue May 3 22:39:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 618167 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qzx2x1Bjrz9sDD for ; Wed, 4 May 2016 08:40:40 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=njuIqk6a; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=KJpA+LqUh+D0eDN8FIv5nNl9RZSxvGWJeQIjYcmvFe8WgiJD7kx3W hZ/KlcIGJyWexwpXKg5ZuR6/YgBrVLXiGiwt8zTlU0GW+8Z9o9rb+3XtVx32ZvW0 ftJhqMfcecigInMJqmtb7EfVjUqnehgKmJUKlAJ/AwI+fo3lpg31I8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=yZBqwXDg5vxmr0rT9yC6xTd8iKc=; b=njuIqk6a1k9IAUTdWpPf /TVJrMQUQqhAywfb0ntX+DplTuzKd+EGUh0X6o33cfV9sLbTsDm+oU+yZhS2nwmX 3HNPHEnYXZZWFWG+jTGnOoIKc9+i6qFSn1HINW2JbjLT0IRo/sNc8QU190cbb97x CgX0ElJrfFtxt4D0BH9SOFg= Received: (qmail 80084 invoked by alias); 3 May 2016 22:40:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 80072 invoked by uid 89); 3 May 2016 22:40:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.5 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, KAM_STOCKGEN autolearn=no version=3.3.2 spammy=King, 978, 2506r, 8994797 X-HELO: e34.co.us.ibm.com Received: from e34.co.us.ibm.com (HELO e34.co.us.ibm.com) (32.97.110.152) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Tue, 03 May 2016 22:40:02 +0000 Received: from localhost by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 3 May 2016 16:39:59 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 3 May 2016 16:39:58 -0600 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: meissner@ibm-tiger.the-meissners.org X-IBM-RcptTo: gcc-patches@gcc.gnu.org Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 5739E19D8052 for ; Tue, 3 May 2016 16:39:41 -0600 (MDT) Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u43MduAM39452718 for ; Tue, 3 May 2016 22:39:57 GMT Received: from d01av01.pok.ibm.com (localhost [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u43MduFM031256 for ; Tue, 3 May 2016 18:39:56 -0400 Received: from ibm-tiger.the-meissners.org (dhcp-9-32-77-111.usma.ibm.com [9.32.77.111]) by d01av01.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u43MduUn031237; Tue, 3 May 2016 18:39:56 -0400 Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 0138745EAA; Tue, 3 May 2016 18:39:55 -0400 (EDT) Date: Tue, 3 May 2016 18:39:55 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH], Add PowerPC ISA 3.0 vector d-form addressing Message-ID: <20160503223955.GA12329@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16050322-0017-0000-0000-00002A3D3CC9 X-IsSubscribed: yes This patch implements the new instructions added in ISA 3.0 (power9) to allow d-form (register + offset) memory loads and stores to/from vector registers. I split the previous -mpower9-dform switch to -mpower9-dform-vector and -mpower9-dform-scalar in case you need to disable one or both forms. The vector d-form instructions are more restricted than the scalar instructions in that they use a 12-bit offset (like the lq/stq instructions that operate on GPR registers). Note, -mpower9-dform-vector is not compatible with RELOAD. I believe the problem is that we are missing some push_reloads, and it winds up with an invalid memory address that includes another memory address. Given that we plan to move from RELOAD to LRA, I stopped trying to debug it, and enabled it only if LRA is used. Right now, we cannot move to LRA as the default until the performance degradations that LRA causes in the 403.gcc spec bencharmark are fixed: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69847 With this patch, I enable -mlra if the user did not specify either -mlra or -mno-lra on the command line, and -mcpu=power9 or -mpower9-dform-vector were used. I also enabled -mvsx-timode if LRA was used, which also is a RELOAD issue, that works with LRA. I have built the spec 2006 CPU benchmarks with this option, and all bencmarks generate vector d-form instructions (mcf only generates stores, not loads). This patch bootstraps fine on a little endian Power8 compiler and has no regressions. Is it ok to install in the trunk? How about in the gcc 6.2 branch after a burn-in period. [gcc] 2016-05-03 Michael Meissner * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Make -mlra an option mask instead of setting a separate word. Add -mlra and -mvsx-timode as defaults for power9. Split -mpower9-dform into -mpower9-dform-scalar and -mpower9-dform-vector. Add support for ISA 3.0 vector d-form instructions. Set -mlra by default if -mpower9-dform-vector. Set -mvsx-timode if -mlra. Add more debug printouts. If we have ISA 3.0 d-form vector instructions use them for the epilog and prolog. Add wO constraint for ISA 3.0 vector d-form instructions. Rewrite quad memory support to support both lq/stq for GPRs and ISA 3.0 vector d-forms for vector registers. Delete p9_vecload_ and p9_vecstore_ in favor of folding the ISA 3.0 endian load/store into the general mov insns. (POWERPC_MASKS): Likewise. * config/rs6000/rs6000.opt (-mlra): Likewise. (-mpower9-dform): Likewise. (-mpower9-dform-scalar): Likewise. (-mpower9-dform-vector): Likewise. * config/rs6000/rs6000.c (RELOAD_REG_QUAD_OFFSET): Likewise. (mode_supports_vsx_dform_quad): Likewise. (rs6000_debug_addr_mask): Likewise. (rs6000_setup_reg_addr_masks): Likewise. (rs6000_option_override_internal): Likewise. (quad_address_offset_p): Likewise. (mem_operand_gpr): Likewise. (reg_offset_addressing_ok_p): Likewise. (offsettable_ok_by_alignment): Likewise. (rs6000_legitimate_offset_address_p): Likewise. (legitimate_lo_sum_address_p): Likewise. (rs6000_legitimize_address): Likewise. (rs6000_legitimize_reload_address): Likewise. (rs6000_legitimate_address_p): Likewise. (rs6000_secondary_reload_memory): Likewise. (rs6000_secondary_reload_inner): Likewise. (rs6000_preferred_reload_class): Likewise. (rs6000_output_move_128bit): Likewise. (rs6000_emit_prologue): Likewise. (rs6000_emit_epilogue): Likewise. (rs6000_lra_p): Likewise. (rs6000_opt_masks): Likewise. (rs6000_print_options_internal): Likewise. * config/rs6000/constraints.md (wO constraint): Likewise. * config/rs6000/predicates.md (quad_memory_operand): Likewise. (vsx_quad_dform_memory_operand): Likewise. * config/rs6000/rs6000-protos.h (quad_address_p): Likewise. * config/rs6000/vsx.md (p9_vecload_): Likewise. (p9_vecstore_): Likewise. (vsx_mov * gcc.target/powerpc/dform-1.c: Add -mlra to options. * gcc.target/powerpc/dform-2.c: Likewise. * gcc.target/powerpc/dform-3.c: New test for ISA 3.0 vector d-form instructions. Index: gcc/config/rs6000/rs6000-cpus.def =================================================================== --- gcc/config/rs6000/rs6000-cpus.def (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000-cpus.def (.../gcc/config/rs6000) (working copy) @@ -60,14 +60,17 @@ | OPTION_MASK_UPPER_REGS_SF) /* Add ISEL back into ISA 3.0, since it is supposed to be a win. Do not add - P9_DFORM or P9_MINMAX until they are fully debugged. */ + P9_MINMAX until the hardware that supports it is available. */ #define ISA_3_0_MASKS_SERVER (ISA_2_7_MASKS_SERVER \ | OPTION_MASK_FLOAT128_HW \ | OPTION_MASK_ISEL \ + | OPTION_MASK_LRA \ | OPTION_MASK_MODULO \ | OPTION_MASK_P9_FUSION \ - | OPTION_MASK_P9_DFORM \ - | OPTION_MASK_P9_VECTOR) + | OPTION_MASK_P9_DFORM_SCALAR \ + | OPTION_MASK_P9_DFORM_VECTOR \ + | OPTION_MASK_P9_VECTOR \ + | OPTION_MASK_VSX_TIMODE) #define POWERPC_7400_MASK (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC) @@ -94,6 +97,7 @@ | OPTION_MASK_FPRND \ | OPTION_MASK_HTM \ | OPTION_MASK_ISEL \ + | OPTION_MASK_LRA \ | OPTION_MASK_MFCRF \ | OPTION_MASK_MFPGPR \ | OPTION_MASK_MODULO \ @@ -101,7 +105,8 @@ | OPTION_MASK_NO_UPDATE \ | OPTION_MASK_P8_FUSION \ | OPTION_MASK_P8_VECTOR \ - | OPTION_MASK_P9_DFORM \ + | OPTION_MASK_P9_DFORM_SCALAR \ + | OPTION_MASK_P9_DFORM_VECTOR \ | OPTION_MASK_P9_FUSION \ | OPTION_MASK_P9_MINMAX \ | OPTION_MASK_P9_VECTOR \ Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000.opt (.../gcc/config/rs6000) (working copy) @@ -470,8 +470,8 @@ Target RejectNegative Joined UInteger Va -mlong-double- Specify size of long double (64 or 128 bits). mlra -Target Report Var(rs6000_lra_flag) Init(0) Save -Use LRA instead of reload. +Target Undocumented Mask(LRA) Var(rs6000_isa_flags) +Use the LRA register allocator instead of the reload register allocator. msched-costly-dep= Target RejectNegative Joined Var(rs6000_sched_costly_dep_str) @@ -609,9 +609,17 @@ mpower9-vector Target Report Mask(P9_VECTOR) Var(rs6000_isa_flags) Use/do not use vector and scalar instructions added in ISA 3.0. +mpower9-dform-scalar +Target Report Mask(P9_DFORM_SCALAR) Var(rs6000_isa_flags) +Use/do not use scalar register+offset memory instructions added in ISA 3.0. + +mpower9-dform-vector +Target Report Mask(P9_DFORM_VECTOR) Var(rs6000_isa_flags) +Use/do not use vector register+offset memory instructions added in ISA 3.0. + mpower9-dform -Target Undocumented Mask(P9_DFORM) Var(rs6000_isa_flags) -Use/do not use vector and scalar instructions added in ISA 3.0. +Target Report Var(TARGET_P9_DFORM_BOTH) Init(-1) Save +Use/do not use register+offset memory instructions added in ISA 3.0. mpower9-minmax Target Undocumented Mask(P9_MINMAX) Var(rs6000_isa_flags) Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) @@ -452,6 +452,7 @@ typedef unsigned char addr_mask_type; #define RELOAD_REG_PRE_INCDEC 0x10 /* PRE_INC/PRE_DEC valid. */ #define RELOAD_REG_PRE_MODIFY 0x20 /* PRE_MODIFY valid. */ #define RELOAD_REG_AND_M16 0x40 /* AND -16 addressing. */ +#define RELOAD_REG_QUAD_OFFSET 0x80 /* quad offset is limited. */ /* Register type masks based on the type, of valid addressing modes. */ struct rs6000_reg_addr { @@ -499,6 +500,16 @@ mode_supports_vmx_dform (machine_mode mo return ((reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_OFFSET) != 0); } +/* Return true if we have D-form addressing in VSX registers. This addressing + is more limited than normal d-form addressing in that the offset must be + aligned on a 16-byte boundary. */ +static inline bool +mode_supports_vsx_dform_quad (machine_mode mode) +{ + return ((reg_addr[mode].addr_mask[RELOAD_REG_ANY] & RELOAD_REG_QUAD_OFFSET) + != 0); +} + /* Target cpu costs. */ @@ -2108,7 +2119,9 @@ rs6000_debug_addr_mask (addr_mask_type m else if (keep_spaces) *p++ = ' '; - if ((mask & RELOAD_REG_OFFSET) != 0) + if ((mask & RELOAD_REG_QUAD_OFFSET) != 0) + *p++ = 'O'; + else if ((mask & RELOAD_REG_OFFSET) != 0) *p++ = 'o'; else if (keep_spaces) *p++ = ' '; @@ -2645,9 +2658,6 @@ rs6000_debug_reg_global (void) if (TARGET_LINK_STACK) fprintf (stderr, DEBUG_FMT_S, "link_stack", "true"); - if (targetm.lra_p ()) - fprintf (stderr, DEBUG_FMT_S, "lra", "true"); - if (TARGET_P8_FUSION) { char options[80]; @@ -2781,17 +2791,31 @@ rs6000_setup_reg_addr_masks (void) } /* GPR and FPR registers can do REG+OFFSET addressing, except - possibly for SDmode. ISA 3.0 (i.e. power9) adds D-form - addressing for scalars to altivec registers. */ + possibly for SDmode. ISA 3.0 (i.e. power9) adds D-form addressing + for 64-bit scalars and 32-bit SFmode to altivec registers. */ if ((addr_mask != 0) && !indexed_only_p && msize <= 8 && (rc == RELOAD_REG_GPR - || rc == RELOAD_REG_FPR - || (rc == RELOAD_REG_VMX - && TARGET_P9_DFORM - && (m2 == DFmode || m2 == SFmode)))) + || ((msize == 8 || m2 == SFmode) + && (rc == RELOAD_REG_FPR + || (rc == RELOAD_REG_VMX + && TARGET_P9_DFORM_SCALAR))))) addr_mask |= RELOAD_REG_OFFSET; + /* VSX registers can do REG+OFFSET addresssing if ISA 3.0 + instructions are enabled. The offset for 128-bit VSX registers is + only 12-bits. While GPRs can handle the full offset range, VSX + registers can only handle the restricted range. */ + else if ((addr_mask != 0) && !indexed_only_p + && msize == 16 && TARGET_P9_DFORM_VECTOR + && (ALTIVEC_OR_VSX_VECTOR_MODE (m2) + || (m2 == TImode && TARGET_VSX_TIMODE))) + { + addr_mask |= RELOAD_REG_OFFSET; + if (rc == RELOAD_REG_FPR || rc == RELOAD_REG_VMX) + addr_mask |= RELOAD_REG_QUAD_OFFSET; + } + /* VMX registers can do (REG & -16) and ((REG+REG) & -16) addressing on 128-bit types. */ if (rc == RELOAD_REG_VMX && msize == 16 @@ -3114,7 +3138,7 @@ rs6000_init_hard_regno_mode_ok (bool glo } /* Support for new D-form instructions. */ - if (TARGET_P9_DFORM) + if (TARGET_P9_DFORM_SCALAR) rs6000_constraints[RS6000_CONSTRAINT_wb] = ALTIVEC_REGS; /* Support for ISA 3.0 (power9) vectors. */ @@ -3987,7 +4011,8 @@ rs6000_option_override_internal (bool gl /* For the newer switches (vsx, dfp, etc.) set some of the older options, unless the user explicitly used the -mno-