From patchwork Wed Dec 14 14:45:12 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 705685 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tdzsC1LBWz9snk for ; Thu, 15 Dec 2016 01:45:46 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="FDKkSd5e"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:references:mime-version:content-type :in-reply-to:message-id; q=dns; s=default; b=wGIKYAH5Bv5gjbsR/No TbZynoLvk5xOe5RCV2awVWR3q9HTMHY4QiqwRzCyKAS2utXOdAo/fjmqhDkFEjQC oROaXMKg9eK/LUSFFmaB3bN55UASwpsvSDON4p5cRViYqH3zJmYyRw48tc+PISni oV0PhmRXIXQZfJLRkxaO/Ypk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:references:mime-version:content-type :in-reply-to:message-id; s=default; bh=w6GOQvzN8VfstWsIM1cL00Y8K Xk=; b=FDKkSd5exUGoOjtOt4BzMmcpc9Mq5Z1zBki8Sb/bNf0x6yXrIvmlAURzi IrK64gFLIKp+kN+aOnA8x63ar8EXhjYkCEJF6FG+vjU1BmlU+wvNVXH8wd2Q75W4 l5oiEUnb6d3zz7xxD3OeRBuVM9Druynal5563+EriRGg8WFeOY= Received: (qmail 55170 invoked by alias); 14 Dec 2016 14:45:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 55102 invoked by uid 89); 14 Dec 2016 14:45:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.8 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=zm, 2506r, 8994797, littleton X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Dec 2016 14:45:23 +0000 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id uBEEVmxP119841 for ; Wed, 14 Dec 2016 09:45:21 -0500 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0a-001b2d01.pphosted.com with ESMTP id 27b1utq8cb-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 14 Dec 2016 09:45:20 -0500 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 14 Dec 2016 07:45:18 -0700 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 14 Dec 2016 07:45:16 -0700 Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 9C7371FF0026; Wed, 14 Dec 2016 07:44:54 -0700 (MST) Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id uBEEjF3x13435196; Wed, 14 Dec 2016 07:45:15 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DFB79BE042; Wed, 14 Dec 2016 07:45:15 -0700 (MST) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP id B2C3EBE039; Wed, 14 Dec 2016 07:45:15 -0700 (MST) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id DEB3943C31; Wed, 14 Dec 2016 09:45:14 -0500 (EST) Date: Wed, 14 Dec 2016 09:45:12 -0500 From: Michael Meissner To: Segher Boessenkool Cc: Michael Meissner , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt Subject: Re: [PATCH] Add ISA 3.0 PowerPC support for VEXTU{B, H, W}{L, R}X instructions Mail-Followup-To: Michael Meissner , Segher Boessenkool , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt References: <20161209204810.GA9578@ibm-tiger.the-meissners.org> <20161212120510.GE30845@gate.crashing.org> <20161213151502.GA26733@ibm-tiger.the-meissners.org> <20161213222935.GL30845@gate.crashing.org> <20161213231717.GA6350@ibm-tiger.the-meissners.org> <20161214010245.GO30845@gate.crashing.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20161214010245.GO30845@gate.crashing.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16121414-0012-0000-0000-000011649769 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006248; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000196; SDB=6.00793743; UDB=6.00384867; IPR=6.00571549; BA=6.00004968; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013624; XFM=3.00000011; UTC=2016-12-14 14:45:18 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16121414-0013-0000-0000-00004802AC2F Message-Id: <20161214144512.GA4439@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-12-14_10:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1612140244 X-IsSubscribed: yes On Tue, Dec 13, 2016 at 07:02:48PM -0600, Segher Boessenkool wrote: > On Tue, Dec 13, 2016 at 06:17:17PM -0500, Michael Meissner wrote: > > > > + else if (mode == V8HImode) > > > > + { > > > > + rtx tmp_gpr_si = (GET_CODE (tmp_gpr) == SCRATCH > > > > + ? dest_si > > > > + : gen_rtx_REG (SImode, REGNO (tmp_gpr))); > > > > > > I think you have these the wrong way around? > > > > The rs6000_split_vec_extract_var function is called from several places in > > vsx.md to do a variable vector extract. In looking at each of the cases, there > > is a GPR tmp register for each of the calls, so I could modify it to move: > > > > gcc_assert (REG_P (tmp_gpr)); > > > > before the support for VEXTU{B,H,W}{L,R}X instructions, and leave the > > > > gcc_assert (REG_P (tmp_altivec)); > > > > and remove the test for SCRATCH. In the original version of the code, the > > non-variable case also called rs6000_split_vec_extract_var, and it did not have > > a scratch register. > > What I am asking is: in your code, if there is a scratch you don't use it, > while if you get a reg you generate a new reg. It looks like you have the > ? and : the wrong way around. I looked at the code once again, and in the cases that call the function, they always create a GPR temporary register, so I didn't need to reuse the destination register. Thanks for the comments. Are these patches ok to check in? They boostrap and have no regressions on power8 little endian: [gcc] 2016-12-14 Michael Meissner * config/rs6000/rs6000.c (rs6000_split_vec_extract_var): On ISA 3.0/power9, add support to use the VEXTU{B,H,W}{L,R}X extract instructions. * config/rs6000/vsx.md (VSr2): Add IEEE 128-bit floating point type constraint registers. (VSr3): Likewise. (FL_CONV): New mode iterator for binary floating types that have a direct conversion from 64-bit integer to floating point. (vsx_extract__p9): Add support for the ISA 3.0/power9 VEXTU{B,H,W}{L,R}X extract instructions. (vsx_extract__p9 splitter): Add splitter to load up the extract byte position into the GPR if we are using the VEXTU{B,H,W}{L,R}X extract instructions. (vsx_extract__di_p9): Support extracts to GPRs. (vsx_extract__store_p9): Support extracting to GPRs so that we can use reg+offset address instructions. (vsx_extract__var): Support extracts to GPRs. (vsx_extract___var): New combiner insn to combine vector extracts with zero_extend. (vsx_ext__fl_): Optimize extracting a small integer vector element and converting it to a floating point type. (vsx_ext__ufl_): Likewise. [gcc/testsuite] 2016-12-14 Michael Meissner * gcc/testsuite/gcc.target/powerpc/vec-extract.h: If DO_TRACE is defined, add tracing of the various extracts to stderr. Add support for tests that convert the result to another type. * gcc/testsuite/gcc.target/powerpc/vec-extract-v2df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v4sf.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v4si-df.c: Add new tests that do an extract and then convert the values double. * gcc/testsuite/gcc.target/powerpc/vec-extract-v4siu-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v16qi-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v8hiu-df.c: Likewise. * gcc/testsuite/gcc.target/powerpc/vec-extract-v8hi-df.c: Likewise. * gcc.target/powerpc/p9-extract-1.c: Update test to check for VEXTU{B,H,W}{L,R}X instructions being generated by default instead of VEXTRACTU{B,H} and XXEXTRACTUW. * gcc.target/powerpc/p9-extract-3.c: New test for combination of vec_extract and convert to floating point. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) @@ -7519,8 +7519,52 @@ rs6000_split_vec_extract_var (rtx dest, { int bit_shift = byte_shift + 3; rtx element2; + int dest_regno = regno_or_subregno (dest); + int src_regno = regno_or_subregno (src); + int element_regno = regno_or_subregno (element); - gcc_assert (REG_P (tmp_gpr) && REG_P (tmp_altivec)); + gcc_assert (REG_P (tmp_gpr)); + + /* See if we want to generate VEXTU{B,H,W}{L,R}X if the destination is in + a general purpose register. */ + if (TARGET_P9_VECTOR + && (mode == V16QImode || mode == V8HImode || mode == V4SImode) + && INT_REGNO_P (dest_regno) + && ALTIVEC_REGNO_P (src_regno) + && INT_REGNO_P (element_regno)) + { + rtx dest_si = gen_rtx_REG (SImode, dest_regno); + rtx element_si = gen_rtx_REG (SImode, element_regno); + + if (mode == V16QImode) + emit_insn (VECTOR_ELT_ORDER_BIG + ? gen_vextublx (dest_si, element_si, src) + : gen_vextubrx (dest_si, element_si, src)); + + else if (mode == V8HImode) + { + rtx tmp_gpr_si = gen_rtx_REG (SImode, REGNO (tmp_gpr)); + emit_insn (gen_ashlsi3 (tmp_gpr_si, element_si, const1_rtx)); + emit_insn (VECTOR_ELT_ORDER_BIG + ? gen_vextuhlx (dest_si, tmp_gpr_si, src) + : gen_vextuhrx (dest_si, tmp_gpr_si, src)); + } + + + else + { + rtx tmp_gpr_si = gen_rtx_REG (SImode, REGNO (tmp_gpr)); + emit_insn (gen_ashlsi3 (tmp_gpr_si, element_si, const2_rtx)); + emit_insn (VECTOR_ELT_ORDER_BIG + ? gen_vextuwlx (dest_si, tmp_gpr_si, src) + : gen_vextuwrx (dest_si, tmp_gpr_si, src)); + } + + return; + } + + + gcc_assert (REG_P (tmp_altivec)); /* For little endian, adjust element ordering. For V2DI/V2DF, we can use an XOR, otherwise we need to subtract. The shift amount is so VSLO Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/vsx.md (.../gcc/config/rs6000) (working copy) @@ -119,13 +119,17 @@ (define_mode_attr VSr2 [(V2DF "wd") (V4SF "wf") (DF "ws") (SF "ww") - (DI "wi")]) + (DI "wi") + (KF "wq") + (TF "wp")]) (define_mode_attr VSr3 [(V2DF "wa") (V4SF "wa") (DF "ws") (SF "ww") - (DI "wi")]) + (DI "wi") + (KF "wq") + (TF "wp")]) ;; Map the register class for sp<->dp float conversions, destination (define_mode_attr VSr4 [(SF "ws") @@ -298,6 +302,14 @@ (define_mode_iterator VSX_EXTRACT_FL [SF || (FLOAT128_IEEE_P (TFmode) && TARGET_FLOAT128_HW)")]) +;; Mode iterator for binary floating types that have a direct conversion +;; from 64-bit integer to floating point +(define_mode_iterator FL_CONV [SF + DF + (KF "TARGET_FLOAT128_HW") + (TF "TARGET_FLOAT128_HW + && FLOAT128_IEEE_P (TFmode)")]) + ;; Iterator for the 2 short vector types to do a splat from an integer (define_mode_iterator VSX_SPLAT_I [V16QI V8HI]) @@ -2535,63 +2547,98 @@ (define_expand "vsx_extract_" }) (define_insn "vsx_extract__p9" - [(set (match_operand: 0 "gpc_reg_operand" "=") + [(set (match_operand: 0 "gpc_reg_operand" "=r,") (vec_select: - (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "") - (parallel [(match_operand:QI 2 "" "n")])))] + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "wK,") + (parallel [(match_operand:QI 2 "" "n,n")]))) + (clobber (match_scratch:SI 3 "=r,X"))] "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB && TARGET_VSX_SMALL_INTEGER" { - HOST_WIDE_INT elt = INTVAL (operands[2]); - HOST_WIDE_INT elt_adj = (!VECTOR_ELT_ORDER_BIG - ? GET_MODE_NUNITS (mode) - 1 - elt - : elt); - - HOST_WIDE_INT unit_size = GET_MODE_UNIT_SIZE (mode); - HOST_WIDE_INT offset = unit_size * elt_adj; - - operands[2] = GEN_INT (offset); - if (unit_size == 4) - return "xxextractuw %x0,%x1,%2"; + if (which_alternative == 0) + return "#"; + else - return "vextractu %0,%1,%2"; + { + HOST_WIDE_INT elt = INTVAL (operands[2]); + HOST_WIDE_INT elt_adj = (!VECTOR_ELT_ORDER_BIG + ? GET_MODE_NUNITS (mode) - 1 - elt + : elt); + + HOST_WIDE_INT unit_size = GET_MODE_UNIT_SIZE (mode); + HOST_WIDE_INT offset = unit_size * elt_adj; + + operands[2] = GEN_INT (offset); + if (unit_size == 4) + return "xxextractuw %x0,%x1,%2"; + else + return "vextractu %0,%1,%2"; + } } [(set_attr "type" "vecsimple")]) +(define_split + [(set (match_operand: 0 "int_reg_operand") + (vec_select: + (match_operand:VSX_EXTRACT_I 1 "altivec_register_operand") + (parallel [(match_operand:QI 2 "const_int_operand")]))) + (clobber (match_operand:SI 3 "int_reg_operand"))] + "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB + && TARGET_VSX_SMALL_INTEGER && reload_completed" + [(const_int 0)] +{ + rtx op0_si = gen_rtx_REG (SImode, REGNO (operands[0])); + rtx op1 = operands[1]; + rtx op2 = operands[2]; + rtx op3 = operands[3]; + HOST_WIDE_INT offset = INTVAL (op2) * GET_MODE_UNIT_SIZE (mode); + + emit_move_insn (op3, GEN_INT (offset)); + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_vextulx (op0_si, op3, op1)); + else + emit_insn (gen_vexturx (op0_si, op3, op1)); + DONE; +}) + ;; Optimize zero extracts to eliminate the AND after the extract. (define_insn_and_split "*vsx_extract__di_p9" - [(set (match_operand:DI 0 "gpc_reg_operand" "=") + [(set (match_operand:DI 0 "gpc_reg_operand" "=r,") (zero_extend:DI (vec_select: - (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "") - (parallel [(match_operand:QI 2 "const_int_operand" "n")]))))] + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "wK,") + (parallel [(match_operand:QI 2 "const_int_operand" "n,n")])))) + (clobber (match_scratch:SI 3 "=r,X"))] "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB && TARGET_VSX_SMALL_INTEGER" "#" "&& reload_completed" - [(set (match_dup 3) - (vec_select: - (match_dup 1) - (parallel [(match_dup 2)])))] + [(parallel [(set (match_dup 4) + (vec_select: + (match_dup 1) + (parallel [(match_dup 2)]))) + (clobber (match_dup 3))])] { - operands[3] = gen_rtx_REG (mode, REGNO (operands[0])); + operands[4] = gen_rtx_REG (mode, REGNO (operands[0])); }) ;; Optimize stores to use the ISA 3.0 scalar store instructions (define_insn_and_split "*vsx_extract__store_p9" - [(set (match_operand: 0 "memory_operand" "=Z") + [(set (match_operand: 0 "memory_operand" "=Z,m") (vec_select: - (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "") - (parallel [(match_operand:QI 2 "const_int_operand" "n")]))) - (clobber (match_scratch: 3 "="))] + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" ",") + (parallel [(match_operand:QI 2 "const_int_operand" "n,n")]))) + (clobber (match_scratch: 3 "=,&r")) + (clobber (match_scratch:SI 4 "=X,&r"))] "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB && TARGET_VSX_SMALL_INTEGER" "#" "&& reload_completed" - [(set (match_dup 3) - (vec_select: - (match_dup 1) - (parallel [(match_dup 2)]))) + [(parallel [(set (match_dup 3) + (vec_select: + (match_dup 1) + (parallel [(match_dup 2)]))) + (clobber (match_dup 4))]) (set (match_dup 0) (match_dup 3))]) @@ -2721,13 +2768,13 @@ (define_insn_and_split "*vsx_extract__var" - [(set (match_operand: 0 "gpc_reg_operand" "=r,r") + [(set (match_operand: 0 "gpc_reg_operand" "=r,r,r") (unspec: - [(match_operand:VSX_EXTRACT_I 1 "input_operand" "v,m") - (match_operand:DI 2 "gpc_reg_operand" "r,r")] + [(match_operand:VSX_EXTRACT_I 1 "input_operand" "wK,v,m") + (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] UNSPEC_VSX_EXTRACT)) - (clobber (match_scratch:DI 3 "=r,&b")) - (clobber (match_scratch:V2DI 4 "=&v,X"))] + (clobber (match_scratch:DI 3 "=r,r,&b")) + (clobber (match_scratch:V2DI 4 "=X,&v,X"))] "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" "#" "&& reload_completed" @@ -2738,6 +2785,27 @@ (define_insn_and_split "vsx_extract___var" + [(set (match_operand:SDI 0 "gpc_reg_operand" "=r,r,r") + (zero_extend:SDI + (unspec: + [(match_operand:VSX_EXTRACT_I 1 "input_operand" "wK,v,m") + (match_operand:DI 2 "gpc_reg_operand" "r,r,r")] + UNSPEC_VSX_EXTRACT))) + (clobber (match_scratch:DI 3 "=r,r,&b")) + (clobber (match_scratch:V2DI 4 "=X,&v,X"))] + "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" + "#" + "&& reload_completed" + [(const_int 0)] +{ + machine_mode smode = mode; + rs6000_split_vec_extract_var (gen_rtx_REG (smode, REGNO (operands[0])), + operands[1], operands[2], + operands[3], operands[4]); + DONE; +}) + ;; VSX_EXTRACT optimizations ;; Optimize double d = (double) vec_extract (vi, ) ;; Get the element into the top position and use XVCVSWDP/XVCVUWDP @@ -2839,6 +2907,56 @@ (define_insn_and_split "*vsx_extract_si_ DONE; }) +;; Optimize f = () vec_extract (, ) +;; Where is SFmode, DFmode (and KFmode/TFmode if those types are IEEE +;; 128-bit hardware types) and is vector char, vector unsigned char, +;; vector short or vector unsigned short. +(define_insn_and_split "*vsx_ext__fl_" + [(set (match_operand:FL_CONV 0 "gpc_reg_operand" "=") + (float:FL_CONV + (vec_select: + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "v") + (parallel [(match_operand:QI 2 "const_int_operand" "n")])))) + (clobber (match_scratch: 3 "=v"))] + "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT + && TARGET_P9_VECTOR && TARGET_VSX_SMALL_INTEGER" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 3) + (vec_select: + (match_dup 1) + (parallel [(match_dup 2)]))) + (clobber (scratch:SI))]) + (set (match_dup 4) + (sign_extend:DI (match_dup 3))) + (set (match_dup 0) + (float: (match_dup 4)))] +{ + operands[4] = gen_rtx_REG (DImode, REGNO (operands[3])); +}) + +(define_insn_and_split "*vsx_ext__ufl_" + [(set (match_operand:FL_CONV 0 "gpc_reg_operand" "=") + (unsigned_float:FL_CONV + (vec_select: + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "v") + (parallel [(match_operand:QI 2 "const_int_operand" "n")])))) + (clobber (match_scratch: 3 "=v"))] + "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT + && TARGET_P9_VECTOR && TARGET_VSX_SMALL_INTEGER" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 3) + (vec_select: + (match_dup 1) + (parallel [(match_dup 2)]))) + (clobber (scratch:SI))]) + (set (match_dup 0) + (float: (match_dup 4)))] +{ + operands[4] = gen_rtx_REG (DImode, REGNO (operands[3])); +}) + ;; V4SI/V8HI/V16QI set operation on ISA 3.0 (define_insn "vsx_set__p9" [(set (match_operand:VSX_EXTRACT_I 0 "gpc_reg_operand" "=") Index: gcc/testsuite/gcc.target/powerpc/vec-extract.h =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract.h (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 243590) +++ gcc/testsuite/gcc.target/powerpc/vec-extract.h (.../gcc/testsuite/gcc.target/powerpc) (working copy) @@ -2,16 +2,53 @@ #include #include +#ifndef RTYPE +#define RTYPE TYPE +#endif + +#ifdef DO_TRACE +#include + +#define TRACE(STRING, NUM) \ +do \ + { \ + fprintf (stderr, "%s%s: %2d\n", (NUM == 0) ? "\n" : "", \ + STRING, (int)NUM); \ + fflush (stderr); \ + } \ +while (0) + +#ifndef FAIL_FORMAT +#define FAIL_FORMAT "%ld" +#define FAIL_CAST(X) ((long)(X)) +#endif + +#define FAIL(EXP, GOT) \ +do \ + { \ + fprintf (stderr, "Expected: " FAIL_FORMAT ", got " FAIL_FORMAT "\n", \ + FAIL_CAST (EXP), FAIL_CAST (GOT)); \ + fflush (stderr); \ + abort (); \ + } \ +while (0) + +#else +#define TRACE(STRING, NUM) +#define FAIL(EXP, GOT) abort () +#endif + +static void check (RTYPE, RTYPE) __attribute__((__noinline__)); +static vector TYPE deoptimize (vector TYPE) __attribute__((__noinline__)); +static vector TYPE *deoptimize_ptr (vector TYPE *) __attribute__((__noinline__)); + static void -check (TYPE expected, TYPE got) +check (RTYPE expected, RTYPE got) { if (expected != got) - abort (); + FAIL (expected, got); } -static vector TYPE deoptimize (vector TYPE) __attribute__((__noinline__)); -static vector TYPE *deoptimize_ptr (vector TYPE *) __attribute__((__noinline__)); - static vector TYPE deoptimize (vector TYPE a) { @@ -29,116 +66,116 @@ deoptimize_ptr (vector TYPE *p) /* Tests for the normal case of vec_extract where the vector is in a register and returning the result in a register as a return value. */ -TYPE +RTYPE get_auto_n (vector TYPE a, ssize_t n) { - return vec_extract (a, n); + return (RTYPE) vec_extract (a, n); } -TYPE +RTYPE get_auto_0 (vector TYPE a) { - return vec_extract (a, 0); + return (RTYPE) vec_extract (a, 0); } -TYPE +RTYPE get_auto_1 (vector TYPE a) { - return vec_extract (a, 1); + return (RTYPE) vec_extract (a, 1); } #if ELEMENTS >= 4 -TYPE +RTYPE get_auto_2 (vector TYPE a) { - return vec_extract (a, 2); + return (RTYPE) vec_extract (a, 2); } -TYPE +RTYPE get_auto_3 (vector TYPE a) { - return vec_extract (a, 3); + return (RTYPE) vec_extract (a, 3); } #if ELEMENTS >= 8 -TYPE +RTYPE get_auto_4 (vector TYPE a) { - return vec_extract (a, 4); + return (RTYPE) vec_extract (a, 4); } -TYPE +RTYPE get_auto_5 (vector TYPE a) { - return vec_extract (a, 5); + return (RTYPE) vec_extract (a, 5); } -TYPE +RTYPE get_auto_6 (vector TYPE a) { - return vec_extract (a, 6); + return (RTYPE) vec_extract (a, 6); } -TYPE +RTYPE get_auto_7 (vector TYPE a) { - return vec_extract (a, 7); + return (RTYPE) vec_extract (a, 7); } #if ELEMENTS >= 16 -TYPE +RTYPE get_auto_8 (vector TYPE a) { - return vec_extract (a, 8); + return (RTYPE) vec_extract (a, 8); } -TYPE +RTYPE get_auto_9 (vector TYPE a) { - return vec_extract (a, 9); + return (RTYPE) vec_extract (a, 9); } -TYPE +RTYPE get_auto_10 (vector TYPE a) { - return vec_extract (a, 10); + return (RTYPE) vec_extract (a, 10); } -TYPE +RTYPE get_auto_11 (vector TYPE a) { - return vec_extract (a, 11); + return (RTYPE) vec_extract (a, 11); } -TYPE +RTYPE get_auto_12 (vector TYPE a) { - return vec_extract (a, 12); + return (RTYPE) vec_extract (a, 12); } -TYPE +RTYPE get_auto_13 (vector TYPE a) { - return vec_extract (a, 13); + return (RTYPE) vec_extract (a, 13); } -TYPE +RTYPE get_auto_14 (vector TYPE a) { - return vec_extract (a, 14); + return (RTYPE) vec_extract (a, 14); } -TYPE +RTYPE get_auto_15 (vector TYPE a) { - return vec_extract (a, 15); + return (RTYPE) vec_extract (a, 15); } #endif #endif #endif -typedef TYPE (*auto_func_type) (vector TYPE); +typedef RTYPE (*auto_func_type) (vector TYPE); static auto_func_type get_auto_const[] = { get_auto_0, @@ -173,7 +210,10 @@ do_auto (vector TYPE a) size_t i; for (i = 0; i < sizeof (get_auto_const) / sizeof (get_auto_const[0]); i++) - check (get_auto_n (a, i), (get_auto_const[i]) (a)); + { + TRACE ("auto", i); + check (get_auto_n (a, i), (get_auto_const[i]) (a)); + } } @@ -182,115 +222,115 @@ do_auto (vector TYPE a) in the right position to use a scalar store). */ void -get_store_n (TYPE *p, vector TYPE a, ssize_t n) +get_store_n (RTYPE *p, vector TYPE a, ssize_t n) { - *p = vec_extract (a, n); + *p = (RTYPE) vec_extract (a, n); } void -get_store_0 (TYPE *p, vector TYPE a) +get_store_0 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 0); + *p = (RTYPE) vec_extract (a, 0); } void -get_store_1 (TYPE *p, vector TYPE a) +get_store_1 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 1); + *p = (RTYPE) vec_extract (a, 1); } #if ELEMENTS >= 4 void -get_store_2 (TYPE *p, vector TYPE a) +get_store_2 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 2); + *p = (RTYPE) vec_extract (a, 2); } void -get_store_3 (TYPE *p, vector TYPE a) +get_store_3 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 3); + *p = (RTYPE) vec_extract (a, 3); } #if ELEMENTS >= 8 void -get_store_4 (TYPE *p, vector TYPE a) +get_store_4 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 4); + *p = (RTYPE) vec_extract (a, 4); } void -get_store_5 (TYPE *p, vector TYPE a) +get_store_5 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 5); + *p = (RTYPE) vec_extract (a, 5); } void -get_store_6 (TYPE *p, vector TYPE a) +get_store_6 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 6); + *p = (RTYPE) vec_extract (a, 6); } void -get_store_7 (TYPE *p, vector TYPE a) +get_store_7 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 7); + *p = (RTYPE) vec_extract (a, 7); } #if ELEMENTS >= 16 void -get_store_8 (TYPE *p, vector TYPE a) +get_store_8 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 8); + *p = (RTYPE) vec_extract (a, 8); } void -get_store_9 (TYPE *p, vector TYPE a) +get_store_9 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 9); + *p = (RTYPE) vec_extract (a, 9); } void -get_store_10 (TYPE *p, vector TYPE a) +get_store_10 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 10); + *p = (RTYPE) vec_extract (a, 10); } void -get_store_11 (TYPE *p, vector TYPE a) +get_store_11 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 11); + *p = (RTYPE) vec_extract (a, 11); } void -get_store_12 (TYPE *p, vector TYPE a) +get_store_12 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 12); + *p = (RTYPE) vec_extract (a, 12); } void -get_store_13 (TYPE *p, vector TYPE a) +get_store_13 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 13); + *p = (RTYPE) vec_extract (a, 13); } void -get_store_14 (TYPE *p, vector TYPE a) +get_store_14 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 14); + *p = (RTYPE) vec_extract (a, 14); } void -get_store_15 (TYPE *p, vector TYPE a) +get_store_15 (RTYPE *p, vector TYPE a) { - *p = vec_extract (a, 15); + *p = (RTYPE) vec_extract (a, 15); } #endif #endif #endif -typedef void (*store_func_type) (TYPE *, vector TYPE); +typedef void (*store_func_type) (RTYPE *, vector TYPE); static store_func_type get_store_const[] = { get_store_0, @@ -323,10 +363,11 @@ void do_store (vector TYPE a) { size_t i; - TYPE result_var, result_const; + RTYPE result_var, result_const; for (i = 0; i < sizeof (get_store_const) / sizeof (get_store_const[0]); i++) { + TRACE ("store", i); get_store_n (&result_var, a, i); (get_store_const[i]) (&result_const, a); check (result_var, result_const); @@ -337,116 +378,116 @@ do_store (vector TYPE a) /* Tests for vec_extract where the vector comes from memory (the compiler can optimize this by doing a scalar load without having to load the whole vector). */ -TYPE +RTYPE get_pointer_n (vector TYPE *p, ssize_t n) { - return vec_extract (*p, n); + return (RTYPE) vec_extract (*p, n); } -TYPE +RTYPE get_pointer_0 (vector TYPE *p) { - return vec_extract (*p, 0); + return (RTYPE) vec_extract (*p, 0); } -TYPE +RTYPE get_pointer_1 (vector TYPE *p) { - return vec_extract (*p, 1); + return (RTYPE) vec_extract (*p, 1); } #if ELEMENTS >= 4 -TYPE +RTYPE get_pointer_2 (vector TYPE *p) { - return vec_extract (*p, 2); + return (RTYPE) vec_extract (*p, 2); } -TYPE +RTYPE get_pointer_3 (vector TYPE *p) { - return vec_extract (*p, 3); + return (RTYPE) vec_extract (*p, 3); } #if ELEMENTS >= 8 -TYPE +RTYPE get_pointer_4 (vector TYPE *p) { - return vec_extract (*p, 4); + return (RTYPE) vec_extract (*p, 4); } -static TYPE +RTYPE get_pointer_5 (vector TYPE *p) { - return vec_extract (*p, 5); + return (RTYPE) vec_extract (*p, 5); } -TYPE +RTYPE get_pointer_6 (vector TYPE *p) { - return vec_extract (*p, 6); + return (RTYPE) vec_extract (*p, 6); } -TYPE +RTYPE get_pointer_7 (vector TYPE *p) { - return vec_extract (*p, 7); + return (RTYPE) vec_extract (*p, 7); } #if ELEMENTS >= 16 -TYPE +RTYPE get_pointer_8 (vector TYPE *p) { - return vec_extract (*p, 8); + return (RTYPE) vec_extract (*p, 8); } -TYPE +RTYPE get_pointer_9 (vector TYPE *p) { - return vec_extract (*p, 9); + return (RTYPE) vec_extract (*p, 9); } -TYPE +RTYPE get_pointer_10 (vector TYPE *p) { - return vec_extract (*p, 10); + return (RTYPE) vec_extract (*p, 10); } -TYPE +RTYPE get_pointer_11 (vector TYPE *p) { - return vec_extract (*p, 11); + return (RTYPE) vec_extract (*p, 11); } -TYPE +RTYPE get_pointer_12 (vector TYPE *p) { - return vec_extract (*p, 12); + return (RTYPE) vec_extract (*p, 12); } -TYPE +RTYPE get_pointer_13 (vector TYPE *p) { - return vec_extract (*p, 13); + return (RTYPE) vec_extract (*p, 13); } -TYPE +RTYPE get_pointer_14 (vector TYPE *p) { - return vec_extract (*p, 14); + return (RTYPE) vec_extract (*p, 14); } -TYPE +RTYPE get_pointer_15 (vector TYPE *p) { - return vec_extract (*p, 15); + return (RTYPE) vec_extract (*p, 15); } #endif #endif #endif -typedef TYPE (*pointer_func_type) (vector TYPE *); +typedef RTYPE (*pointer_func_type) (vector TYPE *); static pointer_func_type get_pointer_const[] = { get_pointer_0, @@ -481,7 +522,10 @@ do_pointer (vector TYPE *p) size_t i; for (i = 0; i < sizeof (get_pointer_const) / sizeof (get_pointer_const[0]); i++) - check (get_pointer_n (p, i), (get_pointer_const[i]) (p)); + { + TRACE ("pointer", i); + check (get_pointer_n (p, i), (get_pointer_const[i]) (p)); + } } @@ -489,116 +533,116 @@ do_pointer (vector TYPE *p) operation. This is to make sure that if the compiler optimizes vec_extract from memory to be a scalar load, the address is correctly adjusted. */ -TYPE +RTYPE get_indexed_n (vector TYPE *p, size_t x, ssize_t n) { - return vec_extract (p[x], n); + return (RTYPE) vec_extract (p[x], n); } -TYPE +RTYPE get_indexed_0 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 0); + return (RTYPE) vec_extract (p[x], 0); } -TYPE +RTYPE get_indexed_1 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 1); + return (RTYPE) vec_extract (p[x], 1); } #if ELEMENTS >= 4 -TYPE +RTYPE get_indexed_2 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 2); + return (RTYPE) vec_extract (p[x], 2); } -TYPE +RTYPE get_indexed_3 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 3); + return (RTYPE) vec_extract (p[x], 3); } #if ELEMENTS >= 8 -TYPE +RTYPE get_indexed_4 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 4); + return (RTYPE) vec_extract (p[x], 4); } -static TYPE +RTYPE get_indexed_5 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 5); + return (RTYPE) vec_extract (p[x], 5); } -TYPE +RTYPE get_indexed_6 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 6); + return (RTYPE) vec_extract (p[x], 6); } -TYPE +RTYPE get_indexed_7 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 7); + return (RTYPE) vec_extract (p[x], 7); } #if ELEMENTS >= 16 -TYPE +RTYPE get_indexed_8 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 8); + return (RTYPE) vec_extract (p[x], 8); } -TYPE +RTYPE get_indexed_9 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 9); + return (RTYPE) vec_extract (p[x], 9); } -TYPE +RTYPE get_indexed_10 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 10); + return (RTYPE) vec_extract (p[x], 10); } -TYPE +RTYPE get_indexed_11 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 11); + return (RTYPE) vec_extract (p[x], 11); } -TYPE +RTYPE get_indexed_12 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 12); + return (RTYPE) vec_extract (p[x], 12); } -TYPE +RTYPE get_indexed_13 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 13); + return (RTYPE) vec_extract (p[x], 13); } -TYPE +RTYPE get_indexed_14 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 14); + return (RTYPE) vec_extract (p[x], 14); } -TYPE +RTYPE get_indexed_15 (vector TYPE *p, size_t x) { - return vec_extract (p[x], 15); + return (RTYPE) vec_extract (p[x], 15); } #endif #endif #endif -typedef TYPE (*indexed_func_type) (vector TYPE *, size_t); +typedef RTYPE (*indexed_func_type) (vector TYPE *, size_t); static indexed_func_type get_indexed_const[] = { get_indexed_0, @@ -633,7 +677,10 @@ do_indexed (vector TYPE *p, size_t x) size_t i; for (i = 0; i < sizeof (get_indexed_const) / sizeof (get_indexed_const[0]); i++) - check (get_indexed_n (p, x, i), (get_indexed_const[i]) (p, x)); + { + TRACE ("indexed", i); + check (get_indexed_n (p, x, i), (get_indexed_const[i]) (p, x)); + } } @@ -641,116 +688,116 @@ do_indexed (vector TYPE *p, size_t x) with a pointer and a constant offset. This will occur in ISA 3.0 which added d-form memory addressing for vectors. */ -TYPE +RTYPE get_ptr_plus1_n (vector TYPE *p, ssize_t n) { - return vec_extract (p[1], n); + return (RTYPE) vec_extract (p[1], n); } -TYPE +RTYPE get_ptr_plus1_0 (vector TYPE *p) { - return vec_extract (p[1], 0); + return (RTYPE) vec_extract (p[1], 0); } -TYPE +RTYPE get_ptr_plus1_1 (vector TYPE *p) { - return vec_extract (p[1], 1); + return (RTYPE) vec_extract (p[1], 1); } #if ELEMENTS >= 4 -TYPE +RTYPE get_ptr_plus1_2 (vector TYPE *p) { - return vec_extract (p[1], 2); + return (RTYPE) vec_extract (p[1], 2); } -TYPE +RTYPE get_ptr_plus1_3 (vector TYPE *p) { - return vec_extract (p[1], 3); + return (RTYPE) vec_extract (p[1], 3); } #if ELEMENTS >= 8 -TYPE +RTYPE get_ptr_plus1_4 (vector TYPE *p) { - return vec_extract (p[1], 4); + return (RTYPE) vec_extract (p[1], 4); } -static TYPE +RTYPE get_ptr_plus1_5 (vector TYPE *p) { - return vec_extract (p[1], 5); + return (RTYPE) vec_extract (p[1], 5); } -TYPE +RTYPE get_ptr_plus1_6 (vector TYPE *p) { - return vec_extract (p[1], 6); + return (RTYPE) vec_extract (p[1], 6); } -TYPE +RTYPE get_ptr_plus1_7 (vector TYPE *p) { - return vec_extract (p[1], 7); + return (RTYPE) vec_extract (p[1], 7); } #if ELEMENTS >= 16 -TYPE +RTYPE get_ptr_plus1_8 (vector TYPE *p) { - return vec_extract (p[1], 8); + return (RTYPE) vec_extract (p[1], 8); } -TYPE +RTYPE get_ptr_plus1_9 (vector TYPE *p) { - return vec_extract (p[1], 9); + return (RTYPE) vec_extract (p[1], 9); } -TYPE +RTYPE get_ptr_plus1_10 (vector TYPE *p) { - return vec_extract (p[1], 10); + return (RTYPE) vec_extract (p[1], 10); } -TYPE +RTYPE get_ptr_plus1_11 (vector TYPE *p) { - return vec_extract (p[1], 11); + return (RTYPE) vec_extract (p[1], 11); } -TYPE +RTYPE get_ptr_plus1_12 (vector TYPE *p) { - return vec_extract (p[1], 12); + return (RTYPE) vec_extract (p[1], 12); } -TYPE +RTYPE get_ptr_plus1_13 (vector TYPE *p) { - return vec_extract (p[1], 13); + return (RTYPE) vec_extract (p[1], 13); } -TYPE +RTYPE get_ptr_plus1_14 (vector TYPE *p) { - return vec_extract (p[1], 14); + return (RTYPE) vec_extract (p[1], 14); } -TYPE +RTYPE get_ptr_plus1_15 (vector TYPE *p) { - return vec_extract (p[1], 15); + return (RTYPE) vec_extract (p[1], 15); } #endif #endif #endif -typedef TYPE (*pointer_func_type) (vector TYPE *); +typedef RTYPE (*pointer_func_type) (vector TYPE *); static pointer_func_type get_ptr_plus1_const[] = { get_ptr_plus1_0, @@ -785,7 +832,10 @@ do_ptr_plus1 (vector TYPE *p) size_t i; for (i = 0; i < sizeof (get_ptr_plus1_const) / sizeof (get_ptr_plus1_const[0]); i++) - check (get_ptr_plus1_n (p, i), (get_ptr_plus1_const[i]) (p)); + { + TRACE ("ptr_plus1", i); + check (get_ptr_plus1_n (p, i), (get_ptr_plus1_const[i]) (p)); + } } @@ -793,116 +843,116 @@ do_ptr_plus1 (vector TYPE *p) static vector TYPE s; -TYPE +RTYPE get_static_n (ssize_t n) { - return vec_extract (s, n); + return (RTYPE) vec_extract (s, n); } -TYPE +RTYPE get_static_0 (void) { - return vec_extract (s, 0); + return (RTYPE) vec_extract (s, 0); } -TYPE +RTYPE get_static_1 (void) { - return vec_extract (s, 1); + return (RTYPE) vec_extract (s, 1); } #if ELEMENTS >= 4 -TYPE +RTYPE get_static_2 (void) { - return vec_extract (s, 2); + return (RTYPE) vec_extract (s, 2); } -TYPE +RTYPE get_static_3 (void) { - return vec_extract (s, 3); + return (RTYPE) vec_extract (s, 3); } #if ELEMENTS >= 8 -TYPE +RTYPE get_static_4 (void) { - return vec_extract (s, 4); + return (RTYPE) vec_extract (s, 4); } -TYPE +RTYPE get_static_5 (void) { - return vec_extract (s, 5); + return (RTYPE) vec_extract (s, 5); } -TYPE +RTYPE get_static_6 (void) { - return vec_extract (s, 6); + return (RTYPE) vec_extract (s, 6); } -TYPE +RTYPE get_static_7 (void) { - return vec_extract (s, 7); + return (RTYPE) vec_extract (s, 7); } #if ELEMENTS >= 16 -TYPE +RTYPE get_static_8 (void) { - return vec_extract (s, 8); + return (RTYPE) vec_extract (s, 8); } -TYPE +RTYPE get_static_9 (void) { - return vec_extract (s, 9); + return (RTYPE) vec_extract (s, 9); } -TYPE +RTYPE get_static_10 (void) { - return vec_extract (s, 10); + return (RTYPE) vec_extract (s, 10); } -TYPE +RTYPE get_static_11 (void) { - return vec_extract (s, 11); + return (RTYPE) vec_extract (s, 11); } -TYPE +RTYPE get_static_12 (void) { - return vec_extract (s, 12); + return (RTYPE) vec_extract (s, 12); } -TYPE +RTYPE get_static_13 (void) { - return vec_extract (s, 13); + return (RTYPE) vec_extract (s, 13); } -TYPE +RTYPE get_static_14 (void) { - return vec_extract (s, 14); + return (RTYPE) vec_extract (s, 14); } -TYPE +RTYPE get_static_15 (void) { - return vec_extract (s, 15); + return (RTYPE) vec_extract (s, 15); } #endif #endif #endif -typedef TYPE (*static_func_type) (void); +typedef RTYPE (*static_func_type) (void); static static_func_type get_static_const[] = { get_static_0, @@ -937,7 +987,10 @@ do_static (void) size_t i; for (i = 0; i < sizeof (get_static_const) / sizeof (get_static_const[0]); i++) - check (get_static_n (i), (get_static_const[i]) ()); + { + TRACE ("static", i); + check (get_static_n (i), (get_static_const[i]) ()); + } } @@ -945,116 +998,116 @@ do_static (void) vector TYPE g; -TYPE +RTYPE get_global_n (ssize_t n) { - return vec_extract (g, n); + return (RTYPE) vec_extract (g, n); } -TYPE +RTYPE get_global_0 (void) { - return vec_extract (g, 0); + return (RTYPE) vec_extract (g, 0); } -TYPE +RTYPE get_global_1 (void) { - return vec_extract (g, 1); + return (RTYPE) vec_extract (g, 1); } #if ELEMENTS >= 4 -TYPE +RTYPE get_global_2 (void) { - return vec_extract (g, 2); + return (RTYPE) vec_extract (g, 2); } -TYPE +RTYPE get_global_3 (void) { - return vec_extract (g, 3); + return (RTYPE) vec_extract (g, 3); } #if ELEMENTS >= 8 -TYPE +RTYPE get_global_4 (void) { - return vec_extract (g, 4); + return (RTYPE) vec_extract (g, 4); } -TYPE +RTYPE get_global_5 (void) { - return vec_extract (g, 5); + return (RTYPE) vec_extract (g, 5); } -TYPE +RTYPE get_global_6 (void) { - return vec_extract (g, 6); + return (RTYPE) vec_extract (g, 6); } -TYPE +RTYPE get_global_7 (void) { - return vec_extract (g, 7); + return (RTYPE) vec_extract (g, 7); } #if ELEMENTS >= 16 -TYPE +RTYPE get_global_8 (void) { - return vec_extract (g, 8); + return (RTYPE) vec_extract (g, 8); } -TYPE +RTYPE get_global_9 (void) { - return vec_extract (g, 9); + return (RTYPE) vec_extract (g, 9); } -TYPE +RTYPE get_global_10 (void) { - return vec_extract (g, 10); + return (RTYPE) vec_extract (g, 10); } -TYPE +RTYPE get_global_11 (void) { - return vec_extract (g, 11); + return (RTYPE) vec_extract (g, 11); } -TYPE +RTYPE get_global_12 (void) { - return vec_extract (g, 12); + return (RTYPE) vec_extract (g, 12); } -TYPE +RTYPE get_global_13 (void) { - return vec_extract (g, 13); + return (RTYPE) vec_extract (g, 13); } -TYPE +RTYPE get_global_14 (void) { - return vec_extract (g, 14); + return (RTYPE) vec_extract (g, 14); } -TYPE +RTYPE get_global_15 (void) { - return vec_extract (g, 15); + return (RTYPE) vec_extract (g, 15); } #endif #endif #endif -typedef TYPE (*global_func_type) (void); +typedef RTYPE (*global_func_type) (void); static global_func_type get_global_const[] = { get_global_0, @@ -1089,7 +1142,10 @@ do_global (void) size_t i; for (i = 0; i < sizeof (get_global_const) / sizeof (get_global_const[0]); i++) - check (get_global_n (i), (get_global_const[i]) ()); + { + TRACE ("global", i); + check (get_global_n (i), (get_global_const[i]) ()); + } } Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v2df.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v2df.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 243590) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v2df.c (.../gcc/testsuite/gcc.target/powerpc) (working copy) @@ -3,6 +3,8 @@ /* { dg-options "-O2 -mvsx" } */ #define TYPE double +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) #define ELEMENTS 2 #define INITIAL { 10.0, -20.0 } Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v4sf.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v4sf.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 243590) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v4sf.c (.../gcc/testsuite/gcc.target/powerpc) (working copy) @@ -3,6 +3,8 @@ /* { dg-options "-O2 -mvsx" } */ #define TYPE float +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) #define ELEMENTS 4 #define INITIAL { 10.0f, -20.0f, 30.0f, -40.0f } Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v4si-df.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v4si-df.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v4si-df.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243608) @@ -0,0 +1,12 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +#define TYPE int +#define RTYPE double +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) +#define ELEMENTS 4 +#define INITIAL { 10, -20, 30, -40 } + +#include "vec-extract.h" Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v4siu-df.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v4siu-df.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v4siu-df.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243608) @@ -0,0 +1,12 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +#define TYPE unsigned int +#define RTYPE double +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) +#define ELEMENTS 4 +#define INITIAL { 1, 2, 0xff03, 0xff04 } + +#include "vec-extract.h" Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-df.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-df.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v16qiu-df.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243608) @@ -0,0 +1,13 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +#define TYPE unsigned char +#define RTYPE double +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) +#define ELEMENTS 16 +#define INITIAL \ + { 1, 2, 3, 4, 5, 6, 7, 8, 240, 241, 242, 243, 244, 245, 246, 247 } + +#include "vec-extract.h" Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v16qi-df.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v16qi-df.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v16qi-df.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243608) @@ -0,0 +1,14 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +#define TYPE signed char +#define RTYPE double +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) +#define ELEMENTS 16 +#define INITIAL \ + { 10, -20, 30, -40, 50, -60, 70, -80, \ + 90, -100, 110, -120, 30, -40, 50, -60 } + +#include "vec-extract.h" Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v8hiu-df.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v8hiu-df.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v8hiu-df.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243608) @@ -0,0 +1,12 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +#define TYPE unsigned short +#define RTYPE double +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) +#define ELEMENTS 8 +#define INITIAL { 1, 2, 3, 4, 0xf1, 0xf2, 0xf3, 0xf4 } + +#include "vec-extract.h" Index: gcc/testsuite/gcc.target/powerpc/vec-extract-v8hi-df.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-extract-v8hi-df.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-extract-v8hi-df.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243608) @@ -0,0 +1,12 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +#define TYPE short +#define RTYPE double +#define FAIL_FORMAT "%g" +#define FAIL_CAST(X) ((double)(X)) +#define ELEMENTS 8 +#define INITIAL { 10, -20, 30, -40, 50, -60, 70, 80 } + +#include "vec-extract.h" Index: gcc/testsuite/gcc.target/powerpc/p9-extract-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-extract-1.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 243590) +++ gcc/testsuite/gcc.target/powerpc/p9-extract-1.c (.../gcc/testsuite/gcc.target/powerpc) (working copy) @@ -3,24 +3,107 @@ /* { dg-require-effective-target powerpc_p9vector_ok } */ /* { dg-options "-mcpu=power9 -O2" } */ +/* Test to make sure VEXTU{B,H,W}{L,R}X is generated for various vector extract + operations for ISA 3.0 (-mcpu=power9). In addition, make sure that neither + of the the the old methods of doing vector extracts are done either by + explict stores to the stack or by using direct move instructions. */ + #include -int extract_int_0 (vector int a) { return vec_extract (a, 0); } -int extract_int_3 (vector int a) { return vec_extract (a, 3); } +int +extract_int_0 (vector int a) +{ + int b = vec_extract (a, 0); + return b; +} + +int +extract_int_3 (vector int a) +{ + int b = vec_extract (a, 3); + return b; +} + +unsigned int +extract_uint_0 (vector unsigned int a) +{ + unsigned int b = vec_extract (a, 0); + return b; +} + +unsigned int +extract_uint_3 (vector unsigned int a) +{ + unsigned int b = vec_extract (a, 3); + return b; +} + +short +extract_short_0 (vector short a) +{ + short b = vec_extract (a, 0); + return b; +} + +short +extract_short_7 (vector short a) +{ + short b = vec_extract (a, 7); + return b; +} + +unsigned short +extract_ushort_0 (vector unsigned short a) +{ + unsigned short b = vec_extract (a, 0); + return b; +} + +unsigned short +extract_ushort_7 (vector unsigned short a) +{ + unsigned short b = vec_extract (a, 7); + return b; +} + +signed char +extract_schar_0 (vector signed char a) +{ + signed char b = vec_extract (a, 0); + return b; +} + +signed char +extract_schar_15 (vector signed char a) +{ + signed char b = vec_extract (a, 15); + return b; +} -int extract_short_0 (vector short a) { return vec_extract (a, 0); } -int extract_short_3 (vector short a) { return vec_extract (a, 7); } +unsigned char +extract_uchar_0 (vector unsigned char a) +{ + unsigned char b = vec_extract (a, 0); + return b; +} -int extract_schar_0 (vector signed char a) { return vec_extract (a, 0); } -int extract_schar_3 (vector signed char a) { return vec_extract (a, 15); } +unsigned char +extract_uchar_15 (vector unsigned char a) +{ + signed char b = vec_extract (a, 15); + return b; +} -/* { dg-final { scan-assembler "vextractub" } } */ -/* { dg-final { scan-assembler "vextractuh" } } */ -/* { dg-final { scan-assembler "xxextractuw" } } */ -/* { dg-final { scan-assembler "mfvsr" } } */ -/* { dg-final { scan-assembler-not "stxvd2x" } } */ -/* { dg-final { scan-assembler-not "stxv" } } */ -/* { dg-final { scan-assembler-not "lwa" } } */ -/* { dg-final { scan-assembler-not "lwz" } } */ -/* { dg-final { scan-assembler-not "lha" } } */ -/* { dg-final { scan-assembler-not "lhz" } } */ +/* { dg-final { scan-assembler "vextub\[lr\]x " } } */ +/* { dg-final { scan-assembler "vextuh\[lr\]x " } } */ +/* { dg-final { scan-assembler "vextuw\[lr\]x " } } */ +/* { dg-final { scan-assembler "extsb " } } */ +/* { dg-final { scan-assembler "extsh " } } */ +/* { dg-final { scan-assembler "extsw " } } */ +/* { dg-final { scan-assembler-not "m\[ft\]vsr" } } */ +/* { dg-final { scan-assembler-not "stxvd2x " } } */ +/* { dg-final { scan-assembler-not "stxv " } } */ +/* { dg-final { scan-assembler-not "lwa " } } */ +/* { dg-final { scan-assembler-not "lwz " } } */ +/* { dg-final { scan-assembler-not "lha " } } */ +/* { dg-final { scan-assembler-not "lhz " } } */ Index: gcc/testsuite/gcc.target/powerpc/p9-extract-3.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-extract-3.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-extract-3.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243608) @@ -0,0 +1,108 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2" } */ + +/* Test that under ISA 3.0 (-mcpu=power9), the compiler optimizes conversion to + double after a vec_extract to use the VEXTRACTU{B,H} or XXEXTRACTUW + instructions (which leaves the result in a vector register), and not the + VEXTU{B,H,W}{L,R}X instructions (which needs a direct move to do the floating + point conversion). */ + +#include + +double +fpcvt_int_0 (vector int a) +{ + int b = vec_extract (a, 0); + return (double)b; +} + +double +fpcvt_int_3 (vector int a) +{ + int b = vec_extract (a, 3); + return (double)b; +} + +double +fpcvt_uint_0 (vector unsigned int a) +{ + unsigned int b = vec_extract (a, 0); + return (double)b; +} + +double +fpcvt_uint_3 (vector unsigned int a) +{ + unsigned int b = vec_extract (a, 3); + return (double)b; +} + +double +fpcvt_short_0 (vector short a) +{ + short b = vec_extract (a, 0); + return (double)b; +} + +double +fpcvt_short_7 (vector short a) +{ + short b = vec_extract (a, 7); + return (double)b; +} + +double +fpcvt_ushort_0 (vector unsigned short a) +{ + unsigned short b = vec_extract (a, 0); + return (double)b; +} + +double +fpcvt_ushort_7 (vector unsigned short a) +{ + unsigned short b = vec_extract (a, 7); + return (double)b; +} + +double +fpcvt_schar_0 (vector signed char a) +{ + signed char b = vec_extract (a, 0); + return (double)b; +} + +double +fpcvt_schar_15 (vector signed char a) +{ + signed char b = vec_extract (a, 15); + return (double)b; +} + +double +fpcvt_uchar_0 (vector unsigned char a) +{ + unsigned char b = vec_extract (a, 0); + return (double)b; +} + +double +fpcvt_uchar_15 (vector unsigned char a) +{ + signed char b = vec_extract (a, 15); + return (double)b; +} + +/* { dg-final { scan-assembler "vextractu\[bh\] " } } */ +/* { dg-final { scan-assembler "vexts\[bh\]2d " } } */ +/* { dg-final { scan-assembler "vspltw " } } */ +/* { dg-final { scan-assembler "xscvsxddp " } } */ +/* { dg-final { scan-assembler "xvcvsxwdp " } } */ +/* { dg-final { scan-assembler "xvcvuxwdp " } } */ +/* { dg-final { scan-assembler-not "exts\[bhw\] " } } */ +/* { dg-final { scan-assembler-not "stxv" } } */ +/* { dg-final { scan-assembler-not "m\[ft\]vsrd " } } */ +/* { dg-final { scan-assembler-not "m\[ft\]vsrw\[az\] " } } */ +/* { dg-final { scan-assembler-not "l\[hw\]\[az\] " } } */