From patchwork Mon Aug 7 13:18:30 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 798634 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-459938-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="HaOipedA"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xQymB5rl0z9s06 for ; Mon, 7 Aug 2017 23:19:06 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:references:mime-version:content-type :in-reply-to:message-id; q=dns; s=default; b=YOgEuJdjJ+tOU7oe3AS yowLOc7WvEOtIOL5R0rVGmLhcryceylMAQd1PFNjfhQyehkp1AjTyOMiZAxt91M7 jv5ggPsh9zeMXkJaXJ4IolQldFBUH8tqo6wvm6EHSojGGoBjsK/6JKORRfQDty4e 32WF0wWccoB6Za1RTAE+ydfE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:references:mime-version:content-type :in-reply-to:message-id; s=default; bh=h7tBQmOq5wZUQCMzJmrQRyIXs SI=; b=HaOipedAB0TUwqFQNrgyBm66dr6jPVMMMPp/F7FEoT7voPH8Lo8rCDLrM 3+RgrEpGT6qKPZxOn9KyFjOzl1MJ/YP2CShrBUqs7rj1R5weXEBomrOm1S3XERvz 3RmH2Zh4xDB4JnQXSixXMqV4YUIY6/bnL1w2l+nXnGGlcQo1HU= Received: (qmail 66579 invoked by alias); 7 Aug 2017 13:18:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 65173 invoked by uid 89); 7 Aug 2017 13:18:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-9.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=ii, ele X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 07 Aug 2017 13:18:38 +0000 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v77DIK2p066129 for ; Mon, 7 Aug 2017 09:18:36 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2c6kfcgcx0-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 07 Aug 2017 09:18:36 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 7 Aug 2017 09:18:34 -0400 Received: from b01cxnp23033.gho.pok.ibm.com (9.57.198.28) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 7 Aug 2017 09:18:32 -0400 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v77DIVjk36896970; Mon, 7 Aug 2017 13:18:31 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A14F2AE03B; Mon, 7 Aug 2017 09:18:47 -0400 (EDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP id 76FABAE03C; Mon, 7 Aug 2017 09:18:47 -0400 (EDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 152EB45DA2; Mon, 7 Aug 2017 09:18:30 -0400 (EDT) Date: Mon, 7 Aug 2017 09:18:30 -0400 From: Michael Meissner To: Segher Boessenkool Cc: Michael Meissner , GCC Patches , David Edelsohn , Bill Schmidt Subject: Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts Mail-Followup-To: Michael Meissner , Segher Boessenkool , GCC Patches , David Edelsohn , Bill Schmidt References: <20170727232113.GA8723@ibm-tiger.the-meissners.org> <20170728210848.GC13471@gate.crashing.org> <20170802142855.GA11603@ibm-tiger.the-meissners.org> <20170803150141.GV13471@gate.crashing.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20170803150141.GV13471@gate.crashing.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 x-cbid: 17080713-0040-0000-0000-0000038BDAB2 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007501; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000217; SDB=6.00898859; UDB=6.00449851; IPR=6.00679065; BA=6.00005515; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016574; XFM=3.00000015; UTC=2017-08-07 13:18:33 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17080713-0041-0000-0000-0000078006F7 Message-Id: <20170807131830.GA753@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-08-07_10:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1708070224 X-IsSubscribed: yes On Thu, Aug 03, 2017 at 10:01:41AM -0500, Segher Boessenkool wrote: > Hi Mike, > > On Wed, Aug 02, 2017 at 10:28:55AM -0400, Michael Meissner wrote: > > On Fri, Jul 28, 2017 at 04:08:50PM -0500, Segher Boessenkool wrote: > > > I think calling this with the rtx elementN args makes this only more > > > complicated (the function comment doesn't say what they are or what > > > NULL means, btw). > > You didn't handle the first part of this as far as I see? It's the > big complicating issue here. > > > + If ELEMENT1 is null, use the top 64-bit double word of ARG1. If it is > > + non-NULL, it is a 0 or 1 constant that gives the vector element number to > > + use for extracting the 64-bit double word from ARG1. > > + > > + If ELEMENT2 is null, use the top 64-bit double word of ARG2. If it is > > + non-NULL, it is a 0 or 1 constant that gives the vector element number to > > + use for extracting the 64-bit double word from ARG2. > > + > > + The element number is based on the user element ordering, set by the > > + endianess and by the -maltivec={le,be} options. */ > > ("endianness", two n's). > > I don't like using NULL as a magic value at all; it does not simplify > this interface, it complicates it instead. > > Can you move the "which half is high" decision to the callers? I rewrote the patch to eliminate the rs6000_output_xxpermdi function, and do the calculation of the XXPERMDI mask in each of the vsx_concat__{1,2,3} insns. Just to be sure I got things correct, I wrote a new executable test that tests various methods of creating/inserting 2 element vectors with double word elements, and tested in BE, LE -maltivec=be, and LE, and the results match previous compilers. I have done bootstrap/build checks on a big endian power7, a little endian power8 system, and I have done a non-bootstrap/check on a power9 prototype (I have script issues that prevents a bootstrap build on power9 that I need to look into). There are no regressions in the tests and the new tests were run on each of the systems. Can I check this into the trunk? I would also like to backport it to all open branches (particularly GCC 7, but GCC 6 if possible). Note, the patch will need a slight tweak on the older systems due to GCC 7 still supporting -mupper-regs-{df,di} and I have to adjust the constraints to accomidate this, and under GCC 6 DImode not being allowed in traditional Altivec registers. [gcc] 2017-08-07 Michael Meissner PR target/81593 * config/rs6000/vsx.md (vsx_concat_, VSX_D): Cleanup constraints since the -mupper-regs-* switches have been eliminated. (vsx_concat__1): New combiner insns to recognize inserting into a vector from a double word element that was extracted from another vector, and eliminate extra XXPERMDI instructions. (vsx_concat__2): Likewise. (vsx_concat__3): Likewise. (vsx_set_, VSX_D): Rewrite vector set in terms of vector concat to allow optimizing inserts from previous extracts. [gcc/testsuite] 2017-08-07 Michael Meissner PR target/81593 * gcc.target/powerpc/vec-setup.h: New tests to test various combinations of setting up vectors of 2 double word elements. * gcc.target/powerpc/vec-setup-long.c: Likewise. * gcc.target/powerpc/vec-setup-double.c: Likewise. * gcc.target/powerpc/vec-setup-be-long.c: Likewise. * gcc.target/powerpc/vec-setup-be-double.c: Likewise. * gcc.target/powerpc/vsx-extract-6.c: New tests for optimzing vector inserts from vector extracts. * gcc.target/powerpc/vsx-extract-7.c: Likewise. Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 250858) +++ gcc/config/rs6000/vsx.md (.../gcc/config/rs6000) (working copy) @@ -2364,10 +2364,10 @@ (define_insn "*vsx_float_fix_v2df2" ;; Build a V2DF/V2DI vector from two scalars (define_insn "vsx_concat_" - [(set (match_operand:VSX_D 0 "gpc_reg_operand" "=,we") + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa,we") (vec_concat:VSX_D - (match_operand: 1 "gpc_reg_operand" ",b") - (match_operand: 2 "gpc_reg_operand" ",b")))] + (match_operand: 1 "gpc_reg_operand" "wa,b") + (match_operand: 2 "gpc_reg_operand" "wa,b")))] "VECTOR_MEM_VSX_P (mode)" { if (which_alternative == 0) @@ -2385,6 +2385,80 @@ (define_insn "vsx_concat_" } [(set_attr "type" "vecperm")]) +;; Combiner patterns to allow creating XXPERMDI's to access either double +;; word element in a vector register. +(define_insn "*vsx_concat__1" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") + (vec_concat:VSX_D + (vec_select: + (match_operand:VSX_D 1 "gpc_reg_operand" "wa") + (parallel [(match_operand:QI 2 "const_0_to_1_operand" "n")])) + (match_operand: 3 "gpc_reg_operand" "wa")))] + "VECTOR_MEM_VSX_P (mode)" +{ + HOST_WIDE_INT dword = INTVAL (operands[2]); + if (BYTES_BIG_ENDIAN) + { + operands[4] = GEN_INT (2*dword); + return "xxpermdi %x0,%x1,%x3,%4"; + } + else + { + operands[4] = GEN_INT (!dword); + return "xxpermdi %x0,%x3,%x1,%4"; + } +} + [(set_attr "type" "vecperm")]) + +(define_insn "*vsx_concat__2" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") + (vec_concat:VSX_D + (match_operand: 1 "gpc_reg_operand" "wa") + (vec_select: + (match_operand:VSX_D 2 "gpc_reg_operand" "wa") + (parallel [(match_operand:QI 3 "const_0_to_1_operand" "n")]))))] + "VECTOR_MEM_VSX_P (mode)" +{ + HOST_WIDE_INT dword = INTVAL (operands[3]); + if (BYTES_BIG_ENDIAN) + { + operands[4] = GEN_INT (dword); + return "xxpermdi %x0,%x1,%x2,%4"; + } + else + { + operands[4] = GEN_INT (2 * !dword); + return "xxpermdi %x0,%x2,%x1,%4"; + } +} + [(set_attr "type" "vecperm")]) + +(define_insn "*vsx_concat__3" + [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") + (vec_concat:VSX_D + (vec_select: + (match_operand:VSX_D 1 "gpc_reg_operand" "wa") + (parallel [(match_operand:QI 2 "const_0_to_1_operand" "n")])) + (vec_select: + (match_operand:VSX_D 3 "gpc_reg_operand" "wa") + (parallel [(match_operand:QI 4 "const_0_to_1_operand" "n")]))))] + "VECTOR_MEM_VSX_P (mode)" +{ + HOST_WIDE_INT dword1 = INTVAL (operands[2]); + HOST_WIDE_INT dword2 = INTVAL (operands[4]); + if (BYTES_BIG_ENDIAN) + { + operands[5] = GEN_INT ((2 * dword1) + dword2); + return "xxpermdi %x0,%x1,%x3,%5"; + } + else + { + operands[5] = GEN_INT ((2 * !dword2) + !dword1); + return "xxpermdi %x0,%x3,%x1,%5"; + } +} + [(set_attr "type" "vecperm")]) + ;; Special purpose concat using xxpermdi to glue two single precision values ;; together, relying on the fact that internally scalar floats are represented ;; as doubles. This is used to initialize a V4SF vector with 4 floats @@ -2585,25 +2659,35 @@ (define_expand "vsx_set_v1ti" DONE; }) -;; Set the element of a V2DI/VD2F mode -(define_insn "vsx_set_" - [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?") - (unspec:VSX_D - [(match_operand:VSX_D 1 "vsx_register_operand" "wd,") - (match_operand: 2 "vsx_register_operand" ",") - (match_operand:QI 3 "u5bit_cint_operand" "i,i")] - UNSPEC_VSX_SET))] +;; Rewrite V2DF/V2DI set in terms of VEC_CONCAT +(define_expand "vsx_set_" + [(use (match_operand:VSX_D 0 "vsx_register_operand")) + (use (match_operand:VSX_D 1 "vsx_register_operand")) + (use (match_operand: 2 "gpc_reg_operand")) + (use (match_operand:QI 3 "const_0_to_1_operand"))] "VECTOR_MEM_VSX_P (mode)" { - int idx_first = BYTES_BIG_ENDIAN ? 0 : 1; - if (INTVAL (operands[3]) == idx_first) - return \"xxpermdi %x0,%x2,%x1,1\"; - else if (INTVAL (operands[3]) == 1 - idx_first) - return \"xxpermdi %x0,%x1,%x2,0\"; + rtx dest = operands[0]; + rtx vec_reg = operands[1]; + rtx value = operands[2]; + rtx ele = operands[3]; + rtx tmp = gen_reg_rtx (mode); + + if (ele == const0_rtx) + { + emit_insn (gen_vsx_extract_ (tmp, vec_reg, const1_rtx)); + emit_insn (gen_vsx_concat_ (dest, value, tmp)); + DONE; + } + else if (ele == const1_rtx) + { + emit_insn (gen_vsx_extract_ (tmp, vec_reg, const0_rtx)); + emit_insn (gen_vsx_concat_ (dest, tmp, value)); + DONE; + } else gcc_unreachable (); -} - [(set_attr "type" "vecperm")]) +}) ;; Extract a DF/DI element from V2DF/V2DI ;; Optimize cases were we can do a simple or direct move. Index: gcc/testsuite/gcc.target/powerpc/vec-setup-be-long.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-setup-be-long.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-setup-be-long.c (.../gcc/testsuite/gcc.target/powerpc) (revision 250878) @@ -0,0 +1,11 @@ +/* { dg-do run { target { powerpc64le*-*-linux* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx -maltivec=be" } */ + +/* Test various ways of creating vectors with 2 double words and accessing the + elements. This test uses the long (on 64-bit systems) or long long datatype + (on 32-bit systems). + + This test explicitly tests -maltivec=be to make sure things are correct. */ + +#include "vec-setup.h" Index: gcc/testsuite/gcc.target/powerpc/vec-setup.h =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-setup.h (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-setup.h (.../gcc/testsuite/gcc.target/powerpc) (revision 250878) @@ -0,0 +1,366 @@ +#include + +/* Test various ways of creating vectors with 2 double words and accessing the + elements. This include files supports: + + testing double + testing long on 64-bit systems + testing long long on 32-bit systems. + + The endian support is: + + big endian + little endian with little endian element ordering + little endian with big endian element ordering. */ + +#ifdef DEBUG +#include +#define DEBUG0(STR) fputs (STR, stdout) +#define DEBUG2(STR,A,B) printf (STR, A, B) + +static int errors = 0; + +#else +#include +#define DEBUG0(STR) +#define DEBUG2(STR,A,B) +#endif + +#if defined(DO_DOUBLE) +#define TYPE double +#define STYPE "double" +#define ZERO 0.0 +#define ONE 1.0 +#define TWO 2.0 +#define THREE 3.0 +#define FOUR 4.0 +#define FIVE 5.0 +#define SIX 6.0 +#define FMT "g" + +#elif defined(_ARCH_PPC64) +#define TYPE long +#define STYPE "long" +#define ZERO 0L +#define ONE 1L +#define TWO 2L +#define THREE 3L +#define FOUR 4L +#define FIVE 5L +#define SIX 6L +#define FMT "ld" + +#else +#define TYPE long long +#define STYPE "long long" +#define ZERO 0LL +#define ONE 1LL +#define TWO 2LL +#define THREE 3LL +#define FOUR 4LL +#define FIVE 5LL +#define SIX 6LL +#define FMT "lld" +#endif + +/* Macros to order the left/right values correctly. Note, -maltivec=be does + not change the order for static initializations, so we have to handle it + specially. */ + +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +#define INIT_ORDER(A, B) (TYPE) A, (TYPE) B +#define ELEMENT_ORDER(A, B) (TYPE) A, (TYPE) B +#define ENDIAN "-mbig" + +#elif __VEC_ELEMENT_REG_ORDER__ == __ORDER_BIG_ENDIAN__ +#define NO_ARRAY +#define INIT_ORDER(A, B) (TYPE) B, (TYPE) A +#define ELEMENT_ORDER(A, B) (TYPE) A, (TYPE) B +#define ENDIAN "-mlittle -maltivec=be" + +#else +#define INIT_ORDER(A, B) (TYPE) B, (TYPE) A +#define ELEMENT_ORDER(A, B) (TYPE) B, (TYPE) A +#define ENDIAN "-mlittle" +#endif + +static volatile TYPE five = FIVE; +static volatile TYPE six = SIX; +static volatile vector TYPE s_v12 = { ONE, TWO }; +static volatile vector TYPE g_v34 = { THREE, FOUR }; + + +__attribute__((__noinline__)) +static void +vector_check (vector TYPE v, TYPE expect_hi, TYPE expect_lo) +{ + TYPE actual_hi, actual_lo; +#ifdef DEBUG + const char *pass_fail; +#endif + + __asm__ ("xxlor %x0,%x1,%x1" : "=&wa" (actual_hi) : "wa" (v)); + __asm__ ("xxpermdi %x0,%x1,%x1,3" : "=&wa" (actual_lo) : "wa" (v)); + +#ifdef DEBUG + if ((actual_hi == expect_hi) && (actual_lo == expect_lo)) + pass_fail = ", pass"; + else + { + pass_fail = ", fail"; + errors++; + } + + printf ("Expected %" FMT ", %" FMT ", got %" FMT ", %" FMT "%s\n", + expect_hi, expect_lo, + actual_hi, actual_lo, + pass_fail); +#else + if ((actual_hi != expect_hi) || (actual_lo != expect_lo)) + abort (); +#endif +} + +__attribute__((__noinline__)) +static vector TYPE +combine (TYPE op0, TYPE op1) +{ + return (vector TYPE) { op0, op1 }; +} + +__attribute__((__noinline__)) +static vector TYPE +combine_insert (TYPE op0, TYPE op1) +{ + vector TYPE ret = (vector TYPE) { ZERO, ZERO }; + ret = vec_insert (op0, ret, 0); + ret = vec_insert (op1, ret, 1); + return ret; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract_00 (vector TYPE a, vector TYPE b) +{ + return (vector TYPE) { vec_extract (a, 0), vec_extract (b, 0) }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract_01 (vector TYPE a, vector TYPE b) +{ + return (vector TYPE) { vec_extract (a, 0), vec_extract (b, 1) }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract_10 (vector TYPE a, vector TYPE b) +{ + return (vector TYPE) { vec_extract (a, 1), vec_extract (b, 0) }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract_11 (vector TYPE a, vector TYPE b) +{ + return (vector TYPE) { vec_extract (a, 1), vec_extract (b, 1) }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract2_0s (vector TYPE a, TYPE b) +{ + return (vector TYPE) { vec_extract (a, 0), b }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract2_1s (vector TYPE a, TYPE b) +{ + return (vector TYPE) { vec_extract (a, 1), b }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract2_s0 (TYPE a, vector TYPE b) +{ + return (vector TYPE) { a, vec_extract (b, 0) }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract2_s1 (TYPE a, vector TYPE b) +{ + return (vector TYPE) { a, vec_extract (b, 1) }; +} + +__attribute__((__noinline__)) +static vector TYPE +concat_extract_nn (vector TYPE a, vector TYPE b, size_t i, size_t j) +{ + return (vector TYPE) { vec_extract (a, i), vec_extract (b, j) }; +} + +#ifndef NO_ARRAY +__attribute__((__noinline__)) +static vector TYPE +array_0 (vector TYPE v, TYPE a) +{ + v[0] = a; + return v; +} + +__attribute__((__noinline__)) +static vector TYPE +array_1 (vector TYPE v, TYPE a) +{ + v[1] = a; + return v; +} + +__attribute__((__noinline__)) +static vector TYPE +array_01 (vector TYPE v, TYPE a, TYPE b) +{ + v[0] = a; + v[1] = b; + return v; +} + +__attribute__((__noinline__)) +static vector TYPE +array_01b (TYPE a, TYPE b) +{ + vector TYPE v = (vector TYPE) { 0, 0 }; + v[0] = a; + v[1] = b; + return v; +} +#endif + +int +main (void) +{ + vector TYPE a = (vector TYPE) { ONE, TWO }; + vector TYPE b = (vector TYPE) { THREE, FOUR }; + size_t i, j; + +#ifndef NO_ARRAY + vector TYPE z = (vector TYPE) { ZERO, ZERO }; +#endif + + DEBUG2 ("Endian: %s, type: %s\n", ENDIAN, STYPE); + DEBUG0 ("\nStatic/global initialization\n"); + vector_check (s_v12, INIT_ORDER (1, 2)); + vector_check (g_v34, INIT_ORDER (3, 4)); + + DEBUG0 ("\nVector via constant runtime intiialization\n"); + vector_check (a, INIT_ORDER (1, 2)); + vector_check (b, INIT_ORDER (3, 4)); + + DEBUG0 ("\nCombine scalars using vector initialization\n"); + vector_check (combine (1, 2), INIT_ORDER (1, 2)); + vector_check (combine (3, 4), INIT_ORDER (3, 4)); + + DEBUG0 ("\nSetup with vec_insert\n"); + a = combine_insert (1, 2); + b = combine_insert (3, 4); + vector_check (a, ELEMENT_ORDER (1, 2)); + vector_check (b, ELEMENT_ORDER (3, 4)); + +#ifndef NO_ARRAY + DEBUG0 ("\nTesting array syntax\n"); + vector_check (array_0 (a, FIVE), ELEMENT_ORDER (5, 2)); + vector_check (array_1 (b, SIX), ELEMENT_ORDER (3, 6)); + vector_check (array_01 (z, FIVE, SIX), ELEMENT_ORDER (5, 6)); + vector_check (array_01b (FIVE, SIX), ELEMENT_ORDER (5, 6)); + + vector_check (array_0 (a, five), ELEMENT_ORDER (5, 2)); + vector_check (array_1 (b, six), ELEMENT_ORDER (3, 6)); + vector_check (array_01 (z, five, six), ELEMENT_ORDER (5, 6)); + vector_check (array_01b (five, six), ELEMENT_ORDER (5, 6)); +#else + DEBUG0 ("\nSkipping array syntax on -maltivec=be\n"); +#endif + + DEBUG0 ("\nTesting concat and extract\n"); + vector_check (concat_extract_00 (a, b), INIT_ORDER (1, 3)); + vector_check (concat_extract_01 (a, b), INIT_ORDER (1, 4)); + vector_check (concat_extract_10 (a, b), INIT_ORDER (2, 3)); + vector_check (concat_extract_11 (a, b), INIT_ORDER (2, 4)); + + DEBUG0 ("\nTesting concat and extract #2\n"); + vector_check (concat_extract2_0s (a, FIVE), INIT_ORDER (1, 5)); + vector_check (concat_extract2_1s (a, FIVE), INIT_ORDER (2, 5)); + vector_check (concat_extract2_s0 (SIX, a), INIT_ORDER (6, 1)); + vector_check (concat_extract2_s1 (SIX, a), INIT_ORDER (6, 2)); + + DEBUG0 ("\nTesting variable concat and extract\n"); + for (i = 0; i < 2; i++) + { + for (j = 0; j < 2; j++) + { + static struct { + TYPE hi; + TYPE lo; + } hilo[2][2] = + { { { ONE, THREE }, { ONE, FOUR } }, + { { TWO, THREE }, { TWO, FOUR } } }; + + vector_check (concat_extract_nn (a, b, i, j), + INIT_ORDER (hilo[i][j].hi, hilo[i][j].lo)); + } + } + + DEBUG0 ("\nTesting separate function\n"); + vector_check (combine (vec_extract (a, 0), vec_extract (b, 0)), + INIT_ORDER (1, 3)); + + vector_check (combine (vec_extract (a, 0), vec_extract (b, 1)), + INIT_ORDER (1, 4)); + + vector_check (combine (vec_extract (a, 1), vec_extract (b, 0)), + INIT_ORDER (2, 3)); + + vector_check (combine (vec_extract (a, 1), vec_extract (b, 1)), + INIT_ORDER (2, 4)); + + vector_check (combine_insert (vec_extract (a, 0), vec_extract (b, 0)), + ELEMENT_ORDER (1, 3)); + + vector_check (combine_insert (vec_extract (a, 0), vec_extract (b, 1)), + ELEMENT_ORDER (1, 4)); + + vector_check (combine_insert (vec_extract (a, 1), vec_extract (b, 0)), + ELEMENT_ORDER (2, 3)); + + vector_check (combine_insert (vec_extract (a, 1), vec_extract (b, 1)), + ELEMENT_ORDER (2, 4)); + + +#if defined(DO_DOUBLE) + DEBUG0 ("\nTesting explicit 2df concat\n"); + vector_check (__builtin_vsx_concat_2df (FIVE, SIX), INIT_ORDER (5, 6)); + vector_check (__builtin_vsx_concat_2df (five, six), INIT_ORDER (5, 6)); + +#elif defined(_ARCH_PPC64) + DEBUG0 ("\nTesting explicit 2di concat\n"); + vector_check (__builtin_vsx_concat_2di (FIVE, SIX), INIT_ORDER (5, 6)); + vector_check (__builtin_vsx_concat_2di (five, six), INIT_ORDER (5, 6)); + +#else + DEBUG0 ("\nSkip explicit 2di concat on 32-bit\n"); +#endif + +#ifdef DEBUG + if (errors) + printf ("\n%d error%s were found", errors, (errors == 1) ? "" : "s"); + else + printf ("\nNo errors were found.\n"); + + return errors; + +#else + return 0; +#endif +} Index: gcc/testsuite/gcc.target/powerpc/vec-setup-be-double.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-setup-be-double.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-setup-be-double.c (.../gcc/testsuite/gcc.target/powerpc) (revision 250878) @@ -0,0 +1,12 @@ +/* { dg-do run { target { powerpc*-*-linux* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +/* Test various ways of creating vectors with 2 double words and accessing the + elements. This test uses the double datatype. + + This test explicitly tests -maltivec=be to make sure things are correct. */ + +#define DO_DOUBLE + +#include "vec-setup.h" Index: gcc/testsuite/gcc.target/powerpc/vec-setup-double.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-setup-double.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-setup-double.c (.../gcc/testsuite/gcc.target/powerpc) (revision 250878) @@ -0,0 +1,11 @@ +/* { dg-do run { target { powerpc*-*-linux* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +/* Test various ways of creating vectors with 2 double words and accessing the + elements. This test uses the double datatype and the default endian + order. */ + +#define DO_DOUBLE + +#include "vec-setup.h" Index: gcc/testsuite/gcc.target/powerpc/vec-setup-long.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-setup-long.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-setup-long.c (.../gcc/testsuite/gcc.target/powerpc) (revision 250878) @@ -0,0 +1,9 @@ +/* { dg-do run { target { powerpc*-*-linux* } } } */ +/* { dg-require-effective-target vsx_hw } */ +/* { dg-options "-O2 -mvsx" } */ + +/* Test various ways of creating vectors with 2 double words and accessing the + elements. This test uses the long (on 64-bit systems) or long long datatype + (on 32-bit systems). The default endian order is used. */ + +#include "vec-setup.h" Index: gcc/testsuite/gcc.target/powerpc/vsx-extract-6.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vsx-extract-6.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-extract-6.c (.../gcc/testsuite/gcc.target/powerpc) (revision 250858) @@ -0,0 +1,25 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ + +vector unsigned long +test_vpasted (vector unsigned long high, vector unsigned long low) +{ + vector unsigned long res; + res[1] = high[1]; + res[0] = low[0]; + return res; +} + +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 1 } } */ +/* { dg-final { scan-assembler-not {\mvspltisw\M} } } */ +/* { dg-final { scan-assembler-not {\mxxlor\M} } } */ +/* { dg-final { scan-assembler-not {\mxxlxor\M} } } */ +/* { dg-final { scan-assembler-not {\mxxspltib\M} } } */ +/* { dg-final { scan-assembler-not {\mlxvx?\M} } } */ +/* { dg-final { scan-assembler-not {\mlxv[dw][24]x\M} } } */ +/* { dg-final { scan-assembler-not {\mlvx\M} } } */ +/* { dg-final { scan-assembler-not {\mstxvx?\M} } } */ +/* { dg-final { scan-assembler-not {\mstxv[dw][24]x\M} } } */ +/* { dg-final { scan-assembler-not {\mstvx\M} } } */ Index: gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c (.../gcc/testsuite/gcc.target/powerpc) (revision 250858) @@ -0,0 +1,25 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ + +vector double +test_vpasted (vector double high, vector double low) +{ + vector double res; + res[1] = high[1]; + res[0] = low[0]; + return res; +} + +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 1 } } */ +/* { dg-final { scan-assembler-not {\mvspltisw\M} } } */ +/* { dg-final { scan-assembler-not {\mxxlor\M} } } */ +/* { dg-final { scan-assembler-not {\mxxlxor\M} } } */ +/* { dg-final { scan-assembler-not {\mxxspltib\M} } } */ +/* { dg-final { scan-assembler-not {\mlxvx?\M} } } */ +/* { dg-final { scan-assembler-not {\mlxv[dw][24]x\M} } } */ +/* { dg-final { scan-assembler-not {\mlvx\M} } } */ +/* { dg-final { scan-assembler-not {\mstxvx?\M} } } */ +/* { dg-final { scan-assembler-not {\mstxv[dw][24]x\M} } } */ +/* { dg-final { scan-assembler-not {\mstvx\M} } } */