From patchwork Fri Jun 26 18:20:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: will schmidt X-Patchwork-Id: 1317945 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=LQUYFqnq; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49tlZ53yBTz9sRf for ; Sat, 27 Jun 2020 04:21:01 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9F2CB388C007; Fri, 26 Jun 2020 18:20:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9F2CB388C007 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1593195659; bh=pgCRdYQxb6QP8ld7kiHZmK7o6Aj/t4UEou1vR3el04Q=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=LQUYFqnqq4wOVkXHE7RgTOfyILKqOl+ZC8LE6Uv84iy88BSA0i1khVygMe6BQ+59f XhXBM/35BVqwVlUfEENW5+mL8V9eIRdjJ2zBRJ+aw3dvwoy9Gf+rvkPI5Wz3IAGrWe HsoAqZzxrxnXS2lmMHTOeVaFNkLq9FqIfmh6j+Dg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 4B8F0386F824 for ; Fri, 26 Jun 2020 18:20:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4B8F0386F824 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05QI1pSk050802; Fri, 26 Jun 2020 14:20:55 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 31wcbdumt1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 26 Jun 2020 14:20:55 -0400 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 05QI2ldb053876; Fri, 26 Jun 2020 14:20:54 -0400 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com with ESMTP id 31wcbdumsp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 26 Jun 2020 14:20:54 -0400 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05QIJnuD016136; Fri, 26 Jun 2020 18:20:53 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma02dal.us.ibm.com with ESMTP id 31uurtk70r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 26 Jun 2020 18:20:53 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05QIKkPH14549376 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 26 Jun 2020 18:20:46 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 57779136051; Fri, 26 Jun 2020 18:20:47 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7672F136059; Fri, 26 Jun 2020 18:20:46 +0000 (GMT) Received: from lexx (unknown [9.160.10.60]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP; Fri, 26 Jun 2020 18:20:46 +0000 (GMT) Message-ID: <1391c282d282cff26e3ad3bff82789020d33c2fe.camel@vnet.ibm.com> Subject: [PATCH, rs6000] Add support to enable vmsumudm behind vec_msum builtin. To: gcc-patches@gcc.gnu.org Date: Fri, 26 Jun 2020 13:20:43 -0500 X-Mailer: Evolution 3.28.5 (3.28.5-8.el7) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-26_09:2020-06-26, 2020-06-26 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 spamscore=0 cotscore=-2147483648 bulkscore=0 impostorscore=0 malwarescore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 suspectscore=1 mlxlogscore=999 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006260124 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: will schmidt via Gcc-patches From: will schmidt Reply-To: will schmidt Cc: David Edelsohn , Segher Boessenkool Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, Add support for the vmsumudm instruction and tie it into the vec_msum built-in to support the variants of that built-in using vector _int128 parameters. vector _uint128_t vec_msum (vector unsigned long long, vector unsigned long long, vector _uint128_t); vector _int128_t vec_msum (vector signed long long, vector signed long long, vector _int128_t); Regtests currently running on assorted powerpc targets. OK for trunk? Thanks, -Will [gcc] 2020-06-18 Will Schmidt * config/rs6000/altivec.h (vec_vmsumudm): New define. * config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec. (altivec_vmsumudm): New define_insn. * config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3 entry. * config/rs6000/rs6000-call.c (altivec_overloaded_builtins): Add entries for ALTIVEC_BUILTIN_VMSUMUDM variants of vec_msum. [testsuite] 2020-06-18 Will Schmidt * gcc.target/powerpc/builtins-msum-runnable.c: New test. * gcc.target/powerpc/vsx-builtin-msum.c: New test. diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index bb1524f..0d19939 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -159,10 +159,11 @@ #define vec_vmsumuhm __builtin_vec_vmsumuhm #define vec_vmsummbm __builtin_vec_vmsummbm #define vec_vmsumubm __builtin_vec_vmsumubm #define vec_vmsumshs __builtin_vec_vmsumshs #define vec_vmsumuhs __builtin_vec_vmsumuhs +#define vec_vmsumudm __builtin_vec_vmsumudm #define vec_vmulesb __builtin_vec_vmulesb #define vec_vmulesh __builtin_vec_vmulesh #define vec_vmuleuh __builtin_vec_vmuleuh #define vec_vmuleub __builtin_vec_vmuleub #define vec_vmulosh __builtin_vec_vmulosh diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 2ce9227..0481642 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -19,10 +19,11 @@ ;; . (define_c_enum "unspec" [UNSPEC_VCMPBFP UNSPEC_VMSUMU + UNSPEC_VMSUMUDM UNSPEC_VMSUMM UNSPEC_VMSUMSHM UNSPEC_VMSUMUHS UNSPEC_VMSUMSHS UNSPEC_VMHADDSHS @@ -970,10 +971,20 @@ UNSPEC_VMSUMU))] "TARGET_ALTIVEC" "vmsumum %0,%1,%2,%3" [(set_attr "type" "veccomplex")]) +(define_insn "altivec_vmsumudm" + [(set (match_operand:V1TI 0 "register_operand" "=v") + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v") + (match_operand:V1TI 3 "register_operand" "v")] + UNSPEC_VMSUMUDM))] + "TARGET_P8_VECTOR" + "vmsumudm %0,%1,%2,%3" + [(set_attr "type" "veccomplex")]) + (define_insn "altivec_vmsummm" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:VIshort 1 "register_operand" "v") (match_operand:VIshort 2 "register_operand" "v") (match_operand:V4SI 3 "register_operand" "v")] diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 363656e..ee0d787 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1140,10 +1140,11 @@ BU_ALTIVEC_3 (VMHADDSHS, "vmhaddshs", SAT, altivec_vmhaddshs) BU_ALTIVEC_3 (VMHRADDSHS, "vmhraddshs", SAT, altivec_vmhraddshs) BU_ALTIVEC_3 (VMLADDUHM, "vmladduhm", CONST, fmav8hi4) BU_ALTIVEC_3 (VMSUMUBM, "vmsumubm", CONST, altivec_vmsumubm) BU_ALTIVEC_3 (VMSUMMBM, "vmsummbm", CONST, altivec_vmsummbm) BU_ALTIVEC_3 (VMSUMUHM, "vmsumuhm", CONST, altivec_vmsumuhm) +BU_ALTIVEC_3 (VMSUMUDM, "vmsumudm", CONST, altivec_vmsumudm) BU_ALTIVEC_3 (VMSUMSHM, "vmsumshm", CONST, altivec_vmsumshm) BU_ALTIVEC_3 (VMSUMUHS, "vmsumuhs", SAT, altivec_vmsumuhs) BU_ALTIVEC_3 (VMSUMSHS, "vmsumshs", SAT, altivec_vmsumshs) BU_ALTIVEC_3 (VNMSUBFP, "vnmsubfp", FP, nfmsv4sf4) BU_ALTIVEC_3 (VPERM_1TI, "vperm_1ti", CONST, altivec_vperm_v1ti) @@ -1497,10 +1498,11 @@ BU_ALTIVEC_OVERLOAD_3 (SEL, "sel") BU_ALTIVEC_OVERLOAD_3 (VMSUMMBM, "vmsummbm") BU_ALTIVEC_OVERLOAD_3 (VMSUMSHM, "vmsumshm") BU_ALTIVEC_OVERLOAD_3 (VMSUMSHS, "vmsumshs") BU_ALTIVEC_OVERLOAD_3 (VMSUMUBM, "vmsumubm") BU_ALTIVEC_OVERLOAD_3 (VMSUMUHM, "vmsumuhm") +BU_ALTIVEC_OVERLOAD_3 (VMSUMUDM, "vmsumudm") BU_ALTIVEC_OVERLOAD_3 (VMSUMUHS, "vmsumuhs") /* Altivec DST overloaded builtins. */ BU_ALTIVEC_OVERLOAD_D (DST, "dst") BU_ALTIVEC_OVERLOAD_D (DSTT, "dstt") diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 3a109fe..c91980d3 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -3086,10 +3086,16 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V4SI, RS6000_BTI_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_V4SI }, { ALTIVEC_BUILTIN_VEC_MSUM, ALTIVEC_BUILTIN_VMSUMUHM, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI }, { ALTIVEC_BUILTIN_VEC_MSUM, ALTIVEC_BUILTIN_VMSUMSHM, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V4SI }, + + { ALTIVEC_BUILTIN_VEC_MSUM, ALTIVEC_BUILTIN_VMSUMUDM, + RS6000_BTI_V1TI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V1TI }, + { ALTIVEC_BUILTIN_VEC_MSUM, ALTIVEC_BUILTIN_VMSUMUDM, + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V1TI }, + { ALTIVEC_BUILTIN_VEC_VMSUMSHM, ALTIVEC_BUILTIN_VMSUMSHM, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V4SI }, { ALTIVEC_BUILTIN_VEC_VMSUMUHM, ALTIVEC_BUILTIN_VMSUMUHM, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V4SI }, { ALTIVEC_BUILTIN_VEC_VMSUMMBM, ALTIVEC_BUILTIN_VMSUMMBM, diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 95f7192..d95974a 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -20207,10 +20207,17 @@ bool scalar_test_data_class (double source, const int condition); bool scalar_test_data_class (__ieee128 source, const int condition); bool scalar_test_neg (float source); bool scalar_test_neg (double source); bool scalar_test_neg (__ieee128 source); + +vector _uint128_t vec_msum (vector unsigned long long, + vector unsigned long long, + vector _uint128_t); +vector _int128_t vec_msum (vector signed long long, + vector signed long long, + vector _int128_t); @end smallexample The @code{scalar_extract_exp} and @code{scalar_extract_sig} functions require a 64-bit environment supporting ISA 3.0 or later. The @code{scalar_extract_exp} and @code{scalar_extract_sig} built-in @@ -20226,10 +20233,13 @@ When supplied with a 128-bit @code{source} argument, the treated similarly. Note that the sign of the significand is not represented in the result returned from the @code{scalar_extract_sig} function. Use the @code{scalar_test_neg} function to test the sign of its @code{double} argument. +The @code{vec_msum} functions perform a vector multiply-sum, returning +the result of arg1*arg2+arg3. ISA 3.0 adds support for vec_msum returning +a vector int128 result. The @code{scalar_insert_exp} functions require a 64-bit environment supporting ISA 3.0 or later. When supplied with a 64-bit first argument, the @code{scalar_insert_exp} built-in function returns a double-precision diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-msum-runnable.c b/gcc/testsuite/gcc.target/powerpc/builtins-msum-runnable.c new file mode 100644 index 0000000..0fa5c31 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/builtins-msum-runnable.c @@ -0,0 +1,74 @@ +/* { dg-do run { target { p9vector_hw } } } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */ + +#include + +#ifdef DEBUG +#include +#endif + +void abort (void); + +int +main() +{ + vector __uint128_t arg_uint128, result_uint128, expected_uint128; + vector __int128_t arg_int128, result_int128, expected_int128; + + arg_uint128[0] = 0x1627384950617243; + arg_uint128[0] = arg_uint128[0] << 64; + arg_uint128[0] |= 0x9405182930415263; + expected_uint128[0] = 0x1627384950617243; + expected_uint128[0] = expected_uint128[0] << 64; + expected_uint128[0] |= 0xb6b07e42a570e5fe; + vector unsigned long long arg_vull2 = {0x12345678,0x44445555}; + vector unsigned long long arg_vull3 = {0x6789abcd,0x66667777}; + result_uint128 = vec_msum (arg_vull2, arg_vull3, arg_uint128); + + if (result_uint128[0] != expected_uint128[0]) + { +#ifdef DEBUG + printf("result_uint128[0] doesn't match expected_u128[0]\n"); + printf("arg_vull2 %llx %llx \n", arg_vull2[0], arg_vull2[1]); + printf("arg_vull3 %llx %llx \n", arg_vull3[0], arg_vull3[1]); + printf("arg_uint128[0] = %llx ", arg_uint128[0] >> 64); + printf(" %llx\n", arg_uint128[0] & 0xFFFFFFFFFFFFFFFF); + + printf("result_uint128[0] = %llx ", result_uint128[0] >> 64); + printf(" %llx\n", result_uint128[0] & 0xFFFFFFFFFFFFFFFF); + + printf("expected_uint128[0] = %llx ", expected_uint128[0] >> 64); + printf(" %llx\n", expected_uint128[0] & 0xFFFFFFFFFFFFFFFF); +#else + abort(); +#endif + } + + arg_int128[0] = 0x1627384950617283; + arg_int128[0] = arg_int128[0] << 64; + arg_int128[0] |= 0x9405182930415263; + expected_int128[0] = 0x1627384950617283; + expected_int128[0] = expected_int128[0] << 64; + expected_int128[0] |= 0xd99f35969c11cbfa; + vector signed long long arg_vll2 = { 0x567890ab, 0x1233456 }; + vector signed long long arg_vll3 = { 0xcdef0123, 0x9873451 }; + result_int128 = vec_msum (arg_vll2, arg_vll3, arg_int128); + + if (result_int128[0] != expected_int128[0]) + { +#ifdef DEBUG + printf("result_int128[0] doesn't match expected128[0]\n"); + printf("arg_int128[0] = %llx ", arg_int128[0] >> 64); + printf(" %llx\n", arg_int128[0] & 0xFFFFFFFFFFFFFFFF); + + printf("result_int128[0] = %llx ", result_int128[0] >> 64); + printf(" %llx\n", result_int128[0] & 0xFFFFFFFFFFFFFFFF); + + printf("expected_int128[0] = %llx ", expected_int128[0] >> 64); + printf(" %llx\n", expected_int128[0] & 0xFFFFFFFFFFFFFFFF); +#else + abort(); +#endif + } +} + diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-msum.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-msum.c new file mode 100644 index 0000000..f2f7395 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-msum.c @@ -0,0 +1,25 @@ +/* Verify that overloaded built-ins for vec_msum with __int128 + inputs generate the proper code. */ + +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power8 -O3" } */ + +#include + +vector signed __int128 +test_msum_si (vector signed long long vsll_1, vector signed long long vsll_2, + vector signed __int128 vsi128) +{ + return vec_msum (vsll_1, vsll_2, vsi128); +} + +vector unsigned __int128 +test_msum_ui (vector unsigned long long vull_1, vector unsigned long long vull_2, + vector unsigned __int128 vui128) +{ + return vec_msum (vull_1, vull_2, vui128); +} + +/* { dg_final { scan_assembler_times "vmsumudm" 2 } } */ +