From patchwork Wed Nov 17 08:28:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 1556111 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=MN+2ojjy; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4HvGLp5Kfgz9sRR for ; Wed, 17 Nov 2021 19:29:37 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 30C6D385B83B for ; Wed, 17 Nov 2021 08:29:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 30C6D385B83B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1637137774; bh=CNPef7DvkJ/SBhTlL3tQpphnwT0+5A/AG/TV08uLyI0=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=MN+2ojjyX1qW4WY18sFOGvwNMoBsZAVIFMONBlcy4bEwaf0T0aU7v6C/JmN8xBqNF knbu/AFw9CSANagYtHrTlkdT6MgIdmOyUdU68XVj50TiOXfGu3dACHlUqTHIAqOuPM nvwfQDshbpVK01KFLoekp8PNr8t1HhMaEhvavaP4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 612563858C27 for ; Wed, 17 Nov 2021 08:28:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 612563858C27 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AH81U37005094; Wed, 17 Nov 2021 08:28:47 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3ccwv18ekt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 Nov 2021 08:28:46 +0000 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1AH8NfOX034891; Wed, 17 Nov 2021 08:28:46 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com with ESMTP id 3ccwv18eka-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 Nov 2021 08:28:45 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1AH8ML3v004649; Wed, 17 Nov 2021 08:28:43 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma01fra.de.ibm.com with ESMTP id 3ca509x2qm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 17 Nov 2021 08:28:43 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1AH8LhV139911838 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 17 Nov 2021 08:21:43 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E1C7952051; Wed, 17 Nov 2021 08:28:40 +0000 (GMT) Received: from [9.200.100.183] (unknown [9.200.100.183]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 44A8252057; Wed, 17 Nov 2021 08:28:39 +0000 (GMT) Message-ID: <5313f969-9120-ff25-f685-d1c11ebec764@linux.ibm.com> Date: Wed, 17 Nov 2021 16:28:38 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Content-Language: en-US To: gcc-patches Subject: [PATCH, rs6000] optimization for vec_reve builtin [PR100868] X-TM-AS-GCONF: 00 X-Proofpoint-GUID: CotWv2ODFHisIXRLZZikQsckJiKh6m5V X-Proofpoint-ORIG-GUID: Tj1o15pzoseB3BZSp46Xki0In-3DTYjj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-17_02,2021-11-16_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 mlxlogscore=999 phishscore=0 lowpriorityscore=0 clxscore=1015 suspectscore=0 priorityscore=1501 impostorscore=0 bulkscore=0 spamscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111170037 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: HAO CHEN GUI via Gcc-patches From: HAO CHEN GUI Reply-To: HAO CHEN GUI Cc: Bill Schmidt , David , Segher Boessenkool Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi,   The patch optimized for vec_reve builtin on rs6000. For V2DI and V2DF, it is implemented by xxswapd on all targets. For V16QI, V8HI, V4SI and V4SF, it is implemented by quadword byte reverse plus halfword/word byte reverse when p9_vector is set.   Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2021-11-17 Haochen Gui gcc/ * config/rs6000/altivec.md (altivec_vreve2 for VEC_K): Use xxbrq for v16qi, xxbrq + xxbrh for v8hi and xxbrq + xxbrw for v4si or v4sf when p9_vector is set. (altivec_vreve2 for VEC_64): Defined. Implemented by xxswapd. gcc/testsuite/ * gcc.target/powerpc/vec_reve_1.c: New test. * gcc.target/powerpc/vec_reve_2.c: Likewise. patch.diff 2021-11-17 Haochen Gui gcc/ * config/rs6000/altivec.md (altivec_vreve2 for VEC_K): Use xxbrq for v16qi, xxbrq + xxbrh for v8hi and xxbrq + xxbrw for v4si or v4sf when p9_vector is set. (altivec_vreve2 for VEC_64): Defined. Implemented by xxswapd. gcc/testsuite/ * gcc.target/powerpc/vec_reve_1.c: New test. * gcc.target/powerpc/vec_reve_2.c: Likewise. diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 93d237156d5..480db032495 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -4029,12 +4029,43 @@ (define_expand "altivec_vreveti2" DONE; }) +;; Vector reverse elements for V16QI V8HI V4SI V4SF (define_expand "altivec_vreve2" - [(set (match_operand:VEC_A 0 "register_operand" "=v") - (unspec:VEC_A [(match_operand:VEC_A 1 "register_operand" "v")] + [(set (match_operand:VEC_K 0 "register_operand" "=v") + (unspec:VEC_K [(match_operand:VEC_K 1 "register_operand" "v")] UNSPEC_VREVEV))] "TARGET_ALTIVEC" { + if (TARGET_P9_VECTOR) + { + if (mode == V16QImode) + emit_insn (gen_p9_xxbrq_v16qi (operands[0], operands[1])); + else if (mode == V8HImode) + { + rtx subreg1 = simplify_gen_subreg (V1TImode, operands[1], + mode, 0); + rtx temp = gen_reg_rtx (V1TImode); + emit_insn (gen_p9_xxbrq_v1ti (temp, subreg1)); + rtx subreg2 = simplify_gen_subreg (mode, temp, + V1TImode, 0); + emit_insn (gen_p9_xxbrh_v8hi (operands[0], subreg2)); + } + else /* V4SI and V4SF. */ + { + rtx subreg1 = simplify_gen_subreg (V1TImode, operands[1], + mode, 0); + rtx temp = gen_reg_rtx (V1TImode); + emit_insn (gen_p9_xxbrq_v1ti (temp, subreg1)); + rtx subreg2 = simplify_gen_subreg (mode, temp, + V1TImode, 0); + if (mode == V4SImode) + emit_insn (gen_p9_xxbrw_v4si (operands[0], subreg2)); + else + emit_insn (gen_p9_xxbrw_v4sf (operands[0], subreg2)); + } + DONE; + } + int i, j, size, num_elements; rtvec v = rtvec_alloc (16); rtx mask = gen_reg_rtx (V16QImode); @@ -4053,6 +4084,17 @@ (define_expand "altivec_vreve2" DONE; }) +;; Vector reverse elements for V2DI V2DF +(define_expand "altivec_vreve2" + [(set (match_operand:VEC_64 0 "register_operand" "=v") + (unspec:VEC_64 [(match_operand:VEC_64 1 "register_operand" "v")] + UNSPEC_VREVEV))] + "TARGET_ALTIVEC" +{ + emit_insn (gen_xxswapd_ (operands[0], operands[1])); + DONE; +}) + ;; Vector SIMD PEM v2.06c defines LVLX, LVLXL, LVRX, LVRXL, ;; STVLX, STVLXL, STVVRX, STVRXL are available only on Cell. (define_insn "altivec_lvlx" diff --git a/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c b/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c new file mode 100644 index 00000000000..120c318ddfa --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_altivec_ok } */ +/* { dg-options "-O2 -maltivec" } */ + +#include + +vector double foo1 (vector double a) +{ + return vec_reve (a); +} + +vector long long foo2 (vector long long a) +{ + return vec_reve (a); +} + +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 2 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c b/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c new file mode 100644 index 00000000000..966193951c3 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2" } */ + +#include + +vector int foo1 (vector int a) +{ + return vec_reve (a); +} + +vector float foo2 (vector float a) +{ + return vec_reve (a); +} + +vector short foo3 (vector short a) +{ + return vec_reve (a); +} + +vector char foo4 (vector char a) +{ + return vec_reve (a); +} + +/* { dg-final { scan-assembler-times {\mxxbrq\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxbrw\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxbrh\M} 1 } } */ diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 1351dafbc41..a1698ce85c0 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -4049,13 +4049,43 @@ (define_expand "altivec_negv4sf2" DONE; }) -;; Vector reverse elements +;; Vector reverse elements for V16QI V8HI V4SI V4SF (define_expand "altivec_vreve2" - [(set (match_operand:VEC_A 0 "register_operand" "=v") - (unspec:VEC_A [(match_operand:VEC_A 1 "register_operand" "v")] + [(set (match_operand:VEC_K 0 "register_operand" "=v") + (unspec:VEC_K [(match_operand:VEC_K 1 "register_operand" "v")] UNSPEC_VREVEV))] "TARGET_ALTIVEC" { + if (TARGET_P9_VECTOR) + { + if (mode == V16QImode) + emit_insn (gen_p9_xxbrq_v16qi (operands[0], operands[1])); + else if (mode == V8HImode) + { + rtx subreg1 = simplify_gen_subreg (V1TImode, operands[1], + mode, 0); + rtx temp = gen_reg_rtx (V1TImode); + emit_insn (gen_p9_xxbrq_v1ti (temp, subreg1)); + rtx subreg2 = simplify_gen_subreg (mode, temp, + V1TImode, 0); + emit_insn (gen_p9_xxbrh_v8hi (operands[0], subreg2)); + } + else /* V4SI and V4SF. */ + { + rtx subreg1 = simplify_gen_subreg (V1TImode, operands[1], + mode, 0); + rtx temp = gen_reg_rtx (V1TImode); + emit_insn (gen_p9_xxbrq_v1ti (temp, subreg1)); + rtx subreg2 = simplify_gen_subreg (mode, temp, + V1TImode, 0); + if (mode == V4SImode) + emit_insn (gen_p9_xxbrw_v4si (operands[0], subreg2)); + else + emit_insn (gen_p9_xxbrw_v4sf (operands[0], subreg2)); + } + DONE; + } + int i, j, size, num_elements; rtvec v = rtvec_alloc (16); rtx mask = gen_reg_rtx (V16QImode); @@ -4074,6 +4104,17 @@ (define_expand "altivec_vreve2" DONE; }) +;; Vector reverse elements for V2DI V2DF +(define_expand "altivec_vreve2" + [(set (match_operand:VEC_64 0 "register_operand" "=v") + (unspec:VEC_64 [(match_operand:VEC_64 1 "register_operand" "v")] + UNSPEC_VREVEV))] + "TARGET_ALTIVEC" +{ + emit_insn (gen_xxswapd_ (operands[0], operands[1])); + DONE; +}) + ;; Vector SIMD PEM v2.06c defines LVLX, LVLXL, LVRX, LVRXL, ;; STVLX, STVLXL, STVVRX, STVRXL are available only on Cell. (define_insn "altivec_lvlx" diff --git a/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c b/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c new file mode 100644 index 00000000000..83a9206758b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_reve_1.c @@ -0,0 +1,16 @@ +/* { dg-require-effective-target powerpc_altivec_ok } */ +/* { dg-options "-O2 -maltivec" } */ + +#include + +vector double foo1 (vector double a) +{ + return vec_reve (a); +} + +vector long long foo2 (vector long long a) +{ + return vec_reve (a); +} + +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 2 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c b/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c new file mode 100644 index 00000000000..b6dd33d6d79 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_reve_2.c @@ -0,0 +1,28 @@ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power9 -O2 -maltivec" } */ + +#include + +vector int foo1 (vector int a) +{ + return vec_reve (a); +} + +vector float foo2 (vector float a) +{ + return vec_reve (a); +} + +vector short foo3 (vector short a) +{ + return vec_reve (a); +} + +vector char foo4 (vector char a) +{ + return vec_reve (a); +} + +/* { dg-final { scan-assembler-times {\mxxbrq\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mxxbrw\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxbrh\M} 1 } } */