From patchwork Mon Mar 21 01:51:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 1607542 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=Sth4SpmQ; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4KMHgN2l0Mz9s75 for ; Mon, 21 Mar 2022 12:52:32 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7AF323857C4E for ; Mon, 21 Mar 2022 01:52:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7AF323857C4E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1647827550; bh=TNYsUlh+AnTJa7f/cb/oVhLqdQcgQkpA5hbvehw3InE=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Sth4SpmQO6coA2AoH7njfdgmmCRZhZerR/LN5I3MrGGRvvn9AKKZVLBi/jh54aLii lVMImfvCRCPzNqRLTH5ikUWGSUVwE/KJ0jTweEMWHlLo+3umvVUayD+MyxJNd8qIVb IxGRxrYBFvnpBin+1oi2z7hqS1SJr1F5mYEFrCJE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id E0A7D3858C83 for ; Mon, 21 Mar 2022 01:51:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E0A7D3858C83 Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 22L1gC5B008262; Mon, 21 Mar 2022 01:51:39 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3exfx3g3mm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 21 Mar 2022 01:51:39 +0000 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 22L1jtYC017659; Mon, 21 Mar 2022 01:51:39 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com with ESMTP id 3exfx3g3m8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 21 Mar 2022 01:51:38 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 22L1mVvn017952; Mon, 21 Mar 2022 01:51:36 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma05fra.de.ibm.com with ESMTP id 3ew6t9adcb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 21 Mar 2022 01:51:36 +0000 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 22L1pYx927787650 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 21 Mar 2022 01:51:34 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2D3F242041; Mon, 21 Mar 2022 01:51:34 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0E27E42042; Mon, 21 Mar 2022 01:51:32 +0000 (GMT) Received: from [9.197.243.199] (unknown [9.197.243.199]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 21 Mar 2022 01:51:31 +0000 (GMT) Message-ID: Date: Mon, 21 Mar 2022 09:51:29 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Content-Language: en-US To: gcc-patches Subject: [PATCH v3, rs6000] Add V1TI into vector comparison expand [PR103316] X-TM-AS-GCONF: 00 X-Proofpoint-GUID: KbwqwTjDjA3peCPD7AXhBrTkwLWULE1l X-Proofpoint-ORIG-GUID: mImw_OvI7wGITxPNMhG4LUNnB_Ogn_6f X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.850,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-03-20_10,2022-03-15_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxscore=0 phishscore=0 adultscore=0 lowpriorityscore=0 bulkscore=0 impostorscore=0 mlxlogscore=999 priorityscore=1501 spamscore=0 suspectscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2203210006 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: HAO CHEN GUI via Gcc-patches From: HAO CHEN GUI Reply-To: HAO CHEN GUI Cc: Peter Bergner , David , Segher Boessenkool Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch adds V1TI mode into a new mode iterator used in vector comparison expands.Without the patch, the comparisons between two vector __int128 are converted to scalar comparisons with branches. The code is suboptimal.The patch fixes the issue. Now all comparisons between two vector __int128 generates P10 new comparison instructions. Also the relative built-ins generate the same instructions after gimple folding. So they're added back to the list. Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2022-03-16 Haochen Gui gcc/ PR target/103316 * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Enable gimple folding for RS6000_BIF_VCMPEQUT, RS6000_BIF_VCMPNET, RS6000_BIF_CMPGE_1TI, RS6000_BIF_CMPGE_U1TI, RS6000_BIF_VCMPGTUT, RS6000_BIF_VCMPGTST, RS6000_BIF_CMPLE_1TI, RS6000_BIF_CMPLE_U1TI. * config/rs6000/vector.md (VEC_IC): Define. Add support for new Power10 V1TI instructions. (vec_cmp): Set mode iterator to VEC_IC. (vec_cmpu): Likewise. gcc/testsuite/ PR target/103316 * gcc.target/powerpc/pr103316.c: New. * gcc.target/powerpc/fold-vec-cmp-int128.c: New cases for vector __int128. patch.diff diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc index 5d34c1bcfc9..fac7f43f438 100644 --- a/gcc/config/rs6000/rs6000-builtin.cc +++ b/gcc/config/rs6000/rs6000-builtin.cc @@ -1994,16 +1994,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case RS6000_BIF_VCMPEQUH: case RS6000_BIF_VCMPEQUW: case RS6000_BIF_VCMPEQUD: - /* We deliberately omit RS6000_BIF_VCMPEQUT for now, because gimple - folding produces worse code for 128-bit compares. */ + case RS6000_BIF_VCMPEQUT: fold_compare_helper (gsi, EQ_EXPR, stmt); return true; case RS6000_BIF_VCMPNEB: case RS6000_BIF_VCMPNEH: case RS6000_BIF_VCMPNEW: - /* We deliberately omit RS6000_BIF_VCMPNET for now, because gimple - folding produces worse code for 128-bit compares. */ + case RS6000_BIF_VCMPNET: fold_compare_helper (gsi, NE_EXPR, stmt); return true; @@ -2015,9 +2013,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case RS6000_BIF_CMPGE_U4SI: case RS6000_BIF_CMPGE_2DI: case RS6000_BIF_CMPGE_U2DI: - /* We deliberately omit RS6000_BIF_CMPGE_1TI and RS6000_BIF_CMPGE_U1TI - for now, because gimple folding produces worse code for 128-bit - compares. */ + case RS6000_BIF_CMPGE_1TI: + case RS6000_BIF_CMPGE_U1TI: fold_compare_helper (gsi, GE_EXPR, stmt); return true; @@ -2029,9 +2026,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case RS6000_BIF_VCMPGTUW: case RS6000_BIF_VCMPGTUD: case RS6000_BIF_VCMPGTSD: - /* We deliberately omit RS6000_BIF_VCMPGTUT and RS6000_BIF_VCMPGTST - for now, because gimple folding produces worse code for 128-bit - compares. */ + case RS6000_BIF_VCMPGTUT: + case RS6000_BIF_VCMPGTST: fold_compare_helper (gsi, GT_EXPR, stmt); return true; @@ -2043,9 +2039,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case RS6000_BIF_CMPLE_U4SI: case RS6000_BIF_CMPLE_2DI: case RS6000_BIF_CMPLE_U2DI: - /* We deliberately omit RS6000_BIF_CMPLE_1TI and RS6000_BIF_CMPLE_U1TI - for now, because gimple folding produces worse code for 128-bit - compares. */ + case RS6000_BIF_CMPLE_1TI: + case RS6000_BIF_CMPLE_U1TI: fold_compare_helper (gsi, LE_EXPR, stmt); return true; diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md index b87a742cca8..d88869cc8d0 100644 --- a/gcc/config/rs6000/vector.md +++ b/gcc/config/rs6000/vector.md @@ -26,6 +26,9 @@ ;; Vector int modes (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI]) +;; Vector int modes for comparison +(define_mode_iterator VEC_IC [V16QI V8HI V4SI V2DI (V1TI "TARGET_POWER10")]) + ;; 128-bit int modes (define_mode_iterator VEC_TI [V1TI TI]) @@ -533,10 +536,10 @@ (define_expand "vcond_mask_" ;; For signed integer vectors comparison. (define_expand "vec_cmp" - [(set (match_operand:VEC_I 0 "vint_operand") + [(set (match_operand:VEC_IC 0 "vint_operand") (match_operator 1 "signed_or_equality_comparison_operator" - [(match_operand:VEC_I 2 "vint_operand") - (match_operand:VEC_I 3 "vint_operand")]))] + [(match_operand:VEC_IC 2 "vint_operand") + (match_operand:VEC_IC 3 "vint_operand")]))] "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" { enum rtx_code code = GET_CODE (operands[1]); @@ -573,10 +576,10 @@ (define_expand "vec_cmp" ;; For unsigned integer vectors comparison. (define_expand "vec_cmpu" - [(set (match_operand:VEC_I 0 "vint_operand") + [(set (match_operand:VEC_IC 0 "vint_operand") (match_operator 1 "unsigned_or_equality_comparison_operator" - [(match_operand:VEC_I 2 "vint_operand") - (match_operand:VEC_I 3 "vint_operand")]))] + [(match_operand:VEC_IC 2 "vint_operand") + (match_operand:VEC_IC 3 "vint_operand")]))] "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)" { enum rtx_code code = GET_CODE (operands[1]); diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c new file mode 100644 index 00000000000..1a4db0f45d4 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-cmp-int128.c @@ -0,0 +1,86 @@ +/* Verify that overloaded built-ins for vec_cmp with __int128 + inputs produce the right code. */ + +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +vector bool __int128 +test3_eq (vector signed __int128 x, vector signed __int128 y) +{ + return vec_cmpeq (x, y); +} + +vector bool __int128 +test6_eq (vector unsigned __int128 x, vector unsigned __int128 y) +{ + return vec_cmpeq (x, y); +} + +vector bool __int128 +test3_ge (vector signed __int128 x, vector signed __int128 y) +{ + return vec_cmpge (x, y); +} + +vector bool __int128 +test6_ge (vector unsigned __int128 x, vector unsigned __int128 y) +{ + return vec_cmpge (x, y); +} + +vector bool __int128 +test3_gt (vector signed __int128 x, vector signed __int128 y) +{ + return vec_cmpgt (x, y); +} + +vector bool __int128 +test6_gt (vector unsigned __int128 x, vector unsigned __int128 y) +{ + return vec_cmpgt (x, y); +} + +vector bool __int128 +test3_le (vector signed __int128 x, vector signed __int128 y) +{ + return vec_cmple (x, y); +} + +vector bool __int128 +test6_le (vector unsigned __int128 x, vector unsigned __int128 y) +{ + return vec_cmple (x, y); +} + +vector bool __int128 +test3_lt (vector signed __int128 x, vector signed __int128 y) +{ + return vec_cmplt (x, y); +} + +vector bool __int128 +test6_lt (vector unsigned __int128 x, vector unsigned __int128 y) +{ + return vec_cmplt (x, y); +} + +vector bool __int128 +test3_ne (vector signed __int128 x, vector signed __int128 y) +{ + return vec_cmpne (x, y); +} + +vector bool __int128 +test6_ne (vector unsigned __int128 x, vector unsigned __int128 y) +{ + return vec_cmpne (x, y); +} + +/* { dg-final { scan-assembler-times "vcmpequq" 4 } } */ +/* { dg-final { scan-assembler-times "vcmpgtsq" 4 } } */ +/* { dg-final { scan-assembler-times "vcmpgtuq" 4 } } */ +/* { dg-final { scan-assembler-times "xxlnor" 6 } } */ + diff --git a/gcc/testsuite/gcc.target/powerpc/pr103316.c b/gcc/testsuite/gcc.target/powerpc/pr103316.c new file mode 100644 index 00000000000..02f7dc5ca1b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103316.c @@ -0,0 +1,80 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +vector bool __int128 +test_eq (vector signed __int128 a, vector signed __int128 b) +{ + return a == b; +} + +vector bool __int128 +test_ne (vector signed __int128 a, vector signed __int128 b) +{ + return a != b; +} + +vector bool __int128 +test_gt (vector signed __int128 a, vector signed __int128 b) +{ + return a > b; +} + +vector bool __int128 +test_ge (vector signed __int128 a, vector signed __int128 b) +{ + return a >= b; +} + +vector bool __int128 +test_lt (vector signed __int128 a, vector signed __int128 b) +{ + return a < b; +} + +vector bool __int128 +test_le (vector signed __int128 a, vector signed __int128 b) +{ + return a <= b; +} + +vector bool __int128 +testu_eq (vector unsigned __int128 a, vector unsigned __int128 b) +{ + return a == b; +} + +vector bool __int128 +testu_ne (vector unsigned __int128 a, vector unsigned __int128 b) +{ + return a != b; +} + +vector bool __int128 +testu_gt (vector unsigned __int128 a, vector unsigned __int128 b) +{ + return a > b; +} + +vector bool __int128 +testu_ge (vector unsigned __int128 a, vector unsigned __int128 b) +{ + return a >= b; +} + +vector bool __int128 +testu_lt (vector unsigned __int128 a, vector unsigned __int128 b) +{ + return a < b; +} + +vector bool __int128 +testu_le (vector unsigned __int128 a, vector unsigned __int128 b) +{ + return a <= b; +} + +/* { dg-final { scan-assembler-times "vcmpequq" 4 } } */ +/* { dg-final { scan-assembler-times "vcmpgtsq" 4 } } */ +/* { dg-final { scan-assembler-times "vcmpgtuq" 4 } } */ +/* { dg-final { scan-assembler-times "xxlnor" 6 } } */