From patchwork Wed Jun 22 08:26:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 1646381 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=siapSN4D; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LSc1V5vHqz9s0r for ; Wed, 22 Jun 2022 18:26:53 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 260F93835421 for ; Wed, 22 Jun 2022 08:26:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 260F93835421 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1655886409; bh=jeX5RmJT852vAyi5+lMxT9e7GhUo7bs/jyQB3KOG5Gk=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=siapSN4Dl02hP6YCMHqM4czRIGs6/laxhGRV6dHpCL15O/PkcaA5M+Ub3pEeOxmDi QR7w32JXn3FZfi7UzIMJ2KAmIm2Ar4UQmgYEq9Xmqenpg4LIDj1XbZZtzmzYy4zkBY oH1Sgq/fOlMOpSa0vc46SN85P2eK4ES/AUoVsI2g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 35351385701C for ; Wed, 22 Jun 2022 08:26:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 35351385701C Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25M8JAkV000826; Wed, 22 Jun 2022 08:26:25 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3guyfcr47n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 22 Jun 2022 08:26:25 +0000 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 25M8KX4N008715; Wed, 22 Jun 2022 08:26:24 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3guyfcr46y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 22 Jun 2022 08:26:24 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 25M8Ln8Q006662; Wed, 22 Jun 2022 08:26:22 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma03ams.nl.ibm.com with ESMTP id 3gs6b8w9n8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 22 Jun 2022 08:26:22 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 25M8QJsW15073734 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 22 Jun 2022 08:26:19 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 59FD1A4053; Wed, 22 Jun 2022 08:26:19 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 69A81A4040; Wed, 22 Jun 2022 08:26:17 +0000 (GMT) Received: from [9.200.47.194] (unknown [9.200.47.194]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 22 Jun 2022 08:26:17 +0000 (GMT) Message-ID: <85f7e36e-4a24-0e9b-ad8e-56f85cabf5b5@linux.ibm.com> Date: Wed, 22 Jun 2022 16:26:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: gcc-patches Subject: [PATCH v2, rs6000] Use CC for BCD operations [PR100736] X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 2Cm6NSgB0PtubjxCbE4xZ-YiK992noBj X-Proofpoint-ORIG-GUID: 0rwmWvaI6udJa61wYEvFgdTVjX5LAhE9 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.64.514 definitions=2022-06-21_11,2022-06-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 adultscore=0 spamscore=0 lowpriorityscore=0 malwarescore=0 mlxlogscore=999 priorityscore=1501 clxscore=1015 mlxscore=0 bulkscore=0 impostorscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2204290000 definitions=main-2206220039 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: HAO CHEN GUI via Gcc-patches From: HAO CHEN GUI Reply-To: HAO CHEN GUI Cc: Peter Bergner , David , Segher Boessenkool Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch uses CC instead of CCFP for all BCD operations. Thus, infinite math flag has no impact on BCD operations. To support BCD overflow and invalid coding, an UNSPEC is defined to move the bit to a general register. The patterns of condition branch and return with overflow bit are defined as the UNSPEC and branch/return can be combined to one jump insn. The split pattern of overflow bit extension is define for optimization. This patch also replaces bcdadd with bcdsub for BCD invaliding coding expand. ChangeLog 2022-06-22 Haochen Gui gcc/ PR target/100736 * config/rs6000/altivec.md (BCD_TEST): Remove unordered. (bcd_): Replace CCFP with CC. (*bcd_test_): Replace CCFP with CC. Generate condition insn with CC mode. (bcd_overflow_): New. (*bcdoverflow_): New. (*bcdinvalid_): Removed. (bcdinvalid_): Implement by UNSPEC_BCDSUB and UNSPEC_BCD_OVERFLOW. (nuun): New. (*overflow_cbranch): New. (*overflow_creturn): New. (*overflow_extendsidi): New. (bcdshift_v16qi): Replace CCFP with CC. (bcdmul10_v16qi): Likewise. (bcddiv10_v16qi): Likewise. (peephole for bcd_add/sub): Likewise. * config/rs6000/rs6000-builtins.def (__builtin_bcdadd_ov_v1ti): Set pattern to bcdadd_overflow_v1ti. (__builtin_bcdadd_ov_v16qi): Set pattern to bcdadd_overflow_v16qi. (__builtin_bcdsub_ov_v1ti): Set pattern to bcdsub_overflow_v1ti. (__builtin_bcdsub_ov_v16qi): Set pattern to bcdsub_overflow_v16qi. gcc/testsuite/ PR target/100736 * gcc.target/powerpc/bcd-4.c: Adjust number of bcdadd and bcdsub. Scan no cror insns. patch.diff diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index efc8ae35c2e..26f131e61ea 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -4370,7 +4370,7 @@ (define_int_iterator UNSPEC_BCD_ADD_SUB [UNSPEC_BCDADD UNSPEC_BCDSUB]) (define_int_attr bcd_add_sub [(UNSPEC_BCDADD "add") (UNSPEC_BCDSUB "sub")]) -(define_code_iterator BCD_TEST [eq lt le gt ge unordered]) +(define_code_iterator BCD_TEST [eq lt le gt ge]) (define_mode_iterator VBCD [V1TI V16QI]) (define_insn "bcd_" @@ -4379,7 +4379,7 @@ (define_insn "bcd_" (match_operand:VBCD 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "n")] UNSPEC_BCD_ADD_SUB)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CC CR6_REGNO))] "TARGET_P8_VECTOR" "bcd. %0,%1,%2,%3" [(set_attr "type" "vecsimple")]) @@ -4389,9 +4389,9 @@ (define_insn "bcd_" ;; UNORDERED test on an integer type (like V1TImode) is not defined. The type ;; probably should be one that can go in the VMX (Altivec) registers, so we ;; can't use DDmode or DFmode. -(define_insn "*bcd_test_" - [(set (reg:CCFP CR6_REGNO) - (compare:CCFP +(define_insn "bcd_test_" + [(set (reg:CC CR6_REGNO) + (compare:CC (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v") (match_operand:VBCD 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "i")] @@ -4408,8 +4408,8 @@ (define_insn "*bcd_test2_" (match_operand:VBCD 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "i")] UNSPEC_BCD_ADD_SUB)) - (set (reg:CCFP CR6_REGNO) - (compare:CCFP + (set (reg:CC CR6_REGNO) + (compare:CC (unspec:V2DF [(match_dup 1) (match_dup 2) (match_dup 3)] @@ -4502,8 +4502,8 @@ (define_insn "vclrrb" [(set_attr "type" "vecsimple")]) (define_expand "bcd__" - [(parallel [(set (reg:CCFP CR6_REGNO) - (compare:CCFP + [(parallel [(set (reg:CC CR6_REGNO) + (compare:CC (unspec:V2DF [(match_operand:VBCD 1 "register_operand") (match_operand:VBCD 2 "register_operand") (match_operand:QI 3 "const_0_to_1_operand")] @@ -4511,46 +4511,138 @@ (define_expand "bcd__" (match_dup 4))) (clobber (match_scratch:VBCD 5))]) (set (match_operand:SI 0 "register_operand") - (BCD_TEST:SI (reg:CCFP CR6_REGNO) + (BCD_TEST:SI (reg:CC CR6_REGNO) (const_int 0)))] "TARGET_P8_VECTOR" { operands[4] = CONST0_RTX (V2DFmode); + emit_insn (gen_bcd_test_ (operands[0], operands[1], + operands[2], operands[3], + operands[4])); + + rtx cr6 = gen_rtx_REG (CCmode, CR6_REGNO); + rtx condition_rtx = gen_rtx_ (SImode, cr6, const0_rtx); + + if ( == GE || == LE) + { + rtx not_result = gen_reg_rtx (CCEQmode); + rtx not_op, rev_cond_rtx; + rev_cond_rtx = gen_rtx_fmt_ee (rs6000_reverse_condition (SImode, ), + SImode, XEXP (condition_rtx, 0), + const0_rtx); + not_op = gen_rtx_COMPARE (CCEQmode, rev_cond_rtx, const0_rtx); + emit_insn (gen_rtx_SET (not_result, not_op)); + condition_rtx = gen_rtx_EQ (SImode, not_result, const0_rtx); + } + + emit_insn (gen_rtx_SET (operands[0], condition_rtx)); + DONE; }) -(define_insn "*bcdinvalid_" - [(set (reg:CCFP CR6_REGNO) - (compare:CCFP - (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")] - UNSPEC_BCDADD) - (match_operand:V2DF 2 "zero_constant" "j"))) - (clobber (match_scratch:VBCD 0 "=v"))] +(define_expand "bcd_overflow_" + [(parallel [(set (reg:CC CR6_REGNO) + (compare:CC + (unspec:V2DF [(match_operand:VBCD 1 "register_operand") + (match_operand:VBCD 2 "register_operand") + (match_operand:QI 3 "const_0_to_1_operand")] + UNSPEC_BCD_ADD_SUB) + (match_dup 4))) + (clobber (match_scratch:VBCD 5))]) + (set (match_operand:SI 0 "register_operand") + (unspec:SI [(reg:CC CR6_REGNO) + (const_int 0)] + UNSPEC_BCD_OVERFLOW))] "TARGET_P8_VECTOR" - "bcdadd. %0,%1,%1,0" +{ + operands[4] = CONST0_RTX (V2DFmode); +}) + +(define_insn "*bcdoverflow_" + [(set (match_operand:SDI 0 "register_operand" "=r") + (unspec:SDI [(reg:CC CR6_REGNO) + (const_int 0)] + UNSPEC_BCD_OVERFLOW))] + "TARGET_P8_VECTOR" + "mfcr %0,2\;rlwinm %0,%0,28,1" [(set_attr "type" "vecsimple")]) (define_expand "bcdinvalid_" - [(parallel [(set (reg:CCFP CR6_REGNO) - (compare:CCFP - (unspec:V2DF [(match_operand:VBCD 1 "register_operand")] - UNSPEC_BCDADD) + [(parallel [(set (reg:CC CR6_REGNO) + (compare:CC + (unspec:V2DF [(match_operand:VBCD 1 "register_operand") + (match_dup 1) + (const_int 0)] + UNSPEC_BCDSUB) (match_dup 2))) (clobber (match_scratch:VBCD 3))]) (set (match_operand:SI 0 "register_operand") - (unordered:SI (reg:CCFP CR6_REGNO) - (const_int 0)))] + (unspec:SI [(reg:CC CR6_REGNO) + (const_int 0)] + UNSPEC_BCD_OVERFLOW))] "TARGET_P8_VECTOR" { operands[2] = CONST0_RTX (V2DFmode); }) +(define_code_attr nuun [(eq "nu") + (ne "un")]) + +(define_insn "*overflow_cbranch" + [(set (pc) + (if_then_else (eqne + (unspec:SI [(reg:CC CR6_REGNO) + (const_int 0)] + UNSPEC_BCD_OVERFLOW) + (const_int 0)) + (label_ref (match_operand 0)) + (pc)))] + "TARGET_P8_VECTOR" + "b 6,%l0" + [(set_attr "type" "branch") + (set (attr "length") + (if_then_else (and (ge (minus (match_dup 0) (pc)) + (const_int -32768)) + (lt (minus (match_dup 0) (pc)) + (const_int 32764))) + (const_int 4) + (const_int 8)))]) + +(define_insn "*overflow_creturn" + [(set (pc) + (if_then_else (eqne + (unspec:SI [(reg:CC CR6_REGNO) + (const_int 0)] + UNSPEC_BCD_OVERFLOW) + (const_int 0)) + (simple_return) + (pc)))] + "TARGET_P8_VECTOR" + "blr 6" + [(set_attr "type" "jmpreg")]) + +(define_insn_and_split "*overflow_extendsidi" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (sign_extend:DI + (unspec:SI [(reg:CC CR6_REGNO) + (const_int 0)] + UNSPEC_BCD_OVERFLOW)))] + "TARGET_P8_VECTOR" + "#" + "&& 1" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") + (unspec:DI [(reg:CC CR6_REGNO) + (const_int 0)] + UNSPEC_BCD_OVERFLOW))] + "" + [(set_attr "type" "vecsimple")]) + (define_insn "bcdshift_v16qi" [(set (match_operand:V16QI 0 "register_operand" "=v") (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v") (match_operand:V16QI 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "n")] UNSPEC_BCDSHIFT)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CC CR6_REGNO))] "TARGET_P8_VECTOR" "bcds. %0,%1,%2,%3" [(set_attr "type" "vecsimple")]) @@ -4559,7 +4651,7 @@ (define_expand "bcdmul10_v16qi" [(set (match_operand:V16QI 0 "register_operand") (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] UNSPEC_BCDSHIFT)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CC CR6_REGNO))] "TARGET_P9_VECTOR" { rtx one = gen_reg_rtx (V16QImode); @@ -4574,7 +4666,7 @@ (define_expand "bcddiv10_v16qi" [(set (match_operand:V16QI 0 "register_operand") (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] UNSPEC_BCDSHIFT)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CC CR6_REGNO))] "TARGET_P9_VECTOR" { rtx one = gen_reg_rtx (V16QImode); @@ -4598,9 +4690,9 @@ (define_peephole2 (match_operand:V1TI 2 "register_operand") (match_operand:QI 3 "const_0_to_1_operand")] UNSPEC_BCD_ADD_SUB)) - (clobber (reg:CCFP CR6_REGNO))]) - (parallel [(set (reg:CCFP CR6_REGNO) - (compare:CCFP + (clobber (reg:CC CR6_REGNO))]) + (parallel [(set (reg:CC CR6_REGNO) + (compare:CC (unspec:V2DF [(match_dup 1) (match_dup 2) (match_dup 3)] @@ -4613,8 +4705,8 @@ (define_peephole2 (match_dup 2) (match_dup 3)] UNSPEC_BCD_ADD_SUB)) - (set (reg:CCFP CR6_REGNO) - (compare:CCFP + (set (reg:CC CR6_REGNO) + (compare:CC (unspec:V2DF [(match_dup 1) (match_dup 2) (match_dup 3)] diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index f4a9f24bcc5..8e94fe5c438 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -2371,10 +2371,10 @@ BCDADD_LT_V16QI bcdadd_lt_v16qi {} const signed int __builtin_bcdadd_ov_v1ti (vsq, vsq, const int<1>); - BCDADD_OV_V1TI bcdadd_unordered_v1ti {} + BCDADD_OV_V1TI bcdadd_overflow_v1ti {} const signed int __builtin_bcdadd_ov_v16qi (vsc, vsc, const int<1>); - BCDADD_OV_V16QI bcdadd_unordered_v16qi {} + BCDADD_OV_V16QI bcdadd_overflow_v16qi {} const signed int __builtin_bcdinvalid_v1ti (vsq); BCDINVALID_V1TI bcdinvalid_v1ti {} @@ -2419,10 +2419,10 @@ BCDSUB_LT_V16QI bcdsub_lt_v16qi {} const signed int __builtin_bcdsub_ov_v1ti (vsq, vsq, const int<1>); - BCDSUB_OV_V1TI bcdsub_unordered_v1ti {} + BCDSUB_OV_V1TI bcdsub_overflow_v1ti {} const signed int __builtin_bcdsub_ov_v16qi (vsc, vsc, const int<1>); - BCDSUB_OV_V16QI bcdsub_unordered_v16qi {} + BCDSUB_OV_V16QI bcdsub_overflow_v16qi {} const vuc __builtin_crypto_vpermxor_v16qi (vuc, vuc, vuc); VPERMXOR_V16QI crypto_vpermxor_v16qi {} diff --git a/gcc/testsuite/gcc.target/powerpc/bcd-4.c b/gcc/testsuite/gcc.target/powerpc/bcd-4.c index 2c8554dfe82..3c25ed60e17 100644 --- a/gcc/testsuite/gcc.target/powerpc/bcd-4.c +++ b/gcc/testsuite/gcc.target/powerpc/bcd-4.c @@ -2,10 +2,11 @@ /* { dg-require-effective-target int128 } */ /* { dg-require-effective-target power10_hw } */ /* { dg-options "-mdejagnu-cpu=power10 -O2 -save-temps" } */ -/* { dg-final { scan-assembler-times {\mbcdadd\M} 7 } } */ -/* { dg-final { scan-assembler-times {\mbcdsub\M} 18 } } */ +/* { dg-final { scan-assembler-times {\mbcdadd\M} 5 } } */ +/* { dg-final { scan-assembler-times {\mbcdsub\M} 20 } } */ /* { dg-final { scan-assembler-times {\mbcds\M} 2 } } */ /* { dg-final { scan-assembler-times {\mdenbcdq\M} 1 } } */ +/* { dg-final { scan-assembler-not {\mcror\M} 1 } } */ #include