From patchwork Thu Feb 1 19:31:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 868395 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-472500-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="GSvzxFdL"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zXVbr6hy4z9s82 for ; Fri, 2 Feb 2018 06:31:36 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; q=dns; s= default; b=lBJgNX8JQAtq3quueHA6QL6vlte2eQXfMnUxXjxpo5Boc8N+lbji2 Pux7TRkRyza43auTY4hY33sHLNH6tCtgyJc0nsrEx1oe4idyimSuaF4FacOEDSet XSDI9TaJ1OJ8O09BUuW7LvEffmt74VfrlYkl65JIWBpwVML0r3mrNI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; s= default; bh=66cO0RLD4d/q/KfClpYe16BP3W4=; b=GSvzxFdLXJYZcR/m1V+8 its/3TYDRWDfOhmW0gRj+qSDDWqxitPDKiL1jK5pzZ9frm5QXGDbNCVkaeb8bT9l zW/xQJm2IPHyceCI48Zx/Y1bm6A5ur8YKaplcULa+d5GxsQZ+iKxySc3LbZE3KEH zl7kn/HzaStvNWIocHEZs8o= Received: (qmail 23018 invoked by alias); 1 Feb 2018 19:31:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 20456 invoked by uid 89); 1 Feb 2018 19:31:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=mrlwinm X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 01 Feb 2018 19:31:23 +0000 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w11JSpNZ042201 for ; Thu, 1 Feb 2018 14:31:22 -0500 Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) by mx0a-001b2d01.pphosted.com with ESMTP id 2fv67krh4n-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 01 Feb 2018 14:31:21 -0500 Received: from localhost by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 1 Feb 2018 12:31:21 -0700 Received: from b03cxnp08027.gho.boulder.ibm.com (9.17.130.19) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 1 Feb 2018 12:31:18 -0700 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w11JVG3e11796928; Thu, 1 Feb 2018 12:31:18 -0700 Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 74B136A03B; Thu, 1 Feb 2018 12:31:18 -0700 (MST) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP id 4C4FA6A03F; Thu, 1 Feb 2018 12:31:18 -0700 (MST) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 8586D495F7; Thu, 1 Feb 2018 14:31:17 -0500 (EST) Date: Thu, 1 Feb 2018 14:31:17 -0500 From: Michael Meissner To: GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH] PowerPC PR target/84154, fix floating point to small integer conversion regression Mail-Followup-To: Michael Meissner , GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 x-cbid: 18020119-0016-0000-0000-000008328DEB X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008459; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000248; SDB=6.00983611; UDB=6.00498865; IPR=6.00762889; BA=6.00005808; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00019323; XFM=3.00000015; UTC=2018-02-01 19:31:20 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18020119-0017-0000-0000-00003D4B854D Message-Id: <20180201193116.GA15164@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-02-01_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1802010249 X-IsSubscribed: yes This patch fixes the optimization regression that occurred on GCC 7 where conversions from the various floating point types to small integers would at times generate a store and a load. For example, converting from double to unsigned char generated the following code on GCC 6 for -mcpu=power8: fctiwuz 1,1 mfvsrd 3,1 rlwinm 3,3,0,0xff on GCC 7 and 8 it generates: fctiwuz 0,1 mfvsrwz 9,0 stw 9,-16(1) ori 2,2,0 lbz 3,-16(1) The insns before register allocation are: (insn 7 8 13 2 (set (subreg:SI (reg:QI 157) 0) (unsigned_fix:SI (reg:SF 33))) (insn 13 7 14 2 (set (reg/i:DI 3 3) (zero_extend:DI (reg:QI 157)))) After reload, the insns are: (insn 7 8 19 2 (set (reg:SI 32 0 [160]) (unsigned_fix:SI (reg:SF 33)))) (insn 19 7 18 2 (set (reg:SI 9 9 [160]) (reg:SI 32 0 [160]))) (insn 18 19 13 2 (set (mem/c:SI (plus:DI (reg/f:DI 1 1) (const_int -16)) (reg:SI 9)))) (insn 13 18 14 2 (set (reg/i:DI 3 3) (zero_extend:DI (mem/c:HI (plus:DI (reg/f:DI 1 1) (const_int -16)))))) ISA 3.0 (Power9) did not have this problem, because it already had a fixuns_truncdfqi2 pattern, since QI/HImode values are allowed in vector registers. Previous versions of the ISA did not allow QI/HImode into vector registers, because there wasn't load or store byte/half-word operations. I extended ISA 3.0 conversion patterns to handle ISA 2.07, using splitters to move the 32-bit int parts back to the GPR to do sign/zero extension or stores. I also moved the optimization to prevent the register allocator from doing a direct move on ISA 3.0 to do an offsettable store via the GPR register to a separate insn, like I had previously done for SImode. The rationale for this is to prevent some places where the register allocator decided to do change a store into a move (and then later store). I have tested this patch on a little endian power8 system (64-bit) and a big endian power8 system (both 32-bit and 64-bit executables). There were no regressions in the test suite and the compiler bootstrapped fine. I added some tests, and verified they ran in all 3 environments. Can I check this into the trunk? Given this is a regression in GCC 7 as well, can I check the patch if it applies cleanly into GCC 7 after a burn-in period. [gcc] 2018-02-01 Michael Meissner PR target/84154 * config/rs6000/rs6000.md (fix_trunc2): Convert from define_expand to be define_insn_and_split. Rework float/double/_Float128 conversions to QI/HI/SImode to work with both ISA 2.07 (power8) or ISA 3.0 (power9). Fix regression where conversions to QI/HImode types did a store and then a load to truncate the value. For conversions to VSX registers, don't split the insn, instead emit the code directly. (fixuns_trunc2): Likewise. (fix_trunc2): Likewise. (fix_trunc2_internal): Delete, no longer used. (fixuns_trunc2_internal): Likewise. (fix__mem): Likewise. (fix_trunc2_mem): On ISA 3.0, prevent the register allocator from doing a direct move to the GPRs to do a store, and instead use the ISA 3.0 store byte/half-word from vector register instruction. For IEEE 128-bit floating point, also optimize stores of 32-bit ints. (fixuns_trunc2_mem): Likewise. (fix_trunc2_mem): Likewise. (fixuns_trunc2_mem): Likewise. [gcc/testsuite] 2018-02-01 Michael Meissner PR target/84154 * gcc.target/powerpc/pr84154-1.c: New tests. * gcc.target/powerpc/pr84154-2.c: Likewise. * gcc.target/powerpc/pr84154-3.c: Likewise. Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 257269) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -5700,43 +5700,48 @@ (define_insn "*fix_truncdi2_fctidz xscvdpsxds %x0,%x1" [(set_attr "type" "fp")]) -(define_expand "fix_trunc2" - [(parallel [(set (match_operand: 0 "nonimmediate_operand") - (fix:QHI (match_operand:SFDF 1 "gpc_reg_operand"))) - (clobber (match_scratch:DI 2))])] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT" +;; If have ISA 3.0, QI/HImode values can go in both VSX registers and GPR +;; registers. If we have ISA 2.07, we don't allow QI/HImode values in the +;; vector registers, so we need to do direct moves to the GPRs, but SImode +;; values can go in VSX registers. Keeping the direct move part through +;; register allocation prevents the register allocator from doing a direct of +;; the SImode value to a GPR, and then a store/load. +(define_insn_and_split "fix_trunc2" + [(set (match_operand: 0 "gpc_reg_operand" "=wJ,wJwK,r") + (fix:QHI (match_operand:SFDF 1 "gpc_reg_operand" "wJ,wJwK,wa"))) + (clobber (match_scratch:SI 2 "=X,X,wi"))] + "TARGET_DIRECT_MOVE" + "@ + fctiwz %0,%1 + xscvdpsxws %x0,%x1 + #" + "&& reload_completed && int_reg_operand (operands[0], mode)" + [(set (match_dup 2) + (fix:SI (match_dup 1))) + (set (match_dup 3) + (match_dup 2))] { - if (MEM_P (operands[0])) - operands[0] = rs6000_address_for_fpconvert (operands[0]); -}) + operands[3] = gen_rtx_REG (SImode, REGNO (operands[0])); +} + [(set_attr "length" "4,4,8") + (set_attr "type" "fp")]) -(define_insn_and_split "*fix_trunc2_internal" - [(set (match_operand: 0 "reg_or_indexed_operand" "=wIwJ,rZ") - (fix:QHI - (match_operand:SFDF 1 "gpc_reg_operand" ","))) - (clobber (match_scratch:DI 2 "=X,wi"))] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT" +;; Keep the convert and store together through register allocation to prevent +;; the register allocator from getting clever and doing a direct move to a GPR +;; and then store for reg+offset stores. +(define_insn_and_split "*fix_trunc2_mem" + [(set (match_operand:QHI 0 "memory_operand" "=Z") + (fix:QHI (match_operand:SFDF 1 "gpc_reg_operand" "wa"))) + (clobber (match_scratch:SI 2 "=wa"))] + "TARGET_P9_VECTOR" "#" "&& reload_completed" - [(const_int 0)] + [(set (match_dup 2) + (fix:SI (match_dup 1))) + (set (match_dup 0) + (match_dup 3))] { - rtx dest = operands[0]; - rtx src = operands[1]; - - if (vsx_register_operand (dest, mode)) - { - rtx di_dest = gen_rtx_REG (DImode, REGNO (dest)); - emit_insn (gen_fix_truncdi2 (di_dest, src)); - } - else - { - rtx tmp = operands[2]; - rtx tmp2 = gen_rtx_REG (mode, REGNO (tmp)); - - emit_insn (gen_fix_truncdi2 (tmp, src)); - emit_move_insn (dest, tmp2); - } - DONE; + operands[3] = gen_rtx_REG (mode, REGNO (operands[2])); }) (define_expand "fixuns_truncsi2" @@ -5803,48 +5808,51 @@ (define_insn "fixuns_truncdi2" xscvdpuxds %x0,%x1" [(set_attr "type" "fp")]) -(define_expand "fixuns_trunc2" - [(parallel [(set (match_operand: 0 "nonimmediate_operand") - (unsigned_fix:QHI (match_operand:SFDF 1 "gpc_reg_operand"))) - (clobber (match_scratch:DI 2))])] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT" +;; If have ISA 3.0, QI/HImode values can go in both VSX registers and GPR +;; registers. If we have ISA 2.07, we don't allow QI/HImode values in the +;; vector registers, so we need to do direct moves to the GPRs, but SImode +;; values can go in VSX registers. Keeping the direct move part through +;; register allocation prevents the register allocator from doing a direct of +;; the SImode value to a GPR, and then a store/load. +(define_insn_and_split "fixuns_trunc2" + [(set (match_operand: 0 "gpc_reg_operand" "=wJ,wJwK,r") + (unsigned_fix:QHI + (match_operand:SFDF 1 "gpc_reg_operand" "wJ,wJwK,wa"))) + (clobber (match_scratch:SI 2 "=X,X,wi"))] + "TARGET_DIRECT_MOVE" + "@ + fctiwuz %0,%1 + xscvdpuxws %x0,%x1 + #" + "&& reload_completed && int_reg_operand (operands[0], mode)" + [(set (match_dup 2) + (unsigned_fix:SI (match_dup 1))) + (set (match_dup 3) + (match_dup 2))] { - if (MEM_P (operands[0])) - operands[0] = rs6000_address_for_fpconvert (operands[0]); -}) + operands[3] = gen_rtx_REG (SImode, REGNO (operands[0])); +} + [(set_attr "length" "4,4,8") + (set_attr "type" "fp")]) -(define_insn_and_split "*fixuns_trunc2_internal" - [(set (match_operand: 0 "reg_or_indexed_operand" "=wIwJ,rZ") - (unsigned_fix:QHI - (match_operand:SFDF 1 "gpc_reg_operand" ","))) - (clobber (match_scratch:DI 2 "=X,wi"))] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT" +;; Keep the convert and store together through register allocation to prevent +;; the register allocator from getting clever and doing a direct move to a GPR +;; and then store for reg+offset stores. +(define_insn_and_split "*fixuns_trunc2_mem" + [(set (match_operand:QHI 0 "memory_operand" "=Z") + (unsigned_fix:QHI (match_operand:SFDF 1 "gpc_reg_operand" "wa"))) + (clobber (match_scratch:SI 2 "=wa"))] + "TARGET_P9_VECTOR" "#" "&& reload_completed" - [(const_int 0)] + [(set (match_dup 2) + (unsigned_fix:SI (match_dup 1))) + (set (match_dup 0) + (match_dup 3))] { - rtx dest = operands[0]; - rtx src = operands[1]; - - if (vsx_register_operand (dest, mode)) - { - rtx di_dest = gen_rtx_REG (DImode, REGNO (dest)); - emit_insn (gen_fixuns_truncdi2 (di_dest, src)); - } - else - { - rtx tmp = operands[2]; - rtx tmp2 = gen_rtx_REG (mode, REGNO (tmp)); - - emit_insn (gen_fixuns_truncdi2 (tmp, src)); - emit_move_insn (dest, tmp2); - } - DONE; + operands[3] = gen_rtx_REG (mode, REGNO (operands[2])); }) -;; If -mvsx-small-integer, we can represent the FIX operation directly. On -;; older machines, we have to use an UNSPEC to produce a SImode and move it -;; to another location, since SImode is not allowed in vector registers. (define_insn "*fctiwz__smallint" [(set (match_operand:SI 0 "vsx_register_operand" "=d,wi") (any_fix:SI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] @@ -14386,6 +14394,15 @@ (define_insn "fix_si2_hw" [(set_attr "type" "vecfloat") (set_attr "size" "128")]) +(define_insn "fix_trunc2" + [(set (match_operand: 0 "altivec_register_operand" "=v") + (fix:QHI + (match_operand:IEEE128 1 "altivec_register_operand" "v")))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xscvqpswz %0,%1" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + (define_insn "fixuns_si2_hw" [(set (match_operand:SI 0 "altivec_register_operand" "=v") (unsigned_fix:SI (match_operand:IEEE128 1 "altivec_register_operand" "v")))] @@ -14394,17 +14411,40 @@ (define_insn "fixuns_si2_hw" [(set_attr "type" "vecfloat") (set_attr "size" "128")]) -;; Combiner pattern to prevent moving the result of converting an IEEE 128-bit -;; floating point value to 32-bit integer to GPR in order to save it. -(define_insn_and_split "*fix__mem" - [(set (match_operand:SI 0 "memory_operand" "=Z") - (any_fix:SI (match_operand:IEEE128 1 "altivec_register_operand" "v"))) - (clobber (match_scratch:SI 2 "=v"))] +(define_insn "fixuns_trunc2" + [(set (match_operand: 0 "altivec_register_operand" "=v") + (unsigned_fix:QHI + (match_operand:IEEE128 1 "altivec_register_operand" "v")))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xscvqpuwz %0,%1" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +;; Combiner patterns to prevent moving the result of converting an IEEE 128-bit +;; floating point value to 8/16/32-bit integer to GPR in order to save it. +(define_insn_and_split "*fix_trunc2_mem" + [(set (match_operand:QHSI 0 "memory_operand" "=Z") + (fix:QHSI + (match_operand:IEEE128 1 "altivec_register_operand" "v"))) + (clobber (match_scratch:QHSI 2 "=v"))] "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" "#" "&& reload_completed" [(set (match_dup 2) - (any_fix:SI (match_dup 1))) + (fix:QHSI (match_dup 1))) + (set (match_dup 0) + (match_dup 2))]) + +(define_insn_and_split "*fixuns_trunc2_mem" + [(set (match_operand:QHSI 0 "memory_operand" "=Z") + (unsigned_fix:QHSI + (match_operand:IEEE128 1 "altivec_register_operand" "v"))) + (clobber (match_scratch:QHSI 2 "=v"))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "#" + "&& reload_completed" + [(set (match_dup 2) + (unsigned_fix:QHSI (match_dup 1))) (set (match_dup 0) (match_dup 2))]) Index: gcc/testsuite/gcc.target/powerpc/pr84154-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr84154-1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr84154-1.c (revision 0) @@ -0,0 +1,55 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mpower8-vector -O2" } */ + +/* PR target/84154. Make sure conversion to char/short does not generate a + store and a load on ISA 2.07 and newer systems. */ + +unsigned char +double_to_uc (double x) +{ + return x; +} + +signed char +double_to_sc (double x) +{ + return x; +} + +unsigned short +double_to_us (double x) +{ + return x; +} + +short +double_to_ss (double x) +{ + return x; +} + +unsigned int +double_to_ui (double x) +{ + return x; +} + +int +double_to_si (double x) +{ + return x; +} + +/* { dg-final { scan-assembler-times {\mextsb\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mextsh\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mfctiwuz\M|\mxscvdpuxws\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mfctiwz\M|\mxscvdpsxws\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mmfvsrwz\M} 6 } } */ +/* { dg-final { scan-assembler-times {\mrlwinm\M} 2 } } */ +/* { dg-final { scan-assembler-not {\mlbz\M} } } */ +/* { dg-final { scan-assembler-not {\mlhz\M} } } */ +/* { dg-final { scan-assembler-not {\mlha\M} } } */ +/* { dg-final { scan-assembler-not {\mmfvsrd\M} } } */ +/* { dg-final { scan-assembler-not {\mstw\M} } } */ Index: gcc/testsuite/gcc.target/powerpc/pr84154-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr84154-2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr84154-2.c (revision 0) @@ -0,0 +1,58 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-mcpu=power8 -O2" } */ + +/* PR target/84154. Make sure on ISA 2.07 (power8) that we store the result of + a conversion to char/short using an offsettable address does not generate + direct moves for storing 32-bit integers, but does do a direct move for + 8/16-bit integers. */ + +void +double_to_uc (double x, unsigned char *p) +{ + p[3] = x; +} + +void +double_to_sc (double x, signed char *p) +{ + p[3] = x; +} + +void +double_to_us (double x, unsigned short *p) +{ + p[3] = x; +} + +void +double_to_ss (double x, short *p) +{ + p[3] = x; +} + +void +double_to_ui (double x, unsigned int *p) +{ + p[3] = x; +} + +void +double_to_si (double x, int *p) +{ + p[3] = x; +} + +/* { dg-final { scan-assembler-times {\mfctiwuz\M|\mxscvdpuxws\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mfctiwz\M|\mxscvdpsxws\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mmfvsrwz\M} 4 } } */ +/* { dg-final { scan-assembler-times {\mstfiwx\M|\mstxsiwx\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mstb\M} 2 } } */ +/* { dg-final { scan-assembler-times {\msth\M} 2 } } */ +/* { dg-final { scan-assembler-not {\mlbz\M} } } */ +/* { dg-final { scan-assembler-not {\mlhz\M} } } */ +/* { dg-final { scan-assembler-not {\mlha\M} } } */ +/* { dg-final { scan-assembler-not {\mmfvsrd\M} } } */ +/* { dg-final { scan-assembler-not {\mstw\M} } } */ Index: gcc/testsuite/gcc.target/powerpc/pr84154-3.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr84154-3.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr84154-3.c (revision 0) @@ -0,0 +1,60 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-options "-mcpu=power9 -O2" } */ + +/* PR target/84154. Make sure on ISA 3.0 we store the result of a conversion + to char/short using an offsettable address does not generate direct moves + for storing 8/16/32-bit integers. */ + +void +double_to_uc (double x, unsigned char *p) +{ + p[3] = x; +} + +void +double_to_sc (double x, signed char *p) +{ + p[3] = x; +} + +void +double_to_us (double x, unsigned short *p) +{ + p[3] = x; +} + +void +double_to_ss (double x, short *p) +{ + p[3] = x; +} + +void +double_to_ui (double x, unsigned int *p) +{ + p[3] = x; +} + +void +double_to_si (double x, int *p) +{ + p[3] = x; +} + +/* { dg-final { scan-assembler-times {\maddi\M} 6 } } */ +/* { dg-final { scan-assembler-times {\mfctiwuz\M|\mxscvdpuxws\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mfctiwz\M|\mxscvdpsxws\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mstfiwx\M|\mstxsiwx\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mstxsibx\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mstxsihx\M} 2 } } */ +/* { dg-final { scan-assembler-not {\mlbz\M} } } */ +/* { dg-final { scan-assembler-not {\mlhz\M} } } */ +/* { dg-final { scan-assembler-not {\mlha\M} } } */ +/* { dg-final { scan-assembler-not {\mmfvsrwz\M} } } */ +/* { dg-final { scan-assembler-not {\mmfvsrd\M} } } */ +/* { dg-final { scan-assembler-not {\mstw\M} } } */ +/* { dg-final { scan-assembler-not {\mstb\M} } } */ +/* { dg-final { scan-assembler-not {\msth\M} } } */