From patchwork Thu Feb  1 19:31:17 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Meissner <meissner@linux.vnet.ibm.com>
X-Patchwork-Id: 868395
Return-Path: 
 <gcc-patches-return-472500-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=gcc-patches-return-472500-incoming=patchwork.ozlabs.org@gcc.gnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b="GSvzxFdL"; dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 3zXVbr6hy4z9s82
	for <incoming@patchwork.ozlabs.org>;
	Fri,  2 Feb 2018 06:31:36 +1100 (AEDT)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:subject:mime-version:content-type:message-id; q=dns; s=
	default; b=lBJgNX8JQAtq3quueHA6QL6vlte2eQXfMnUxXjxpo5Boc8N+lbji2
	Pux7TRkRyza43auTY4hY33sHLNH6tCtgyJc0nsrEx1oe4idyimSuaF4FacOEDSet
	XSDI9TaJ1OJ8O09BUuW7LvEffmt74VfrlYkl65JIWBpwVML0r3mrNI=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:subject:mime-version:content-type:message-id; s=
	default; bh=66cO0RLD4d/q/KfClpYe16BP3W4=; b=GSvzxFdLXJYZcR/m1V+8
	its/3TYDRWDfOhmW0gRj+qSDDWqxitPDKiL1jK5pzZ9frm5QXGDbNCVkaeb8bT9l
	zW/xQJm2IPHyceCI48Zx/Y1bm6A5ur8YKaplcULa+d5GxsQZ+iKxySc3LbZE3KEH
	zl7kn/HzaStvNWIocHEZs8o=
Received: (qmail 23018 invoked by alias); 1 Feb 2018 19:31:28 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 20456 invoked by uid 89); 1 Feb 2018 19:31:26 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-10.1 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS,
	KAM_LAZY_DOMAIN_SECURITY,
	RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=mrlwinm
X-HELO: mx0a-001b2d01.pphosted.com
Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com)
	(148.163.156.1) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Thu, 01 Feb 2018 19:31:23 +0000
Received: from pps.filterd (m0098396.ppops.net [127.0.0.1])	by
	mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id
	w11JSpNZ042201	for <gcc-patches@gcc.gnu.org>;
	Thu, 1 Feb 2018 14:31:22 -0500
Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152])	by
	mx0a-001b2d01.pphosted.com with ESMTP id
	2fv67krh4n-1	(version=TLSv1.2 cipher=AES256-SHA bits=256
	verify=NOT)	for <gcc-patches@gcc.gnu.org>;
	Thu, 01 Feb 2018 14:31:21 -0500
Received: from localhost	by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway:
	Authorized Use Only! Violators will be prosecuted	for
	<gcc-patches@gcc.gnu.org> from
	<meissner@ibm-tiger.the-meissners.org>;
	Thu, 1 Feb 2018 12:31:21 -0700
Received: from b03cxnp08027.gho.boulder.ibm.com (9.17.130.19)	by
	e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be prosecuted;
	Thu, 1 Feb 2018 12:31:18 -0700
Received: from b03ledav003.gho.boulder.ibm.com
	(b03ledav003.gho.boulder.ibm.com [9.17.130.234])	by
	b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0)
	with ESMTP id w11JVG3e11796928; Thu, 1 Feb 2018 12:31:18 -0700
Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1])	by
	IMSVA (Postfix) with ESMTP id 74B136A03B;
	Thu,  1 Feb 2018 12:31:18 -0700 (MST)
Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111])	by
	b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP id
	4C4FA6A03F; Thu,  1 Feb 2018 12:31:18 -0700 (MST)
Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500)	id
	8586D495F7; Thu,  1 Feb 2018 14:31:17 -0500 (EST)
Date: Thu, 1 Feb 2018 14:31:17 -0500
From: Michael Meissner <meissner@linux.vnet.ibm.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	David Edelsohn <dje.gcc@gmail.com>,
	Bill Schmidt <wschmidt@linux.vnet.ibm.com>
Subject: [PATCH] PowerPC PR target/84154,
	fix floating point to small integer conversion regression
Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	David Edelsohn <dje.gcc@gmail.com>,
	Bill Schmidt <wschmidt@linux.vnet.ibm.com>
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.5.20 (2009-12-10)
X-TM-AS-GCONF: 00
x-cbid: 18020119-0016-0000-0000-000008328DEB
X-IBM-SpamModules-Scores: 
X-IBM-SpamModules-Versions: BY=3.00008459; HX=3.00000241; KW=3.00000007;
	PH=3.00000004; SC=3.00000248; SDB=6.00983611; UDB=6.00498865;
	IPR=6.00762889; BA=6.00005808; NDR=6.00000001; ZLA=6.00000005;
	ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000;
	ZU=6.00000002; MB=3.00019323; XFM=3.00000015;
	UTC=2018-02-01 19:31:20
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 18020119-0017-0000-0000-00003D4B854D
Message-Id: <20180201193116.GA15164@ibm-tiger.the-meissners.org>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, ,
	definitions=2018-02-01_06:, , signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
	priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0
	bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0
	impostorscore=0 adultscore=0 classifier=spam adjust=0
	reason=mlx scancount=1 engine=8.0.1-1709140000
	definitions=main-1802010249
X-IsSubscribed: yes

This patch fixes the optimization regression that occurred on GCC 7 where
conversions from the various floating point types to small integers would at
times generate a store and a load.

For example, converting from double to unsigned char generated the following
code on GCC 6 for -mcpu=power8:

        fctiwuz 1,1
        mfvsrd 3,1
        rlwinm 3,3,0,0xff

on GCC 7 and 8 it generates:

        fctiwuz 0,1
        mfvsrwz 9,0
        stw 9,-16(1)
        ori 2,2,0
        lbz 3,-16(1)

The insns before register allocation are:

	(insn 7 8 13 2 (set (subreg:SI (reg:QI 157) 0)
			    (unsigned_fix:SI (reg:SF 33)))

	(insn 13 7 14 2 (set (reg/i:DI 3 3)
			     (zero_extend:DI (reg:QI 157))))

After reload, the insns are:

	(insn 7 8 19 2 (set (reg:SI 32 0 [160])
			    (unsigned_fix:SI (reg:SF 33))))

	(insn 19 7 18 2 (set (reg:SI 9 9 [160])
			     (reg:SI 32 0 [160])))

	(insn 18 19 13 2 (set (mem/c:SI (plus:DI (reg/f:DI 1 1)
						 (const_int -16))
			      (reg:SI 9))))

	(insn 13 18 14 2 (set (reg/i:DI 3 3)
			      (zero_extend:DI (mem/c:HI (plus:DI (reg/f:DI 1 1)
								 (const_int -16))))))

ISA 3.0 (Power9) did not have this problem, because it already had a
fixuns_truncdfqi2 pattern, since QI/HImode values are allowed in vector
registers.  Previous versions of the ISA did not allow QI/HImode into vector
registers, because there wasn't load or store byte/half-word operations.

I extended ISA 3.0 conversion patterns to handle ISA 2.07, using splitters to
move the 32-bit int parts back to the GPR to do sign/zero extension or stores.

I also moved the optimization to prevent the register allocator from doing a
direct move on ISA 3.0 to do an offsettable store via the GPR register to a
separate insn, like I had previously done for SImode.  The rationale for this
is to prevent some places where the register allocator decided to do change a
store into a move (and then later store).

I have tested this patch on a little endian power8 system (64-bit) and a big
endian power8 system (both 32-bit and 64-bit executables).  There were no
regressions in the test suite and the compiler bootstrapped fine.  I added some
tests, and verified they ran in all 3 environments.  Can I check this into the
trunk?  Given this is a regression in GCC 7 as well, can I check the patch if
it applies cleanly into GCC 7 after a burn-in period.

[gcc]
2018-02-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84154
	* config/rs6000/rs6000.md (fix_trunc<SFDF:mode><QHI:mode>2):
	Convert from define_expand to be define_insn_and_split.  Rework
	float/double/_Float128 conversions to QI/HI/SImode to work with
	both ISA 2.07 (power8) or ISA 3.0 (power9).  Fix regression where
	conversions to QI/HImode types did a store and then a load to
	truncate the value.  For conversions to VSX registers, don't split
	the insn, instead emit the code directly.
	(fixuns_trunc<SFDF:mode><QHI:mode>2): Likewise.
	(fix_trunc<IEEE128:mode><QHI:mode>2): Likewise.
	(fix_trunc<SFDF:mode><QHI:mode>2_internal): Delete, no longer
	used.
	(fixuns_trunc<SFDF:mode><QHI:mode>2_internal): Likewise.
	(fix<uns>_<mode>_mem): Likewise.
	(fix_trunc<SFDF:mode><QHI:mode>2_mem): On ISA 3.0, prevent the
	register allocator from doing a direct move to the GPRs to do a
	store, and instead use the ISA 3.0 store byte/half-word from
	vector register instruction.  For IEEE 128-bit floating point,
	also optimize stores of 32-bit ints.
	(fixuns_trunc<SFDF:mode><QHI:mode>2_mem): Likewise.
	(fix_trunc<IEEE128:mode><QHSI:mode>2_mem): Likewise.
	(fixuns_trunc<IEEE128:mode><QHSI:mode>2_mem): Likewise.

[gcc/testsuite]
2018-02-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84154
	* gcc.target/powerpc/pr84154-1.c: New tests.
	* gcc.target/powerpc/pr84154-2.c: Likewise.
	* gcc.target/powerpc/pr84154-3.c: Likewise.

Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 257269)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -5700,43 +5700,48 @@ (define_insn "*fix_trunc<mode>di2_fctidz
    xscvdpsxds %x0,%x1"
   [(set_attr "type" "fp")])
 
-(define_expand "fix_trunc<SFDF:mode><QHI:mode>2"
-  [(parallel [(set (match_operand:<QHI:MODE> 0 "nonimmediate_operand")
-		   (fix:QHI (match_operand:SFDF 1 "gpc_reg_operand")))
-	      (clobber (match_scratch:DI 2))])]
-  "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT"
+;; If have ISA 3.0, QI/HImode values can go in both VSX registers and GPR
+;; registers.  If we have ISA 2.07, we don't allow QI/HImode values in the
+;; vector registers, so we need to do direct moves to the GPRs, but SImode
+;; values can go in VSX registers.  Keeping the direct move part through
+;; register allocation prevents the register allocator from doing a direct of
+;; the SImode value to a GPR, and then a store/load.
+(define_insn_and_split "fix_trunc<SFDF:mode><QHI:mode>2"
+  [(set (match_operand:<QHI:MODE> 0 "gpc_reg_operand" "=wJ,wJwK,r")
+	(fix:QHI (match_operand:SFDF 1 "gpc_reg_operand" "wJ,wJwK,wa")))
+   (clobber (match_scratch:SI 2 "=X,X,wi"))]
+  "TARGET_DIRECT_MOVE"
+  "@
+   fctiwz %0,%1
+   xscvdpsxws %x0,%x1
+   #"
+  "&& reload_completed && int_reg_operand (operands[0], <QHI:MODE>mode)"
+  [(set (match_dup 2)
+	(fix:SI (match_dup 1)))
+   (set (match_dup 3)
+	(match_dup 2))]
 {
-  if (MEM_P (operands[0]))
-    operands[0] = rs6000_address_for_fpconvert (operands[0]);
-})
+  operands[3] = gen_rtx_REG (SImode, REGNO (operands[0]));
+}
+  [(set_attr "length" "4,4,8")
+   (set_attr "type" "fp")])
 
-(define_insn_and_split "*fix_trunc<SFDF:mode><QHI:mode>2_internal"
-  [(set (match_operand:<QHI:MODE> 0 "reg_or_indexed_operand" "=wIwJ,rZ")
-	(fix:QHI
-	 (match_operand:SFDF 1 "gpc_reg_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:DI 2 "=X,wi"))]
-  "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT"
+;; Keep the convert and store together through register allocation to prevent
+;; the register allocator from getting clever and doing a direct move to a GPR
+;; and then store for reg+offset stores.
+(define_insn_and_split "*fix_trunc<SFDF:mode><QHI:mode>2_mem"
+  [(set (match_operand:QHI 0 "memory_operand" "=Z")
+	(fix:QHI (match_operand:SFDF 1 "gpc_reg_operand" "wa")))
+   (clobber (match_scratch:SI 2 "=wa"))]
+  "TARGET_P9_VECTOR"
   "#"
   "&& reload_completed"
-  [(const_int 0)]
+  [(set (match_dup 2)
+	(fix:SI (match_dup 1)))
+   (set (match_dup 0)
+	(match_dup 3))]
 {
-  rtx dest = operands[0];
-  rtx src = operands[1];
-
-  if (vsx_register_operand (dest, <QHI:MODE>mode))
-    {
-      rtx di_dest = gen_rtx_REG (DImode, REGNO (dest));
-      emit_insn (gen_fix_trunc<SFDF:mode>di2 (di_dest, src));
-    }
-  else
-    {
-      rtx tmp = operands[2];
-      rtx tmp2 = gen_rtx_REG (<QHI:MODE>mode, REGNO (tmp));
-
-      emit_insn (gen_fix_trunc<SFDF:mode>di2 (tmp, src));
-      emit_move_insn (dest, tmp2);
-    }
-  DONE;
+  operands[3] = gen_rtx_REG (<QHI:MODE>mode, REGNO (operands[2]));
 })
 
 (define_expand "fixuns_trunc<mode>si2"
@@ -5803,48 +5808,51 @@ (define_insn "fixuns_trunc<mode>di2"
    xscvdpuxds %x0,%x1"
   [(set_attr "type" "fp")])
 
-(define_expand "fixuns_trunc<SFDF:mode><QHI:mode>2"
-  [(parallel [(set (match_operand:<QHI:MODE> 0 "nonimmediate_operand")
-		   (unsigned_fix:QHI (match_operand:SFDF 1 "gpc_reg_operand")))
-	      (clobber (match_scratch:DI 2))])]
-  "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT"
+;; If have ISA 3.0, QI/HImode values can go in both VSX registers and GPR
+;; registers.  If we have ISA 2.07, we don't allow QI/HImode values in the
+;; vector registers, so we need to do direct moves to the GPRs, but SImode
+;; values can go in VSX registers.  Keeping the direct move part through
+;; register allocation prevents the register allocator from doing a direct of
+;; the SImode value to a GPR, and then a store/load.
+(define_insn_and_split "fixuns_trunc<SFDF:mode><QHI:mode>2"
+  [(set (match_operand:<QHI:MODE> 0 "gpc_reg_operand" "=wJ,wJwK,r")
+	(unsigned_fix:QHI
+	 (match_operand:SFDF 1 "gpc_reg_operand" "wJ,wJwK,wa")))
+   (clobber (match_scratch:SI 2 "=X,X,wi"))]
+  "TARGET_DIRECT_MOVE"
+  "@
+   fctiwuz %0,%1
+   xscvdpuxws %x0,%x1
+   #"
+  "&& reload_completed && int_reg_operand (operands[0], <QHI:MODE>mode)"
+  [(set (match_dup 2)
+	(unsigned_fix:SI (match_dup 1)))
+   (set (match_dup 3)
+	(match_dup 2))]
 {
-  if (MEM_P (operands[0]))
-    operands[0] = rs6000_address_for_fpconvert (operands[0]);
-})
+  operands[3] = gen_rtx_REG (SImode, REGNO (operands[0]));
+}
+  [(set_attr "length" "4,4,8")
+   (set_attr "type" "fp")])
 
-(define_insn_and_split "*fixuns_trunc<SFDF:mode><QHI:mode>2_internal"
-  [(set (match_operand:<QHI:MODE> 0 "reg_or_indexed_operand" "=wIwJ,rZ")
-	(unsigned_fix:QHI
-	 (match_operand:SFDF 1 "gpc_reg_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:DI 2 "=X,wi"))]
-  "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT"
+;; Keep the convert and store together through register allocation to prevent
+;; the register allocator from getting clever and doing a direct move to a GPR
+;; and then store for reg+offset stores.
+(define_insn_and_split "*fixuns_trunc<SFDF:mode><QHI:mode>2_mem"
+  [(set (match_operand:QHI 0 "memory_operand" "=Z")
+	(unsigned_fix:QHI (match_operand:SFDF 1 "gpc_reg_operand" "wa")))
+   (clobber (match_scratch:SI 2 "=wa"))]
+  "TARGET_P9_VECTOR"
   "#"
   "&& reload_completed"
-  [(const_int 0)]
+  [(set (match_dup 2)
+	(unsigned_fix:SI (match_dup 1)))
+   (set (match_dup 0)
+	(match_dup 3))]
 {
-  rtx dest = operands[0];
-  rtx src = operands[1];
-
-  if (vsx_register_operand (dest, <QHI:MODE>mode))
-    {
-      rtx di_dest = gen_rtx_REG (DImode, REGNO (dest));
-      emit_insn (gen_fixuns_trunc<SFDF:mode>di2 (di_dest, src));
-    }
-  else
-    {
-      rtx tmp = operands[2];
-      rtx tmp2 = gen_rtx_REG (<QHI:MODE>mode, REGNO (tmp));
-
-      emit_insn (gen_fixuns_trunc<SFDF:mode>di2 (tmp, src));
-      emit_move_insn (dest, tmp2);
-    }
-  DONE;
+  operands[3] = gen_rtx_REG (<QHI:MODE>mode, REGNO (operands[2]));
 })
 
-;; If -mvsx-small-integer, we can represent the FIX operation directly.  On
-;; older machines, we have to use an UNSPEC to produce a SImode and move it
-;; to another location, since SImode is not allowed in vector registers.
 (define_insn "*fctiw<u>z_<mode>_smallint"
   [(set (match_operand:SI 0 "vsx_register_operand" "=d,wi")
 	(any_fix:SI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
@@ -14386,6 +14394,15 @@ (define_insn "fix_<mode>si2_hw"
   [(set_attr "type" "vecfloat")
    (set_attr "size" "128")])
 
+(define_insn "fix_trunc<IEEE128:mode><QHI:mode>2"
+  [(set (match_operand:<QHI:MODE> 0 "altivec_register_operand" "=v")
+	(fix:QHI
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<IEEE128:MODE>mode)"
+  "xscvqpswz %0,%1"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 (define_insn "fixuns_<mode>si2_hw"
   [(set (match_operand:SI 0 "altivec_register_operand" "=v")
 	(unsigned_fix:SI (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
@@ -14394,17 +14411,40 @@ (define_insn "fixuns_<mode>si2_hw"
   [(set_attr "type" "vecfloat")
    (set_attr "size" "128")])
 
-;; Combiner pattern to prevent moving the result of converting an IEEE 128-bit
-;; floating point value to 32-bit integer to GPR in order to save it.
-(define_insn_and_split "*fix<uns>_<mode>_mem"
-  [(set (match_operand:SI 0 "memory_operand" "=Z")
-	(any_fix:SI (match_operand:IEEE128 1 "altivec_register_operand" "v")))
-   (clobber (match_scratch:SI 2 "=v"))]
+(define_insn "fixuns_trunc<IEEE128:mode><QHI:mode>2"
+  [(set (match_operand:<QHI:MODE> 0 "altivec_register_operand" "=v")
+	(unsigned_fix:QHI
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<IEEE128:MODE>mode)"
+  "xscvqpuwz %0,%1"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+;; Combiner patterns to prevent moving the result of converting an IEEE 128-bit
+;; floating point value to 8/16/32-bit integer to GPR in order to save it.
+(define_insn_and_split "*fix_trunc<IEEE128:mode><QHSI:mode>2_mem"
+  [(set (match_operand:QHSI 0 "memory_operand" "=Z")
+	(fix:QHSI
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))
+   (clobber (match_scratch:QHSI 2 "=v"))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "#"
   "&& reload_completed"
   [(set (match_dup 2)
-	(any_fix:SI (match_dup 1)))
+	(fix:QHSI (match_dup 1)))
+   (set (match_dup 0)
+	(match_dup 2))])
+
+(define_insn_and_split "*fixuns_trunc<IEEE128:mode><QHSI:mode>2_mem"
+  [(set (match_operand:QHSI 0 "memory_operand" "=Z")
+	(unsigned_fix:QHSI
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")))
+   (clobber (match_scratch:QHSI 2 "=v"))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 2)
+	(unsigned_fix:QHSI (match_dup 1)))
    (set (match_dup 0)
 	(match_dup 2))])
 
Index: gcc/testsuite/gcc.target/powerpc/pr84154-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr84154-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr84154-1.c	(revision 0)
@@ -0,0 +1,55 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mpower8-vector -O2" } */
+
+/* PR target/84154.  Make sure conversion to char/short does not generate a
+   store and a load on ISA 2.07 and newer systems.  */
+
+unsigned char
+double_to_uc (double x)
+{
+  return x;
+}
+
+signed char
+double_to_sc (double x)
+{
+  return x;
+}
+
+unsigned short
+double_to_us (double x)
+{
+  return x;
+}
+
+short
+double_to_ss (double x)
+{
+  return x;
+}
+
+unsigned int
+double_to_ui (double x)
+{
+  return x;
+}
+
+int
+double_to_si (double x)
+{
+  return x;
+}
+
+/* { dg-final { scan-assembler-times {\mextsb\M}                  1 } } */
+/* { dg-final { scan-assembler-times {\mextsh\M}                  1 } } */
+/* { dg-final { scan-assembler-times {\mfctiwuz\M|\mxscvdpuxws\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mfctiwz\M|\mxscvdpsxws\M}  3 } } */
+/* { dg-final { scan-assembler-times {\mmfvsrwz\M}                6 } } */
+/* { dg-final { scan-assembler-times {\mrlwinm\M}                 2 } } */
+/* { dg-final { scan-assembler-not   {\mlbz\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mlhz\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mlha\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mmfvsrd\M}                   } } */
+/* { dg-final { scan-assembler-not   {\mstw\M}                      } } */
Index: gcc/testsuite/gcc.target/powerpc/pr84154-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr84154-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr84154-2.c	(revision 0)
@@ -0,0 +1,58 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O2" } */
+
+/* PR target/84154.  Make sure on ISA 2.07 (power8) that we store the result of
+   a conversion to char/short using an offsettable address does not generate
+   direct moves for storing 32-bit integers, but does do a direct move for
+   8/16-bit integers.  */
+
+void
+double_to_uc (double x, unsigned char *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_sc (double x, signed char *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_us (double x, unsigned short *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_ss (double x, short *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_ui (double x, unsigned int *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_si (double x, int *p)
+{
+  p[3] = x;
+}
+
+/* { dg-final { scan-assembler-times {\mfctiwuz\M|\mxscvdpuxws\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mfctiwz\M|\mxscvdpsxws\M}  3 } } */
+/* { dg-final { scan-assembler-times {\mmfvsrwz\M}                4 } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M|\mstxsiwx\M}     2 } } */
+/* { dg-final { scan-assembler-times {\mstb\M}                    2 } } */
+/* { dg-final { scan-assembler-times {\msth\M}                    2 } } */
+/* { dg-final { scan-assembler-not   {\mlbz\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mlhz\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mlha\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mmfvsrd\M}                   } } */
+/* { dg-final { scan-assembler-not   {\mstw\M}                      } } */
Index: gcc/testsuite/gcc.target/powerpc/pr84154-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr84154-3.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr84154-3.c	(revision 0)
@@ -0,0 +1,60 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+/* PR target/84154.  Make sure on ISA 3.0 we store the result of a conversion
+   to char/short using an offsettable address does not generate direct moves
+   for storing 8/16/32-bit integers.  */
+
+void
+double_to_uc (double x, unsigned char *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_sc (double x, signed char *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_us (double x, unsigned short *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_ss (double x, short *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_ui (double x, unsigned int *p)
+{
+  p[3] = x;
+}
+
+void
+double_to_si (double x, int *p)
+{
+  p[3] = x;
+}
+
+/* { dg-final { scan-assembler-times {\maddi\M}                   6 } } */
+/* { dg-final { scan-assembler-times {\mfctiwuz\M|\mxscvdpuxws\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mfctiwz\M|\mxscvdpsxws\M}  3 } } */
+/* { dg-final { scan-assembler-times {\mstfiwx\M|\mstxsiwx\M}     2 } } */
+/* { dg-final { scan-assembler-times {\mstxsibx\M}                2 } } */
+/* { dg-final { scan-assembler-times {\mstxsihx\M}                2 } } */
+/* { dg-final { scan-assembler-not   {\mlbz\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mlhz\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mlha\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mmfvsrwz\M}                  } } */
+/* { dg-final { scan-assembler-not   {\mmfvsrd\M}                   } } */
+/* { dg-final { scan-assembler-not   {\mstw\M}                      } } */
+/* { dg-final { scan-assembler-not   {\mstb\M}                      } } */
+/* { dg-final { scan-assembler-not   {\msth\M}                      } } */