From patchwork Tue Sep 17 03:40:45 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Meissner <meissner@linux.ibm.com>
X-Patchwork-Id: 1986288
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (2048-bit key;
 unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256
 header.s=pp1 header.b=hGrjJQSM;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4X76yP0JNhz1y2N
	for <incoming@patchwork.ozlabs.org>; Tue, 17 Sep 2024 13:41:15 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id B014A385840B
	for <incoming@patchwork.ozlabs.org>; Tue, 17 Sep 2024 03:41:12 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com
 [148.163.156.1])
 by sourceware.org (Postfix) with ESMTPS id 47DDA3858D26
 for <gcc-patches@gcc.gnu.org>; Tue, 17 Sep 2024 03:40:52 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 47DDA3858D26
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=linux.ibm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 47DDA3858D26
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=148.163.156.1
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1726544454; cv=none;
 b=pqnvgEM1XpmcSIxBw49BDJJUXyyFsx2kf3kwBQQlUA6G+QXqaguFXpPKgMpj18xslPaKXNMbnTcLi8nzMNZ1gXfywNLONKbGlD6Obw16CCbsBGqDa9B2bSNWqQfF8mfS3wldQxAt5nY9Bq4a8G4YijCQZD2S9RBvQzAqOJrKPoE=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1726544454; c=relaxed/simple;
 bh=eV+Yvgyt+8I//PjcW/Z16+i76FyShaHCVqRq4modCfs=;
 h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version;
 b=mktcQx7/pBbhL716M+bbNHM9rrbPE/e9M21LUimjSyLMdHbVoYuGDfgPsHPLwcCf5io4xVTZO+cb6ZRJveiaf8TiqteWiNO90MODKJkGjXpmWWIEJ0++uBBolfxhHJbgAFv8c+m63oLv+JwUDQcCKpyLmdv3fYm324qBwBwx9d4=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from pps.filterd (m0356517.ppops.net [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id
 48GIUx2m007555;
 Tue, 17 Sep 2024 03:40:51 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date
 :from:to:subject:message-id:mime-version:content-type; s=pp1;
 bh=oA/UpKDIveXsre0Xd/iMqrfDxSgVQjtiYjLUitOWN4c=; b=hGrjJQSMW+Iv
 0LzciusirtMnosxj/p/wu5g9wE7cHfE7R7+e+z3lYJOeAgAHLNGj/orUWEXD8Rhl
 zHe7DuhpKITfkEoPA1ySlsA3FFiKPOe/zG/nRrEXMQjuxM4ZKNTeOurImVdkqXWe
 hcv5dG9zqlCDg14k6pzaPMdNPtJ/+Z5wwJPzNomsQvsDvy1wp5tGjDZpVzQq14Hy
 yaQ60ZEEiuMZbaxS3WDdF8lbpcS0fwjqrkOMgbwyMR0UZP5L6H6ZmAXe2/AZGIDa
 SnvB4A5+MXeGJULCrosLTcl6fXlutoVeC1zCV4ioh1v6WUfOzWvrIupfqWxgwl3S
 v6CqbpSpQA==
Received: from pps.reinject (localhost [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 41n3vddmum-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Tue, 17 Sep 2024 03:40:50 +0000 (GMT)
Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1])
 by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 48H3eovA013402;
 Tue, 17 Sep 2024 03:40:50 GMT
Received: from ppma22.wdc07v.mail.ibm.com
 (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92])
 by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 41n3vddmuj-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Tue, 17 Sep 2024 03:40:50 +0000 (GMT)
Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1])
 by ppma22.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id
 48H12xS2000620;
 Tue, 17 Sep 2024 03:40:49 GMT
Received: from smtprelay02.dal12v.mail.ibm.com ([172.16.1.4])
 by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 41nn71321e-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Tue, 17 Sep 2024 03:40:49 +0000
Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com
 [10.39.53.228])
 by smtprelay02.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id
 48H3el8W42467882
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
 Tue, 17 Sep 2024 03:40:47 GMT
Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id 8E26D58055;
 Tue, 17 Sep 2024 03:40:47 +0000 (GMT)
Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id D1B105804B;
 Tue, 17 Sep 2024 03:40:46 +0000 (GMT)
Received: from cowardly-lion.the-meissners.org (unknown [9.61.174.39])
 by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTPS;
 Tue, 17 Sep 2024 03:40:46 +0000 (GMT)
Date: Mon, 16 Sep 2024 23:40:45 -0400
From: Michael Meissner <meissner@linux.ibm.com>
To: gcc-patches@gcc.gnu.org, Michael Meissner <meissner@linux.ibm.com>,
 Segher Boessenkool <segher@kernel.crashing.org>,
 "Kewen.Lin" <linkw@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>,
 Peter Bergner <bergner@linux.ibm.com>
Subject: [REPOST, PATCH] PR 89213: Add better support for shifting vectors
 with 64-bit elements
Message-ID: <Zuj6Pd-GhAp8elFm@cowardly-lion.the-meissners.org>
Mail-Followup-To: Michael Meissner <meissner@linux.ibm.com>,
 gcc-patches@gcc.gnu.org,
 Segher Boessenkool <segher@kernel.crashing.org>,
 "Kewen.Lin" <linkw@linux.ibm.com>,
 David Edelsohn <dje.gcc@gmail.com>,
 Peter Bergner <bergner@linux.ibm.com>
MIME-Version: 1.0
Content-Disposition: inline
X-TM-AS-GCONF: 00
X-Proofpoint-ORIG-GUID: TaUi9k49syQA7iGRBAOOeHi_nIY2F6il
X-Proofpoint-GUID: oFSi3oPs7y7JcPDBzTjzg840QznSPXkK
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29
 definitions=2024-09-17_01,2024-09-16_01,2024-09-02_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
 impostorscore=0 spamscore=0
 lowpriorityscore=0 suspectscore=0 bulkscore=0 clxscore=1011 mlxscore=0
 malwarescore=0 mlxlogscore=999 priorityscore=1501 phishscore=0
 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.19.0-2408220000 definitions=main-2409170025
X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_LOW,
 RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

I posted this patch in August, and I never got a reply, so I'm reposting this
now.

This patch fixes PR target/89213 to allow better code to be generated to do
constant shifts of V2DI/V2DF vectors.  Previously GCC would do constant shifts
of vectors with 64-bit elements by using:

	XXSPLTIB 32,4
	VEXTSB2D 0,0
	VSRAD 2,2,0

I.e., the PowerPC does not have a VSPLTISD instruction to load -15..14 for the
64-bit shift count in one instruction.  Instead, it would need to load a byte
and then convert it to 64-bit.

With this patch, GCC now realizes that the vector shift instructions will look
at the bottom 6 bits for the shift count, and it can use either a VSPLTISW or
XXSPLTIB instruction to load the shift count.

I have built these patches on both big endian PowerPC server systems and little
endian PowerPC server systems, and there were no regressions.  Can I check in
this patch to the master branch for GCC 15?

2024-09-16  Michael Meissner  <meissner@linux.ibm.com>

gcc/

	PR target/89213
	* config/rs6000/altivec.md (UNSPEC_VECTOR_SHIFT): New unspec.
	(VSHIFT_MODE): New mode iterator.
	(vshift_code): New code iterator.
	(vshift_attr): New code attribute.
	(altivec_<mode>_<vshift_attr>_const): New pattern to optimize
	vector long long/int shifts by a constant.
	(altivec_<mode>_shift_const): New helper insn to load up a
	constant used by the shift operation.
	* config/rs6000/predicates.md (vector_shift_constant): New
	predicate.

gcc/testsuite/

	PR target/89213
	* gcc.target/powerpc/pr89213.c: New test.
	* gcc.target/powerpc/vec-rlmi-rlnm.c: Update instruction count.
---
 gcc/config/rs6000/altivec.md                  |  51 +++++++++
 gcc/config/rs6000/predicates.md               |  63 +++++++++++
 gcc/testsuite/gcc.target/powerpc/pr89213.c    | 106 ++++++++++++++++++
 .../gcc.target/powerpc/vec-rlmi-rlnm.c        |   4 +-
 4 files changed, 222 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr89213.c

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 1f5489b974f..8faece984e9 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -170,6 +170,7 @@ (define_c_enum "unspec"
    UNSPEC_VSTRIL
    UNSPEC_SLDB
    UNSPEC_SRDB
+   UNSPEC_VECTOR_SHIFT
 ])
 
 (define_c_enum "unspecv"
@@ -2176,6 +2177,56 @@ (define_insn "altivec_vsro"
   "vsro %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
+;; Optimize V2DI shifts by constants.  This relies on the shift instructions
+;; only looking at the bits needed to do the shift.  This means we can use
+;; VSPLTISW or XXSPLTIB to load up the constant, and not worry about the bits
+;; that the vector shift instructions will not use.
+(define_mode_iterator VSHIFT_MODE	[(V4SI "TARGET_P9_VECTOR")
+					 (V2DI "TARGET_P8_VECTOR")])
+
+(define_code_iterator vshift_code	[ashift ashiftrt lshiftrt])
+(define_code_attr vshift_attr		[(ashift   "ashift")
+					 (ashiftrt "ashiftrt")
+					 (lshiftrt "lshiftrt")])
+
+(define_insn_and_split "*altivec_<mode>_<vshift_attr>_const"
+  [(set (match_operand:VSHIFT_MODE 0 "register_operand" "=v")
+	(vshift_code:VSHIFT_MODE
+	 (match_operand:VSHIFT_MODE 1 "register_operand" "v")
+	 (match_operand:VSHIFT_MODE 2 "vector_shift_constant" "")))
+   (clobber (match_scratch:VSHIFT_MODE 3 "=&v"))]
+  "((<MODE>mode == V2DImode && TARGET_P8_VECTOR)
+    || (<MODE>mode == V4SImode && TARGET_P9_VECTOR))"
+  "#"
+  "&& 1"
+  [(set (match_dup 3)
+	(unspec:VSHIFT_MODE [(match_dup 4)] UNSPEC_VECTOR_SHIFT))
+   (set (match_dup 0)
+	(vshift_code:VSHIFT_MODE (match_dup 1)
+				 (match_dup 3)))]
+{
+  if (GET_CODE (operands[3]) == SCRATCH)
+    operands[3] = gen_reg_rtx (<MODE>mode);
+
+  operands[4] = ((GET_CODE (operands[2]) == CONST_VECTOR)
+		 ? CONST_VECTOR_ELT (operands[2], 0)
+		 : XEXP (operands[2], 0));
+})
+
+(define_insn "*altivec_<mode>_shift_const"
+  [(set (match_operand:VSHIFT_MODE 0 "register_operand" "=v")
+	(unspec:VSHIFT_MODE [(match_operand 1 "const_int_operand" "n")]
+			    UNSPEC_VECTOR_SHIFT))]
+  "TARGET_P8_VECTOR"
+{
+  if (UINTVAL (operands[1]) <= 15)
+    return "vspltisw %0,%1";
+  else if (TARGET_P9_VECTOR)
+    return "xxspltib %x0,%1";
+  else
+    gcc_unreachable ();
+})
+
 (define_insn "altivec_vsum4ubs"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
         (unspec:V4SI [(match_operand:V16QI 1 "register_operand" "v")
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 7f0b4ab61e6..0b78901e94b 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -861,6 +861,69 @@ (define_predicate "vector_int_reg_or_same_bit"
     return op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode);
 })
 
+;; Return 1 if the operand is a V2DI or V4SI const_vector, where each element
+;; is the same constant, and the constant can be used for a shift operation.
+;; This is to prevent sub-optimal code, that needs to load up the constant and
+;; then zero extend it 32 or 64-bit vectors or load up the constant from the
+;; literal pool.
+;;
+;; For V4SImode, we only recognize shifts by 16..31 on ISA 3.0, since shifts by
+;; 1..15 can be handled by the normal VSPLTISW and vector shift instruction.
+;; For V2DImode, we do this all of the time, since there is no convenient
+;; instruction to load up a vector long long splatted constant.
+;;
+;; If we can use XXSPLTIB, then allow constants up to 63.  If not, we restrict
+;; the constant to 0..15 that can be loaded with VSPLTISW.  V4SI shifts are
+;; only optimized for ISA 3.0 when the shift value is >= 16 and <= 31.  Values
+;; between 0 and 15 can use a normal VSPLTISW to load the value, and it doesn't
+;; need this optimization.
+(define_predicate "vector_shift_constant"
+  (match_code "const_vector,vec_duplicate")
+{
+  unsigned HOST_WIDE_INT min_value;
+
+  if (mode == V2DImode)
+    {
+      min_value = 0;
+      if (!TARGET_P8_VECTOR)
+	return 0;
+    }
+  else if (mode == V4SImode)
+    {
+      min_value = 16;
+      if (!TARGET_P9_VECTOR)
+	return 0;
+    }
+  else
+    return 0;
+
+  unsigned HOST_WIDE_INT max_value = TARGET_P9_VECTOR ? 63 : 15;
+
+  if (GET_CODE (op) == CONST_VECTOR)
+    {
+      unsigned HOST_WIDE_INT first = UINTVAL (CONST_VECTOR_ELT (op, 0));
+      unsigned nunits = GET_MODE_NUNITS (mode);
+      unsigned i;
+
+      if (!IN_RANGE (first, min_value, max_value))
+	return 0;
+
+      for (i = 1; i < nunits; i++)
+	if (first != UINTVAL (CONST_VECTOR_ELT (op, i)))
+	  return 0;
+
+      return 1;
+    }
+  else
+    {
+      rtx op0 = XEXP (op, 0);
+      if (!CONST_INT_P (op0))
+	return 0;
+
+      return IN_RANGE (UINTVAL (op0), min_value, max_value);
+    }
+})
+
 ;; Return 1 if operand is 0.0.
 (define_predicate "zero_fp_constant"
   (and (match_code "const_double")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr89213.c b/gcc/testsuite/gcc.target/powerpc/pr89213.c
new file mode 100644
index 00000000000..8f5fae2c3ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr89213.c
@@ -0,0 +1,106 @@
+/* { dg-do compile { target { lp64 } } } */
+/* { dg-require-effective-target powerpc_vsx } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+/* Optimize vector shifts by constants.  */
+
+#include <altivec.h>
+
+typedef vector long long vi64_t;
+typedef vector unsigned long long vui64_t;
+
+typedef vector int vi32_t;
+typedef vector unsigned int vui32_t;
+
+vi64_t
+shiftra_test64_4 (vi64_t a)
+{
+  vui64_t x = {4, 4};
+  return (vi64_t) vec_vsrad (a, x);
+}
+
+vi64_t
+shiftrl_test64_4 (vi64_t a)
+{
+  vui64_t x = {4, 4};
+  return (vi64_t) vec_vsrd (a, x);
+}
+
+vi64_t
+shiftl_test64_4 (vi64_t a)
+{
+  vui64_t x = {4, 4};
+  return (vi64_t) vec_vsld (a, x);
+}
+
+vi64_t
+shiftra_test64_29 (vi64_t a)
+{
+  vui64_t x = {29, 29};
+  return (vi64_t) vec_vsrad (a, x);
+}
+
+vi64_t
+shiftrl_test64_29 (vi64_t a)
+{
+  vui64_t x = {29, 29};
+  return (vi64_t) vec_vsrd (a, x);
+}
+
+vi64_t
+shiftl_test64_29 (vi64_t a)
+{
+  vui64_t x = {29, 29};
+  return (vi64_t) vec_vsld (a, x);
+}
+
+vi32_t
+shiftra_test32_4 (vi32_t a)
+{
+  vui32_t x = {4, 4, 4, 4};
+  return (vi32_t) vec_vsraw (a, x);
+}
+
+vi32_t
+shiftrl_test32_4 (vi32_t a)
+{
+  vui32_t x = {4, 4, 4, 4};
+  return (vi32_t) vec_vsrw (a, x);
+}
+
+vi32_t
+shiftl_test32_4 (vi32_t a)
+{
+  vui32_t x = {4, 4, 4, 4};
+  return (vi32_t) vec_vslw (a, x);
+}
+
+vi32_t
+shiftra_test32_29 (vi32_t a)
+{
+  vui32_t x = {29, 29, 29, 29};
+  return (vi32_t) vec_vsraw (a, x);
+}
+
+vi32_t
+shiftrl_test32_29 (vi32_t a)
+{
+  vui32_t x = {29, 29, 29, 29};
+  return (vi32_t) vec_vsrw (a, x);
+}
+
+vi32_t
+shiftl_test32_29 (vi32_t a)
+{
+  vui32_t x = {29, 29, 29, 29};
+  return (vi32_t) vec_vslw (a, x);
+}
+
+/* { dg-final { scan-assembler-times {\mxxspltib\M}  6 } } */
+/* { dg-final { scan-assembler-times {\mvsld\M}      2 } } */
+/* { dg-final { scan-assembler-times {\mvslw\M}      2 } } */
+/* { dg-final { scan-assembler-times {\mvspltisw\M}  6 } } */
+/* { dg-final { scan-assembler-times {\mvsrd\M}      2 } } */
+/* { dg-final { scan-assembler-times {\mvsrw\M}      2 } } */
+/* { dg-final { scan-assembler-times {\mvsrad\M}     2 } } */
+/* { dg-final { scan-assembler-times {\mvsraw\M}     2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
index 6834733b1bf..01fa0a99d46 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
@@ -54,12 +54,12 @@ rlnm_test_2 (vector unsigned long long x, vector unsigned long long y,
     - For rlnm_test_1: vspltisw, vslw, xxlor, vrlwnm.
     - For rlnm_test_2: xxspltib, vextsb2d, vsld, xxlor, vrldnm.
    There is a choice of splat instructions in both cases, so we
-   just check for "splt".  */
+   just check for "splt".  In the past vextsb2d would be generated for
+   rlnm_test_2, but the compiler no longer generates it.  */
 
 /* { dg-final { scan-assembler-times "vrlwmi" 1 } } */
 /* { dg-final { scan-assembler-times "vrldmi" 1 } } */
 /* { dg-final { scan-assembler-times "splt" 2 } } */
-/* { dg-final { scan-assembler-times "vextsb2d" 1 } } */
 /* { dg-final { scan-assembler-times "vslw" 1 } } */
 /* { dg-final { scan-assembler-times "vsld" 1 } } */
 /* { dg-final { scan-assembler-times "xxlor" 4 } } */