From patchwork Fri Jul 19 20:04:12 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Carl Love <cel@linux.ibm.com>
X-Patchwork-Id: 1962614
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (2048-bit key;
 unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256
 header.s=pp1 header.b=dyr/vj5C;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4WQgcy1J6Lz1xrQ
	for <incoming@patchwork.ozlabs.org>; Sat, 20 Jul 2024 06:04:48 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id CB16A3847700
	for <incoming@patchwork.ozlabs.org>; Fri, 19 Jul 2024 20:04:46 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com
 [148.163.158.5])
 by sourceware.org (Postfix) with ESMTPS id E3CBE385DDC5
 for <gcc-patches@gcc.gnu.org>; Fri, 19 Jul 2024 20:04:18 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E3CBE385DDC5
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=linux.ibm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E3CBE385DDC5
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=148.163.158.5
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721419465; cv=none;
 b=sxWGMOopgT2lJe8dk8lGBgWLK9H8rPK3/XryqpyOrWbIKqKLeLZjkCofblK4nfoyrMh9gZJGU71UxL92ksSa1hIDx3ZC7zbXWwR/mbamNfq95at92niZFryHLvbXrt+IBwZfu1HLXIoP3uSofiPDknG4t7HaScbBIIcqG7vxOio=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1721419465; c=relaxed/simple;
 bh=N8qGUTbtfXdG352FqkIPx88GKVwwQdcxFPJM4UnQWVk=;
 h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject;
 b=xS25IaF9g9JzBbDH6LUngmLlaHiqRpDwEkZSAfv7vh55cnYCfeeMzZjg1nbfEGh5Ng4b11+taHNaDxJXi/gjGZnmUOIiBQXBoQf9w/Mben+dvnnchcVJzqeT3tqeGEooMaYMLXCqovzdvQB+vPdQJwtcNxsTBc5mKbs7XzM+zWg=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from pps.filterd (m0353723.ppops.net [127.0.0.1])
 by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id
 46JJuiFA009259;
 Fri, 19 Jul 2024 20:04:17 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=
 message-id:date:mime-version:to:from:subject:content-type
 :content-transfer-encoding; s=pp1; bh=pxePk78BM/TDbeCE6Bk7sTLYRo
 Y79efUljj+Zl9+FSg=; b=dyr/vj5Ch1F2f3nBx1DfX/n4oLfyDUpo4+M/AfJgY/
 3hiYJ5lHlXTdKHEjDAwbxubUor0Eo93GRB/EspJ8vjD7ZyyI+NDeGX2haFp94R3g
 2nrqZYd/O1AK9IlbrevBh9/A4/ULsaUQWWmttziQNY5yp50VqAUHm1TFDDKVSiBL
 H5MtpSZF4ik0LVBAUunYHyuCwPEBlZzbeGMTzNcZcsnmE9fF4FrPFG8Jfaymk/OV
 R7ZzH+F+aQfSnoqoUamNcR58P4DonOCNZFdX2sfqpG3AeKp6fvLNdxdgAx/vpiE2
 k4IAi8DGrU+iW/Ej+12SMhRMVJanncUf5RMrv1dpep3Q==
Received: from ppma11.dal12v.mail.ibm.com
 (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219])
 by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 40fw0v08mq-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Fri, 19 Jul 2024 20:04:17 +0000 (GMT)
Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1])
 by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id
 46JHpTZF006337; Fri, 19 Jul 2024 20:04:16 GMT
Received: from smtprelay03.dal12v.mail.ibm.com ([172.16.1.5])
 by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 40dwkmsem6-1
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Fri, 19 Jul 2024 20:04:16 +0000
Received: from smtpav06.dal12v.mail.ibm.com (smtpav06.dal12v.mail.ibm.com
 [10.241.53.105])
 by smtprelay03.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id
 46JK4DJS23069242
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
 Fri, 19 Jul 2024 20:04:15 GMT
Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id B16A258043;
 Fri, 19 Jul 2024 20:04:13 +0000 (GMT)
Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1])
 by IMSVA (Postfix) with ESMTP id CE6EE58055;
 Fri, 19 Jul 2024 20:04:12 +0000 (GMT)
Received: from [9.67.179.228] (unknown [9.67.179.228])
 by smtpav06.dal12v.mail.ibm.com (Postfix) with ESMTP;
 Fri, 19 Jul 2024 20:04:12 +0000 (GMT)
Message-ID: <163c9a9d-6755-4fbc-9c5a-82475fd1d4b1@linux.ibm.com>
Date: Fri, 19 Jul 2024 13:04:12 -0700
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: gcc-patches@gcc.gnu.org, Kewen <linkw@linux.ibm.com>,
 segher <segher@kernel.crashing.org>,
 Peter Bergner <bergner@linux.ibm.com>, cel <cel@linux.ibm.com>
From: Carl Love <cel@linux.ibm.com>
Subject: [PATCH] rs6000, Add new overloaded vector shift builtin int128,
 varients
X-TM-AS-GCONF: 00
X-Proofpoint-GUID: -Mwjg0QXpKL1am42AniWB8jlU_x-oK5A
X-Proofpoint-ORIG-GUID: -Mwjg0QXpKL1am42AniWB8jlU_x-oK5A
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16
 definitions=2024-07-19_06,2024-07-18_01,2024-05-17_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
 phishscore=0 mlxlogscore=999
 clxscore=1015 mlxscore=0 spamscore=0 impostorscore=0 lowpriorityscore=0
 priorityscore=1501 bulkscore=0 adultscore=0 suspectscore=0 malwarescore=0
 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2407110000
 definitions=main-2407190148
X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS,
 KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

GCC developers:

The following patch adds the int128 varients to the existing overloaded 
built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo, vec_srdb, 
vec_srl, vec_sro.  These varients were requested by Steve Munroe.

The patch has been tested on a Power 10 system with no regressions.

Please let me know if the patch is acceptable for mainline.

                                    Carl


-------------------------------------------------------------------------------------------------------
  rs6000, Add new overloaded vector shift builtin int128 varients

Add the signed __int128 and unsigned __int128 argument types for the
overloaded built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo,
vec_srdb, vec_srl, vec_sro.  For each of the new argument types add a
testcase and update the documentation for the built-in.

Add the missing internal names for the float and double types for
overloaded builtin vec_sld for the float and double types.

gcc/ChangeLog:
     * config/rs6000/altivec.md (vs<SLDB_lr>db_<mode>): Change
     define_insn iterator to VEC_IC.
     * config/rs6000/rs6000-builtins.def (__builtin_altivec_vsldoi_v1ti,
     __builtin_vsx_xxsldwi_v1ti, __builtin_altivec_vsldb_v1ti,
     __builtin_altivec_vsrdb_v1ti): New builtin definitions.
     * config/rs6000/rs6000-overload.def (vec_sld, vec_sldb, vec_sldw,
     vec_sll, vec_slo, vec_srdb, vec_srl, vec_sro): New overloaded
     definitions.
     (vec_sld): Add missing internal names.
     * doc/extend.texi (vec_sld, vec_sldb, vec_sldw,    vec_sll, vec_slo,
     vec_srdb, vec_srl, vec_sro): Add documentation for new overloaded
     built-ins.

gcc/testsuite/ChangeLog:
     * gcc.target/powerpc/vec-shift-double-runnable-int128.c: New test
     file.
---
  gcc/config/rs6000/altivec.md                  |   6 +-
  gcc/config/rs6000/rs6000-builtins.def         |  12 +
  gcc/config/rs6000/rs6000-overload.def         |  44 ++-
  gcc/doc/extend.texi                           |  42 +++
  .../vec-shift-double-runnable-int128.c        | 349 ++++++++++++++++++
  5 files changed, 448 insertions(+), 5 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c

+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times {\mvsrdbi\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvsldbi\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvsl\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvsr\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvslo\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvsro\M} 4 } } */

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 5af9bf920a2..2a18ee44526 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -878,9 +878,9 @@ (define_int_attr SLDB_lr [(UNSPEC_SLDB "l")
  (define_int_iterator VSHIFT_DBL_LR [UNSPEC_SLDB UNSPEC_SRDB])

  (define_insn "vs<SLDB_lr>db_<mode>"
- [(set (match_operand:VI2 0 "register_operand" "=v")
-  (unspec:VI2 [(match_operand:VI2 1 "register_operand" "v")
-           (match_operand:VI2 2 "register_operand" "v")
+ [(set (match_operand:VEC_IC 0 "register_operand" "=v")
+  (unspec:VEC_IC [(match_operand:VEC_IC 1 "register_operand" "v")
+           (match_operand:VEC_IC 2 "register_operand" "v")
             (match_operand:QI 3 "const_0_to_12_operand" "n")]
            VSHIFT_DBL_LR))]
    "TARGET_POWER10"
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 77eb0f7e406..fbb6e1ddf85 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -964,6 +964,9 @@
    const vss __builtin_altivec_vsldoi_8hi (vss, vss, const int<4>);
      VSLDOI_8HI altivec_vsldoi_v8hi {}

+  const vsq __builtin_altivec_vsldoi_v1ti (vsq, vsq, const int<4>);
+    VSLDOI_V1TI altivec_vsldoi_v1ti {}
+
    const vss __builtin_altivec_vslh (vss, vus);
      VSLH vashlv8hi3 {}

@@ -1831,6 +1834,9 @@
    const vsll __builtin_vsx_xxsldwi_2di (vsll, vsll, const int<2>);
      XXSLDWI_2DI vsx_xxsldwi_v2di {}

+  const vsq __builtin_vsx_xxsldwi_v1ti (vsq, vsq, const int<2>);
+    XXSLDWI_Q vsx_xxsldwi_v1ti {}
+
    const vf __builtin_vsx_xxsldwi_4sf (vf, vf, const int<2>);
      XXSLDWI_4SF vsx_xxsldwi_v4sf {}

@@ -3299,6 +3305,9 @@
    const vss __builtin_altivec_vsldb_v8hi (vss, vss, const int<3>);
      VSLDB_V8HI vsldb_v8hi {}

+  const vsq __builtin_altivec_vsldb_v1ti (vsq, vsq, const int<3>);
+    VSLDB_V1TI vsldb_v1ti {}
+
    const vsq __builtin_altivec_vslq (vsq, vuq);
      VSLQ vashlv1ti3 {}

@@ -3317,6 +3326,9 @@
    const vss __builtin_altivec_vsrdb_v8hi (vss, vss, const int<3>);
      VSRDB_V8HI vsrdb_v8hi {}

+  const vsq __builtin_altivec_vsrdb_v1ti (vsq, vsq, const int<3>);
+    VSRDB_V1TI vsrdb_v1ti {}
+
    const vsq __builtin_altivec_vsrq (vsq, vuq);
      VSRQ vlshrv1ti3 {}

diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index c4ecafc6f7e..302e0232533 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3396,9 +3396,13 @@
    vull __builtin_vec_sld (vull, vull, const int);
      VSLDOI_2DI  VSLDOI_VULL
    vf __builtin_vec_sld (vf, vf, const int);
-    VSLDOI_4SF
+    VSLDOI_4SF VSLDOI_VF
    vd __builtin_vec_sld (vd, vd, const int);
-    VSLDOI_2DF
+    VSLDOI_2DF VSLDOI_VD
+  vsq __builtin_vec_sld (vsq, vsq, const int);
+    VSLDOI_V1TI  VSLDOI_VSQ
+  vuq __builtin_vec_sld (vuq, vuq, const int);
+    VSLDOI_V1TI  VSLDOI_VUQ

  [VEC_SLDB, vec_sldb, __builtin_vec_sldb]
    vsc __builtin_vec_sldb (vsc, vsc, const int);
@@ -3417,6 +3421,10 @@
      VSLDB_V2DI  VSLDB_VSLL
    vull __builtin_vec_sldb (vull, vull, const int);
      VSLDB_V2DI  VSLDB_VULL
+  vsq __builtin_vec_sldb (vsq, vsq, const int);
+    VSLDB_V1TI  VSLDB_VSQ
+  vuq __builtin_vec_sldb (vuq, vuq, const int);
+    VSLDB_V1TI  VSLDB_VUQ

  [VEC_SLDW, vec_sldw, __builtin_vec_sldw]
    vsc __builtin_vec_sldw (vsc, vsc, const int);
@@ -3439,6 +3447,10 @@
      XXSLDWI_4SF  XXSLDWI_VF
    vd __builtin_vec_sldw (vd, vd, const int);
      XXSLDWI_2DF  XXSLDWI_VD
+  vsq __builtin_vec_sldw (vsq, vsq, const int);
+    XXSLDWI_Q  XXSLDWI_VSQ
+  vuq __builtin_vec_sldw (vuq, vuq, const int);
+    XXSLDWI_Q  XXSLDWI_VUQ

  [VEC_SLL, vec_sll, __builtin_vec_sll]
    vsc __builtin_vec_sll (vsc, vuc);
@@ -3459,6 +3471,10 @@
      VSL  VSL_VSLL
    vull __builtin_vec_sll (vull, vuc);
      VSL  VSL_VULL
+  vsq __builtin_vec_sll (vsq, vuc);
+    VSL  VSL_VSQ
+  vuq __builtin_vec_sll (vuq, vuc);
+    VSL  VSL_VUQ
  ; The following variants are deprecated.
    vsc __builtin_vec_sll (vsc, vus);
      VSL  VSL_VSC_VUS
@@ -3554,6 +3570,14 @@
      VSLO  VSLO_VFS
    vf __builtin_vec_slo (vf, vuc);
      VSLO  VSLO_VFU
+  vsq __builtin_vec_slo (vsq, vsc);
+    VSLO  VSLDO_VSQS
+  vsq __builtin_vec_slo (vsq, vuc);
+    VSLO  VSLDO_VSQU
+  vuq __builtin_vec_slo (vuq, vsc);
+    VSLO  VSLDO_VUQS
+  vuq __builtin_vec_slo (vuq, vuc);
+    VSLO  VSLDO_VUQU

  [VEC_SLV, vec_slv, __builtin_vec_vslv]
    vuc __builtin_vec_vslv (vuc, vuc);
@@ -3699,6 +3723,10 @@
      VSRDB_V2DI  VSRDB_VSLL
    vull __builtin_vec_srdb (vull, vull, const int);
      VSRDB_V2DI  VSRDB_VULL
+  vsq __builtin_vec_srdb (vsq, vsq, const int);
+    VSRDB_V1TI  VSRDB_VSQ
+  vuq __builtin_vec_srdb (vuq, vuq, const int);
+    VSRDB_V1TI  VSRDB_VUQ

  [VEC_SRL, vec_srl, __builtin_vec_srl]
    vsc __builtin_vec_srl (vsc, vuc);
@@ -3719,6 +3747,10 @@
      VSR  VSR_VSLL
    vull __builtin_vec_srl (vull, vuc);
      VSR  VSR_VULL
+  vsq __builtin_vec_srl (vsq, vuc);
+    VSR  VSR_VSQ
+  vuq __builtin_vec_srl (vuq, vuc);
+    VSR  VSR_VUQ
  ; The following variants are deprecated.
    vsc __builtin_vec_srl (vsc, vus);
      VSR  VSR_VSC_VUS
@@ -3808,6 +3840,14 @@
      VSRO  VSRO_VFS
    vf __builtin_vec_sro (vf, vuc);
      VSRO  VSRO_VFU
+  vsq __builtin_vec_sro (vsq, vsc);
+    VSRO  VSRDO_VSQS
+  vsq __builtin_vec_sro (vsq, vuc);
+    VSRO  VSRDO_VSQU
+  vuq __builtin_vec_sro (vuq, vsc);
+    VSRO  VSRDO_VUQS
+  vuq __builtin_vec_sro (vuq, vuc);
+    VSRO  VSRDO_VUQU

  [VEC_SRV, vec_srv, __builtin_vec_vsrv]
    vuc __builtin_vec_vsrv (vuc, vuc);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 0b572afca72..5125a6d9def 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -23504,6 +23504,10 @@ const unsigned int);
  vector signed long long, const unsigned int);
  @exdent vector unsigned long long vec_sldb (vector unsigned long long,
  vector unsigned long long, const unsigned int);
+@exdent vector signed __int128 vec_sldb (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sldb (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
  @end smallexample

  Shift the combined input vectors left by the amount specified by the 
low-order
@@ -23531,6 +23535,10 @@ const unsigned int);
  vector signed long long, const unsigned int);
  @exdent vector unsigned long long vec_srdb (vector unsigned long long,
  vector unsigned long long, const unsigned int);
+@exdent vector signed __int128 vec_srdb (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_srdb (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
  @end smallexample

  Shift the combined input vectors right by the amount specified by the 
low-order
@@ -24025,6 +24033,40 @@ int vec_any_le (vector signed __int128, vector 
signed __int128);
  int vec_any_le (vector unsigned __int128, vector unsigned __int128);
  @end smallexample

+@smallexample
+@exdent vector signed __int128 vec_sld (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sld (vector unsigned __int128,
+vector unsigned __int128, const unsigned int);
+@exdent vector signed __int128 vec_sldw (vector signed __int128,
+vector signed __int128, const unsigned int);
+@exdent vector unsigned __int128 vec_sldw (vector unsigned __int,
+vector unsigned __int128, const unsigned int);
+@exdent vector signed __int128 vec_slo (vector signed __int128,
+vector signed char);
+@exdent vector signed __int128 vec_slo (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
+vector signed char);
+@exdent vector unsigned __int128 vec_slo (vector unsigned __int128,
+vector unsigned char);
+@exdent vector signed __int128 vec_sro (vector signed __int128,
+vector signed char);
+@exdent vector signed __int128 vec_sro (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
+vector signed char);
+@exdent vector unsigned __int128 vec_sro (vector unsigned __int128,
+vector unsigned char);
+@exdent vector signed __int128 vec_srl (vector signed __int128,
+vector unsigned char);
+@exdent vector unsigned __int128 vec_srl (vector unsigned __int128,
+vector unsigned char);
+@end smallexample
+
+The above instances are extension of the exiting overloaded built-ins
+@code{vec_sld}, @code{vec_sldw}, @code{vec_slo}, @code{vec_sro}, 
@code{vec_srl}
+that are documented in the PVIPR.

  @node PowerPC Hardware Transactional Memory Built-in Functions
  @subsection PowerPC Hardware Transactional Memory Built-in Functions
diff --git 
a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c 
b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
new file mode 100644
index 00000000000..bb90f489149
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c
@@ -0,0 +1,349 @@
+/* { dg-do run  { target { int128 } && { power10_hw } } } */
+/* { dg-do link { target { ! power10_hw } } } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+
+#include <altivec.h>
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+
+void print_i128 (unsigned __int128 val)
+{
+  printf(" 0x%016llx%016llx",
+     (unsigned long long)(val >> 64),
+     (unsigned long long)(val & 0xFFFFFFFFFFFFFFFF));
+}
+#endif
+
+extern void abort (void);
+
+#if DEBUG
+#define ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME)                \
+  printf ("vec_%s (vector TYPE __int128, vector TYPE __int128) \n", 
#NAME); \
+  printf(" src_va_s128[0] =      ");                    \
+  print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]);        \
+  printf("\n");                            \
+  printf(" src_vb_uc =            0x");                \
+  for (i = 0; i < 16; i++)                         \
+    printf("%02x",  src_vb_uc[i]);                    \
+  printf("\n");                            \
+  printf(" vresult[0] =          ");                    \
+  print_i128 ((unsigned __int128) vresult[0]);                \
+  printf("\n");                            \
+  printf(" expected_vresult[0] = ");                    \
+  print_i128 ((unsigned __int128) expected_vresult[0]);        \
+  printf("\n");
+
+#define ACTION_2ARG_SIGNED(NAME, TYPE_NAME)                \
+  printf ("vec_%s (vector TYPE __int128, vector TYPE __int128) \n", 
#NAME); \
+  printf(" src_va_s128[0] =      ");                    \
+  print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]);        \
+  printf("\n");                            \
+  printf(" src_vb_sc =            0x");                \
+  for (i = 0; i < 16; i++)                         \
+    printf("%02x",  src_vb_sc[i]);                    \
+  printf("\n");                            \
+  printf(" vresult[0] =          ");                    \
+  print_i128 ((unsigned __int128) vresult[0]);                \
+  printf("\n");                            \
+  printf(" expected_vresult[0] = ");                    \
+  print_i128 ((unsigned __int128) expected_vresult[0]);        \
+  printf("\n");
+
+#define ACTION_3ARG(NAME, TYPE_NAME, CONST)                \
+  printf ("vec_%s (vector TYPE __int128, vector TYPE __int128, %s) 
\n",    \
+    #NAME, #CONST);                            \
+  printf(" src_va_s128[0] =      ");                    \
+  print_i128 ((unsigned __int128) src_va_##TYPE_NAME[0]);        \
+  printf("\n");                            \
+  printf(" src_vb_s128[0] =      ");                    \
+  print_i128 ((unsigned __int128) src_vb_##TYPE_NAME[0]);        \
+  printf("\n");                            \
+  printf(" vresult[0] =          ");                    \
+  print_i128 ((unsigned __int128) vresult[0]);                \
+  printf("\n");                            \
+  printf(" expected_vresult[0] = ");                    \
+  print_i128 ((unsigned __int128) expected_vresult[0]);        \
+  printf("\n");
+#else
+#define ACTION_2ARG(NAME, TYPE_NAME)                        \
+  abort();
+
+#define ACTION_3ARG(NAME, TYPE_NAME, CONST)                \
+  abort();
+#endif
+
+/* Second argument of the builtin is vector unsigned char.  */
+#define TEST_2ARG_UNSIGNED(NAME, TYPE, TYPE_NAME, EXP_RESULT_HI, 
EXP_RESULT_LO) \
+  {                                    \
+    vector TYPE __int128 vresult;                    \
+    vector TYPE __int128 expected_vresult;                \
+    int i;                                \
+                                        \
+    expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
+    expected_vresult = (expected_vresult << 64) | \
+      (vector TYPE __int128) { EXP_RESULT_LO };            \
+    vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_uc);        \
+                                    \
+    if (!vec_all_eq (vresult,  expected_vresult)) {            \
+      ACTION_2ARG_UNSIGNED(NAME, TYPE_NAME)                \
+    }                                    \
+  }
+
+/* Second argument of the builtin is vector signed char.  */
+#define TEST_2ARG_SIGNED(NAME, TYPE, TYPE_NAME, EXP_RESULT_HI, 
EXP_RESULT_LO) \
+  {                                    \
+    vector TYPE __int128 vresult;                    \
+    vector TYPE __int128 expected_vresult;                \
+    int i;                                \
+                                        \
+    expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
+    expected_vresult = (expected_vresult << 64) | \
+      (vector TYPE __int128) { EXP_RESULT_LO };            \
+    vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_sc);        \
+                                    \
+    if (!vec_all_eq (vresult,  expected_vresult)) {            \
+      ACTION_2ARG_SIGNED(NAME, TYPE_NAME)                \
+    }                                    \
+  }
+
+#define TEST_3ARG(NAME, TYPE, TYPE_NAME, CONST, EXP_RESULT_HI, 
EXP_RESULT_LO) \
+  {                                    \
+    vector TYPE __int128 vresult;                    \
+    vector TYPE __int128 expected_vresult;                \
+                                        \
+    expected_vresult = (vector TYPE __int128) { EXP_RESULT_HI }; \
+    expected_vresult = (expected_vresult << 64) | \
+      (vector TYPE __int128) { EXP_RESULT_LO };            \
+    vresult = vec_##NAME (src_va_##TYPE_NAME, src_vb_##TYPE_NAME, 
CONST);    \
+                                    \
+    if (!vec_all_eq (vresult,  expected_vresult)) {            \
+      ACTION_3ARG(NAME, TYPE_NAME, CONST)                \
+    }                                    \
+  }
+
+int
+main (int argc, char *argv [])
+{
+  vector signed __int128 vresult_s128;
+  vector signed __int128 expected_vresult_s128;
+  vector signed __int128 src_va_s128;
+  vector signed __int128 src_vb_s128;
+  vector unsigned __int128 vresult_u128;
+  vector unsigned __int128 expected_vresult_u128;
+  vector unsigned __int128 src_va_u128;
+  vector unsigned __int128 src_vb_u128;
+  vector signed char src_vb_sc;
+  vector unsigned char src_vb_uc;
+
+  /* 128-bit vector shift right tests, vec_srdb. */
+  src_va_s128 = (vector signed __int128) {0x12345678};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA90};
+  TEST_3ARG(srdb, signed, s128, 4, 0x8000000000000000, 0xFEDCBA9)
+
+  src_va_u128 = (vector unsigned __int128) { 0xFEDCBA98 };
+  src_vb_u128 = (vector unsigned __int128) { 0x76543210};
+  TEST_3ARG(srdb, unsigned, u128, 4, 0x8000000000000000, 0x07654321)
+
+  /* 128-bit vector shift left tests, vec_sldb. */
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vb_s128 = (src_vb_s128 << 64)
+    | (vector signed __int128) {0xFEDCBA9876543210};
+  TEST_3ARG(sldb, signed, s128, 4, 0x23456789ABCDEF01, 0x23456789ABCDEF0F)
+
+  src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_vb_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_vb_u128 = src_vb_u128 << 64
+    | (vector unsigned __int128) {0x123456789ABCDEF0};
+  TEST_3ARG(sldb, unsigned, u128, 4, 0xEDCBA9876543210F, 
0xEDCBA98765432101)
+
+  /* Shift left by octect tests, vec_sld.  Shift is by immediate value
+     times 8. */
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vb_s128 = (src_vb_s128 << 64)
+    | (vector signed __int128) {0xFEDCBA9876543210};
+  TEST_3ARG(sld, signed, s128, 4, 0x9abcdef012345678, 0x9abcdef0fedcba98)
+
+  src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_vb_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_vb_u128 = src_vb_u128 << 64
+    | (vector unsigned __int128) {0x123456789ABCDEF0};
+  TEST_3ARG(sld, unsigned, u128, 4, 0x76543210fedcba98, 0x7654321012345678)
+
+  /* Vector left shift bytes within the vector, vec_sll. */
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_uc = (vector unsigned char) {0x01, 0x01, 0x01, 0x01,
+                      0x01, 0x01, 0x01, 0x01,
+                      0x01, 0x01, 0x01, 0x01,
+                      0x01, 0x01, 0x01, 0x01};
+  TEST_2ARG_UNSIGNED(sll, signed, s128, 0x2468acf13579bde0,
+             0x2468acf13579bde0)
+
+  src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_vb_uc = (vector unsigned char) {0x02, 0x02, 0x02, 0x02,
+                      0x02, 0x02, 0x02, 0x02,
+                      0x02, 0x02, 0x02, 0x02,
+                      0x02, 0x02, 0x02, 0x02};
+  TEST_2ARG_UNSIGNED(sll, unsigned, u128, 0x48d159e26af37bc0,
+             0x48d159e26af37bc0)
+
+  /* Vector right shift bytes within the vector, vec_srl. */
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_uc = (vector unsigned char) {0x01, 0x01, 0x01, 0x01,
+                      0x01, 0x01, 0x01, 0x01,
+                      0x01, 0x01, 0x01, 0x01,
+                      0x01, 0x01, 0x01, 0x01};
+  TEST_2ARG_UNSIGNED(srl, signed, s128, 0x091a2b3c4d5e6f78,
+             0x091a2b3c4d5e6f78)
+
+  src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_vb_uc = (vector unsigned char) {0x02, 0x02, 0x02, 0x02,
+                      0x02, 0x02, 0x02, 0x02,
+                      0x02, 0x02, 0x02, 0x02,
+                      0x02, 0x02, 0x02, 0x02};
+  TEST_2ARG_UNSIGNED(srl, unsigned, u128, 0x48d159e26af37bc,
+             0x48d159e26af37bc)
+
+  /* Shift left by octect tests, vec_slo.  Shift is by immediate value
+     bytes.  Shift amount in bits 121:124.  */
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  /* Note vb_sc is Endian specific, this is just LE.  */
+  /* The left shift amount is 1 byte, i.e. 1 * 8 bits.  */
+  src_vb_sc = (vector signed char) {0x1 << 3, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0};
+
+  TEST_2ARG_SIGNED(slo, signed, s128, 0x3456789ABCDEF012,
+           0x3456789ABCDEF000)
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  /* Note vb_sc is Endian specific, this is just LE.  */
+  /* The left shift amount is 2 bytes, i.e. 2 * 8 bits.  */
+  src_vb_uc = (vector unsigned char) {0x2 << 3, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0};
+  TEST_2ARG_UNSIGNED(slo, signed, s128, 0x56789ABCDEF01234,
+             0x56789ABCDEF00000)
+
+  src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0xFEDCBA9876543210};
+  /* The left shift amount is 3 bytes, i.e. 3 * 8 bits.  */
+  src_vb_sc = (vector signed char) {0x03<<3, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x00, 0x00, 0x00, 0x0};
+  TEST_2ARG_SIGNED(slo, unsigned, u128, 0x9876543210FEDCBA,
+               0x9876543210000000)
+
+  src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0xFEDCBA9876543210};
+  /* The left shift amount is 4 bytes, i.e. 4 * 8 bits.  */
+  src_vb_uc = (vector unsigned char) {0x04<<3, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x00, 0x00, 0x00, 0x0};
+  TEST_2ARG_UNSIGNED(slo, unsigned, u128, 0x76543210FEDCBA98,
+               0x7654321000000000)
+
+  /* Shift right by octect tests, vec_sro.  Shift is by immediate value
+     times 8.  Shift amount in bits 121:124.  */
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  /* Note vb_sc is Endian specific, this is just LE.  */
+  /* The left shift amount is 1 byte, i.e. 1 * 8 bits.  */
+  src_vb_sc = (vector signed char) {0x1 << 3, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0};
+  TEST_2ARG_SIGNED(sro, signed, s128, 0x00123456789ABCDE, 
0xF0123456789ABCDE)
+
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  /* Note vb_sc is Endian specific, this is just LE.  */
+  /* The left shift amount is 1 byte, i.e. 1 * 8 bits.  */
+  src_vb_uc = (vector unsigned char) {0x2 << 3, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0};
+  TEST_2ARG_UNSIGNED(sro, signed, s128, 0x0000123456789ABC,
+             0xDEF0123456789ABC)
+
+  src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0xFEDCBA9876543210};
+  /* The left shift amount is 4 bytes, i.e. 4 * 8 bits.  */
+  src_vb_sc = (vector signed char) {0x03<<3, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x0, 0x0, 0x0, 0x0,
+                    0x00, 0x00, 0x00, 0x0};
+  TEST_2ARG_SIGNED(sro, unsigned, u128, 0x000000FEDCBA9876,
+           0x543210FEDCBA9876)
+
+  src_va_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_va_u128 = src_va_u128 << 64
+    | (vector unsigned __int128) {0xFEDCBA9876543210};
+  /* The left shift amount is 4 bytes, i.e. 4 * 8 bits.  */
+  src_vb_uc = (vector unsigned char) {0x04<<3, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x0, 0x0, 0x0, 0x0,
+                      0x00, 0x00, 0x00, 0x0};
+  TEST_2ARG_UNSIGNED(sro, unsigned, u128, 0x00000000FEDCBA98,
+               0x76543210FEDCBA98)
+
+  /* 128-bit vector shift left tests, vec_sldw. */
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_va_s128 = (src_va_s128 << 64)
+    | (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vb_s128 = (src_vb_s128 << 64)
+    | (vector signed __int128) {0xFEDCBA9876543210};
+  TEST_3ARG(sldw, signed, s128, 1, 0x9ABCDEF012345678, 0x9ABCDEF0FEDCBA98)
+
+  src_va_u128 = (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_va_u128 = (src_va_u128 << 64)
+    | (vector unsigned __int128) {0x123456789ABCDEF0};
+  src_vb_u128 = (vector unsigned __int128) {0xFEDCBA9876543210};
+  src_vb_u128 = (src_vb_u128 << 64)
+    | (vector unsigned __int128) {0xFEDCBA9876543210};
+  TEST_3ARG(sldw, unsigned, u128, 2, 0x123456789ABCDEF0, 
0xFEDCBA9876543210)
+