From patchwork Tue Aug 14 23:18:28 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: will schmidt <will_schmidt@vnet.ibm.com>
X-Patchwork-Id: 957739
Return-Path: 
 <gcc-patches-return-483668-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=gcc-patches-return-483668-incoming=patchwork.ozlabs.org@gcc.gnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none)
	header.from=vnet.ibm.com
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b="cinpac3k"; dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 41qpSR1tPYz9s0n
	for <incoming@patchwork.ozlabs.org>;
	Wed, 15 Aug 2018 09:18:46 +1000 (AEST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:subject:from:reply-to:to:cc:content-type:date:mime-version
	:content-transfer-encoding:message-id; q=dns; s=default; b=OvWDU
	y/LqGpxzjp4wK3mtPGmz9Cjf1akI6Dtb8VZZyshkaLn4ozy6nrHHzigVDdSNhQ3u
	JQXRKz+H5hZymvo/dKpXpmxyVmLeKbNhF+JrQ450DaEZ2ZkYJImCGOBEkMpGwRV+
	yOD3LDAqPS7+fz+qR/VP6uOdJQhg3wZ4BquqsA=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:subject:from:reply-to:to:cc:content-type:date:mime-version
	:content-transfer-encoding:message-id; s=default; bh=XTJ3o4Mjrug
	4b5luhkokPCTsM38=; b=cinpac3kXg6rTp0OZ6YH7HK9RLl79IN+JWdIUdWb/lW
	zPqDu9/AtEC5XHVCFaals8xCjo78wsC3fCEaIsbbVWlIsJAvLatzgeWST0wmlclX
	MjuXvx+85cHS6rcGAYqUwGwBAxTav0Xg5SJixzr9tzjiA3WChz0ZLicHtoTzAF6g
	=
Received: (qmail 119612 invoked by alias); 14 Aug 2018 23:18:38 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 119602 invoked by uid 89); 14 Aug 2018 23:18:37 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-27.0 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	RCVD_IN_DNSWL_LOW,
	SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: mx0a-001b2d01.pphosted.com
Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com)
	(148.163.156.1) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Tue, 14 Aug 2018 23:18:34 +0000
Received: from pps.filterd (m0098409.ppops.net [127.0.0.1])	by
	mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id
	w7ENDmsw014255	for <gcc-patches@gcc.gnu.org>;
	Tue, 14 Aug 2018 19:18:33 -0400
Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.151])	by
	mx0a-001b2d01.pphosted.com with ESMTP id
	2kv8jkg4sv-1	(version=TLSv1.2 cipher=AES256-GCM-SHA384
	bits=256 verify=NOT)	for <gcc-patches@gcc.gnu.org>;
	Tue, 14 Aug 2018 19:18:32 -0400
Received: from localhost	by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway:
	Authorized Use Only! Violators will be prosecuted	for
	<gcc-patches@gcc.gnu.org> from <will_schmidt@vnet.ibm.com>;
	Tue, 14 Aug 2018 17:18:32 -0600
Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20)	by
	e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be prosecuted;
	(version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256)	Tue, 14
	Aug 2018 17:18:30 -0600
Received: from b03ledav002.gho.boulder.ibm.com
	(b03ledav002.gho.boulder.ibm.com [9.17.130.233])	by
	b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0)
	with ESMTP id w7ENITrn5505432	(version=TLSv1/SSLv3
	cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL);
	Tue, 14 Aug 2018 16:18:29 -0700
Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1])	by
	IMSVA (Postfix) with ESMTP id 72F3D13604F;
	Tue, 14 Aug 2018 17:18:29 -0600 (MDT)
Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1])	by
	IMSVA (Postfix) with ESMTP id 12972136053;
	Tue, 14 Aug 2018 17:18:28 -0600 (MDT)
Received: from [9.10.86.107] (unknown [9.10.86.107])	by
	b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP;
	Tue, 14 Aug 2018 17:18:28 -0600 (MDT)
Subject: [PATCH, rs6000] Fix PR86731 vec_sl()
From: Will Schmidt <will_schmidt@vnet.ibm.com>
Reply-To: will_schmidt@vnet.ibm.com
To: Segher Boessenkool <segher@kernel.crashing.org>,
	Bill Schmidt <wschmidt@linux.vnet.ibm.com>,
	David Edelsohn <dje.gcc@gmail.com>,
	Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Date: Tue, 14 Aug 2018 18:18:28 -0500
Mime-Version: 1.0
x-cbid: 18081423-0036-0000-0000-00000A22DA1C
X-IBM-SpamModules-Scores: 
X-IBM-SpamModules-Versions: BY=3.00009545; HX=3.00000242; KW=3.00000007;
	PH=3.00000004; SC=3.00000266; SDB=6.01073594; UDB=6.00553188;
	IPR=6.00853552; MB=3.00022720; MTD=3.00000008; XFM=3.00000015;
	UTC=2018-08-14 23:18:31
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 18081423-0037-0000-0000-00004895A17E
Message-Id: <1534288708.18801.5.camel@brimstone.rchland.ibm.com>
X-IsSubscribed: yes

Hi, 

  Here is a first pass at fixing PR86731, which is an issue introduced
when gimple folding the vec_sl() intrinsic.
    
This has been sniff tested (successfully) on a power7.  Full regtests for
linux/Powerpc systems is pending.  I expect I'll need to tweak some of
the testcase scan-assembler stanzas after reviewing those results, but I
wanted to get this out for review sooner versus later.  :-)
    
Assuming good results, is this OK for trunk and backport to 8?
Thanks,
-Will
    
[gcc]
    
2018-08-14  Will Schmidt  <will_schmidt@vnet.ibm.com>

	PR target/86731
	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Update logic
	  around folding of vec_sl() to handle out of range shift values.
    
[testsuite]

2018-08-14  Will Schmidt  <will_schmidt@vnet.ibm.com>

	PR target/86731
	* gcc.target/powerpc/fold-vec-shift-altivectest-1.c: New test.
	* gcc.target/powerpc/fold-vec-shift-altivectest-2.c: New test.
	* gcc.target/powerpc/fold-vec-shift-altivectest-3.c: New test.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ec92e6a..0a84290 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15481,20 +15481,48 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
       builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}.  */
     case ALTIVEC_BUILTIN_VSLB:
     case ALTIVEC_BUILTIN_VSLH:
     case ALTIVEC_BUILTIN_VSLW:
     case P8V_BUILTIN_VSLD:
-      arg0 = gimple_call_arg (stmt, 0);
-      if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (arg0)))
-	  && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE (arg0))))
-	return false;
-      arg1 = gimple_call_arg (stmt, 1);
-      lhs = gimple_call_lhs (stmt);
-      g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, arg1);
-      gimple_set_location (g, gimple_location (stmt));
-      gsi_replace (gsi, g, true);
-      return true;
+      {
+	location_t loc;
+	gimple_seq stmts = NULL;
+	arg0 = gimple_call_arg (stmt, 0);
+	tree arg0_type = TREE_TYPE (arg0);
+	if (INTEGRAL_TYPE_P (TREE_TYPE (arg0_type))
+			&& !TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg0_type)))
+		return false;
+	arg1 = gimple_call_arg (stmt, 1);
+	tree arg1_type = TREE_TYPE (arg1);
+	tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+	tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+	loc = gimple_location (stmt);
+	lhs = gimple_call_lhs (stmt);
+	/* Force arg1 into the range valid matching the arg0 type.  */
+	/* Build a vector consisting of the max valid bit-size values.  */
+	int n_elts = VECTOR_CST_NELTS (arg1);
+	int tree_size_in_bits = TREE_INT_CST_LOW (size_in_bytes (arg1_type))
+				* BITS_PER_UNIT;
+	tree element_size = build_int_cst (unsigned_element_type,
+					   tree_size_in_bits / n_elts);
+	tree_vector_builder elts (unsigned_type_for (arg1_type), n_elts, 1);
+	for (int i = 0; i < n_elts; i++)
+	  elts.safe_push (element_size);
+	tree modulo_tree = elts.build ();
+	/* Modulo the provided shift value against that vector.  */
+	tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+					   unsigned_arg1_type, arg1);
+	tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR,
+				      unsigned_arg1_type, unsigned_arg1,
+				      modulo_tree);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	/* And finally, do the shift.  */
+	g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, new_arg1);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
     /* Flavors of vector shift right.  */
     case ALTIVEC_BUILTIN_VSRB:
     case ALTIVEC_BUILTIN_VSRH:
     case ALTIVEC_BUILTIN_VSRW:
     case P8V_BUILTIN_VSRD:
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-1.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-1.c
new file mode 100644
index 0000000..e0546bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-1.c
@@ -0,0 +1,73 @@
+/* PR86731.  Verify that the rs6000 gimple-folding code handles the
+   left shift operation properly.  This is a testcase variation that
+   explicitly specifies -fwrapv, which is a condition for the
+   gimple folding of the vec_sl() intrinsic.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O3 -fwrapv " } */
+
+#include <altivec.h>
+/* original test as reported.  */
+vector unsigned int splat(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector unsigned int) vec_sl(mzero, mzero);
+}
+
+/* more testcase variations.  */
+vector unsigned char splatu1(void)
+{
+        vector unsigned char mzero = vec_splat_u8(-1);
+        return (vector unsigned char) vec_sl(mzero, mzero);
+}
+
+vector unsigned short splatu2(void)
+{
+        vector unsigned short mzero = vec_splat_u16(-1);
+        return (vector unsigned short) vec_sl(mzero, mzero);
+}
+
+vector unsigned int splatu3(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector unsigned int) vec_sl(mzero, mzero);
+}
+
+vector unsigned long long splatu4(void)
+{
+        vector unsigned long long mzero = {-1,-1};
+        return (vector unsigned long long) vec_sl(mzero, mzero);
+}
+vector signed char splats1(void)
+{
+        vector unsigned char mzero = vec_splat_u8(-1);
+        return (vector signed char) vec_sl(mzero, mzero);
+}
+
+vector signed short splats2(void)
+{
+        vector unsigned short mzero = vec_splat_u16(-1);
+        return (vector signed short) vec_sl(mzero, mzero);
+}
+
+vector signed int splats3(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector signed int) vec_sl(mzero, mzero);
+}
+
+vector signed long long splats4(void)
+{
+        vector unsigned long long mzero = {-1,-1};
+        return (vector signed long long) vec_sl(mzero, mzero);
+}
+
+/* Codegen will consist of splat and shift instructions for most types.
+   If folding is enabled, the vec_sl tests using vector long long type will
+   generate a lvx instead of a vspltisw+vsld pair.  */
+
+/* { dg-final { scan-assembler-times {\mvspltisb\M|\mvspltish\M|\mvspltisw\M} 7 } } */
+/* { dg-final { scan-assembler-times {\mvslb\M|\mvslh\M|\mvslw\M|\mvsld\M} 7 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M} 2 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-2.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-2.c
new file mode 100644
index 0000000..20442ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-2.c
@@ -0,0 +1,73 @@
+/* PR86731.  Verify that the rs6000 gimple-folding code handles the
+   left shift operation properly.  This is a testcase variation that
+   explicitly disables gimple folding.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O3 -fwrapv -mno-fold-gimple" } */
+/* { dg-prune-output "gimple folding of rs6000 builtins has been disabled." } */
+
+
+#include <altivec.h>
+/* original test as reported.  */
+vector unsigned int splat(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector unsigned int) vec_sl(mzero, mzero);
+}
+
+/* more testcase variations.  */
+vector unsigned char splatu1(void)
+{
+        vector unsigned char mzero = vec_splat_u8(-1);
+        return (vector unsigned char) vec_sl(mzero, mzero);
+}
+
+vector unsigned short splatu2(void)
+{
+        vector unsigned short mzero = vec_splat_u16(-1);
+        return (vector unsigned short) vec_sl(mzero, mzero);
+}
+
+vector unsigned int splatu3(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector unsigned int) vec_sl(mzero, mzero);
+}
+
+vector unsigned long long splatu4(void)
+{
+        vector unsigned long long mzero = {-1,-1};
+        return (vector unsigned long long) vec_sl(mzero, mzero);
+}
+vector signed char splats1(void)
+{
+        vector unsigned char mzero = vec_splat_u8(-1);
+        return (vector signed char) vec_sl(mzero, mzero);
+}
+
+vector signed short splats2(void)
+{
+        vector unsigned short mzero = vec_splat_u16(-1);
+        return (vector signed short) vec_sl(mzero, mzero);
+}
+
+vector signed int splats3(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector signed int) vec_sl(mzero, mzero);
+}
+
+vector signed long long splats4(void)
+{
+        vector unsigned long long mzero = {-1,-1};
+        return (vector signed long long) vec_sl(mzero, mzero);
+}
+
+/* Codegen will consist of splat and shift instructions for most types.
+   Noted variations:  if gimple folding is disabled, or if -fwrapv is not specified, the
+   long long tests will generate a vspltisw+vsld pair, versus generating a lvx.  */
+/* { dg-final { scan-assembler-times {\mvspltisb\M|\mvspltish\M|\mvspltisw\M} 9 } } */
+/* { dg-final { scan-assembler-times {\mvslb\M|\mvslh\M|\mvslw\M|\mvsld\M} 9 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M} 0 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-3.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-3.c
new file mode 100644
index 0000000..df0b88c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-altivectest-3.c
@@ -0,0 +1,71 @@
+/* PR86731.  Verify that the rs6000 gimple-folding code handles the
+   left shift properly.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O3" } */
+
+#include <altivec.h>
+/* The original test as reported.  */
+vector unsigned int splat(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector unsigned int) vec_sl(mzero, mzero);
+}
+
+/* more testcase variations.  */
+vector unsigned char splatu1(void)
+{
+        vector unsigned char mzero = vec_splat_u8(-1);
+        return (vector unsigned char) vec_sl(mzero, mzero);
+}
+
+vector unsigned short splatu2(void)
+{
+        vector unsigned short mzero = vec_splat_u16(-1);
+        return (vector unsigned short) vec_sl(mzero, mzero);
+}
+
+vector unsigned int splatu3(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector unsigned int) vec_sl(mzero, mzero);
+}
+
+vector unsigned long long splatu4(void)
+{
+        vector unsigned long long mzero = {-1,-1};
+        return (vector unsigned long long) vec_sl(mzero, mzero);
+}
+vector signed char splats1(void)
+{
+        vector unsigned char mzero = vec_splat_u8(-1);
+        return (vector signed char) vec_sl(mzero, mzero);
+}
+
+vector signed short splats2(void)
+{
+        vector unsigned short mzero = vec_splat_u16(-1);
+        return (vector signed short) vec_sl(mzero, mzero);
+}
+
+vector signed int splats3(void)
+{
+        vector unsigned int mzero = vec_splat_u32(-1);
+        return (vector signed int) vec_sl(mzero, mzero);
+}
+
+vector signed long long splats4(void)
+{
+        vector unsigned long long mzero = {-1,-1};
+        return (vector signed long long) vec_sl(mzero, mzero);
+}
+
+/* Codegen will consist of splat and shift instructions for most types.
+   Noted variations:  if gimple folding is disabled, or if -fwrapv is not
+   specified, the long long tests will generate a vspltisw+vsld pair,
+   versus generating a single lvx.  */
+/* { dg-final { scan-assembler-times {\mvspltisb\M|\mvspltish\M|\mvspltisw\M} 9 } } */
+/* { dg-final { scan-assembler-times {\mvslb\M|\mvslh\M|\mvslw\M|\mvsld\M} 9 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M} 0 } } */
+