From patchwork Mon Feb 11 13:36:11 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Bill Schmidt <wschmidt@linux.ibm.com>
X-Patchwork-Id: 1039825
Return-Path: 
 <gcc-patches-return-495812-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=gcc-patches-return-495812-incoming=patchwork.ozlabs.org@gcc.gnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none)
	header.from=linux.ibm.com
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b="QgsxQHay"; dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 43ymz31jkTz9sMp
	for <incoming@patchwork.ozlabs.org>;
	Tue, 12 Feb 2019 00:36:30 +1100 (AEDT)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:to:cc
	:from:subject:date:mime-version:message-id:content-type
	:content-transfer-encoding; q=dns; s=default; b=SDnO0BLrkxHUhwFf
	b2kzwxucOfjpu43QRNxnhOTLIoj9bUOBh1n3VIITSzoirDWvuugZCV1byY/kHk7e
	8dUsR9AbA6lEKeOPvlW2HDbx3BQGdZ7wB4MV8JfOUNnpHN4dEFGG2RPPWyVNn7Fp
	NT5Pz+4wM5BjrQwIiUeNpq2o8Ug=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:to:cc
	:from:subject:date:mime-version:message-id:content-type
	:content-transfer-encoding; s=default; bh=QsM+NttMq4l7PAk/Y05Nfi
	+JTQo=; b=QgsxQHay4Wvw0Yti8rESLmvoSM9X3yDSUfMChp4ramq72i9RIvHMAk
	NLp0qm6OKPPls7Qm3TGqLTytAzVYzxZx3XrxpId/pBdlPlqblSDFxKNGkLJ7oEFH
	dEvMZThxBwJekmAIwbtS8bG7QB+N+VARW5U8US7cOqXxUOvhKm7Ho=
Received: (qmail 86324 invoked by alias); 11 Feb 2019 13:36:22 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 86026 invoked by uid 89); 11 Feb 2019 13:36:22 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-12.6 required=5.0 tests=BAYES_00, GIT_PATCH_2,
	GIT_PATCH_3, HTML_MESSAGE, RCVD_IN_DNSWL_LOW,
	SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: mx0a-001b2d01.pphosted.com
Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com)
	(148.163.156.1) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Mon, 11 Feb 2019 13:36:18 +0000
Received: from pps.filterd (m0098396.ppops.net [127.0.0.1])	by
	mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id
	x1BDTS7J147316	for <gcc-patches@gcc.gnu.org>;
	Mon, 11 Feb 2019 08:36:17 -0500
Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207])	by
	mx0a-001b2d01.pphosted.com with ESMTP id
	2qk81eebfg-1	(version=TLSv1.2 cipher=AES256-GCM-SHA384
	bits=256 verify=NOT)	for <gcc-patches@gcc.gnu.org>;
	Mon, 11 Feb 2019 08:36:16 -0500
Received: from localhost	by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway:
	Authorized Use Only! Violators will be prosecuted	for
	<gcc-patches@gcc.gnu.org> from <wschmidt@linux.ibm.com>;
	Mon, 11 Feb 2019 13:36:15 -0000
Received: from b01cxnp23032.gho.pok.ibm.com (9.57.198.27)	by
	e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be prosecuted;
	(version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256)	Mon, 11
	Feb 2019 13:36:13 -0000
Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com
	[9.57.199.111])	by b01cxnp23032.gho.pok.ibm.com
	(8.14.9/8.14.9/NCO v10.0) with ESMTP id
	x1BDaC9B13369376	(version=TLSv1/SSLv3
	cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);
	Mon, 11 Feb 2019 13:36:12 GMT
Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1])	by IMSVA
	(Postfix) with ESMTP id A4F31AC065;
	Mon, 11 Feb 2019 13:36:12 +0000 (GMT)
Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1])	by IMSVA
	(Postfix) with ESMTP id 28444AC05E;
	Mon, 11 Feb 2019 13:36:12 +0000 (GMT)
Received: from BigMac.local (unknown [9.85.143.212])	by
	b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP;
	Mon, 11 Feb 2019 13:36:11 +0000 (GMT)
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
From: Bill Schmidt <wschmidt@linux.ibm.com>
Subject: [PATCH, v2] rs6000: Vector shift-right should honor modulo semantics
Date: Mon, 11 Feb 2019 07:36:11 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14;
	rv:60.0) Gecko/20100101 Thunderbird/60.4.0
MIME-Version: 1.0
x-cbid: 19021113-0040-0000-0000-000004BFBD30
X-IBM-SpamModules-Scores: 
X-IBM-SpamModules-Versions: BY=3.00010576; HX=3.00000242; KW=3.00000007;
	PH=3.00000004; SC=3.00000279; SDB=6.01159515; UDB=6.00604226;
	IPR=6.00940062; MB=3.00025525; MTD=3.00000008; XFM=3.00000015;
	UTC=2019-02-11 13:36:14
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 19021113-0041-0000-0000-000008CADBD2
Message-Id: <e54579dc-5249-a748-cd86-56ac47bc9564@linux.ibm.com>

Hi!

We had a problem report for code attempting to implement a vector right-shift for a
vector long long (V2DImode) type.  The programmer noted that we don't have a vector
splat-immediate for this mode, but cleverly realized he could use a vector char
splat-immediate since only the lower 6 bits of the shift count are read.  This is a
documented feature of both the vector shift built-ins and the underlying instruction.

Starting with GCC 8, the vector shifts are expanded early in rs6000_gimple_fold_builtin.
However, the GIMPLE folding does not currently perform the required TRUNC_MOD_EXPR to
implement the built-in semantics.  It appears that this was caught earlier for vector
shift-left and fixed, but the same problem was not fixed for vector shift-right.
This patch fixes that.

I've added executable tests for both shift-right algebraic and shift-right logical.
Both fail prior to applying the patch, and work correctly afterwards.

Minor differences from v1 of this patch:
 * Deleted code to defer some vector splats to expand, which was unnecessary.
 * Removed powerpc64*-*-* target selector, added -mvsx option, removed extra
   braces from dg-options directive.
 * Added __attribute__ ((noinline)) to test_s*di_4 functions.
 * Corrected typoed function names.
 * Changed test case names.
 * Added vec-sld-modulo.c as requested (works both before and after this patch).

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.  Is
this okay for trunk, and for GCC 8.3 if there is no fallout by the end of the
week?

Thanks,
Bill


[gcc]

2019-02-11  Bill Schmidt  <wschmidt@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right
	and shift-left vector built-ins need to include a TRUNC_MOD_EXPR
	for correct semantics.

[gcc/testsuite]

2019-02-11  Bill Schmidt  <wschmidt@linux.ibm.com>

	* gcc.target/powerpc/vec-sld-modulo.c: New.
	* gcc.target/powerpc/vec-srad-modulo.c: New.
	* gcc.target/powerpc/vec-srd-modulo.c: New.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 268707)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -15735,13 +15735,37 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
     case ALTIVEC_BUILTIN_VSRAH:
     case ALTIVEC_BUILTIN_VSRAW:
     case P8V_BUILTIN_VSRAD:
-      arg0 = gimple_call_arg (stmt, 0);
-      arg1 = gimple_call_arg (stmt, 1);
-      lhs = gimple_call_lhs (stmt);
-      g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1);
-      gimple_set_location (g, gimple_location (stmt));
-      gsi_replace (gsi, g, true);
-      return true;
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	arg1 = gimple_call_arg (stmt, 1);
+	lhs = gimple_call_lhs (stmt);
+	tree arg1_type = TREE_TYPE (arg1);
+	tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+	tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+	location_t loc = gimple_location (stmt);
+	/* Force arg1 into the range valid matching the arg0 type.  */
+	/* Build a vector consisting of the max valid bit-size values.  */
+	int n_elts = VECTOR_CST_NELTS (arg1);
+	tree element_size = build_int_cst (unsigned_element_type,
+					   128 / n_elts);
+	tree_vector_builder elts (unsigned_arg1_type, n_elts, 1);
+	for (int i = 0; i < n_elts; i++)
+	  elts.safe_push (element_size);
+	tree modulo_tree = elts.build ();
+	/* Modulo the provided shift value against that vector.  */
+	gimple_seq stmts = NULL;
+	tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+					   unsigned_arg1_type, arg1);
+	tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR,
+				      unsigned_arg1_type, unsigned_arg1,
+				      modulo_tree);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	/* And finally, do the shift.  */
+	g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, new_arg1);
+	gimple_set_location (g, loc);
+	gsi_replace (gsi, g, true);
+	return true;
+      }
    /* Flavors of vector shift left.
       builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}.  */
     case ALTIVEC_BUILTIN_VSLB:
@@ -15795,14 +15819,34 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *
 	arg0 = gimple_call_arg (stmt, 0);
 	arg1 = gimple_call_arg (stmt, 1);
 	lhs = gimple_call_lhs (stmt);
+	tree arg1_type = TREE_TYPE (arg1);
+	tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+	tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+	location_t loc = gimple_location (stmt);
 	gimple_seq stmts = NULL;
 	/* Convert arg0 to unsigned.  */
 	tree arg0_unsigned
 	  = gimple_build (&stmts, VIEW_CONVERT_EXPR,
 			  unsigned_type_for (TREE_TYPE (arg0)), arg0);
+	/* Force arg1 into the range valid matching the arg0 type.  */
+	/* Build a vector consisting of the max valid bit-size values.  */
+	int n_elts = VECTOR_CST_NELTS (arg1);
+	tree element_size = build_int_cst (unsigned_element_type,
+					   128 / n_elts);
+	tree_vector_builder elts (unsigned_arg1_type, n_elts, 1);
+	for (int i = 0; i < n_elts; i++)
+	  elts.safe_push (element_size);
+	tree modulo_tree = elts.build ();
+	/* Modulo the provided shift value against that vector.  */
+	tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+					   unsigned_arg1_type, arg1);
+	tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR,
+				      unsigned_arg1_type, unsigned_arg1,
+				      modulo_tree);
+	/* Do the shift.  */
 	tree res
 	  = gimple_build (&stmts, RSHIFT_EXPR,
-			  TREE_TYPE (arg0_unsigned), arg0_unsigned, arg1);
+			  TREE_TYPE (arg0_unsigned), arg0_unsigned, new_arg1);
 	/* Convert result back to the lhs type.  */
 	res = gimple_build (&stmts, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), res);
 	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
Index: gcc/testsuite/gcc.target/powerpc/vec-sld-modulo.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-sld-modulo.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/vec-sld-modulo.c	(working copy)
@@ -0,0 +1,42 @@
+/* Test that using a character splat to set up a shift-left
+   for a doubleword vector works correctly after gimple folding.  */
+
+/* { dg-do run { target { vsx_hw } } } */
+/* { dg-options "-O2 -mvsx" } */
+
+#include <altivec.h>
+
+typedef __vector unsigned long long vui64_t;
+
+static inline vui64_t
+vec_sldi (vui64_t vra, const unsigned int shb)
+{
+  vui64_t lshift;
+  vui64_t result;
+
+  /* Note legitimate use of wrong-type splat due to expectation that only
+     lower 6-bits are read.  */
+  lshift = (vui64_t) vec_splat_s8((const unsigned char)shb);
+
+  /* Vector Shift Left Doublewords based on the lower 6-bits
+     of corresponding element of lshift.  */
+  result = vec_vsld (vra, lshift);
+
+  return (vui64_t) result;
+}
+
+__attribute__ ((noinline)) vui64_t
+test_sldi_4 (vui64_t a)
+{
+  return vec_sldi (a, 4);
+}
+
+int
+main ()
+{
+  vui64_t x = {-256, 1025};
+  x = test_sldi_4 (x);
+  if (x[0] != -4096 || x[1] != 16400)
+    __builtin_abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.target/powerpc/vec-srad-modulo.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-srad-modulo.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/vec-srad-modulo.c	(working copy)
@@ -0,0 +1,43 @@
+/* Test that using a character splat to set up a shift-right algebraic
+   for a doubleword vector works correctly after gimple folding.  */
+
+/* { dg-do run { target { vsx_hw } } } */
+/* { dg-options "-O2 -mvsx" } */
+
+#include <altivec.h>
+
+typedef __vector unsigned long long vui64_t;
+typedef __vector long long vi64_t;
+
+static inline vi64_t
+vec_sradi (vi64_t vra, const unsigned int shb)
+{
+  vui64_t rshift;
+  vi64_t result;
+
+  /* Note legitimate use of wrong-type splat due to expectation that only
+     lower 6-bits are read.  */
+  rshift = (vui64_t) vec_splat_s8((const unsigned char)shb);
+
+  /* Vector Shift Right Algebraic Doublewords based on the lower 6-bits
+     of corresponding element of rshift.  */
+  result = vec_vsrad (vra, rshift);
+
+  return (vi64_t) result;
+}
+
+__attribute__ ((noinline)) vi64_t
+test_sradi_4 (vi64_t a)
+{
+  return vec_sradi (a, 4);
+}
+
+int
+main ()
+{
+  vi64_t x = {-256, 1025};
+  x = test_sradi_4 (x);
+  if (x[0] != -16 || x[1] != 64)
+    __builtin_abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.target/powerpc/vec-srd-modulo.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/vec-srd-modulo.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/vec-srd-modulo.c	(working copy)
@@ -0,0 +1,42 @@
+/* Test that using a character splat to set up a shift-right logical
+   for a doubleword vector works correctly after gimple folding.  */
+
+/* { dg-do run { target { vsx_hw } } } */
+/* { dg-options "-O2 -mvsx" } */
+
+#include <altivec.h>
+
+typedef __vector unsigned long long vui64_t;
+
+static inline vui64_t
+vec_srdi (vui64_t vra, const unsigned int shb)
+{
+  vui64_t rshift;
+  vui64_t result;
+
+  /* Note legitimate use of wrong-type splat due to expectation that only
+     lower 6-bits are read.  */
+  rshift = (vui64_t) vec_splat_s8((const unsigned char)shb);
+
+  /* Vector Shift Right [Logical] Doublewords based on the lower 6-bits
+     of corresponding element of rshift.  */
+  result = vec_vsrd (vra, rshift);
+
+  return (vui64_t) result;
+}
+
+__attribute__ ((noinline)) vui64_t
+test_srdi_4 (vui64_t a)
+{
+  return vec_srdi (a, 4);
+}
+
+int
+main ()
+{
+  vui64_t x = {1992357, 1025};
+  x = test_srdi_4 (x);
+  if (x[0] != 124522 || x[1] != 64)
+    __builtin_abort ();
+  return 0;
+}