From patchwork Fri Mar 21 01:38:14 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Bill Schmidt <wschmidt@linux.vnet.ibm.com>
X-Patchwork-Id: 332472
Return-Path: 
 <gcc-patches-return-363692-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id F0FE82C00AE
	for <incoming@patchwork.ozlabs.org>;
	Fri, 21 Mar 2014 12:38:39 +1100 (EST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:subject:from:to:cc:date:content-type
	:content-transfer-encoding:mime-version; q=dns; s=default; b=Rkw
	5nPIio7r7MsTcvxoPGUf8E0aqLJOowsuwD+9uS0Kum6Yv7ZI9+EsdkgbxqTFJnKO
	K8OpFYrAT7Pc67aJOd6qQpD2do07MvZ4PdOWhLw8w2rUIl/6x4eMIDrTZjK7zWp7
	aFfpylaCTqISU9A32nQnceHqpeJ/6y1HDE7O8pug=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:subject:from:to:cc:date:content-type
	:content-transfer-encoding:mime-version; s=default; bh=1AJ9FfxHu
	EU1nlr1epH9PkSMYBQ=; b=cUHWLFQ2GFYHqc2XhTrYSKoB4CTO+BwSDBLfXVv6o
	QYJ2qqSzjwMDbY6IAwWnYjYHPzZAHXW8duq+4PC5kC22lcE/gSfYlJUmzbpuITdk
	N63PPwTwau2DFh235Oi8KlKLt7+OVSHnd6F3kA1LmqBqdZ9DQWP5wQTB+CUjjD8a
	f8=
Received: (qmail 20632 invoked by alias); 21 Mar 2014 01:38:28 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 20609 invoked by uid 89); 21 Mar 2014 01:38:26 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL, BAYES_20,
	T_RP_MATCHES_RCVD autolearn=ham version=3.3.2
X-HELO: e28smtp04.in.ibm.com
Received: from e28smtp04.in.ibm.com (HELO e28smtp04.in.ibm.com)
	(122.248.162.4) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted)
	ESMTPS; Fri, 21 Mar 2014 01:38:24 +0000
Received: from /spool/local	by e28smtp04.in.ibm.com with IBM ESMTP SMTP
	Gateway: Authorized Use Only! Violators will be
	prosecuted	for <gcc-patches@gcc.gnu.org> from
	<wschmidt@linux.vnet.ibm.com>; Fri, 21 Mar 2014 07:08:19 +0530
Received: from d28dlp03.in.ibm.com (9.184.220.128)	by e28smtp04.in.ibm.com
	(192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use
	Only! Violators will be prosecuted; Fri, 21 Mar 2014 07:08:18 +0530
Received: from d28relay02.in.ibm.com (d28relay02.in.ibm.com
	[9.184.220.59])	by d28dlp03.in.ibm.com (Postfix) with ESMTP
	id 32BD4125803E	for <gcc-patches@gcc.gnu.org>;
	Fri, 21 Mar 2014 07:10:38 +0530 (IST)
Received: from d28av05.in.ibm.com (d28av05.in.ibm.com [9.184.220.67])	by
	d28relay02.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	s2L1c9fr65601644	for <gcc-patches@gcc.gnu.org>;
	Fri, 21 Mar 2014 07:08:09 +0530
Received: from d28av05.in.ibm.com (localhost [127.0.0.1])	by
	d28av05.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP
	id s2L1cGeE008189	for <gcc-patches@gcc.gnu.org>;
	Fri, 21 Mar 2014 07:08:16 +0530
Received: from [9.65.83.83] (sig-9-65-83-83.mts.ibm.com [9.65.83.83])	by
	d28av05.in.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP
	id s2L1cEwD008048; Fri, 21 Mar 2014 07:08:15 +0530
Message-ID: <1395365894.3599.35.camel@gnopaine>
Subject: [PATCH, rs6000] More efficient vector permute for little endian
From: Bill Schmidt <wschmidt@linux.vnet.ibm.com>
To: gcc-patches@gcc.gnu.org
Cc: dje.gcc@gmail.com
Date: Thu, 20 Mar 2014 20:38:14 -0500
Mime-Version: 1.0
X-TM-AS-MML: disable
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 14032101-5564-0000-0000-00000C9B47DF
X-IsSubscribed: yes

Hi,

The original workaround for vector permute on a little endian platform
includes subtracting each element of the permute control vector from 31.
Because the upper 3 bits of each element are unimportant, this was
implemented as subtracting the whole vector from a splat of -1.  On
reflection this can be done more efficiently with a vector nor
operation.  This patch makes that change.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

Thanks,
Bill


2014-03-20  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate a
	pattern for vector nor instead of subtract from splat(-1).
	(altivec_expand_vec_perm_const_le): Likewise.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 208704)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -5622,12 +5622,10 @@ rs6000_expand_vector_set (rtx target, rtx val, int
   else 
     {
       /* Invert selector.  */
-      rtx splat = gen_rtx_VEC_DUPLICATE (V16QImode,
-					 gen_rtx_CONST_INT (QImode, -1));
+      rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
+      rtx andx = gen_rtx_AND (V16QImode, notx, notx);
       rtx tmp = gen_reg_rtx (V16QImode);
-      emit_move_insn (tmp, splat);
-      x = gen_rtx_MINUS (V16QImode, tmp, force_reg (V16QImode, x));
-      emit_move_insn (tmp, x);
+      emit_move_insn (tmp, andx);
 
       /* Permute with operands reversed and adjusted selector.  */
       x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
@@ -30321,18 +30319,18 @@ altivec_expand_vec_perm_const_le (rtx operands[4])
 
 /* Similarly to altivec_expand_vec_perm_const_le, we must adjust the
    permute control vector.  But here it's not a constant, so we must
-   generate a vector splat/subtract to do the adjustment.  */
+   generate a vector NOR to do the adjustment.  */
 
 void
 altivec_expand_vec_perm_le (rtx operands[4])
 {
-  rtx splat, unspec;
+  rtx notx, andx, unspec;
   rtx target = operands[0];
   rtx op0 = operands[1];
   rtx op1 = operands[2];
   rtx sel = operands[3];
   rtx tmp = target;
-  rtx splatreg = gen_reg_rtx (V16QImode);
+  rtx norreg = gen_reg_rtx (V16QImode);
   enum machine_mode mode = GET_MODE (target);
 
   /* Get everything in regs so the pattern matches.  */
@@ -30345,18 +30343,14 @@ altivec_expand_vec_perm_le (rtx operands[4])
   if (!REG_P (target))
     tmp = gen_reg_rtx (mode);
 
-  /* SEL = splat(31) - SEL.  */
-  /* We want to subtract from 31, but we can't vspltisb 31 since
-     it's out of range.  -1 works as well because only the low-order
-     five bits of the permute control vector elements are used.  */
-  splat = gen_rtx_VEC_DUPLICATE (V16QImode,
-				 gen_rtx_CONST_INT (QImode, -1));
-  emit_move_insn (splatreg, splat);
-  sel = gen_rtx_MINUS (V16QImode, splatreg, sel);
-  emit_move_insn (splatreg, sel);
+  /* Invert the selector with a VNOR.  */
+  notx = gen_rtx_NOT (V16QImode, sel);
+  andx = gen_rtx_AND (V16QImode, notx, notx);
+  emit_move_insn (norreg, andx);
 
   /* Permute with operands reversed and adjusted selector.  */
-  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, splatreg), UNSPEC_VPERM);
+  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg),
+			   UNSPEC_VPERM);
 
   /* Copy into target, possibly by way of a register.  */
   if (!REG_P (target))