From patchwork Wed Jun  4 17:06:04 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Evgeny Stupachenko <evstupac@gmail.com>
X-Patchwork-Id: 356040
Return-Path: 
 <gcc-patches-return-369453-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 024CA1400A6
	for <incoming@patchwork.ozlabs.org>;
	Thu,  5 Jun 2014 03:06:15 +1000 (EST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:mime-version:in-reply-to:references:date:message-id:subject
	:from:to:cc:content-type; q=dns; s=default; b=xbhLn7m+P/eMc9/8ra
	GIU6sXmC4pT3ccIPykvB4yQV8D1buORTociSwrLfh4/Va+Ou4fIJ1eJkycsM131G
	khzKvUA0m2y8IQ/93O4BDxpg2ieUvMYPS+U08/45nRcr2HE2djgSLbQ6AcpHYK8p
	IS3Uxfu7QOAJI2L00jf/Cddvs=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:mime-version:in-reply-to:references:date:message-id:subject
	:from:to:cc:content-type; s=default; bh=GBuHMNX8fV8wf1i5luGl6xDw
	WNI=; b=bnum2OIiPynQIQ45ROhbEmqPO9M0K5WDO4oAiLbeiODSS+kTI1x9hjFw
	NnDhFQ0OR77QOXawLZr6U+xh8lL04DZjCpHReL/fAcejxKbBlIjIP4270BjNvj9l
	rTkJ9hkmi7Px85bQ9QgCmyBRCXlMHGrN2azU4YKP0gqDPtZxm8c=
Received: (qmail 3111 invoked by alias); 4 Jun 2014 17:06:08 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 3098 invoked by uid 89); 4 Jun 2014 17:06:08 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00,
	FREEMAIL_FROM, RCVD_IN_DNSWL_LOW,
	SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-ob0-f171.google.com
Received: from mail-ob0-f171.google.com (HELO mail-ob0-f171.google.com)
	(209.85.214.171) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted)
	ESMTPS; Wed, 04 Jun 2014 17:06:06 +0000
Received: by mail-ob0-f171.google.com with SMTP id wn1so8087847obc.16 for
	<gcc-patches@gcc.gnu.org>; Wed, 04 Jun 2014 10:06:04 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.60.145.225 with SMTP id sx1mr23943249oeb.75.1401901564720;
	Wed, 04 Jun 2014 10:06:04 -0700 (PDT)
Received: by 10.76.18.209 with HTTP; Wed, 4 Jun 2014 10:06:04 -0700 (PDT)
In-Reply-To: <537A2A91.3000809@redhat.com>
References: 
 <CAOvf_xyiA5uaZGHd+86Z6X_6=02pRQ7Nc48nbMrHRuyj+kj_kQ@mail.gmail.com>
	<535FBC20.1000400@redhat.com>
	<CAOvf_xzXNYBAAMdZr8d-6PLnQnvJyZaDaZ7LSXnoBDy7opmuPw@mail.gmail.com>
	<535FE3CF.2020005@redhat.com>
	<CAOvf_xxqpnta9SToYjSY+=WXfcTZApnCrMDr4RXJZAPohWeJbg@mail.gmail.com>
	<CAOvf_xwRM17xGzaLoqxHXJ9U=iWJMq26ZyC4f1sp3_TUctnTVA@mail.gmail.com>
	<537A2A91.3000809@redhat.com>
Date: Wed, 4 Jun 2014 21:06:04 +0400
Message-ID: 
 <CAOvf_xwPk6-XTCpvkruL2jkXk_-weZo1TFpFdqA1MTr1q9VEhg@mail.gmail.com>
Subject: Re: [PATCH 2/2, x86] Add palignr support for AVX2.
From: Evgeny Stupachenko <evstupac@gmail.com>
To: Richard Henderson <rth@redhat.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, Richard Biener <rguenther@suse.de>,
	Uros Bizjak <ubizjak@gmail.com>, "H.J. Lu" <hjl.tools@gmail.com>
X-IsSubscribed: yes

Is it ok to use the following pattern?

patch passed bootstrap and make check, but one test failed:
gcc/testsuite/gcc.target/i386/vect-rebuild.c
It failed on /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
which is now palignr. However, both palignr and permilpd costs 1 tick
and take 6 bytes in the opcode.
I vote for modifying the test to scan for palignr:
/* { dg-final { scan-assembler-times "\tv?palignr\[ \t\]" 1 } } */

2014-06-04  Evgeny Stupachenko  <evstupac@gmail.com>

         * config/i386/sse.md (*ssse3_palignr<mode>_perm): New.
         * config/i386/predicates.md (palignr_operand): New.
         Indicates if permutation is suitable for palignr instruction.


On Mon, May 19, 2014 at 8:00 PM, Richard Henderson <rth@redhat.com> wrote:
> On 05/05/2014 09:54 AM, Evgeny Stupachenko wrote:
>> @@ -42943,6 +42944,10 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
>>    if (expand_vec_perm_vpermil (d))
>>      return true;
>>
>> +  /* Try palignr on one operand.  */
>> +  if (d->one_operand_p && expand_vec_perm_palignr (d))
>> +    return true;
>
> No, because unless in_order and SSSE3, expand_vec_perm_palignr generates at
> least 2 insns, and by contract expand_vec_perm_1 must generate only one.
>
> I think what might help you out is to have the rotate permutation matched
> directly, rather than have to have it converted to a shift.
>
> Thus I think you'd do well to start this series with a patch that adds a
> pattern of the form
>
> (define_insn "*ssse3_palignr<mode>_perm"
>   [(set (match_operand:V_128 0 "register_operand" "=x,x")
>         (vec_select:V_128
>            (match_operand:V_128 1 "register_operand" "0,x")
>            (match_operand:V_128 2 "nonimmediate_operand" "xm,xm")
>            (match_parallel 3 "palign_operand"
>              [(match_operand 4 "const_int_operand" "")]
>   "TARGET_SSSE3"
> {
>   enum machine_mode imode = GET_INNER_MODE (GET_MODE (operands[0]));
>   operands[3] = GEN_INT (INTVAL (operands[4]) * GET_MODE_SIZE (imode));
>
>   switch (which_alternative)
>     {
>     case 0:
>       return "palignr\t{%3, %2, %0|%0, %2, %3}";
>     case 1:
>       return "vpalignr\t{%3, %2, %1, %0|%0, %1, %2, %3}";
>     default:
>       gcc_unreachable ();
>     }
> }
>   [(set_attr "isa" "noavx,avx")
>    (set_attr "type" "sseishft")
>    (set_attr "atom_unit" "sishuf")
>    (set_attr "prefix_data16" "1,*")
>    (set_attr "prefix_extra" "1")
>    (set_attr "length_immediate" "1")
>    (set_attr "prefix" "orig,vex")])
>
> where the palign_operand function verifies that the constants are all in order.
>  This is very similar to the way we define the broadcast type patterns.
>
> You'll need a similar pattern with a different predicate for the avx2 palignr,
> since it's not a simple increment, but also verifying the cross-lane constraint.
>
> With that as patch 1/1, I believe that will significantly tidy up what else
> you're attempting to change with this series.
>
>
>
> r~

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 2ef1384..8266f3e 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -1417,6 +1417,22 @@
   return true;
 })

+;; Return true if OP is a parallel for a palignr permute.
+(define_predicate "palignr_operand"
+  (and (match_code "parallel")
+       (match_code "const_int" "a"))
+{
+  int elt = INTVAL (XVECEXP (op, 0, 0));
+  int i, nelt = XVECLEN (op, 0);
+
+  /* Check that an order in the permutation is suitable for palignr.
+     For example, {5 6 7 0 1 2 3 4} is "palignr 5, xmm, xmm".  */
+  for (i = 1; i < nelt; ++i)
+    if (INTVAL (XVECEXP (op, 0, i)) != ((elt + i) % nelt))
+      return false;
+  return true;
+})
+
 ;; Return true if OP is a proper third operand to vpblendw256.
 (define_predicate "avx2_pblendw_operand"
   (match_code "const_int")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c91626b..5e8fd65 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -11454,6 +11454,36 @@
     }
 })

+(define_insn "*ssse3_palignr<mode>_perm"
+  [(set (match_operand:V_128 0 "register_operand" "=x,x")
+      (vec_select:V_128
+       (match_operand:V_128 1 "register_operand" "0,x")
+       (match_parallel 2 "palignr_operand"
+         [(match_operand 3 "const_int_operand" "n, n")])))]
+  "TARGET_SSSE3"
+{
+  enum machine_mode imode = GET_MODE_INNER (GET_MODE (operands[0]));
+  operands[2] = GEN_INT (INTVAL (operands[3]) * GET_MODE_SIZE (imode));
+
+  switch (which_alternative)
+    {
+    case 0:
+      return "palignr\t{%2, %1, %0|%0, %1, %2}";
+    case 1:
+      return "vpalignr\t{%2, %1, %1, %0|%0, %1, %1, %2}";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sseishft")
+   (set_attr "atom_unit" "sishuf")
+   (set_attr "prefix_data16" "1,*")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "orig,vex")])
+
+
 (define_insn "abs<mode>2"
   [(set (match_operand:MMXMODEI 0 "register_operand" "=y")
        (abs:MMXMODEI