From patchwork Wed Jun 4 17:06:04 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Evgeny Stupachenko X-Patchwork-Id: 356040 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 024CA1400A6 for ; Thu, 5 Jun 2014 03:06:15 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=xbhLn7m+P/eMc9/8ra GIU6sXmC4pT3ccIPykvB4yQV8D1buORTociSwrLfh4/Va+Ou4fIJ1eJkycsM131G khzKvUA0m2y8IQ/93O4BDxpg2ieUvMYPS+U08/45nRcr2HE2djgSLbQ6AcpHYK8p IS3Uxfu7QOAJI2L00jf/Cddvs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=GBuHMNX8fV8wf1i5luGl6xDw WNI=; b=bnum2OIiPynQIQ45ROhbEmqPO9M0K5WDO4oAiLbeiODSS+kTI1x9hjFw NnDhFQ0OR77QOXawLZr6U+xh8lL04DZjCpHReL/fAcejxKbBlIjIP4270BjNvj9l rTkJ9hkmi7Px85bQ9QgCmyBRCXlMHGrN2azU4YKP0gqDPtZxm8c= Received: (qmail 3111 invoked by alias); 4 Jun 2014 17:06:08 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 3098 invoked by uid 89); 4 Jun 2014 17:06:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-ob0-f171.google.com Received: from mail-ob0-f171.google.com (HELO mail-ob0-f171.google.com) (209.85.214.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 04 Jun 2014 17:06:06 +0000 Received: by mail-ob0-f171.google.com with SMTP id wn1so8087847obc.16 for ; Wed, 04 Jun 2014 10:06:04 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.145.225 with SMTP id sx1mr23943249oeb.75.1401901564720; Wed, 04 Jun 2014 10:06:04 -0700 (PDT) Received: by 10.76.18.209 with HTTP; Wed, 4 Jun 2014 10:06:04 -0700 (PDT) In-Reply-To: <537A2A91.3000809@redhat.com> References: <535FBC20.1000400@redhat.com> <535FE3CF.2020005@redhat.com> <537A2A91.3000809@redhat.com> Date: Wed, 4 Jun 2014 21:06:04 +0400 Message-ID: Subject: Re: [PATCH 2/2, x86] Add palignr support for AVX2. From: Evgeny Stupachenko To: Richard Henderson Cc: GCC Patches , Richard Biener , Uros Bizjak , "H.J. Lu" X-IsSubscribed: yes Is it ok to use the following pattern? patch passed bootstrap and make check, but one test failed: gcc/testsuite/gcc.target/i386/vect-rebuild.c It failed on /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */ which is now palignr. However, both palignr and permilpd costs 1 tick and take 6 bytes in the opcode. I vote for modifying the test to scan for palignr: /* { dg-final { scan-assembler-times "\tv?palignr\[ \t\]" 1 } } */ 2014-06-04 Evgeny Stupachenko * config/i386/sse.md (*ssse3_palignr_perm): New. * config/i386/predicates.md (palignr_operand): New. Indicates if permutation is suitable for palignr instruction. On Mon, May 19, 2014 at 8:00 PM, Richard Henderson wrote: > On 05/05/2014 09:54 AM, Evgeny Stupachenko wrote: >> @@ -42943,6 +42944,10 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d) >> if (expand_vec_perm_vpermil (d)) >> return true; >> >> + /* Try palignr on one operand. */ >> + if (d->one_operand_p && expand_vec_perm_palignr (d)) >> + return true; > > No, because unless in_order and SSSE3, expand_vec_perm_palignr generates at > least 2 insns, and by contract expand_vec_perm_1 must generate only one. > > I think what might help you out is to have the rotate permutation matched > directly, rather than have to have it converted to a shift. > > Thus I think you'd do well to start this series with a patch that adds a > pattern of the form > > (define_insn "*ssse3_palignr_perm" > [(set (match_operand:V_128 0 "register_operand" "=x,x") > (vec_select:V_128 > (match_operand:V_128 1 "register_operand" "0,x") > (match_operand:V_128 2 "nonimmediate_operand" "xm,xm") > (match_parallel 3 "palign_operand" > [(match_operand 4 "const_int_operand" "")] > "TARGET_SSSE3" > { > enum machine_mode imode = GET_INNER_MODE (GET_MODE (operands[0])); > operands[3] = GEN_INT (INTVAL (operands[4]) * GET_MODE_SIZE (imode)); > > switch (which_alternative) > { > case 0: > return "palignr\t{%3, %2, %0|%0, %2, %3}"; > case 1: > return "vpalignr\t{%3, %2, %1, %0|%0, %1, %2, %3}"; > default: > gcc_unreachable (); > } > } > [(set_attr "isa" "noavx,avx") > (set_attr "type" "sseishft") > (set_attr "atom_unit" "sishuf") > (set_attr "prefix_data16" "1,*") > (set_attr "prefix_extra" "1") > (set_attr "length_immediate" "1") > (set_attr "prefix" "orig,vex")]) > > where the palign_operand function verifies that the constants are all in order. > This is very similar to the way we define the broadcast type patterns. > > You'll need a similar pattern with a different predicate for the avx2 palignr, > since it's not a simple increment, but also verifying the cross-lane constraint. > > With that as patch 1/1, I believe that will significantly tidy up what else > you're attempting to change with this series. > > > > r~ diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 2ef1384..8266f3e 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1417,6 +1417,22 @@ return true; }) +;; Return true if OP is a parallel for a palignr permute. +(define_predicate "palignr_operand" + (and (match_code "parallel") + (match_code "const_int" "a")) +{ + int elt = INTVAL (XVECEXP (op, 0, 0)); + int i, nelt = XVECLEN (op, 0); + + /* Check that an order in the permutation is suitable for palignr. + For example, {5 6 7 0 1 2 3 4} is "palignr 5, xmm, xmm". */ + for (i = 1; i < nelt; ++i) + if (INTVAL (XVECEXP (op, 0, i)) != ((elt + i) % nelt)) + return false; + return true; +}) + ;; Return true if OP is a proper third operand to vpblendw256. (define_predicate "avx2_pblendw_operand" (match_code "const_int") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index c91626b..5e8fd65 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -11454,6 +11454,36 @@ } }) +(define_insn "*ssse3_palignr_perm" + [(set (match_operand:V_128 0 "register_operand" "=x,x") + (vec_select:V_128 + (match_operand:V_128 1 "register_operand" "0,x") + (match_parallel 2 "palignr_operand" + [(match_operand 3 "const_int_operand" "n, n")])))] + "TARGET_SSSE3" +{ + enum machine_mode imode = GET_MODE_INNER (GET_MODE (operands[0])); + operands[2] = GEN_INT (INTVAL (operands[3]) * GET_MODE_SIZE (imode)); + + switch (which_alternative) + { + case 0: + return "palignr\t{%2, %1, %0|%0, %1, %2}"; + case 1: + return "vpalignr\t{%2, %1, %1, %0|%0, %1, %1, %2}"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sseishft") + (set_attr "atom_unit" "sishuf") + (set_attr "prefix_data16" "1,*") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "1") + (set_attr "prefix" "orig,vex")]) + + (define_insn "abs2" [(set (match_operand:MMXMODEI 0 "register_operand" "=y") (abs:MMXMODEI