From patchwork Fri Dec 5 16:33:02 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Tocar X-Patchwork-Id: 418173 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4347C1400EA for ; Sat, 6 Dec 2014 03:33:28 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=Vupj8c/YzFJsO3Zy0 PGwFSZ6O/AqUIklpdF6lklxP/3qvke3+VnL0J6XaU2W/U1RdxZ7rcUCKVgRWGfwz urdFmIZ6/xOiyeKRwX35ulGp5dB6S6pyqNCCHNE2m45ofe94RVkBik/BAn0AFmlt K8P+FyUeMI+uEydA+4lpgUpiFU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=GsR/LLjK/N+9NacpOjVH8kB Ktcg=; b=u2luYDNxuZsuHSrg7UBXEnFvl647Qtwq7lJ7AmR+HbmPYvJOLAlCTLd mLMsk21sBFApWLzECeMfCL3gr9enD6H5S7aPDHLnAftmr/4sZmRx6hSkt4o8qM11 UvWTpo3ZcV5AQEEt9juwerR6E5BcEOGytf8P20YUp50uqe+H35Sk= Received: (qmail 22840 invoked by alias); 5 Dec 2014 16:33:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 22822 invoked by uid 89); 5 Dec 2014 16:33:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-ig0-f174.google.com Received: from mail-ig0-f174.google.com (HELO mail-ig0-f174.google.com) (209.85.213.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 05 Dec 2014 16:33:16 +0000 Received: by mail-ig0-f174.google.com with SMTP id hn15so1075764igb.1 for ; Fri, 05 Dec 2014 08:33:14 -0800 (PST) X-Received: by 10.43.7.3 with SMTP id om3mr16153018icb.97.1417797194757; Fri, 05 Dec 2014 08:33:14 -0800 (PST) Received: from msticlxl7.ims.intel.com (jfdmzpr04-ext.jf.intel.com. [134.134.137.73]) by mx.google.com with ESMTPSA id 126sm5062397ion.12.2014.12.05.08.33.11 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 05 Dec 2014 08:33:14 -0800 (PST) Date: Fri, 5 Dec 2014 19:33:02 +0300 From: Ilya Tocar To: Uros Bizjak Cc: Jakub Jelinek , "H.J. Lu" , GCC Patches Subject: Re: [PATCH x86] Enable v64qi permutations. Message-ID: <20141205163302.GA48900@msticlxl7.ims.intel.com> References: <20141204094959.GA67582@msticlxl7.ims.intel.com> <20141204115733.GA1923@tucnak.redhat.com> <20141204120426.GB1923@tucnak.redhat.com> <20141204135359.GB16358@msticlxl7.ims.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes On 04 Dec 15:16, Uros Bizjak wrote: > On Thu, Dec 4, 2014 at 2:53 PM, Ilya Tocar wrote: > > >> >>> >> Can you add a few testcases? > >> >>> > > >> >>> > Isn't it already covered by gcc.dg/torture/vshuf* ? > >> >>> > > >> >>> > >> >>> I didn't see them fail on my machines today. > >> >> > >> >> Those are executable testcases, those better should not fail. > >> >> The patch just improved code generation and the testcases test > >> >> if the improved code generation works well. > >> >> Did you mean some scan-assembler test that verifies the better code > >> >> generation? Guess it is possible, though fragile. > >> > > >> > I think that existing executable testcases adequately cover the > >> > functionality of the patch. > >> > > >> > The patch is OK. > >> > >> BTW, the ChangeLog is missing. > >> > > * config/i386/i386.c (ix86_expand_vec_perm_vpermi2): Handle v64qi. > > (expand_vec_perm_broadcast_1): Ditto. > > (expand_vec_perm_vpermi2_vpshub2): New. > > (ix86_expand_vec_perm_const_1): Use it. > > (ix86_vectorize_vec_perm_const_ok): Handle v64qi. > > * config/i386/sse.md (VEC_PERM_AVX2): Add v64qi. > > (VEC_PERM_CONST): Ditto. > >> index ca5d720..6252e7e 100644 > >> --- a/gcc/config/i386/sse.md > >> +++ b/gcc/config/i386/sse.md > >> @@ -10678,7 +10678,7 @@ > >> (V8SF "TARGET_AVX2") (V4DF "TARGET_AVX2") > >> (V16SF "TARGET_AVX512F") (V8DF "TARGET_AVX512F") > >> (V16SI "TARGET_AVX512F") (V8DI "TARGET_AVX512F") > >> - (V32HI "TARGET_AVX512BW")]) > >> + (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512VBMI")]) > >> > >> I don't think change for VBMI target belongs in this patch. > >> > > Those changes enable non-const v64qi permutes > > (via single vpermi2b insn), should I split them into separate patch? > > If they are not on the same topic, then please yes. Please don't mix > separate issues together. > OK. Patch bellow adds variable v64qi permutations. OK for trunk? (I plan to commit both of them simultaneously, if this part is approved) * config/i386/i386.c (ix86_expand_vec_perm_vpermi2): Handle v64qi. * config/i386/sse.md (VEC_PERM_AVX2): Add v64qi. --- gcc/config/i386/i386.c | 4 ++++ gcc/config/i386/sse.md | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ce5dfad..c4dbf78 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -21831,6 +21831,10 @@ ix86_expand_vec_perm_vpermi2 (rtx target, rtx op0, rtx mask, rtx op1, if (TARGET_AVX512VL && TARGET_AVX512BW) gen = gen_avx512vl_vpermi2varv16hi3; break; + case V64QImode: + if (TARGET_AVX512VBMI) + gen = gen_avx512bw_vpermi2varv64qi3; + break; case V32HImode: if (TARGET_AVX512BW) gen = gen_avx512bw_vpermi2varv32hi3; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 734e6b4..cfbe40c 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -10691,7 +10691,7 @@ (V8SF "TARGET_AVX2") (V4DF "TARGET_AVX2") (V16SF "TARGET_AVX512F") (V8DF "TARGET_AVX512F") (V16SI "TARGET_AVX512F") (V8DI "TARGET_AVX512F") - (V32HI "TARGET_AVX512BW")]) + (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512VBMI")]) (define_expand "vec_perm" [(match_operand:VEC_PERM_AVX2 0 "register_operand")