From patchwork Wed Apr 13 13:29:56 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Enkovich X-Patchwork-Id: 610023 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qlPqH1Gj5z9s4x for ; Wed, 13 Apr 2016 23:32:10 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=jVGun3dF; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; q=dns; s=default; b=Pee6KfuUW9DdiEBrVRLo5ryTcyVSAMUJdRWyQRcrhmQc2AXrro ODDrkNrNJbXpcXjyhCe9+YM7+xT0HNpAbgBGsw6tTnQXjgBP5HMUaaSCzTnqXhFr NhSS2R8iD1wq+53G0uPPVDhPhnaI9cWhoIwCTGweO7zuQizcTScBXr1og= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:mime-version:content-type; s= default; bh=8DDj06qe0y5LVv+SlOqfVtKCzH8=; b=jVGun3dFuBVwCsj9mI4f 3B1ueUYj6bBzWAC5I7Aato+1feukQEzwpxC6gRBIbecx26n1OZKkuQHj1jLSXllz foL6CskrgQXI/uSm0AzjjPc/ZJTff46Ax+j6M0VGpvDiXhQKusrOdJhJq9JP/VmX sOLL0/4t1ozw3GH8bi2N2uM= Received: (qmail 64094 invoked by alias); 13 Apr 2016 13:32:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 64070 invoked by uid 89); 13 Apr 2016 13:31:59 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=compensated X-HELO: mail-qg0-f66.google.com Received: from mail-qg0-f66.google.com (HELO mail-qg0-f66.google.com) (209.85.192.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 13 Apr 2016 13:31:49 +0000 Received: by mail-qg0-f66.google.com with SMTP id 7so4607338qgj.0 for ; Wed, 13 Apr 2016 06:31:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-disposition:user-agent; bh=FEUrMhq+ucO5BM+ltDEaThxweAq797EwSCi/lSjl62c=; b=CZmJAtBbPlhWeV0nZturjNoJFhsInkXLIfMy6J2kdr5E3VD61WPcE8fIUWvc6H++ge 3UZAS42ElBWQFhzzyG6OVeenRGFFUmp6i/BzO13t7lkTgFXUB4bn+EYNsRsUqB6p0iib zVdI1VAMXtFOXJpo44L6IEz4wjtAwiqpsNeTq2YNWL09rOqvKg6GXJsgxJvowY5X6Pi9 Zd+SqWi5RFTsMJbHBl5IIycp/50GDTac8x39es41YHfJ5viFlrrdLlrgXywmXGtHBqe6 xAUaHcdgJ26PggVeJot/SM6Vh9sChPljbDZrAIaf/ERLZCNrHTNtA4U+B69MZsYX3g/S jWlQ== X-Gm-Message-State: AOPr4FXLxrVrpjd8FuNGmWYtEH0xIAPS6jUxtQR2AlhJ1yk2OzAyHZvf1ULQPMuwCnRbJQ== X-Received: by 10.140.27.182 with SMTP id 51mr11140535qgx.4.1460554306931; Wed, 13 Apr 2016 06:31:46 -0700 (PDT) Received: from msticlxl57.ims.intel.com (irdmzpr02-ext.ir.intel.com. [192.198.151.37]) by smtp.gmail.com with ESMTPSA id j185sm7934683qke.40.2016.04.13.06.31.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 13 Apr 2016 06:31:46 -0700 (PDT) Date: Wed, 13 Apr 2016 16:29:56 +0300 From: Ilya Enkovich To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com;, kirill.yukhin@gmail.com Subject: [PATCH, i386] Fix operands order in kunpck* insns and corresponding expands Message-ID: <20160413132956.GA13305@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes Hi, Current kunpck[hi|si|di] patterns emit operands in a wrong order. This is compensated by a wrong operands order in vec_pack_trunc_[qi|hi|si] expands and therefore we get correct code for vectorized loops. Code using kunpck* intrinsics would be wrong though. This patch fixes operands order and adds runtime tests for _mm512_kunpack* intrinsics. Bootstrapped and regtested on x86_64-pc-linux-gnu. OK for trunk? Thanks, Ilya --- gcc/ 2016-04-13 Ilya Enkovich * config/i386/i386.md (kunpckhi): Swap operands. (kunpcksi): Likewise. (kunpckdi): Likewise. * config/i386/sse.md (vec_pack_trunc_qi): Likewise. (vec_pack_trunc_): Likewise. gcc/testsuite/ 2016-04-13 Ilya Enkovich * gcc.target/i386/avx512bw-kunpckdq-2.c: New test. * gcc.target/i386/avx512bw-kunpckwd-2.c: New test. * gcc.target/i386/avx512f-kunpckbw-2.c: New test. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 09da69e..56a3050 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -8907,7 +8907,7 @@ (const_int 8)) (zero_extend:HI (match_operand:QI 2 "register_operand" "k"))))] "TARGET_AVX512F" - "kunpckbw\t{%1, %2, %0|%0, %2, %1}" + "kunpckbw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "mode" "HI") (set_attr "type" "msklog") (set_attr "prefix" "vex")]) @@ -8920,7 +8920,7 @@ (const_int 16)) (zero_extend:SI (match_operand:HI 2 "register_operand" "k"))))] "TARGET_AVX512BW" - "kunpckwd\t{%1, %2, %0|%0, %2, %1}" + "kunpckwd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "mode" "SI")]) (define_insn "kunpckdi" @@ -8931,7 +8931,7 @@ (const_int 32)) (zero_extend:DI (match_operand:SI 2 "register_operand" "k"))))] "TARGET_AVX512BW" - "kunpckdq\t{%1, %2, %0|%0, %2, %1}" + "kunpckdq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "mode" "DI")]) ;; See comment for addsi_1_zext why we do use nonimmediate_operand diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 5132955..b64457e 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -11747,16 +11747,16 @@ (define_expand "vec_pack_trunc_qi" [(set (match_operand:HI 0 ("register_operand")) - (ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 1 ("register_operand"))) + (ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 2 ("register_operand"))) (const_int 8)) - (zero_extend:HI (match_operand:QI 2 ("register_operand")))))] + (zero_extend:HI (match_operand:QI 1 ("register_operand")))))] "TARGET_AVX512F") (define_expand "vec_pack_trunc_" [(set (match_operand: 0 ("register_operand")) - (ior: (ashift: (zero_extend: (match_operand:SWI24 1 ("register_operand"))) + (ior: (ashift: (zero_extend: (match_operand:SWI24 2 ("register_operand"))) (match_dup 3)) - (zero_extend: (match_operand:SWI24 2 ("register_operand")))))] + (zero_extend: (match_operand:SWI24 1 ("register_operand")))))] "TARGET_AVX512BW" { operands[3] = GEN_INT (GET_MODE_BITSIZE (mode)); diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kunpckdq-2.c b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckdq-2.c new file mode 100644 index 0000000..4fe503e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckdq-2.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mavx512bw" } */ +/* { dg-require-effective-target avx512bw } */ + +#define AVX512BW + +#include "avx512f-helper.h" + +static __mmask64 __attribute__((noinline,noclone)) +unpack (__mmask64 arg1, __mmask64 arg2) +{ + __mmask64 res; + + res = _mm512_kunpackd (arg1, arg2); + + return res; +} + +void +TEST (void) +{ + if (unpack (0x07UL, 0x70UL) != 0x0700000070UL) + __builtin_abort (); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-kunpckwd-2.c b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckwd-2.c new file mode 100644 index 0000000..5d7f895 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512bw-kunpckwd-2.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mavx512bw" } */ +/* { dg-require-effective-target avx512bw } */ + +#define AVX512BW + +#include "avx512f-helper.h" + +static __mmask32 __attribute__((noinline,noclone)) +unpack (__mmask32 arg1, __mmask32 arg2) +{ + __mmask32 res; + + res = _mm512_kunpackw (arg1, arg2); + + return res; +} + +void +TEST (void) +{ + if (unpack (0x07, 0x70) != 0x070070) + __builtin_abort (); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512f-kunpckbw-2.c b/gcc/testsuite/gcc.target/i386/avx512f-kunpckbw-2.c new file mode 100644 index 0000000..86580f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512f-kunpckbw-2.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mavx512f" } */ +/* { dg-require-effective-target avx512f } */ + +#define AVX512F + +#include "avx512f-helper.h" + +static __mmask16 __attribute__((noinline,noclone)) +unpack (__mmask16 arg1, __mmask16 arg2) +{ + __mmask16 res; + + res = _mm512_kunpackb (arg1, arg2); + + return res; +} + +void +TEST (void) +{ + if (unpack (0x07, 0x70) != 0x0770) + __builtin_abort (); +}