From patchwork Wed Jun 2 05:41:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 1486386 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=HGKVrMq1; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Fvybl4sMkz9s5R for ; Wed, 2 Jun 2021 15:42:43 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 82AE8383581A for ; Wed, 2 Jun 2021 05:42:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 82AE8383581A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1622612561; bh=2TExSVwcDML2Sl5v/Qt6iBGy4MDUEBnXqFyvPCYNsck=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=HGKVrMq1tsgCQ7HHSFFT8Vnf1npC/mlTap7dAO/hlp35+anB/WdO7IvwqgQcRnhQ+ j7NzmoHieNwc1+KJZjdbOHlmntVFgCHtiCjqJKpW9wspGunHwB3j2qJy8awZEirJDO P7gTKAzfR93Kf04QKgeYJ4POVwDjHAN6VsBWlo38= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by sourceware.org (Postfix) with ESMTPS id C450F3857012 for ; Wed, 2 Jun 2021 05:41:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C450F3857012 IronPort-SDR: SLRqMK3N7E+lT90h4Wnoqlrv9IaPAqKAd6FHJR2pNy5XU9bdacxHm6Ylm6TTO5l1TjHqh8knVX a1rMJR7nXs6Q== X-IronPort-AV: E=McAfee;i="6200,9189,10002"; a="184074288" X-IronPort-AV: E=Sophos;i="5.83,241,1616482800"; d="scan'208";a="184074288" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2021 22:41:55 -0700 IronPort-SDR: wLIociDs2RhSeliupYbNDyW5wRIU6+k/q4DRZXcS4YAiOsq6bnBEY+JYr2avRjrVeuUd01o16O tRMgBYvFsMKw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,241,1616482800"; d="scan'208";a="467322438" Received: from scymds01.sc.intel.com ([10.148.94.138]) by fmsmga004.fm.intel.com with ESMTP; 01 Jun 2021 22:41:55 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds01.sc.intel.com with ESMTP id 1525fs3t000624; Tue, 1 Jun 2021 22:41:54 -0700 To: segher@kernel.crashing.org Subject: [PATCH] Canonicalize (vec_duplicate (not A)) to (not (vec_duplicate A)). Date: Wed, 2 Jun 2021 13:41:54 +0800 Message-Id: <20210602054154.86889-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20210601140254.GJ3085@gate.crashing.org> References: <20210601140254.GJ3085@gate.crashing.org> X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" For i386, it will enable below opt from notl %edi vpbroadcastd %edi, %xmm0 vpand %xmm1, %xmm0, %xmm0 to vpbroadcastd %edi, %xmm0 vpandn %xmm1, %xmm0, %xmm0 gcc/ChangeLog: PR target/100711 * simplify-rtx.c (simplify_unary_operation_1): Canonicalize (vec_duplicate (not A)) to (not (vec_duplicate A)). * doc/md.texi (Insn Canonicalizations): Document canonicalization of vec_duplicate. gcc/testsuite/ChangeLog: PR target/100711 * gcc.target/i386/avx2-pr100711.c: New test. * gcc.target/i386/avx512bw-pr100711.c: New test. --- gcc/doc/md.texi | 5 ++ gcc/simplify-rtx.c | 6 ++ gcc/testsuite/gcc.target/i386/avx2-pr100711.c | 73 +++++++++++++++++++ .../gcc.target/i386/avx512bw-pr100711.c | 48 ++++++++++++ 4 files changed, 132 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx2-pr100711.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512bw-pr100711.c diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 0e65b3ae663..06b42901413 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -8297,6 +8297,11 @@ operand of @code{mult} is also a shift, then that is extended also. This transformation is only applied when it can be proven that the original operation had sufficient precision to prevent overflow. +@cindex @code{vec_duplicate}, canonicalization of +@item +@code{(vec_duplicate (not @var{a}))} is converted to +@code{(not (vec_duplicate @var{a}))}. + @end itemize Further canonicalization rules are defined in the function diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 04423bbd195..171fc447d50 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -1708,6 +1708,12 @@ simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode, #endif break; + /* Canonicalize (vec_duplicate (not A)) to (not (vec_duplicate A)). */ + case VEC_DUPLICATE: + if (GET_CODE (op) == NOT) + return gen_rtx_NOT (mode, gen_rtx_VEC_DUPLICATE (mode, XEXP (op, 0))); + break; + default: break; } diff --git a/gcc/testsuite/gcc.target/i386/avx2-pr100711.c b/gcc/testsuite/gcc.target/i386/avx2-pr100711.c new file mode 100644 index 00000000000..5b144623873 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx2-pr100711.c @@ -0,0 +1,73 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -O2" } */ +/* { dg-final { scan-assembler-times "pandn" 8 } } */ +/* { dg-final { scan-assembler-not "not\[bwlq\]" } } */ +typedef char v16qi __attribute__((vector_size(16))); +typedef char v32qi __attribute__((vector_size(32))); +typedef short v8hi __attribute__((vector_size(16))); +typedef short v16hi __attribute__((vector_size(32))); +typedef int v4si __attribute__((vector_size(16))); +typedef int v8si __attribute__((vector_size(32))); +typedef long long v2di __attribute__((vector_size(16))); +typedef long long v4di __attribute__((vector_size(32))); + +v16qi +f1 (char a, v16qi c) +{ + char b = ~a; + return (__extension__(v16qi) {b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b}) & c; +} + +v32qi +f2 (char a, v32qi c) +{ + char b = ~a; + return (__extension__(v32qi) {b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b}) & c; +} + +v8hi +f3 (short a, v8hi c) +{ + short b = ~a; + return (__extension__(v8hi) {b, b, b, b, b, b, b, b}) & c; +} + +v16hi +f4 (short a, v16hi c) +{ + short b = ~a; + return (__extension__(v16hi) {b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b}) & c; +} + +v4si +f5 (int a, v4si c) +{ + int b = ~a; + return (__extension__(v4si) {b, b, b, b}) & c; +} + +v8si +f6 (int a, v8si c) +{ + int b = ~a; + return (__extension__(v8si) {b, b, b, b, b, b, b, b}) & c; +} + +v2di +f7 (long long a, v2di c) +{ + long long b = ~a; + return (__extension__(v2di) {b, b}) & c; +} + +v4di +f8 (long long a, v4di c) +{ + long long b = ~a; + return (__extension__(v4di) {b, b, b, b}) & c; +} diff --git a/gcc/testsuite/gcc.target/i386/avx512bw-pr100711.c b/gcc/testsuite/gcc.target/i386/avx512bw-pr100711.c new file mode 100644 index 00000000000..f0a103d0bc2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512bw-pr100711.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -O2" } */ +/* { dg-final { scan-assembler-times "pandn" 4 } } */ +/* { dg-final { scan-assembler-not "not\[bwlq\]" } } */ + +typedef char v64qi __attribute__((vector_size(64))); +typedef short v32hi __attribute__((vector_size(64))); +typedef int v16si __attribute__((vector_size(64))); +typedef long long v8di __attribute__((vector_size(64))); + +v64qi +f1 (char a, v64qi c) +{ + char b = ~a; + return (__extension__(v64qi) {b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b}) & c; +} + +v32hi +f2 (short a, v32hi c) +{ + short b = ~a; + return (__extension__(v32hi) {b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b}) & c; +} + +v16si +f3 (int a, v16si c) +{ + int b = ~a; + return (__extension__(v16si) {b, b, b, b, b, b, b, b, + b, b, b, b, b, b, b, b}) & c; +} + +v8di +f4 (long long a, v8di c) +{ + long long b = ~a; + return (__extension__(v8di) {b, b, b, b, b, b, b, b}) & c; +}