From patchwork Thu Jan 6 06:51:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 1575947 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=g5xb0BgO; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JTxqs1g0Cz9t0k for ; Thu, 6 Jan 2022 17:52:39 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 03F97385783E for ; Thu, 6 Jan 2022 06:52:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 03F97385783E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1641451956; bh=+uXyu7nPHg/tQFwlhe/K2lgmoxh9T5Gcd/bjMHrr6UM=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=g5xb0BgOc3mvMonbx4wGphM4OA93Pr6apIPJ4Km7eAMcmvkFaviUJzugbw9LGXsdx dBXriV2gWX34htws67JnvvM0Z8wr/mc+0Z3rw+50D1/ADPNYPQrhyZCRVeUIM0M25g N52TmQDsjc4PgxOrhkskUQ0T/tbck20biSYnTAlo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by sourceware.org (Postfix) with ESMTPS id 1A6B53858014 for ; Thu, 6 Jan 2022 06:51:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1A6B53858014 X-IronPort-AV: E=McAfee;i="6200,9189,10217"; a="229418255" X-IronPort-AV: E=Sophos;i="5.88,266,1635231600"; d="scan'208";a="229418255" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Jan 2022 22:51:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,266,1635231600"; d="scan'208";a="668357237" Received: from scymds01.sc.intel.com ([10.148.94.138]) by fmsmga001.fm.intel.com with ESMTP; 05 Jan 2022 22:51:49 -0800 Received: from shliclel051.sh.intel.com (shliclel051.sh.intel.com [10.239.236.51]) by scymds01.sc.intel.com with ESMTP id 2066pmQ1010778; Wed, 5 Jan 2022 22:51:49 -0800 To: gcc-patches@gcc.gnu.org Subject: [PATCH] [RTL/fwprop] Allow propagations from inner loop to outer loop. Date: Thu, 6 Jan 2022 14:51:48 +0800 Message-Id: <20220106065148.64387-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: References: X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" > that's flow_loop_nested_p (loop *outer, loop *inner) which > is implemented in O(1). Note behavior for outer == inner > might be different (didn't check your implementation too hard) > Thanks, it seems flow_loop_nested_p assume outer and inner not to be NULL. So I add some conditions to check NULL which is considered as an outer loop of any other loop. gcc/ChangeLog: PR rtl/103750 * fwprop.c (forward_propagate_into): Allow propagations from inner loop to outer loop. gcc/testsuite/ChangeLog: * g++.target/i386/pr103750-fwprop-1.C: New test. --- gcc/fwprop.c | 7 +++-- .../g++.target/i386/pr103750-fwprop-1.C | 26 +++++++++++++++++++ 2 files changed, 31 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/pr103750-fwprop-1.C diff --git a/gcc/fwprop.c b/gcc/fwprop.c index 2eab4fd4614..af2e9d1c189 100644 --- a/gcc/fwprop.c +++ b/gcc/fwprop.c @@ -866,10 +866,13 @@ forward_propagate_into (use_info *use, bool reg_prop_only = false) rtx src = SET_SRC (def_set); /* Allow propagations into a loop only for reg-to-reg copies, since - replacing one register by another shouldn't increase the cost. */ + replacing one register by another shouldn't increase the cost. + Propagations from inner loop to outer loop should be also ok. */ struct loop *def_loop = def_insn->bb ()->cfg_bb ()->loop_father; struct loop *use_loop = use->bb ()->cfg_bb ()->loop_father; - if ((reg_prop_only || def_loop != use_loop) + if ((reg_prop_only + || (use_loop && def_loop != use_loop + &&(!def_loop || !flow_loop_nested_p (use_loop, def_loop)))) && (!reg_single_def_p (dest) || !reg_single_def_p (src))) return false; diff --git a/gcc/testsuite/g++.target/i386/pr103750-fwprop-1.C b/gcc/testsuite/g++.target/i386/pr103750-fwprop-1.C new file mode 100644 index 00000000000..26987d307aa --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr103750-fwprop-1.C @@ -0,0 +1,26 @@ +/* PR target/103750. */ +/* { dg-do compile } */ +/* { dg-options "-O2 -std=c++1y -march=cannonlake -fdump-rtl-fwprop1" } */ +/* { dg-final { scan-rtl-dump-not "subreg:HI\[ \\\(\]*reg:SI\[^\n]*\n\[^\n]*UNSPEC_TZCNT" "fwprop1" } } */ + +#include +const char16_t *qustrchr(char16_t *n, char16_t *e, char16_t c) noexcept +{ + __m256i mch256 = _mm256_set1_epi16(c); + for ( ; n < e; n += 32) { + __m256i data1 = _mm256_loadu_si256(reinterpret_cast(n)); + __m256i data2 = _mm256_loadu_si256(reinterpret_cast(n) + 1); + __mmask16 mask1 = _mm256_cmpeq_epu16_mask(data1, mch256); + __mmask16 mask2 = _mm256_cmpeq_epu16_mask(data2, mch256); + if (_kortestz_mask16_u8(mask1, mask2)) + continue; + + unsigned idx = _tzcnt_u32(mask1); + if (mask1 == 0) { + idx = __tzcnt_u16(mask2); + n += 16; + } + return n + idx; + } + return e; +}