From patchwork Thu May 16 02:08:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 1100248 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-500849-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="WMFZJppd"; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="HWznyh0F"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 454FGK0L7Wz9sB3 for ; Thu, 16 May 2019 12:08:56 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=jK1fzOlCxTs43G3mLSO8hWDuKceCHLdJvtGUckCan1jbPdfGaigxn VeoWpq/mKFFgCmiN89kAeBawjOMCLcMAB/Rlzw5+S+0DmClTudDPIouvx3sdigb/ 3lu1sIqBWeJHW2+UEo4i+P+XaTPdegyB05gFfEZqc9Y6g8CtrQrLnI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=ZmmxiwGV4091Z2GEgdS79oa9p7w=; b=WMFZJppd5shq/vrtDrBt ngN5XGmWmOT6hSrpE0LsDJPdoKJweXJlHgjxTYsHC7vEca398e59+VmkbLIYWbuC cv066t46vNtE6V8yhQNDiv0Mh6pRRhAEkjkwX4PlMSqvzx3KQtpantVsxVyREbVP msPCk5V/YKw3ETRJykvoawY= Received: (qmail 72708 invoked by alias); 16 May 2019 02:08:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 72667 invoked by uid 89); 16 May 2019 02:08:40 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=quantity, inaccessible, Nonzero, sk:can_thr X-HELO: mail-pf1-f193.google.com Received: from mail-pf1-f193.google.com (HELO mail-pf1-f193.google.com) (209.85.210.193) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 16 May 2019 02:08:37 +0000 Received: by mail-pf1-f193.google.com with SMTP id c6so942550pfa.10 for ; Wed, 15 May 2019 19:08:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pU7pL+rnfRRZs2kGV7vpI6cLw1hR5KaB845wgx+Zuq4=; b=HWznyh0FhtCVQl/+b4Z++hKXETkGn+IfCRE5KC0rwXlFoEV/yXc9O36ZoWS0Y2EmTv X1y6T4Ht7c2nIE235BYEwT4aCDvHaK3PuEC6SzHpeNp9k8FhtAKLQty5kxy9IHB9O2PU FpJ8P2etwMGAx8C745UxQXsFwO/QjyviAB5sEdzWXVgoozgfkTMZMuAod4yOlm/7BJfB nBZUt7ATaYdfiqnlwjs7yCaYq0cGdNaR0apkRcdobI6tS0weFOVZingNN6W+FHIZigbY 9ni0QXNmu2IoPYOhl8d5SmHSMAjiFUfqRlfyvu0j77FDYXTHK/LyKXZYeiCVncWBQe/I Av0Q== Received: from localhost.localdomain (203-219-253-77.static.tpgi.com.au. [203.219.253.77]) by smtp.gmail.com with ESMTPSA id w6sm3827266pge.30.2019.05.15.19.08.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 15 May 2019 19:08:34 -0700 (PDT) From: kugan.vivekanandarajah@linaro.org To: gcc-patches@gcc.gnu.org Cc: Kugan Vivekanandarajah Subject: [PATCH 2/2] [PR88836][aarch64] Fix CSE to process parallel rtx dest one by one Date: Thu, 16 May 2019 12:08:04 +1000 Message-Id: <1557972484-24599-3-git-send-email-kugan.vivekanandarajah@linaro.org> In-Reply-To: <1557972484-24599-1-git-send-email-kugan.vivekanandarajah@linaro.org> References: <1557972484-24599-1-git-send-email-kugan.vivekanandarajah@linaro.org> X-IsSubscribed: yes From: Kugan Vivekanandarajah This patch changes cse_insn to process parallel rtx one by one such that any destination rtx in cse list is invalidated before processing the next. gcc/ChangeLog: 2019-05-16 Kugan Vivekanandarajah PR target/88834 * cse.c (safe_hash): Handle VEC_DUPLICATE. (exp_equiv_p): Likewise. (hash_rtx_cb): Change to accept const_rtx. (struct set): Add field to record if uses of dest is invalidated. (cse_insn): For parallel rtx, invalidate register set by first rtx before processing the next. gcc/testsuite/ChangeLog: 2019-05-16 Kugan Vivekanandarajah PR target/88834 * gcc.target/aarch64/pr88834.c: New test. Change-Id: I7c3a61f034128f38abe0c2b7dab5d81dec28146c --- gcc/cse.c | 67 ++++++++++++++++++++++++++---- gcc/testsuite/gcc.target/aarch64/pr88836.c | 14 +++++++ 2 files changed, 73 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr88836.c diff --git a/gcc/cse.c b/gcc/cse.c index 6c9cda1..9dc31f5 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -570,7 +570,7 @@ static void invalidate_for_call (void); static rtx use_related_value (rtx, struct table_elt *); static inline unsigned canon_hash (rtx, machine_mode); -static inline unsigned safe_hash (rtx, machine_mode); +static inline unsigned safe_hash (const_rtx, machine_mode); static inline unsigned hash_rtx_string (const char *); static rtx canon_reg (rtx, rtx_insn *); @@ -2369,6 +2369,11 @@ hash_rtx_cb (const_rtx x, machine_mode mode, hash += fixed_hash (CONST_FIXED_VALUE (x)); return hash; + case VEC_DUPLICATE: + return hash_rtx_cb (XEXP (x, 0), VOIDmode, + do_not_record_p, hash_arg_in_memory_p, + have_reg_qty, cb); + case CONST_VECTOR: { int units; @@ -2599,7 +2604,7 @@ canon_hash (rtx x, machine_mode mode) and hash_arg_in_memory are not changed. */ static inline unsigned -safe_hash (rtx x, machine_mode mode) +safe_hash (const_rtx x, machine_mode mode) { int dummy_do_not_record; return hash_rtx (x, mode, &dummy_do_not_record, NULL, true); @@ -2630,6 +2635,16 @@ exp_equiv_p (const_rtx x, const_rtx y, int validate, bool for_gcse) return x == y; code = GET_CODE (x); + if ((code == CONST_VECTOR && GET_CODE (y) == VEC_DUPLICATE) + || (code == VEC_DUPLICATE && GET_CODE (y) == CONST_VECTOR)) + { + if (code == VEC_DUPLICATE) + std::swap (x, y); + if (const_vector_encoded_nelts (x) != 1) + return 0; + return exp_equiv_p (CONST_VECTOR_ENCODED_ELT (x, 0), XEXP (y, 0), + validate, for_gcse); + } if (code != GET_CODE (y)) return 0; @@ -4192,7 +4207,8 @@ struct set char src_in_memory; /* Nonzero if the SET_SRC contains something whose value cannot be predicted and understood. */ - char src_volatile; + char src_volatile : 1; + char invalidate_dest_p : 1; /* Original machine mode, in case it becomes a CONST_INT. The size of this field should match the size of the mode field of struct rtx_def (see rtl.h). */ @@ -4639,7 +4655,7 @@ cse_insn (rtx_insn *insn) for (i = 0; i < n_sets; i++) { bool repeat = false; - bool mem_noop_insn = false; + bool noop_insn = false; rtx src, dest; rtx src_folded; struct table_elt *elt = 0, *p; @@ -4736,6 +4752,7 @@ cse_insn (rtx_insn *insn) sets[i].src = src; sets[i].src_hash = HASH (src, mode); sets[i].src_volatile = do_not_record; + sets[i].invalidate_dest_p = 1; sets[i].src_in_memory = hash_arg_in_memory; /* If SRC is a MEM, there is a REG_EQUIV note for SRC, and DEST is @@ -5365,7 +5382,7 @@ cse_insn (rtx_insn *insn) || insn_nothrow_p (insn))) { SET_SRC (sets[i].rtl) = trial; - mem_noop_insn = true; + noop_insn = true; break; } @@ -5418,6 +5435,19 @@ cse_insn (rtx_insn *insn) src_folded_cost = constant_pool_entries_cost; src_folded_regcost = constant_pool_entries_regcost; } + else if (n_sets == 1 + && REG_P (trial) + && REG_P (SET_DEST (sets[i].rtl)) + && GET_MODE_CLASS (mode) == MODE_CC + && REGNO (trial) == REGNO (SET_DEST (sets[i].rtl)) + && !side_effects_p (dest) + && (cfun->can_delete_dead_exceptions + || insn_nothrow_p (insn))) + { + SET_SRC (sets[i].rtl) = trial; + noop_insn = true; + break; + } } /* If we changed the insn too much, handle this set from scratch. */ @@ -5588,7 +5618,7 @@ cse_insn (rtx_insn *insn) } /* Similarly for no-op MEM moves. */ - else if (mem_noop_insn) + else if (noop_insn) { if (cfun->can_throw_non_call_exceptions && can_throw_internal (insn)) cse_cfg_altered = true; @@ -5760,6 +5790,26 @@ cse_insn (rtx_insn *insn) } elt = insert (src, classp, sets[i].src_hash, mode); elt->in_memory = sets[i].src_in_memory; + + if (REG_P (dest) + && ! reg_mentioned_p (dest, src)) + { + sets[i].invalidate_dest_p = 0; + unsigned int regno = REGNO (dest); + unsigned int endregno = END_REGNO (dest); + unsigned int j; + + for (j = regno; j < endregno; j++) + { + if (REG_IN_TABLE (j) >= 0) + { + remove_invalid_refs (j); + REG_IN_TABLE (j) = -1; + } + } + invalidate (dest, VOIDmode); + } + /* If inline asm has any clobbers, ensure we only reuse existing inline asms and never try to put the ASM_OPERANDS into an insn that isn't inline asm. */ @@ -5853,7 +5903,8 @@ cse_insn (rtx_insn *insn) previous quantity's chain. Needed for memory if this is a nonvarying address, unless we have just done an invalidate_memory that covers even those. */ - if (REG_P (dest) || GET_CODE (dest) == SUBREG) + if ((REG_P (dest) || GET_CODE (dest) == SUBREG) + && sets[i].invalidate_dest_p) invalidate (dest, VOIDmode); else if (MEM_P (dest)) invalidate (dest, VOIDmode); @@ -5887,7 +5938,7 @@ cse_insn (rtx_insn *insn) if (!REG_P (x)) mention_regs (x); - else + else if (sets[i].invalidate_dest_p) { /* We used to rely on all references to a register becoming inaccessible when a register changes to a new quantity, diff --git a/gcc/testsuite/gcc.target/aarch64/pr88836.c b/gcc/testsuite/gcc.target/aarch64/pr88836.c new file mode 100644 index 0000000..442e8a7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr88836.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-S -O3 -march=armv8.2-a+sve" } */ + +void +f (int *restrict x, int *restrict y, int *restrict z, int n) +{ + for (int i = 0; i < n; i += 2) + { + x[i] = y[i] + z[i]; + x[i + 1] = y[i + 1] - z[i + 1]; + } +} + +/* { dg-final { scan-assembler-times {ptest} 0 } } */