From patchwork Wed Dec 20 18:30:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1878682 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SwMZL0H5Cz20Gb for ; Thu, 21 Dec 2023 05:30:50 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0A996386183E for ; Wed, 20 Dec 2023 18:30:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 6625C3858CD1 for ; Wed, 20 Dec 2023 18:30:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6625C3858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6625C3858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703097037; cv=none; b=uXPtQFq6hdxgWPTUCJAqwPdy0hWllvXKmz2nSVPSYIeQ1mNmG8NZ6P67/MSKcWD/59mUJVksicMrzbYgcbC+2BskU6RvaYuN6RKapaPo+GjHTrcW954yqRRuRriUpw5+k1UUqRzRk0aiDdrWaI9JT1sFexQNPJ5YjIz7J7nFlek= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703097037; c=relaxed/simple; bh=31BI8kd1/SBNHeq/BD5+fugBAwMV5+ZTX4SVgE98gSc=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=TgO/6tyVOPzCgonXdPapwAmq4k6TRm6i4sJo5VyqAvLHZJWZKsdDTbqdq39JXwI3y2G7DxybJkYWg51v5ke/dRRDHndzu5zBsECZNwg1evDpvyWp9b81LwsjUP/mg4H1g+OR9kxKrysjKzoeKtWCmBwiFat/FJ0Rsu/ANNZktw4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E5EB11FB; Wed, 20 Dec 2023 10:31:19 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8D7103F738; Wed, 20 Dec 2023 10:30:34 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, prathamesh.kulkarni@linaro.org, richard.sandiford@arm.com Cc: prathamesh.kulkarni@linaro.org Subject: [PATCH] cse: Fix handling of fake vec_select sets [PR111702] Date: Wed, 20 Dec 2023 18:30:33 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-21.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org If cse sees: (set (reg R) (const_vector [A B ...])) it creates fake sets of the form: (set R[0] A) (set R[1] B) ... (with R[n] replaced by appropriate rtl) and then adds them to the tables in the same way as for normal sets. This allows a sequence like: (set (reg R2) A) ...(reg R2)... to try to use R[0] instead of (reg R2). But the pass was taking the analogy too far, and was trying to simplify these fake sets based on costs. That is, if there was an earlier: (set (reg T) A) the pass would go to considerable effort trying to work out whether: (set R[0] A) or: (set R[0] (reg T)) was more profitable. This included running validate*_change on the sets, which has no meaning given that the sets are not part of the insn. In this example, the equivalence A == T is already known, and the purpose of the fake sets is to add A == T == R[0]. We can do that just as easily (or, as the PR shows, more easily) if we keep the original form of the fake set, with A instead of T. The problem in the PR occurred if we had: (1) something that establishes an equivalence between a vector V1 of M-bit scalar integers and a hard register H (2) something that establishes an equivalence between a vector V2 of N-bit scalar integers, where Nnext_same_value) if (rtx_equal_p (elt->exp, x)) return elt; to insert_with_costs, or by making cse_insn check whether previous sets have recorded the same equivalence. The latter seems more appealing from a compile-time perspective. But in this case, doing that would be adding yet more spurious work to the handling of fake sets. The handling of fake sets therefore seems like the more fundamental bug. While there, the patch also makes sure that we don't apply REG_EQUAL notes to these fake sets. They only describe the "real" (first) set. gcc/ PR rtl-optimization/111702 * cse.cc (set::mode): Move earlier. (set::src_in_memory, set::src_volatile): Convert to bitfields. (set::is_fake_set): New member variable. (add_to_set): Add an is_fake_set parameter. (find_sets_in_insn): Update calls accordingly. (cse_insn): Do not apply REG_EQUAL notes to fake sets. Do not try to optimize them either, or validate changes to them. gcc/ PR rtl-optimization/111702 * gcc.dg/rtl/aarch64/pr111702.c: New test. --- gcc/cse.cc | 38 +++++++++++------- gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c | 43 +++++++++++++++++++++ 2 files changed, 67 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c diff --git a/gcc/cse.cc b/gcc/cse.cc index f9603fdfd43..9fd51ca2832 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -4128,13 +4128,17 @@ struct set unsigned dest_hash; /* The SET_DEST, with SUBREG, etc., stripped. */ rtx inner_dest; + /* Original machine mode, in case it becomes a CONST_INT. */ + ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; /* Nonzero if the SET_SRC is in memory. */ - char src_in_memory; + unsigned int src_in_memory : 1; /* Nonzero if the SET_SRC contains something whose value cannot be predicted and understood. */ - char src_volatile; - /* Original machine mode, in case it becomes a CONST_INT. */ - ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE; + unsigned int src_volatile : 1; + /* Nonzero if RTL is an artifical set that has been created to describe + part of an insn's effect. Zero means that RTL appears directly in + the insn pattern. */ + unsigned int is_fake_set : 1; /* Hash value of constant equivalent for SET_SRC. */ unsigned src_const_hash; /* A constant equivalent for SET_SRC, if any. */ @@ -4229,12 +4233,15 @@ try_back_substitute_reg (rtx set, rtx_insn *insn) } } -/* Add an entry containing RTL X into SETS. */ +/* Add an entry containing RTL X into SETS. IS_FAKE_SET is true if X is + an artifical set that has been created to describe part of an insn's + effect. */ static inline void -add_to_set (vec *sets, rtx x) +add_to_set (vec *sets, rtx x, bool is_fake_set) { struct set entry = {}; entry.rtl = x; + entry.is_fake_set = is_fake_set; sets->safe_push (entry); } @@ -4271,7 +4278,7 @@ find_sets_in_insn (rtx_insn *insn, vec *psets) && known_eq (GET_MODE_NUNITS (GET_MODE (SET_SRC (x))), 1))) { /* First register the vector itself. */ - add_to_set (psets, x); + add_to_set (psets, x, false); rtx src = SET_SRC (x); /* Go over the constants of the CONST_VECTOR in forward order, to put them in the same order in the SETS array. */ @@ -4281,11 +4288,12 @@ find_sets_in_insn (rtx_insn *insn, vec *psets) used to tell CSE how to get to a particular constant. */ rtx y = simplify_gen_vec_select (SET_DEST (x), i); gcc_assert (y); - add_to_set (psets, gen_rtx_SET (y, CONST_VECTOR_ELT (src, i))); + rtx set = gen_rtx_SET (y, CONST_VECTOR_ELT (src, i)); + add_to_set (psets, set, true); } } else - add_to_set (psets, x); + add_to_set (psets, x, false); } else if (GET_CODE (x) == PARALLEL) { @@ -4306,7 +4314,7 @@ find_sets_in_insn (rtx_insn *insn, vec *psets) else if (GET_CODE (SET_SRC (y)) == CALL) ; else - add_to_set (psets, y); + add_to_set (psets, y, false); } } } @@ -4616,6 +4624,7 @@ cse_insn (rtx_insn *insn) int src_related_regcost = MAX_COST; int src_elt_regcost = MAX_COST; scalar_int_mode int_mode; + bool is_fake_set = sets[i].is_fake_set; dest = SET_DEST (sets[i].rtl); src = SET_SRC (sets[i].rtl); @@ -4627,7 +4636,7 @@ cse_insn (rtx_insn *insn) mode = GET_MODE (src) == VOIDmode ? GET_MODE (dest) : GET_MODE (src); sets[i].mode = mode; - if (src_eqv) + if (!is_fake_set && src_eqv) { machine_mode eqvmode = mode; if (GET_CODE (dest) == STRICT_LOW_PART) @@ -4648,7 +4657,7 @@ cse_insn (rtx_insn *insn) /* If this is a STRICT_LOW_PART assignment, src_eqv corresponds to the value of the INNER register, not the destination. So it is not a valid substitution for the source. But save it for later. */ - if (GET_CODE (dest) == STRICT_LOW_PART) + if (is_fake_set || GET_CODE (dest) == STRICT_LOW_PART) src_eqv_here = 0; else src_eqv_here = src_eqv; @@ -5158,7 +5167,7 @@ cse_insn (rtx_insn *insn) /* Terminate loop when replacement made. This must terminate since the current contents will be tested and will always be valid. */ - while (1) + while (!is_fake_set) { rtx trial; @@ -5425,7 +5434,8 @@ cse_insn (rtx_insn *insn) with the head of the class. If we do not do this, we will have both registers live over a portion of the basic block. This way, their lifetimes will likely abut instead of overlapping. */ - if (REG_P (dest) + if (!is_fake_set + && REG_P (dest) && REGNO_QTY_VALID_P (REGNO (dest))) { int dest_q = REG_QTY (REGNO (dest)); diff --git a/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c new file mode 100644 index 00000000000..8af2c54de3c --- /dev/null +++ b/gcc/testsuite/gcc.dg/rtl/aarch64/pr111702.c @@ -0,0 +1,43 @@ +/* { dg-do compile { target aarch64*-*-* } } */ +/* { dg-options "-O2" } */ + +extern int data[]; + +void __RTL (startwith ("vregs")) foo () +{ + (function "foo" + (insn-chain + (block 2 + (edge-from entry (flags "FALLTHRU")) + (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) + (insn 4 (set (reg:V16QI <0>) + (const_vector:V16QI [(const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0)]))) + (insn 5 (set (reg:V2SI v0) + (const_vector:V2SI [(const_int 1) (const_int 0)]))) + (insn 6 (set (reg:V16QI v1) + (const_vector:V16QI [(const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1) + (const_int 0) (const_int 0) + (const_int 1) (const_int 1)]))) + (insn 7 (set (reg:QI x0) (subreg:QI (reg:V16QI <0>) 0)) + (expr_list:REG_EQUAL (const_int 1) (nil))) + (insn 8 (use (reg:V16QI <0>))) + (insn 9 (use (reg:V2SI v0))) + (insn 10 (use (reg:V16QI v1))) + (insn 11 (use (reg:QI x0))) + (edge-to exit (flags "FALLTHRU")) + ) ;; block 2 + ) ;; insn-chain + ) ;; function +}